https://bz.apache.org/ooo/show_bug.cgi?id=91028
--- Comment #10 from [email protected] --- ---snip--- sal_Bool ScImportExport::ImportStream( SvStream& rStrm, const String& rBaseURL, sal_uLong nFmt ) { if( nFmt == FORMAT_STRING ) { if( ExtText2Doc( rStrm ) ) // pExtOptions auswerten return sal_True; } ... ---snip--- The nFmt is FORMAT_STRING, so it immediately calls "ExtText2Doc( rStrm )", as it visible in frame #0 in my previous comment. Let's set a breakpoint there and load a CSV file with huge lines, 20000 each of a, b, c, d, e, separated by commas, 100004 characters on one line. This line: ---line--- rStrm.ReadCsvLine( aLine, !bFixed, rSeps, cStr); ---line--- calls SvStream::ReadCsvLine() in main/tools/source/stream/stream.cxx, which ultimately ends up in SvStream::ReadLine( ByteString& rStr ). There, it incrementally reads chunks of 256 bytes or less data from the file, looks for end of line, and appends them to the string that was passed in: ---line--- rStr.Append( buf, n ); ---line--- Putting a breakpoint there and printing the length of the rStr string, shows that it increases on each loop round, reaches 65535, then remains stuck there: ---snip--- Thread 1 hit Breakpoint 17, SvStream::ReadLine (this=this@entry=0x80b409830, rStr=...) at source/stream/stream.cxx:736 736 rStr.Append( buf, n ); $396 = 65024 Thread 1 hit Breakpoint 17, SvStream::ReadLine (this=this@entry=0x80b409830, rStr=...) at source/stream/stream.cxx:736 736 rStr.Append( buf, n ); $397 = 65280 Thread 1 hit Breakpoint 17, SvStream::ReadLine (this=this@entry=0x80b409830, rStr=...) at source/stream/stream.cxx:736 736 rStr.Append( buf, n ); $398 = 65535 Thread 1 hit Breakpoint 17, SvStream::ReadLine (this=this@entry=0x80b409830, rStr=...) at source/stream/stream.cxx:736 736 rStr.Append( buf, n ); $399 = 65535 ---snip--- That's because rStr is of type ByteString, which in tools/inc/tools/string.hxx is defined with a 16 bit maximum size limit: ---snip--- #ifdef STRING32 #define STRING_NOTFOUND ((xub_StrLen)0x7FFFFFFF) #define STRING_MATCH ((xub_StrLen)0x7FFFFFFF) #define STRING_LEN ((xub_StrLen)0x7FFFFFFF) #define STRING_MAXLEN ((xub_StrLen)0x7FFFFFFF) #else #define STRING_NOTFOUND ((xub_StrLen)0xFFFF) #define STRING_MATCH ((xub_StrLen)0xFFFF) #define STRING_LEN ((xub_StrLen)0xFFFF) #define STRING_MAXLEN ((xub_StrLen)0xFFFF) #endif ---snip--- There are multiple ways to fix this. Globally define STRING32, which could have many unintended consequences, such as larger spreadsheet cell strings. Pass a different string buffer type to SvStream::ReadCsvLine(), so that it is unaffected by that limit. Drop the stream entirely and switch to push-model CSV parsing, which is the lightest on memory and could be paused and resumed, but has more complex code. -- You are receiving this mail because: You are the assignee for the issue. You are on the CC list for the issue.
