Hi Damjan

Excellent news!

Although I do not use Base (yet!) I use the CSV import/export from Calc on a 
regular basis.

I will build my own 41X binaries in Ubuntu and give it a try!

Can you share a link to a test >64 KiB CSV?

All the best,
Pedro

> On 07/22/2024 5:43 PM WEST Damjan Jovanovic <dam...@apache.org> wrote:
> 
>  
> Hi
> 
> A long and painful bug with CSV files has been that OpenOffice has a row
> (line) limit of 64 KiB (in characters), as well as a field limit of 64 KiB.
> While individual lines of >= 64 KiB are rare, quoted CSV fields can span
> multiple lines.
> 
> Now, at least in Base, this limitation is finally gone!!! With my commit
> 7b2bc0e6bba2fbc38d078306fe10d875115d6c86:
> - New member functions were added to the main/tools SvStream class to work
> with 32 bit OUString and OStringBuilder when reading lines.
> - The helper class QuotedString had to be upgraded from using the 16 bit
> String to the 32 bit OUString.
> - The CSV database driver was patched to use OUString and 32 bit indexes in
> various places.
> - Luckily, little other work was needed, as the ORowSetValue class already
> uses 32 bit OUString, and was previously converting 16 bit String to 32 bit
> OUString internally anyway.
> 
> And it works now, and works well! CSV files load fully. A field over 64 KiB
> no longer corrupts reading of further data from the file. Cells copy to
> Calc, copy out to other applications (admittedly truncated when over 64
> KiB, and with line feeds removed, but that's due to other bugs in the
> clipboard, because pasting to Calc doesn't remove line feeds). SQL queries
> on CSV files work and give correct results.
> 
> Oh and the new storage limit imposed by OUString is the signed 32 bit
> limit, about 2 GiB.
> 
> This should be seen as phase 1. In phase 2 I want to unify CSV parsing
> between Base and Calc, reducing the amount of code we need. And at least,
> Calc should read 32 bit OUString rows, and then possibly impose 64 KiB
> field size limits, instead of the current 16 bit String sized lines and
> corruption of further data (https://bz.apache.org/ooo/show_bug.cgi?id=91028
> ).
> 
> Cherry-picked to AOO42X in 556cbcf7b90911f7cbf5dcdaffd2767ad2b2e230.
> Cherry-picked to AOO41X in f13410cf5cd4d68144c38af7cf9e805599c0d5cf (and 3
> previous backported CSV patches).
> 
> Regards
> Damjan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.apache.org
For additional commands, e-mail: dev-h...@openoffice.apache.org

Reply via email to