https://bugs.documentfoundation.org/show_bug.cgi?id=150141
--- Comment #3 from Pierre Fortin <[email protected]> --- (In reply to Roman Kuznetsov from comment #2) > Calc supports only ~1 million rows by default Yes, but this report is against the new jumbo feature of 16M rows which is almost enough for what my team needs. > Anyway please attach your CSV here Plenty of examples available at https://dl.ncsbe.gov/?prefix=data/ -- look for the big files.... These zip files mostly contain a single .txt (mostly tab separated "csv")... See also the Snapshots folder... Files may contain tab or comma separated data; but vary in data encoding. If the 16 bit encoded files give you trouble, you can use the Linux command: tr -d '\000"\r\377\376\275' < infile.txt > outfile.csv to "clean" them up... Cool! this daily build has a progress bar... Loading a 5.7GB sheet... Progress bar reached the end 1 minute +|- 5 seconds after starting the load. Waiting for the sheet to display. Oh well... after another 1:40m, load failed: too many rows... less than a minute after OK, sheet appeared... HUGE speed improvement over initial tests about a week ago. Sheet showing 16,777,216 rows. This file is from https://s3.amazonaws.com/dl.ncsbe.gov/data/ncvhis_Statewide.zip $ ll ncvhis_Statewide-20220723-070658.csv [snip] 4265533961 Jul 23 07:06 ncvhis_Statewide-20220723-070658.csv $ wc -l ncvhis_Statewide-20220723-070658.csv 33686293 ncvhis_Statewide-20220723-070658.csv ^^^^^^^^ Even if Calc doubled the number of jumbo rows to 33,554,432; I'd still leave 131,861 rows on the cutting room floor... :) While it would be great to load such sheets, we have to split them up. I have one sheet covering 2012-2022 which we reduced to around 77M records... but seriously, 16M rows is something we'd be happy with for a while... we have lots of ways to slice and dice these large sheets; but 16M rows is a big help; I'm using the daily builds almost exclusively when they work... -- You are receiving this mail because: You are the assignee for the bug.
