Caolan McNamara wrote:
I have a large csv file
(http://people.redhat.com/caolanm/speed/rh198788.csv.bz2) with 39,000
rows and 36 columns, it imports fairly ok, but on save to .ods it takes
an astounding amount of time before the export progress bar appears.
Some minimal poking shows that the time is eaten up in
ScUniqueCellFormatsObj::GetObjects_Impl in

ScUniqueFormatsHashMap aHashMap;
while (aIter.GetNext( nCol1, nCol2, nRow1, nRow2 ) )
{
    ScRange aRange( nCol1, nRow1, nTab, nCol2, nRow2, nTab );
    const ScPatternAttr* pPattern = pDoc->GetPattern(nCol1, nRow1,
        nTab);
    aHashMap[pPattern].Join( aRange );
}

where all the joins take up all the time. Now I see that from the cvs
log that this area has been optimized before, so I assume that there's
little that can be done to improve this at this exact point. But I
wonder if it would be possible at .csv import time where we have control
over how we will import the csv to do something which would reduce the
final export time ?
Is this totally unavoidable and directly linked to the number of cells,
or (*hand waving*) linked to hard coding of cell attributes which could
be avoided by generating styles at import time and applying those
instead.

The change was the use of the hash map to find the right range list, which becomes important when there are many different cell formats. The Join calls were there before and after that change.

Importing your csv file results in around 67000 different cell ranges sharing the same format (scientific number format), which slows down ScRangeList::Join a lot. Changing the csv import to use styles wouldn't help without further changes, because they would still be collected in the same loop. To improve this, it would probably be best to use something different from the ScRangeList to collect the ranges in that loop. Does anyone want to try this? The file is unusual but reasonable, so something should be done about it.

Niklas

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to