https://bz.apache.org/ooo/show_bug.cgi?id=128552
Issue ID: 128552
Issue Type: DEFECT
Summary: oox::xls::SheetDataBuffer::setCellFormat() takes over
3 hours to load XSLX spreadsheet
Product: Calc
Version: 4.2.0-dev
Hardware: All
OS: All
Status: CONFIRMED
Severity: Normal
Priority: P5 (lowest)
Component: open-import
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
A large XSLX spreadsheet takes over 3 hours to load. The time appears to be
spent in the method oox::xls::SheetDataBuffer::setCellFormat() of
main/oox/source/xls/sheetdatabuffer.cxx, doing linear searches on the std::map
maXfIdRanges.
Since there are 100000+ rows of cells, and the linear search over ranges is
done per cell, we have an O(n*m) algorithmic complexity, resulting in terrible
performance.
We really need to improve the data structure and algorithm. Perhaps the ranges
should be stored in some kind of spatial index, like a R-tree, so the cell can
rapidly find the range it fits in without having to traverse them all.
--
You are receiving this mail because:
You are the assignee for the issue.