https://bz.apache.org/ooo/show_bug.cgi?id=128552

          Issue ID: 128552
        Issue Type: DEFECT
           Summary: oox::xls::SheetDataBuffer::setCellFormat() takes over
                    3 hours to load XSLX spreadsheet
           Product: Calc
           Version: 4.2.0-dev
          Hardware: All
                OS: All
            Status: CONFIRMED
          Severity: Normal
          Priority: P5 (lowest)
         Component: open-import
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

A large XSLX spreadsheet takes over 3 hours to load. The time appears to be
spent in the method oox::xls::SheetDataBuffer::setCellFormat() of
main/oox/source/xls/sheetdatabuffer.cxx, doing linear searches on the std::map
maXfIdRanges.

Since there are 100000+ rows of cells, and the linear search over ranges is
done per cell, we have an O(n*m) algorithmic complexity, resulting in terrible
performance.

We really need to improve the data structure and algorithm. Perhaps the ranges
should be stored in some kind of spatial index, like a R-tree, so the cell can
rapidly find the range it fits in without having to traverse them all.

-- 
You are receiving this mail because:
You are the assignee for the issue.

Reply via email to