https://issues.apache.org/bugzilla/show_bug.cgi?id=54283
Bug ID: 54283
Summary: Very slow opening XSSF spreadsheets
Product: POI
Version: 3.9-dev
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: XSSF
Assignee: [email protected]
Reporter: [email protected]
Classification: Unclassified
Created attachment 29746
--> https://issues.apache.org/bugzilla/attachment.cgi?id=29746&action=edit
Screenshot from profiler, showing large number of calls when opening
spreadsheet
I'm seeing very slow load times for XSSF spreadsheets. One example spreadsheet
takes several minutes to open, whereas the same spreadsheet same as an xls/HSSF
opens in a fraction of a second.
This is about the same spreadsheets mentioned in bug 54282, but this one looks
harder to fix.
Attached is a screenshot from a profiler session that highlights the problem.
Basically, when opening a single spreadsheet that contains two sheets, there is
behaviour that looks like it's O(N^3) with respects to the number of columns in
the spreadsheet. The sequence is roughly as follows:
- 1 call to XSSFWorkbook.onDocumentRead()
- 2 calls to XSSFSheet.onDocumentRead()
<...>
- 2 calls to ColumnHelper.cleanColumns()
- 5,317 calls to ColumnHelper.addCleanColIntoCols()
- 5,317 calls to ColumnHelper.sortColumns()
- 7,812,463 calls to Xobj.find_element_user()
- 8,243,994,339 calls to Xobj.isElem() and QName.equals().
There's a similar bottleneck that goes like this:
- 1 call to XSSFWorkbook.onDocumentRead()
- 2 calls to XSSFSheet.onDocumentRead()
- 2 calls to ColumnHelper.cleanColumns()
- 5,317 calls to ColumnHelper.addCleanColIntoCols()
- 7,807,146 calls to ColumHelper.getColArray()
- 8,243,994,339 calls to Xobj.isElem() and QName.equals().
I realise that this code bottoms out in in XMLBeans so maybe this is partly an
issue there. I did find this bug report on XMLBeans which sounds relevant, but
it's been open for a couple of years:
https://issues.apache.org/jira/browse/XMLBEANS-438
Still, I wonder if there's something that could be done in the POI code to
avoid hitting the XMLBeans data structures so hard. As it is, it unfortunately
renders the API unusable for certain spreadsheets.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]