https://issues.apache.org/bugzilla/show_bug.cgi?id=52834

             Bug #: 52834
           Summary: ColumnHelper.getColumnWidth does not scale to large
                    numbers of rows, with large numbers of merged regions
           Product: POI
           Version: 3.7
          Platform: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSSF
        AssignedTo: [email protected]
        ReportedBy: [email protected]
    Classification: Unclassified


Created attachment 28423
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28423
Patch to resolve issue against http://svn.apache.org/repos/asf/poi/tags/REL_3_7

Attempting to run ColumnHelper.getColumnWidth on a large worksheet (>110,000
rows) with a large number of merged regions (>40,000) runs for a very long time
(ran for an entire weekend before it was noticed and aborted).

This method contains a loop across merged regions nested within a loop across
rows, and several substantial methods are being called within the inner loop,
possibly on the order of a billion times in my example.

However, many of these time-consuming calls are retrieving static data in the
context of this method, so the code can be refactored such that they're only
run once (or once per row), and the results cached. The main problem calls are:

getNumMergedRegions()
getMergedRegion(i)
getRowNum()

With the code refactored thusly, the method ran within a few minutes.

A patch is attached with these suggested changes. The patch is based on the
REL_3_7 tag.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to