https://issues.apache.org/bugzilla/show_bug.cgi?id=47405

           Summary: RowRecordsAggregate.getStartRowNumberForBlock /
                    getEndRowNumberForBlock not performing with high row
                    count
           Product: POI
           Version: 3.2-FINAL
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Keywords: PatchAvailable
          Severity: major
          Priority: P2
         Component: HSSF
        AssignedTo: [email protected]
        ReportedBy: [email protected]


The methods RowRecordsAggregate.getStartRowNumberForBlock /
getEndRowNumberForBlock iterate over the rows for every block in the sheet.
An increasing row number increases the time for serialization enormously.
The following code improvement decreases the time for serialization of a
workbook with 1 sheet, 25 columns and 30000 rows from 7.5 seconds down to 1.6
seconds on my test environment.

    private List _rowRecordsList = null;

    /** Returns the physical row number of the first row in a block*/
    private int getStartRowNumberForBlock(int block) {
      //Given that we basically iterate through the rows in order,
      // TODO - For a performance improvement, it would be better to return an
instance of
      //an iterator and use that instance throughout, rather than recreating
one and
      //having to move it to the right position.

      if (_rowRecordsList==null) {
          // build up block-based list of row records
          _rowRecordsList = new ArrayList(_rowRecords.values());
      }
      int startIndex = block * DBCellRecord.BLOCK_SIZE;
      RowRecord row = null;
      if (startIndex < _rowRecordsList.size()) {
          row = (RowRecord) _rowRecordsList.get(startIndex);
      }
      /*
      Iterator rowIter = _rowRecords.values().iterator();
      //Position the iterator at the start of the block
      for (int i=0; i<=startIndex;i++) {
        row = (RowRecord)rowIter.next();
      }
      */
      if (row == null) {
          throw new RuntimeException("Did not find start row for block " +
block);
      }


      return row.getRowNumber();
    }

    /** Returns the physical row number of the end row in a block*/
    private int getEndRowNumberForBlock(int block) {
        if (_rowRecordsList==null) {
            // build up block-based list of row records
            _rowRecordsList = new ArrayList(_rowRecords.values());
        }

      int endIndex = ((block + 1)*DBCellRecord.BLOCK_SIZE)-1;
      if (endIndex >= _rowRecords.size())
        endIndex = _rowRecords.size()-1;

      RowRecord row = null;
      if (endIndex < _rowRecordsList.size()) {
          row = (RowRecord) _rowRecordsList.get(endIndex);
      }
      /*
      Iterator rowIter = _rowRecords.values().iterator();
      for (int i=0; i<=endIndex;i++) {
        row = (RowRecord)rowIter.next();
      }
      */
      if (row == null) {
          throw new RuntimeException("Did not find start row for block " +
block);
      }
      return row.getRowNumber();

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to