https://issues.apache.org/bugzilla/show_bug.cgi?id=47405
Summary: RowRecordsAggregate.getStartRowNumberForBlock /
getEndRowNumberForBlock not performing with high row
count
Product: POI
Version: 3.2-FINAL
Platform: PC
OS/Version: Windows XP
Status: NEW
Keywords: PatchAvailable
Severity: major
Priority: P2
Component: HSSF
AssignedTo: [email protected]
ReportedBy: [email protected]
The methods RowRecordsAggregate.getStartRowNumberForBlock /
getEndRowNumberForBlock iterate over the rows for every block in the sheet.
An increasing row number increases the time for serialization enormously.
The following code improvement decreases the time for serialization of a
workbook with 1 sheet, 25 columns and 30000 rows from 7.5 seconds down to 1.6
seconds on my test environment.
private List _rowRecordsList = null;
/** Returns the physical row number of the first row in a block*/
private int getStartRowNumberForBlock(int block) {
//Given that we basically iterate through the rows in order,
// TODO - For a performance improvement, it would be better to return an
instance of
//an iterator and use that instance throughout, rather than recreating
one and
//having to move it to the right position.
if (_rowRecordsList==null) {
// build up block-based list of row records
_rowRecordsList = new ArrayList(_rowRecords.values());
}
int startIndex = block * DBCellRecord.BLOCK_SIZE;
RowRecord row = null;
if (startIndex < _rowRecordsList.size()) {
row = (RowRecord) _rowRecordsList.get(startIndex);
}
/*
Iterator rowIter = _rowRecords.values().iterator();
//Position the iterator at the start of the block
for (int i=0; i<=startIndex;i++) {
row = (RowRecord)rowIter.next();
}
*/
if (row == null) {
throw new RuntimeException("Did not find start row for block " +
block);
}
return row.getRowNumber();
}
/** Returns the physical row number of the end row in a block*/
private int getEndRowNumberForBlock(int block) {
if (_rowRecordsList==null) {
// build up block-based list of row records
_rowRecordsList = new ArrayList(_rowRecords.values());
}
int endIndex = ((block + 1)*DBCellRecord.BLOCK_SIZE)-1;
if (endIndex >= _rowRecords.size())
endIndex = _rowRecords.size()-1;
RowRecord row = null;
if (endIndex < _rowRecordsList.size()) {
row = (RowRecord) _rowRecordsList.get(endIndex);
}
/*
Iterator rowIter = _rowRecords.values().iterator();
for (int i=0; i<=endIndex;i++) {
row = (RowRecord)rowIter.next();
}
*/
if (row == null) {
throw new RuntimeException("Did not find start row for block " +
block);
}
return row.getRowNumber();
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]