Hi
I used the Apache POI to read an excel file. The file has 16000 rows and 50
columns.
Each cell is read as a string and they are put into a 2D array
(e.g. a row is a string array with size 50 and 16000 rows are put in another
array).
the code looks like
try (FileInputStream fIP = new FileInputStream(selectedFile)) {
XSSFWorkbook wb = new XSSFWorkbook(fIP);
XSSFSheet ws = wb.getSheetAt(0);
maxRows = ws.getLastRowNum() + 1;
maxColumns = ws.getRow(0).getLastCellNum();
theRows = new OneRow[maxRows];
Iterator< Row > rowIt = ws.iterator();
XSSFRow row;
int i = 0;
long chs = 0;
while ( rowIt.hasNext() ) {
row = (XSSFRow) rowIt.next();
Iterator< Cell > cellIt = row.cellIterator();
String [] str = new String[maxColumns];
int j = 0;
long ch = 0;
while ( cellIt.hasNext() ) {
Cell cell = cellIt.next();
str[ j ] = cell.getStringCellValue();
ch += str[ j ].length(); // sum one row's length
System.out.print( str[ j ].length() + " " );
++j;
}
System.out.print( "\n ->" + ch + "\n" );
chs += ch; // sum total characters length
theRows[ i ] = new OneRow( maxColumns );
theRows[ i ].add( str );
++i;
}
System.out.print( "\n ==>" + chs + "\n" );
}
the program also counts the length of each cell, ch and the size of all
rows are updated with chs += ch. Then the total number of characters are chs.
The value is 3,230,798 (3M chars) but the used heap is about 1GB. That means a
character is consuming 330 bytes!! (more or less).
Using the profilers, I see that needs a lot of heap and I have to increase
the heap size to 2048. Also the cpu time is devoted to GC which is undesirable.
I have to say that XSSFSheet ws = wb.getSheetAt(0); is the most time and
memory consuming part. In order to investigate that, I manually count the data
structure sizes with ch and chs.
Please see the profiler at https://i.stack.imgur.com/ZfLHj.jpg
In the memory chart and from left, the first peak is where it finishes
loading the excel file.
Regards,
Mahmood
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]