1000 per second isn't bad provided there are a few cells in there (which
you have). that 42000 cells per second, which
is pretty good actually considering. Thats 0.000023809524 seconds per
cell.... I'm happy with that :-)
The "stupid thing" that you're doing is that your pauses could be longer
in between. meaning if my heap is like 2gb and my sheet is only like
10k then it will be a
LOOOOONG time before it hits the 70% mark. The trouble is that as the
sheet gets bigger...the number of full collections become more
frequent...thus say instead of those ints you put a BIG FAT string that
took a lot more bytes each... You could kill perofrmance by keeping it
all in memory and hitting the 70% mark more frequently (default is 70 =
major or full collection depending)... So big heap == better
performance as counter-intuitive as that seems. Isn't that cool?
-Andy
Avik Sengupta wrote:
On a 1.8GHz linux (2.6) with jdk1.4.2, I can insert rows at approx 1000
per second. Except for brief pauses due to garbage collection, the rate
is resonably consistent. I've tried inserting 5000 rows with 42 cols
each.
Attached is a graph showing this performance, the behaviour should be
pretty obvious (and its quite what you would expect). The code follows,
tell me I didnt do anything stupid!
public void testPerf() {
HSSFWorkbook wb = new HSSFWorkbook();
HSSFSheet s = wb.createSheet();
HSSFRow r ;
HSSFCell c;
long t = System.currentTimeMillis();
long newT =0;
for (int i=0;i < 5000;i++) {
if (i % 100 == 0 && i>0) {
newT=System.currentTimeMillis();
//System.out.println("Rows " + (i-100) +" to "+i+" inserted in
" +(newT-t) + " ms - at " + (100000/(newT-t) + " rows per sec"));
System.out.println(100000/(newT-t));
t=newT;
}
r = s.createRow(i);
for (short j=0;j<42;j++) {
c = r.createCell(j);
c.setCellValue(i+j);
}
}
On Tue, 2005-06-14 at 07:44 -0400, [EMAIL PROTECTED] wrote:
Thats some odd time. I achieve much better on my box. First off set
your MINIMUM heap size so that it doesn't grow and shrink. Second off,
make sure the heap is at least 50-60% larger than the high water mark of
usage (due to generational garbage collection. Last off, use the 2.6
kernel. Lastly, I'd like to see that code, it may not be POI at all.
-Andy
Brett Knights wrote:
Hello,
I have recently tried to use POI to add a few thousand rows to a
spreadsheet.
It doesn't make much difference if I start with an almost blank
spreadsheet or one with dummy values in the all the cells that will be
populated on a run of known size.
I have 42 columns.
Operations move fairly quickly for the first 600 to 650 rows and then
slow down considerably.
e.g.
On a test run on a 1GHz Windows machine:
Time to update the first 600 rows takes about 8 seconds.
Time to update the following rows to 1850 takes about 4 minutes. At
around row 700 the code is updating around 13 rows a second.
By row 1800 it's down to 5 rows a second.
On a live run with better hardware than my test setup populating 6500
rows takes close to 25 minutes. This is on a 2GHz Debian machine.
On neither machine does memory use approach the max allocated.
jdk 1.4.2
poi-2.5.1-final-20040804
Any help, tips, pointers etc would be most appreciated. It would make my
life easier if I don't have to redo this as a csv.
TIA
Brett Knights
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/
.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List: http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/