1000 per second isn't bad provided there are a few cells in there (which you have). that 42000 cells per second, which is pretty good actually considering. Thats 0.000023809524 seconds per cell.... I'm happy with that :-)

The "stupid thing" that you're doing is that your pauses could be longer in between. meaning if my heap is like 2gb and my sheet is only like 10k then it will be a LOOOOONG time before it hits the 70% mark. The trouble is that as the sheet gets bigger...the number of full collections become more frequent...thus say instead of those ints you put a BIG FAT string that took a lot more bytes each... You could kill perofrmance by keeping it all in memory and hitting the 70% mark more frequently (default is 70 = major or full collection depending)... So big heap == better performance as counter-intuitive as that seems. Isn't that cool?

-Andy

Avik Sengupta wrote:

On a 1.8GHz linux (2.6) with jdk1.4.2, I can insert rows at approx 1000
per second. Except for brief pauses due to garbage collection, the rate
is resonably consistent. I've tried inserting 5000 rows with 42 cols
each.
Attached is a graph showing this performance, the behaviour should be
pretty obvious (and its quite what you would expect). The code follows,
tell me I didnt do anything stupid!

public void testPerf() {
        
       HSSFWorkbook wb = new HSSFWorkbook();
       HSSFSheet s = wb.createSheet();
       HSSFRow r ;
       HSSFCell c;
       long t = System.currentTimeMillis();
       long newT =0;
       for (int i=0;i < 5000;i++) {
        if (i % 100 == 0 && i>0) {
                newT=System.currentTimeMillis();
                //System.out.println("Rows " + (i-100) +" to "+i+" inserted in
" +(newT-t) + " ms - at " + (100000/(newT-t) + " rows per sec"));
                System.out.println(100000/(newT-t));
                t=newT;
        }
        r = s.createRow(i);
        for (short j=0;j<42;j++) {
                c = r.createCell(j);
                c.setCellValue(i+j);
        }
        
       }



On Tue, 2005-06-14 at 07:44 -0400, [EMAIL PROTECTED] wrote:
Thats some odd time. I achieve much better on my box. First off set your MINIMUM heap size so that it doesn't grow and shrink. Second off, make sure the heap is at least 50-60% larger than the high water mark of usage (due to generational garbage collection. Last off, use the 2.6 kernel. Lastly, I'd like to see that code, it may not be POI at all.

-Andy

Brett Knights wrote:
Hello,

I have recently tried to use POI to add a few thousand rows to a spreadsheet. It doesn't make much difference if I start with an almost blank spreadsheet or one with dummy values in the all the cells that will be populated on a run of known size.

I have 42 columns.
Operations move fairly quickly for the first 600 to 650 rows and then slow down considerably.

e.g.
On a test run on a 1GHz Windows machine:
Time to update the first 600 rows takes about 8 seconds.
Time to update the following rows to 1850 takes about 4 minutes. At around row 700 the code is updating around 13 rows a second.
By row 1800 it's down to 5 rows a second.

On a live run with better hardware than my test setup populating 6500 rows takes close to 25 minutes. This is on a 2GHz Debian machine.

On neither machine does memory use approach the max allocated.

jdk 1.4.2
poi-2.5.1-final-20040804

Any help, tips, pointers etc would be most appreciated. It would make my life easier if I don't have to redo this as a csv.

TIA

Brett Knights


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/
.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

Reply via email to