Hi,

  I have an app that reads and parses a legacy data export file into
an in-process data structure.  The original file is around 30MB, but
my java process ends up peaking around 400MB, 300M of which is the
data structure!  An old parser, written in perl, consumed around 80M 
after parsing the data into a similar in-process structure...

class Record {
        Vector fields;
        // methods to get/add/replace fields by name
}

class Field extends TreeMap {
        String name;
        // methods to get/set name, get/set name/value pairs of TreeMap
}

The input file contains lines like

  @FIELDNAME: |a a field value named "a" |b another field value named "b"

which are read into a StringBuffer and eventually "parsed" with a regular
expression (jakarta-regexp) to populate a Field...

I've run an hprof on the program, and see the following, which I
do not quite understand.  Where is this 207MB that it can't be
GC'd?  I've called sb.setLength(0) as well as sb.delete(0,sb.length())
in hopes of freeing the memory in that StingBuffer, but still I
get this massive memory usage.

  [...]

TRACE 878:
  java.lang.StringBuffer.expandCapacity(StringBuffer.java:202)
  java.lang.StringBuffer.append(StringBuffer.java:401)
  xxx.xxx.xxxxxxx.importer.tagged.RecordReader.read(RecordReader.java:324)
  xxx.xxx.xxxxxxx.importer.tagged.RecordReader.next(RecordReader.java:388) 

  [...]

SITES BEGIN (ordered by live bytes) Sun Jul 17 13:37:47 2005
          percent         live       alloc'ed  stack class
 rank   self  accum    bytes objs   bytes objs trace name
    1 36.80% 36.80% 207366792 222011 207366792 222011   878 [C

  [...]


  Any ideas?  FWIW, the JVM is Sun 1.4.2_08.

Thanks.
        Brent


_______________________________________________
Juglist mailing list
[email protected]
http://trijug.org/mailman/listinfo/juglist_trijug.org

Reply via email to