I do not suspect the StringBuffer is a big part of your memory problem.
Rather, I guess that you are reading the entire 30 MB file into a single
data structure (perhaps some Collection which holds all your Records)
which will reside entirely in memory at once. Is that right?
And every Field in that data structure is in its own object(?) So you
are creating as many of these Field objects as you have fields in the 30
MB(?) How many fields do you have in the 30 MB? Something like
220,000? Even though a given field may be a little String with only a
few characters, the overhead to hold a String object is probably many
times larger.
Why does your Field extend TreeMap? Do you need the functionality of
TreeMap for what appears to be a simple field? A TreeMap has to be a
large object I would think, and you are making possibly 220,000 of these.
Objects do not become garbage collectible until they are unreachable.
So if you are keeping a pointer to all your Records somewhere, then all
of the Fields remain reachable, and not eligible for garbage collection.
Thanks for mentioning hprof. This is the first I've heard of it. I
hope I've interpreted its output correctly.
Rich Hammer
Brent Verner wrote:
Hi,
I have an app that reads and parses a legacy data export file into
an in-process data structure. The original file is around 30MB, but
my java process ends up peaking around 400MB, 300M of which is the
data structure! An old parser, written in perl, consumed around 80M
after parsing the data into a similar in-process structure...
class Record {
Vector fields;
// methods to get/add/replace fields by name
}
class Field extends TreeMap {
String name;
// methods to get/set name, get/set name/value pairs of TreeMap
}
The input file contains lines like
@FIELDNAME: |a a field value named "a" |b another field value named "b"
which are read into a StringBuffer and eventually "parsed" with a regular
expression (jakarta-regexp) to populate a Field...
I've run an hprof on the program, and see the following, which I
do not quite understand. Where is this 207MB that it can't be
GC'd? I've called sb.setLength(0) as well as sb.delete(0,sb.length())
in hopes of freeing the memory in that StingBuffer, but still I
get this massive memory usage.
[...]
TRACE 878:
java.lang.StringBuffer.expandCapacity(StringBuffer.java:202)
java.lang.StringBuffer.append(StringBuffer.java:401)
xxx.xxx.xxxxxxx.importer.tagged.RecordReader.read(RecordReader.java:324)
xxx.xxx.xxxxxxx.importer.tagged.RecordReader.next(RecordReader.java:388)
[...]
SITES BEGIN (ordered by live bytes) Sun Jul 17 13:37:47 2005
percent live alloc'ed stack class
rank self accum bytes objs bytes objs trace name
1 36.80% 36.80% 207366792 222011 207366792 222011 878 [C
[...]
Any ideas? FWIW, the JVM is Sun 1.4.2_08.
_______________________________________________
Juglist mailing list
[email protected]
http://trijug.org/mailman/listinfo/juglist_trijug.org