On 2/6/12 3:09 PM, "W.P. McNeill" <[email protected]> wrote:
> I'm debugging out of memory errors in my application. I suspect that some of > my Avro objects are really big. Is there a way to tell how many bytes a given > Avro object occupies in memory? My current solution is to count the number of > characters in its stringification, but this is a bit of a hack. What Avro language implementation? For Java, there is nothing built-in to do this. It will differ depending on the object format in use (Specific or Generic?) and whether the JVM is running in 32 bit, 64 bit or 64 bit w/ compressed Oops as well. For quick debugging, a 'jmap histo:live' print-out may help you identify what is taking up the memory. To understand the size of a record, here are a few points: Intrinsics are boxed for most cases, so the size per boolean/int/float/long/double is going to be 16 to 24 bytes, depending on the size and JVM configuration. Generic Records are Object[] underneath the covers. This adds 16 + (# of fields * 4) bytes per field, except for 64 bit non-compressed pointers which is 8 bytes per field. SpecificRecords are a little smaller, especially for primitive fields. In general, I would expect all language implementations to use RAM for a record that is somewhere between 2x to 16x serialized binary form. There are corner cases that will more closely match (a large byte array) or differ (maps and arrays of records).
