Hello,

After profiling my application I found out that each time an XML file is
parsed, approximately 4kB are allocated inside
XMLEntityManager.ScannedEntity for an internal buffer. This application
processes a very large amount of XHTML-IM formatted messages, which it
has to parse in order to transform them to other formats. This results
in hundreds of megabytes being allocated and released per minute.

As a workaround, I've reduced the internal buffer size to 256 bytes; the
messages are usually small, so that shouldn't provoke a significant
performance hit, if any at all (I'll profile that next anyway). However,
I'll be happier with a more scalable solution, like a ScannedEntity
pool.

I'm not familiar with Xerces' internals, but from looking at the call
tree starting at ScannedEntity's constructor, it seems that it's
impossible to avoid this code path; therefore I was thinking that
ScannedEntityS could be pooled to save the garbage collector some
significant work.

I could implement the pooling, but I'd like to hear your opinion first.
I'm definitely open to alternate solutions. Please honor the Reply-To
header, as I'm not subscribed to this list.

Thanks,
-- 
Javier Kohen <[EMAIL PROTECTED]>
ICQ: blashyrkh #2361802
Jabber: [EMAIL PROTECTED]

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to