Hello Javier,

You must be looking at the source prior to Xerces 2.7.0. As of 2.7.0 the 
character buffers are managed by a pool. You can control how large the 
internal buffers are by setting the 
http://apache.org/xml/properties/input-buffer-size [1] property.

Thanks.

[1] http://xml.apache.org/xerces2-j/properties.html#input-buffer-size

Javier Kohen <[EMAIL PROTECTED]> wrote on 08/12/2005 03:37:28 
PM:

> Hello,
> 
> After profiling my application I found out that each time an XML file is
> parsed, approximately 4kB are allocated inside
> XMLEntityManager.ScannedEntity for an internal buffer. This application
> processes a very large amount of XHTML-IM formatted messages, which it
> has to parse in order to transform them to other formats. This results
> in hundreds of megabytes being allocated and released per minute.
> 
> As a workaround, I've reduced the internal buffer size to 256 bytes; the
> messages are usually small, so that shouldn't provoke a significant
> performance hit, if any at all (I'll profile that next anyway). However,
> I'll be happier with a more scalable solution, like a ScannedEntity
> pool.
> 
> I'm not familiar with Xerces' internals, but from looking at the call
> tree starting at ScannedEntity's constructor, it seems that it's
> impossible to avoid this code path; therefore I was thinking that
> ScannedEntityS could be pooled to save the garbage collector some
> significant work.
> 
> I could implement the pooling, but I'd like to hear your opinion first.
> I'm definitely open to alternate solutions. Please honor the Reply-To
> header, as I'm not subscribed to this list.
> 
> Thanks,
> -- 
> Javier Kohen <[EMAIL PROTECTED]>
> ICQ: blashyrkh #2361802
> Jabber: [EMAIL PROTECTED]

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to