Hello Javier, You must be looking at the source prior to Xerces 2.7.0. As of 2.7.0 the character buffers are managed by a pool. You can control how large the internal buffers are by setting the http://apache.org/xml/properties/input-buffer-size [1] property.
Thanks. [1] http://xml.apache.org/xerces2-j/properties.html#input-buffer-size Javier Kohen <[EMAIL PROTECTED]> wrote on 08/12/2005 03:37:28 PM: > Hello, > > After profiling my application I found out that each time an XML file is > parsed, approximately 4kB are allocated inside > XMLEntityManager.ScannedEntity for an internal buffer. This application > processes a very large amount of XHTML-IM formatted messages, which it > has to parse in order to transform them to other formats. This results > in hundreds of megabytes being allocated and released per minute. > > As a workaround, I've reduced the internal buffer size to 256 bytes; the > messages are usually small, so that shouldn't provoke a significant > performance hit, if any at all (I'll profile that next anyway). However, > I'll be happier with a more scalable solution, like a ScannedEntity > pool. > > I'm not familiar with Xerces' internals, but from looking at the call > tree starting at ScannedEntity's constructor, it seems that it's > impossible to avoid this code path; therefore I was thinking that > ScannedEntityS could be pooled to save the garbage collector some > significant work. > > I could implement the pooling, but I'd like to hear your opinion first. > I'm definitely open to alternate solutions. Please honor the Reply-To > header, as I'm not subscribed to this list. > > Thanks, > -- > Javier Kohen <[EMAIL PROTECTED]> > ICQ: blashyrkh #2361802 > Jabber: [EMAIL PROTECTED] Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
