Hi Carlo, "De Rossi Heim Carlo (KIRF 741)" <carlo.dero...@credit-suisse.com> wrote on 04/22/2010 03:59:58 AM:
> Dear Michael Glavassevich, > > thank you for your answer. > Indeed we also have the impression that too many objects in general > and too many parsers are created: for example we expected a pooling > to be implemented in the DocumentBuilderImpl: if you look at the > stack trace in my first email everything starts there. But this > would be another discussion. DocumentBuilderImpl is the parser. Everything "starts" in some stack frames you haven't shown; the frames beneath "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl.newDocumentBuilder()". Something in your application is calling newDocumentBuilder(), presumably quite frequently. > A synchronization is not a performnace killer, as long as what is > done in the synchronized method does not take too long: i this case > is the operation that can take as long as possible on a complex > environment, i.e. Class.forName(...). Unfortunately removing the > synchronization simply moves the blocking point forward, i.e. on the > findParserClass of the ObjectFactory. I understand the potential costs of querying ClassLoaders. Xerces-J 2.x was designed to be highly configurable / flexible and ObjectFactories are part of that framework. Frequently creating and burning parser instances is expensive. It's not an area that we spend time optimizing. It's just not the sweet spot for where applications should be operating. I have never seen ObjectFactory show up in a profile of an application which is effectively reusing / pooling parser instances. > We are caching our parsers indeed, but like every cache it goes (in > production) shrinking and groving according to the different load > the system is facing, as expected within a (non overdimensioned) > 24x7 system. Tests done with the Xerces 2.9.1 library shows some > improvement, but still the code seems to have an high "dynamic": we > put the system under stress with a UseCase where the same kind of > message with minimal variations was sent back and forth. The loading > of objects and classes is huge, considering that the message was > always the same. We understand that the code needs also to support > very dynamic changes in configuration and plugging at runtime of > different implementations, but we asked if the huge amount of > objects generated cannot be somehow reduced by setting more > restrictive parameters. To give you a rough idea: a 20kb message can > generate up to more than 5 MB of temporary objects, over and over > again: the same message. > > Any suggestion would be appreciated. If for some reason you can't change your application's pattern of creating parsers perhaps you could change the ClassLoader to one which does a better job of caching the classes it's found / loaded. > Best regards > Carlo de Rossi > > > From: Michael Glavassevich [mailto:mrgla...@ca.ibm.com] > Sent: Wednesday, April 21, 2010 12:49 PM > To: De Rossi Heim Carlo (KIRF 741) > Subject: Re: FW: DTDDVFactory > Hi, > > The thread contention issue with DTDDVFactory was addressed [1] in 2007. > The fix should be available in Xerces-J 2.9.1. > > Note that the fact that you hit this issue is likely a sign that your > application is excessively creating new parser instances. They are meant to > be reused and you will get much better performance if you pool [2] them. > > Thanks. > > [1] http://svn.apache.org/viewvc?view=revision&revision=558581 > [2] http://www.ibm.com/developerworks/library/x-perfap2.html/#reuse > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: mrgla...@ca.ibm.com > E-mail: mrgla...@apache.org > > "De Rossi Heim Carlo (KIRF 741)" <carlo.dero...@credit-suisse.com> > wrote on 04/21/2010 05:31:57 AM: > > > Dear Mr. Glavassevich > > I've seen from the Apache mail thread that you are reported as the > > author of this class: I therefore allowed myself to forwrad our > > concern directly. > > In case there should be an interest in enhancing the functionality > > of the below mentioned class, we are at your disposal, as we have > > already been making some thoughts in regard to a possible implementation. > > Gladly awaiting your feedback. > > Best regards > > Carlo de Rossi > > Carlo de Rossi > > CREDIT SUISSE AG > > Technology Infrastructure Services > > Java Infrastructure Engineering - KIRF 741 > > Uetlibergstrasse 231 > > P.O. Box 600 > > 8070 Zurich > > Switzerland > > Phone +41 44 334 03 62 > > mailto:carlo.dero...@credit-suisse.com > > www.credit-suisse.com > > > > This message may contain confidential, proprietary or legally > > privileged information and is intended only for the use of the > > addressee named above. No confidentiality or privilege is waived or > > lost by any mistransmission. If you are not the intended recipient > > of this message you are hereby notified that you must not use, > > disseminate, copy it in any form or take any action in reliance on > > it. If you have received this message in error please delete it and > > any copies of it and notify the sender immediately. Credit Suisse > > Group AG and its subsidiaries reserve the right to intercept and > > monitor any e-mail communication through its networks if legally allowed. > > > > ______________________________________________ > > From: De Rossi Heim Carlo (KIRF 741) > > Sent: Wednesday, April 21, 2010 11:23 AM > > To: 'j-...@xerces.apache.org' > > Subject: DTDDVFactory > > Dear Madame or Sir > > we are facing some serious performance issues, caused by the > > implementation of the "org.apache.xerces.impl.dv.DTDDVFactory" and > > the "org.apache.xerces.impl.dv.ObjectFactory" classes. > > Our environment is based on Oracle WebLogic server 10.3.2, Sun JDK > > 160_16 and Xerces 2.8.1 > > Basically what we see is an high number of threads blocked by the > > method getInstance of the "org.apache.xerces.impl.dv.DTDDVFactory" class. > > An example is given here: > > ---------------------------------- THREAD DUMP START > > --------------------------------------------- > > "[ACTIVE] ExecuteThread: '99' for queue: 'weblogic.kernel.Default > > (self-tuning)'" daemon prio=6 tid=0x08597400 nid=0x1444 waiting for > > monitor entry [0x4143e000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > at org.apache.xerces.impl.dv.DTDDVFactory.getInstance > (Unknown Source) > > - waiting to lock <0x2a008228> (a java.lang.Class for > > org.apache.xerces.impl.dv.DTDDVFactory) > > at org.apache.xerces.parsers.XML11Configuration.<init> > (Unknown Source) > > at > > org.apache.xerces.parsers.XIncludeAwareParserConfiguration.<init> > > (Unknown Source) > > at > > org.apache.xerces.parsers.XMLGrammarCachingConfiguration.<init> > > (Unknown Source) > > at > > org.apache.xerces.parsers.XMLGrammarCachingConfiguration.<init> > > (Unknown Source) > > at sun.reflect.GeneratedConstructorAccessor61.newInstance > > (Unknown Source) > > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance > > (DelegatingConstructorAccessorImpl.java:27) > > at java.lang.reflect.Constructor.newInstance (Constructor.java:513) > > at java.lang.Class.newInstance0(Class.java:355) > > at java.lang.Class.newInstance(Class.java:308) > > at org.apache.xerces.parsers.ObjectFactory.newInstance > (Unknown Source) > > at org.apache.xerces.parsers.ObjectFactory.createObject > > (Unknown Source) > > at org.apache.xerces.parsers.ObjectFactory.createObject > > (Unknown Source) > > at org.apache.xerces.parsers.DOMParser.<init>(Unknown Source) > > at org.apache.xerces.parsers.DOMParser.<init>(Unknown Source) > > at org.apache.xerces.jaxp.DocumentBuilderImpl.<init> > (Unknown Source) > > at > > org.apache.xerces.jaxp.DocumentBuilderFactoryImpl.newDocumentBuilder > > (Unknown Source) > > ---------------------------------- THREAD DUMP END > > --------------------------------------------- > > Out of 100 threads available 72 are blocked there when the server is > > under stress. > > To understand the issue, is useful to remember that when a > > Class.forName is done (like the ObjectFactory class invoked from the > > DTDDVFactory does) it can takes (depending from the dimension of the > > classpath) up to 2-4 seconds which is a huge time if this is invoked > > continuosly. > > Our suggestion would be to cache the class objects per ClassLoader > > in the DTDDVFactory, at least for the default implementation (" > > org.apache.xerces.impl.dv.dtd.DTDDVFactoryImpl") so that the search > > of this class object in the context of a ClassLoader is done only > > once. If there is any concern about backward compatibility this > > could be done only if a specific System Property is set, which would > > enable the feature and/or configure the size of this (LRU) cache. > > A similar issue (but less heavy in performance) can be found in the > > TransformerFactory. > > > > Would such a change be possible? > > Gladly awaiting your answer and feedback. > > Kind regards > > Carlo de Rossi Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: mrgla...@ca.ibm.com E-mail: mrgla...@apache.org