Re: Infinite loop in AtomEntryParser

Bindu Wavell Wed, 19 Mar 2014 19:35:22 -0700

As I said we had a bunch of very large string arrays allocated and
collected... These huge arrays have the following list of strings over and
again (as indicated in the linked picture):


atom
entry
http://www.w3.org/2005/Atom
cmisra
content
http://docs.oasis-open.org/ns/cmis/restatom/200908/
chemistry
filename
http://chemistry.apache.org/

Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0 AtomPub
client the chemistry:filename element is included in the cmisra:content
element. The parser in openchemistry 0.7.0 is trying to skip this (because
it doesn't know about the chemistry:filename extension.) Notice that we
aren't seeing cmisra:mediatype or cmisra:base64 here.

For some reason this is introducing a parsing loop where we re-parse the
atom:entry over and over again. I guess it's possible the client is
producing this repetition too...



On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <[email protected]> wrote:

> I believe they are using 3.2.4, assuming this is the right
> jar: wstx-asl-3.2.4.jar
>
>
> On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <[email protected]> wrote:
>
>> Hi Bindu,
>>
>> I've never seen something like this. Do you know which Woodstox version
>> is used?
>>
>> The CappedInputStream is something different. It prevents malicious
>> clients from sending an endless XML and consuming server resources.
>>
>>
>> - Florian
>>
>>
>>
>> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
>> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is in
>> > turn using openchemistry 0.8.0.
>> >
>> > For the last few weeks we have run into a situation where the Alfresco
>> > server becomes unresponsive. Specifically it doesn't crash it simply
>> > becomes so overloaded that we can't login via Workdesk/CMIS. Using
>> > JConsole and subsequently YourKit we can see that the server runs fine
>> > and uses between 3-6GB of RAM with nice slow growth and cleanup. Then at
>> > some point we see a huge amount of memory allocation and deallocation
>> > between 3GB and 15GB every couple of seconds and this occurs forever
>> > until we restart the server. The parallel GC cleaning up this memory
>> > consumes so much CPU that folks can't actually use the system.
>> >
>> > To begin with we took a stackshot every 10 seconds for a minute. We saw
>> > that there were 5 opencmis threads "frozen" while parsing an AtomPub
>> > document. They were not deadlocked. Subsequently we found the same thing
>> > using YourKit, it calls these frozen threads. Following is an example
>> > stack trace from YourKit:
>> >
>> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
>> > com.ctc.wstx.sr.StreamScanner.getNext()
>> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
>> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
>> > com.ctc.wstx.sr.BasicStreamReader.next()
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
>> > CmisService, String, HttpServletRequest, HttpServletResponse)
>> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, Object[])
>> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
>> > java.lang.reflect.Method.invoke(Object, Object[])
>> > org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
>> > String, CallContext, CmisService, String, HttpServletRequest,
>> > HttpServletResponse)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
>> > HttpServletRequest, HttpServletResponse)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
>> > HttpServletResponse)
>> > javax.servlet.http.HttpServlet.service(ServletRequest, ServletResponse)
>> >
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>> > ServletResponse)
>> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>> > ServletResponse)
>> >
>> org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
>> > ServletResponse, FilterChain)
>> >
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>> > ServletResponse)
>> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>> > ServletResponse)
>> > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
>> > org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
>> > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
>> > Response)
>> > org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
>> > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
>> > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
>> > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
>> > org.apache.coyote.http11.Http11Processor.process(Socket)
>> >
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
>> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
>> > java.lang.Thread.run()
>> >
>> > By inspecting garbage collected objects we can see that this is almost
>> > all string[] tied back to AtomEntryParser (and the woodstox parser stuff
>> > under it.) If we look at one of these arrays we see the same stuff over
>> > and over again... see a screenshot here of the first few elements in
>> > such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch of
>> > these VERY large string arrays and they all look like they are looping
>> > over the front of an AtomPub document.
>> >
>> > I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
>> > that a CappedInputStream was added while parsing the AtomPub stream.
>> > This appears to be limited to 10MB. Not sure if that is related to this
>> > issue. Or if others have run into this issue. Any thought/guidance would
>> > be greatly appreciated.
>> >
>> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth having
>> > this public as I've been trying to track this issue down for a while and
>> > have not found an answer.
>> >
>> > Thanks all!
>>
>
>

Re: Infinite loop in AtomEntryParser

Reply via email to