Re: Infinite loop in AtomEntryParser

Florian Müller Thu, 20 Mar 2014 09:52:27 -0700

Hi Bindu,

I've scanned the OpenCMIS and Woodstox code again, but I can't find thecause of this issue. It's difficult without being able to reproducethis.I can exclude at least that OpenCMIS parses the entry over and overagain. The incoming stream is directly read and parsed. There is (atleast for OpenCMIS) no way to jump back and parse it again.

There might be some kind of IO issue (see the exceptions). What servletcontainer do you use? Is there a proxy or firewall between the clientand the server?In one of your emails I found this: "Frozen for at least 47m 15 sec".Usually a connection times out when no bytes are transferred. Did youdisable the timeout on the client and the server? If not, the clientmust send more and more data, but I don't know how. The OpenCMIS clientdoesn't do that. Could you check if there is steady incoming networktraffic when the server is in that state?


- Florian

I should add that I'm seeing 3.6 million of the following threeexceptions
(with associate stacks resulting in the exception):

com.ctc.wstx.exc.WstxParsingException
  BasicStreamReader.readPIPrimary()
    BasicStreamReader.nextFromTree()
      BasicStreamReader.next()
        AtomEntryParser.next()
          AtomEntryParser.skip()
            AtomEntryParser.parseCmisContent()
              AtomEntryParser.parseEntry()
                AtomEntryParser.parse()
                  ObjectService.create()

java.io.IOException
  ChunkedInputFilter.doRead()
    InternalInputBuffer.doRead()
      Request.doRead()
        InputBuffer.realReadBytes()
          ByteChunk.substract()
            InputBuffer.read()
              CoyoteInputStream.read()
                UTF8Reader.loadMore()
                  UTF8Reader.read()
                    ReaderSource.readInto()
                      BranchingReaderSource.readInto()
                        StreamScanner.loadMore()

com.ctc.wstx.exc.WstxIOException
  BasicStreamReader.next()
    AtomEntryParser.next()
      AtomEntryParser.skip()
        AtomEntryParser.parseCmisContent()
          AtomEntryParser.parseEntry()
            AtomEntryParser.parse()
              ObjectService.create()
Looking at the 3.2.4 version of com.ctc.wstx.sr.BasicStreamReader Isee the
following in readPIPrimary()


if (target.equalsIgnoreCase

<http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/lang/String.java#String.equalsIgnoreCase%28java.lang.String%29>("xml"))
{

    // 07-Oct-2005, TSa: Still legal in multi-doc mode...

    if (!mConfig

<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#StreamScanner.0mConfig>.inputParsingModeDocuments

<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/api/ReaderConfig.java#ReaderConfig.inputParsingModeDocuments%28%29>())
{

        throwParseError

<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#StreamScanner.throwParseError%28java.lang.String%2Cjava.lang.Object%29>(ErrorConsts

<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>.ERR_WF_PI_XML_TARGET

<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>,
target);

    }


I also see next() catching IOException and throwing
com.ctc.wstx.exc.WstxIOException.
Does this mean that cmisra:content somehow contains another xmldocumentwith an xml root processing instruction? Since I'm not sure how tocapturethe source AtomPub document that is triggering this I find that I'mreallygrasping at straws, I hope someone with more experience in the codebase
can shed some light :)

Thanks for reading through my ramblings...
On Wed, Mar 19, 2014 at 8:33 PM, Bindu Wavell <[email protected]>wrote:
As I said we had a bunch of very large string arrays allocated and
collected... These huge arrays have the following list of stringsover and
again (as indicated in the linked picture):

atom
entry
http://www.w3.org/2005/Atom
cmisra
content
http://docs.oasis-open.org/ns/cmis/restatom/200908/
chemistry
filename
http://chemistry.apache.org/
Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0AtomPubclient the chemistry:filename element is included in thecmisra:contentelement. The parser in openchemistry 0.7.0 is trying to skip this(becauseit doesn't know about the chemistry:filename extension.) Notice thatwe
aren't seeing cmisra:mediatype or cmisra:base64 here.
For some reason this is introducing a parsing loop where we re-parsethe
atom:entry over and over again. I guess it's possible the client is
producing this repetition too...
On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <[email protected]>wrote:
I believe they are using 3.2.4, assuming this is the right
jar: wstx-asl-3.2.4.jar
On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <[email protected]>wrote:
Hi Bindu,
I've never seen something like this. Do you know which Woodstoxversion
is used?
The CappedInputStream is something different. It preventsmaliciousclients from sending an endless XML and consuming serverresources.
- Florian



> Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
> openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1which is
in
> turn using openchemistry 0.8.0.
>
> For the last few weeks we have run into a situation where theAlfresco> server becomes unresponsive. Specifically it doesn't crash itsimply> becomes so overloaded that we can't login via Workdesk/CMIS.Using> JConsole and subsequently YourKit we can see that the serverruns fine> and uses between 3-6GB of RAM with nice slow growth and cleanup.Then
at
> some point we see a huge amount of memory allocation anddeallocation> between 3GB and 15GB every couple of seconds and this occursforever> until we restart the server. The parallel GC cleaning up thismemory
> consumes so much CPU that folks can't actually use the system.
>
> To begin with we took a stackshot every 10 seconds for a minute.We saw> that there were 5 opencmis threads "frozen" while parsing anAtomPub> document. They were not deadlocked. Subsequently we found thesame
thing
> using YourKit, it calls these frozen threads. Following is anexample
> stack trace from YourKit:
>
> http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
> com.ctc.wstx.sr.StreamScanner.getNext()
> com.ctc.wstx.sr.BasicStreamReader.skipToken()
> com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
> com.ctc.wstx.sr.BasicStreamReader.next()
>

org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
>

org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
>

org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
>

org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
>

org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
>

org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
> CmisService, String, HttpServletRequest, HttpServletResponse)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object,Object[])
> sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
> sun.reflect.DelegatingMethodAccessorImpl.invoke(Object,Object[])
> java.lang.reflect.Method.invoke(Object, Object[])
>org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
> String, CallContext, CmisService, String, HttpServletRequest,
> HttpServletResponse)
>

org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
> HttpServletRequest, HttpServletResponse)
>

org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
> HttpServletResponse)
> javax.servlet.http.HttpServlet.service(ServletRequest,ServletResponse)
>

org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
> ServletResponse)
>

org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
> ServletResponse)
>

org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
> ServletResponse, FilterChain)
>

org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
> ServletResponse)
>

org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
> ServletResponse)
> org.apache.catalina.core.StandardWrapperValve.invoke(Request,Response)> org.apache.catalina.core.StandardContextValve.invoke(Request,Response)>org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
> Response)
> org.apache.catalina.core.StandardHostValve.invoke(Request,Response)> org.apache.catalina.valves.ErrorReportValve.invoke(Request,Response)> org.apache.catalina.core.StandardEngineValve.invoke(Request,Response)> org.apache.catalina.connector.CoyoteAdapter.service(Request,Response)
> org.apache.coyote.http11.Http11Processor.process(Socket)
>

org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
> java.lang.Thread.run()
>
> By inspecting garbage collected objects we can see that this isalmost> all string[] tied back to AtomEntryParser (and the woodstoxparser
stuff
> under it.) If we look at one of these arrays we see the samestuff over> and over again... see a screenshot here of the first fewelements in> such an array: http://i.imgur.com/RzNQBYG.png . There are abunch of> these VERY large string arrays and they all look like they arelooping
> over the front of an AtomPub document.
>
> I noticed in opencmis 0.9.0 in the implementation ofAtomEntryParser> that a CappedInputStream was added while parsing the AtomPubstream.> This appears to be limited to 10MB. Not sure if that is relatedto this> issue. Or if others have run into this issue. Anythought/guidance
would
> be greatly appreciated.
>
> I'm bcc'ing Florian and Gab (the spy) but figured it was worthhaving> this public as I've been trying to track this issue down for awhile
and
> have not found an answer.
>
> Thanks all!

Re: Infinite loop in AtomEntryParser

Reply via email to