That's really great news, thanks Dan! --John
On Jun 19, 2013, at 3:22 PM, Daniel Kulp <[email protected]> wrote: > > On Jun 14, 2013, at 2:10 PM, John Bellassai <[email protected]> wrote: > >> I went ahead and opened a ticket yesterday and provided a patch: >> https://issues.apache.org/jira/browse/CXF-5078. > > Patch applied. Next releases should have it. Major thanks! > > Dan > > >> >> I'm not sure what the development cycle for CXF looks like, but do you think >> chances are good that this patch could be included in the next bunch of >> releases? If so I will wait for the next release, but if not, I will >> probably need to provide a build to our customers which includes this >> functionality as a new WSDLGetInterceptor class which places itself before >> the existing one in the interceptor chain. >> >> At the very least I was hoping one of the CXF devs could have a look at it >> to make sure I'm not doing something stupid. >> >> --John >> >> On Jun 7, 2013, at 2:55 PM, Daniel Kulp <[email protected]> wrote: >> >>> >>> On Jun 7, 2013, at 2:01 PM, John Bellassai <[email protected]> wrote: >>>> Hi Daniel. I was under the impression that a new Document object is >>>> generated and only accessible within each invocation of handleMessage() >>>> and thus would not be susceptible to thread-safety issues, but I'll take >>>> your advice over my very limited understanding of the inner workings of >>>> CXF ;). >>> >>> It SHOULD be caching the documents and only creating a new one if there is >>> a required change (like the URL). I think. >>> >>>> I like your idea of writing it to a CachedOutputStream inside the lock. I >>>> will give that a try and look into submitting a patch. >>> >>> Sounds good! >>> Dan >>> >>> >>>> >>>> Thanks again! >>>> >>>> --John >>>> >>>> >>>> On Jun 7, 2013, at 12:21 PM, Daniel Kulp <[email protected]> wrote: >>>> >>>>> >>>>> On Jun 7, 2013, at 12:30 PM, John Bellassai <[email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We've been seeing an issue in production for the past few months where >>>>>> after running smoothly for a couple weeks, our application hangs and >>>>>> stops responding to new requests until we bounce the container (Tomcat >>>>>> 7/CXF 2.5.3). >>>>>> >>>>>> Some thread/heap dump analysis for a few of these hang events have shown >>>>>> a running theme. It seems that all of Tomcat's HTTPS handler threads >>>>>> except one are waiting on a specific lock to become available. The >>>>>> problem is in the org.apache.cxf.frontend.WSDLGetInterceptor class, >>>>>> specifically in the synchronized block in the handleMessage method. >>>>>> >>>>>> What we are experiencing seems to not technically be a deadlock because >>>>>> one thread is legitimately holding the lock and is still runnable and >>>>>> writing the WSDL to the client, but for some reason (network issues >>>>>> perhaps), it is not making very quick progress while other threads >>>>>> continue to pile up. Eventually all of Tomcat's handler threads are in >>>>>> use and are waiting for this lock so from the outside, the server does >>>>>> not respond to new requests. >>>>>> >>>>>> In examining the code for this interceptor I wonder if the synchronized >>>>>> block needs to be as coarse as it is currently. In other words, would >>>>>> it be possible to lock only while creating the WSDL Document object, >>>>>> then actually write to the XMLStreamWriter outside of the synchronized >>>>>> block such that all threads can make progress even if one client happens >>>>>> to be slow or experiencing network issues? >>>>> >>>>> I don't think that would be possible for two reasons: >>>>> >>>>> 1) Another thread could then modify the document while it's being written >>>>> out. I'm not exactly sure what would happen in that case. >>>>> >>>>> 2) Traversing a DOM object is also not thread safe: >>>>> http://xerces.apache.org/xerces2-j/faq-dom.html#faq-1 >>>>> Thus, you don't want 2 threads traversing it at the same time. >>>>> >>>>> One potential fix that might work would be to write it to a >>>>> CachedOutputStream within the lock. That should be fairly fast. Then >>>>> outside the lock, write that to the network stream. Would you care to >>>>> give that a try and maybe submit a patch? >>>>> >>>>> >>>>> -- >>>>> Daniel Kulp >>>>> [email protected] - http://dankulp.com/blog >>>>> Talend Community Coder - http://coders.talend.com >>>>> >>>> >>> >>> -- >>> Daniel Kulp >>> [email protected] - http://dankulp.com/blog >>> Talend Community Coder - http://coders.talend.com >>> >> > > -- > Daniel Kulp > [email protected] - http://dankulp.com/blog > Talend Community Coder - http://coders.talend.com >
