So have you looked at what resources the fetcher is fetching?
On Fri, Sep 5, 2014 at 12:17 PM, Merrill, Matt <mmerr...@mitre.org> wrote: > Yes, we have. During a couple of the outages we did a thread dump and saw > that all (or almost all) of the threads were blocking on the > BasicHTTPFetcher fetch method. We also saw the number of threads jump up > to around the same number of threads we have in our Tomcat HTTP thread > pool (300). > > From the best I can tell, it seems as though the issue is that there are > now MORE calls to the various shindig servlets being made which is causing > all of the HTTP threads to get consumed, but we can’t explain why as the > load is the same. Once we roll back to the version of the application > which uses shindig 2.0.0, everything is absolutely fine. > > I’m very hesitant to just increase the thread pool without a good > understanding of what could cause this. If someone knows something that > changed between the 2.0.0 and 2.5.0-update1 versions that may have caused > more calls to be made whether through the opensocial java API or > internally inside shindig that would be great to know. > > Or, perhaps a configuration parameter was introduced that we have set > wrong that may have caused all these extra calls? > > We have already made sure our HTTP responses are cached at a very high > level per your excellent advice. However, because the majority of the > calls which seem to be taking a long time are RPC calls, it doesn’t appear > these get cached anyway so that wouldn’t affect this problem. > > And if someone knows the answers to the configuration/extension questions > about pipelining, that would be great. > > Thanks! > > -Matt > > On 9/5/14, 11:35 AM, "Ryan Baxter" <rbaxte...@apache.org> wrote: > >>So Matt have you looking into what those threads are doing? I agree >>that it seems odd that with 2.5.1-update1 you are running out of >>threads but it is hard to pinpoint the reason without knowing what all >>those extra threads might be doing. >> >> >>On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mmerr...@mitre.org> wrote: >>> Hi all, >>> >>> I haven’t heard back on this, so I thought I’d provide some more >>> information in the hopes that perhaps someone has some ideas as to what >>> could be causing the issues we’re seeing with shindig’s “loopback” http >>> calls. >>> >>> We have a situation where under load we hit a deadlock-like situation >>> because of the HTTP calls shindig makes to itself when pipelining gadget >>> data. Basically, the HTTP request threadpools inside our Shindig Tomcat >>> container are getting maxed out, and when shindig makes an http rpc call >>> to itself to render a gadget which pipelines data, the request gets held >>> up waiting for the rpc call, which might be being blocked by the Tomcat >>> container waiting to handle an HTTP request. This only happens under >>> load, of course. >>> >>> This is puzzling to me because when we were running Shindig 2.0.0, we >>>had >>> the same size threadpool, and now that we’ve upgraded to Shindig >>> 2.5.0-update1, the threadpools now seem to be getting maxed out. I took >>> some timings inside of our various shindig SPI implementions >>> (PersonService, AppData Service) and I didn’t see anything alarming. >>> There are also no spikes in user traffic. >>> >>> As I see it, we have a few options I could explore: >>> >>> 1) The “nuclear” option would be to simply increase our tomcat HTTP >>> threadpools, but that doesn’t seem prudent since the old version of >>> shindig worked just fine with that thread pool setting. I feel like a >>> greater problem is being masked. Is there anything that changed between >>> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of >>> increase in traffic to shindig? I tried looking at release notes in >>>Jira, >>> but that honestly wasn’t very helpful at all. >>> >>> 2) Re-configure Shindig to use implemented SPI methods (java method >>>calls) >>> instead of making HTTP calls to itself through the RPC API shindig >>> exposes? Based on Stanton’s note below, it seems like there are some >>> configuration options for the RPC calls, but they’re mostly related to >>>how >>> the client-side javascript makes the calls. Is there anything server >>>side >>> I can configure? Perhaps with Guice modules? >>> >>> 3) Explore would be if there are hooks in the code to write custom code >>>to >>> do this. I see in PipelinedDataPreloader.executeSocialRequest that the >>> javadoc mentions that: >>> "Subclasses can override to provide special handling (e.g., directly >>> invoking a local API)” However, I’m missing something because I can’t >>> find out where the preloader gets instantiated. I see that the >>> PipelineExecutor takes in a Guice injected instance of >>> PipedlinedDataPreloader, however, I don’t see it getting created >>>anywhere. >>> Where is this being configured? >> >>The intention was probably to make this possible via Guice, but there >>is not interface you can bind an implementation to. You would have to >>replace the classes where PipelinesDataPreloader are used and then >>keep going up the chain until you get to a class where you can inject >>something via Guice. Looks like a messy situation right now with the >>current way the code is written. >> >>> >>> Any help is appreciated! >>> >>> Thanks! >>> -Matt >>> >>> On 8/25/14, 4:55 PM, "Merrill, Matt" <mmerr...@mitre.org> wrote: >>> >>>>Thanks Stanton! >>>> >>>>I¹m assuming that you mean the javascript calls will call listmethods >>>>then >>>>make any necessary RPC calls, is that correct? Is there any other >>>>documentation on the introspection part? >>>> >>>>The reason I ask is that we¹re having problems server side when Shindig >>>>is >>>>pipelining data. For example, when you do the following in a gadget: >>>><os:ViewerRequest key="viewer" /> >>>> <os:DataRequest key="appData" method="appdata.get" userId="@viewer" >>>>appId="@app"/> >>>> >>>> >>>>Shindig appears to make HTTP requests to the rpc endpoint to itself in >>>>the >>>>process of rendering the gadget. I could be missing something >>>>fundamental >>>>here, but is there any way to configure this differently so that shindig >>>>simply uses its SPI methods to retrieve this data instead? Is this >>>>really >>>>just more of a convenience for the gadget developer than anything else? >>>> >>>>-Matt >>>> >>>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ssiev...@apache.org> wrote: >>>> >>>>>Hi Matt, >>>>> >>>>>This behavior is configured in container.js in the "gadgets.features" >>>>>object. If you look for "osapi" and "osapi.services", you'll see some >>>>>comments about this configuration and the behavior. >>>>>features/container/service.js is where this configuration is used and >>>>>where >>>>>the osapi services are instantiated. As you've seen, Shindig >>>>>introspects >>>>>to find available services by default. >>>>> >>>>>If I knew at one point why this behaves this way, I've since forgotten. >>>>>There is a system.listMethods API[1] defined in the Core API Server >>>>>spec >>>>>that this might simply be re-using to discover the available services. >>>>> >>>>>I hope that helps. >>>>> >>>>>-Stanton >>>>> >>>>>[1] >>>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Servi >>>>>ce >>>>>- >>>>>ListMethods >>>>> >>>>> >>>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mmerr...@mitre.org> >>>>>wrote: >>>>> >>>>>> Good morning, >>>>>> >>>>>> I¹m hoping some shindig veterans can help shed some light into the >>>>>>reason >>>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget >>>>>>rendering >>>>>> process. Why is this done as opposed to retrieving information via >>>>>> internal Java method calls? We hare having lots of issues where this >>>>>> approach seems to be causing a cascading failure when calls get hung >>>>>>up >>>>>>in >>>>>> the HTTPFetcher class. >>>>>> >>>>>> Also, I¹m curious what calls are made in this manner and how can they >>>>>>be >>>>>> configured? I have seen retrieval of viewer data done this way, as >>>>>>well as >>>>>> application data. >>>>>> >>>>>> I¹ve looked for documentation on this topic before and have not seen >>>>>>any. >>>>>> Any help is much appreciated. >>>>>> >>>>>> Thanks! >>>>>> -Matt Merrill >>>>>> >>>> >>> >