Tim, I have added some new guards in to prevent one than one thread from accessing a client. The current implementation will throw a runtime exception if the situation occurs and in the exception the stack trace of the owning thread will be printed as well as the accessing threads stack trace.
This will not fix the problem, however I think it will provide enough information to trace down where the issue is coming form in the code. https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=commit;h=33f083166af2df02558fc238ad7f537ad475f890 Aaron On Tue, Dec 17, 2013 at 5:37 PM, Aaron McCurry <[email protected]> wrote: > I'm wondering if I can somehow put some tracking code in the client to > figure out what's going on. And have it be able to be enabled and disabled > (for when we figure out what's going on). > > I will try and write a unit test tonight to reproduce. > > Aaron > > > On Tue, Dec 17, 2013 at 4:19 PM, Tim Williams <[email protected]>wrote: > >> I think multiple queries does the trick. Here's an evil scenario that >> can produce it if done a couple times. >> >> 1) Run a cruelly long query (e.g. * ) >> 2) Run, several non-trivial, but not evil queries over and over. >> 3) One of the non-trivial queries fails with the "out of sequence" error. >> 4) Then, Mr. Cruel Query fails with a "TTransportException: >> j.n.SocketTimeoutException: Read Timed Out" >> >> I think the corpus has to be reasonably large for it to be slow enough to >> occur. >> >> Thanks, >> --tim >> >> On Tue, Dec 17, 2013 at 3:48 PM, Aaron McCurry <[email protected]> >> wrote: >> > So it's during a query call (+ some other call)? >> > >> > >> > On Tue, Dec 17, 2013 at 3:38 PM, Tim Williams <[email protected]> >> wrote: >> > >> >> Appears to be between controller and shards, with the exception on the >> >> controller side... >> >> >> >> >> BlurResultIterableClient.performSearch(BlurResultIterableClient.java:77) >> >> ... >> >> BlurControllerServer.call(BlurControllerServer.java:396) >> >> ... >> >> >> >> That's where it heads into generated code territory. The whole trace >> >> isn't easily pastable for me so let me know if more context is >> >> necessary... >> >> >> >> Thanks, >> >> --tim >> >> >> >> On Tue, Dec 17, 2013 at 3:08 PM, Aaron McCurry <[email protected]> >> wrote: >> >> > Tim you are right this only occurs when a thrift client has more >> then one >> >> > thread using it. Can you isolate where the error is occurring? >> Meaning >> >> is >> >> > it between the controller and shard or between your client and the >> >> > controller? Because if it's between your client and the controller >> it's >> >> > likely something going on in your application logic. If not, then >> >> there's >> >> > a bug somewhere in the client use/reuse. >> >> > >> >> > Aaron >> >> > >> >> > >> >> > On Tue, Dec 17, 2013 at 2:54 PM, Garrett Barton < >> >> [email protected]>wrote: >> >> > >> >> >> Used to see that with Blur when I did not set a id on the query. >> Don't >> >> >> know if Aaron sets some unique identifier on calls like schema >> though... >> >> >> >> >> >> >> >> >> On Tue, Dec 17, 2013 at 2:46 PM, Tim Williams <[email protected] >> > >> >> >> wrote: >> >> >> >> >> >> > I'm periodically seeing this on various calls - both query and >> >> >> > seemingly harmless ones (e.g. schema). Google hints that it >> happens >> >> >> > when the client is used across threads. Anyone see it before? >> Know >> >> >> > how to solve it? >> >> >> > >> >> >> > Thanks, >> >> >> > --tim >> >> >> > >> >> >> >> >> >> > >
