Phew that's good bug. AsyncKuduClient.sendRpcToTablet() can call delayedSendRpcToTablet() which checks if the RPC has hit too many retries and, if so, both returns a Deferred.fromError and calls errback on the RPC. That all seems good, except that in this case sendRpcToTablet() doesn't care about what's returned and just returns the RPC's Deferred. The problem? That Deferred is bogus if you've hit too many retries, the RPC was already errback'd.
This probably reads like $foreign_language, but I think Dan and Adar understand what it means. I'm trying to get a test that reproes, but it's a really tight race condition that gets us into this situation. J-D On Thu, Oct 6, 2016 at 8:59 AM, Todd Lipcon <[email protected]> wrote: > Looks like one of Dan's builds failed with a test timeout on > TestAsyncKuduSession: > http://104.196.14.100/job/kudu-gerrit/3870/BUILD_TYPE=ASAN/console > > On Wed, Oct 5, 2016 at 8:17 PM, Jean-Daniel Cryans <[email protected]> > wrote: > > > Having that would be pretty awesome :) > > > > On Wed, Oct 5, 2016 at 8:16 PM, Todd Lipcon <[email protected]> wrote: > > > > > Thanks, JD! > > > > > > I'll see if I can also spend some time getting tracking of Java > flakiness > > > into the flaky test dashboard some time in the next couple weeks. That > > way > > > we can get an easier handle on how flaky they are and catch > regressions. > > > > > > -Todd > > > > > > On Wed, Oct 5, 2016 at 4:54 PM, Jean-Daniel Cryans < > [email protected]> > > > wrote: > > > > > > > Hey Kudu devs, > > > > > > > > I just got a load of patches in that fix various flakiness and actual > > > bugs > > > > in the Java client and its testing infrastructure. If you have a > build > > > > failure due to Java tests on a patch that has bce1dd7, please reach > out > > > to > > > > me so I can take a look at it. > > > > > > > > Thanks, > > > > > > > > J-D > > > > > > > > > > > > > > > > -- > > > Todd Lipcon > > > Software Engineer, Cloudera > > > > > > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
