David and I chatted about this briefly offline. We think we should probably
revert the original bug fix from the 1.0.0 branch, since we're not 100%
confident in the "fix of the fix". This isn't a regression, so will just
have to be a known bug in 1.0.0, to be addressed in 1.1 (or a 1.0.1 if we
decide to do one)

-Todd

On Sun, Sep 11, 2016 at 10:03 PM, David Alves <[email protected]> wrote:

> Look at it for a bit and think I figured out the problem and fixed it (put
> on the ip2client map is synchronized with the put on client2tablets so it
> makes sense that the removing is too).
> The problem is that I'm having a hard time reproing the first failure.
> I'll ask JD's opinion tomorrow.
>
> -david
>
> On Sun, Sep 11, 2016 at 8:25 PM, Todd Lipcon <[email protected]> wrote:
>
> > I just had a Jenkins job time out when TestAsyncKuduSession java test
> took
> > an hour:
> > http://104.196.14.100/job/kudu-gerrit/3359/BUILD_TYPE=RELEASE/console
> >
> > Looking at the test output
> > http://104.196.14.100/job/kudu-gerrit/3359/BUILD_TYPE=
> > RELEASE/artifact/java/kudu-client/target/surefire-
> reports/org.apache.kudu.
> > client.TestAsyncKuduSession-output.txt
> > ,
> > I see the following NPE after an unexpected disconnect:
> >
> > 00:50:11.056 [WARN - main] (AsyncKuduSession.java:334) unexpected
> > tablet lookup failure for operation KuduRpc(method=Write, tablet=null,
> > attempt=0, DeadlineTracker(timeout=0, elapsed=32),
> > Deferred@1645276019(state=PENDING, result=null, callback=<none>,
> > errback=<none>)) row_key=(int32 key=1)
> > java.lang.NullPointerException
> >         at org.apache.kudu.client.AsyncKuduClient$RemoteTablet.
> > addTabletClient(AsyncKuduClient.java:2089)
> >         at org.apache.kudu.client.AsyncKuduClient$RemoteTablet.
> > refreshTabletClients(AsyncKuduClient.java:2049)
> >         at org.apache.kudu.client.AsyncKuduClient.discoverTablets(
> > AsyncKuduClient.java:1404)
> >         at org.apache.kudu.client.AsyncKuduClient$MasterLookupCB.call(
> > AsyncKuduClient.java:1317)
> >         at org.apache.kudu.client.AsyncKuduClient$MasterLookupCB.call(
> > AsyncKuduClient.java:1297)
> >         at com.stumbleupon.async.Deferred.doCall(Deferred.java:1280)
> >         at com.stumbleupon.async.Deferred.addCallbacks(
> Deferred.java:685)
> >         at com.stumbleupon.async.Deferred.addCallback(Deferred.java:721)
> >         at org.apache.kudu.client.AsyncKuduClient.locateTablet(
> > AsyncKuduClient.java:1106)
> >         at org.apache.kudu.client.AsyncKuduClient.loopLocateTable(
> > AsyncKuduClient.java:1206)
> >         at org.apache.kudu.client.AsyncKuduClient.locateTable(
> > AsyncKuduClient.java:1241)
> >         at org.apache.kudu.client.AsyncKuduClient.getTabletLocation(
> > AsyncKuduClient.java:1450)
> >         at org.apache.kudu.client.AsyncKuduSession.apply(
> > AsyncKuduSession.java:509)
> >         at org.apache.kudu.client.TestAsyncKuduSession.
> > testInsertIntoUnavailableTablet(TestAsyncKuduSession.java:162)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:57)
> >         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:43)
> >         at java.lang.reflect.Method.invoke(Method.java:606)
> >         at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(
> > FrameworkMethod.java:47)
> >         at org.junit.internal.runners.model.ReflectiveCallable.run(
> > ReflectiveCallable.java:12)
> >         at org.junit.runners.model.FrameworkMethod.invokeExplosively(
> > FrameworkMethod.java:44)
> >         at org.junit.internal.runners.statements.InvokeMethod.
> > evaluate(InvokeMethod.java:17)
> >         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> >
> >
> > I'm guessing this is an artifact
> > of d5082d8ec1218e3f3bd2143d117ddd64772a6162 which was committed Friday.
> > David, since you committed that change, do you mind looking into this and
> > decide if we should revert this patch for 1.0 so we have some more time
> to
> > thoroughly test it?
> >
> > -Todd
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to