David and I chatted about this briefly offline. We think we should probably revert the original bug fix from the 1.0.0 branch, since we're not 100% confident in the "fix of the fix". This isn't a regression, so will just have to be a known bug in 1.0.0, to be addressed in 1.1 (or a 1.0.1 if we decide to do one)
-Todd On Sun, Sep 11, 2016 at 10:03 PM, David Alves <[email protected]> wrote: > Look at it for a bit and think I figured out the problem and fixed it (put > on the ip2client map is synchronized with the put on client2tablets so it > makes sense that the removing is too). > The problem is that I'm having a hard time reproing the first failure. > I'll ask JD's opinion tomorrow. > > -david > > On Sun, Sep 11, 2016 at 8:25 PM, Todd Lipcon <[email protected]> wrote: > > > I just had a Jenkins job time out when TestAsyncKuduSession java test > took > > an hour: > > http://104.196.14.100/job/kudu-gerrit/3359/BUILD_TYPE=RELEASE/console > > > > Looking at the test output > > http://104.196.14.100/job/kudu-gerrit/3359/BUILD_TYPE= > > RELEASE/artifact/java/kudu-client/target/surefire- > reports/org.apache.kudu. > > client.TestAsyncKuduSession-output.txt > > , > > I see the following NPE after an unexpected disconnect: > > > > 00:50:11.056 [WARN - main] (AsyncKuduSession.java:334) unexpected > > tablet lookup failure for operation KuduRpc(method=Write, tablet=null, > > attempt=0, DeadlineTracker(timeout=0, elapsed=32), > > Deferred@1645276019(state=PENDING, result=null, callback=<none>, > > errback=<none>)) row_key=(int32 key=1) > > java.lang.NullPointerException > > at org.apache.kudu.client.AsyncKuduClient$RemoteTablet. > > addTabletClient(AsyncKuduClient.java:2089) > > at org.apache.kudu.client.AsyncKuduClient$RemoteTablet. > > refreshTabletClients(AsyncKuduClient.java:2049) > > at org.apache.kudu.client.AsyncKuduClient.discoverTablets( > > AsyncKuduClient.java:1404) > > at org.apache.kudu.client.AsyncKuduClient$MasterLookupCB.call( > > AsyncKuduClient.java:1317) > > at org.apache.kudu.client.AsyncKuduClient$MasterLookupCB.call( > > AsyncKuduClient.java:1297) > > at com.stumbleupon.async.Deferred.doCall(Deferred.java:1280) > > at com.stumbleupon.async.Deferred.addCallbacks( > Deferred.java:685) > > at com.stumbleupon.async.Deferred.addCallback(Deferred.java:721) > > at org.apache.kudu.client.AsyncKuduClient.locateTablet( > > AsyncKuduClient.java:1106) > > at org.apache.kudu.client.AsyncKuduClient.loopLocateTable( > > AsyncKuduClient.java:1206) > > at org.apache.kudu.client.AsyncKuduClient.locateTable( > > AsyncKuduClient.java:1241) > > at org.apache.kudu.client.AsyncKuduClient.getTabletLocation( > > AsyncKuduClient.java:1450) > > at org.apache.kudu.client.AsyncKuduSession.apply( > > AsyncKuduSession.java:509) > > at org.apache.kudu.client.TestAsyncKuduSession. > > testInsertIntoUnavailableTablet(TestAsyncKuduSession.java:162) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > NativeMethodAccessorImpl.java:57) > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall( > > FrameworkMethod.java:47) > > at org.junit.internal.runners.model.ReflectiveCallable.run( > > ReflectiveCallable.java:12) > > at org.junit.runners.model.FrameworkMethod.invokeExplosively( > > FrameworkMethod.java:44) > > at org.junit.internal.runners.statements.InvokeMethod. > > evaluate(InvokeMethod.java:17) > > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > > > > > > I'm guessing this is an artifact > > of d5082d8ec1218e3f3bd2143d117ddd64772a6162 which was committed Friday. > > David, since you committed that change, do you mind looking into this and > > decide if we should revert this patch for 1.0 so we have some more time > to > > thoroughly test it? > > > > -Todd > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > -- Todd Lipcon Software Engineer, Cloudera
