> > Any chance of similar hack in this test?
If we don't need multiple regionservers in the minicluster for this test, yes. On Fri, Dec 4, 2015 at 11:00 AM, Stack <[email protected]> wrote: > I see. Any chance of similar hack in this test? Or disabling this test in > all but master branch? Or a generic version of your hack (probably not)? > > Getting a successful test run requires our going through all unit tests > twice, first on jdk7 and then on jdk8. The probability for fail is high > (smile) or at least, for flakies to raise their heads. Its a pity after > running thousands of unit tests, that all fail because of a single missed > watcher. > > You think the test written wrong then Andrew? It should be done more > defensively prepared to miss a watcher? If the latter, I could disable it > until this had been addressed? > > Thanks for the back and forth, > St.Ack > > On Fri, Dec 4, 2015 at 10:04 AM, Andrew Purtell <[email protected]> > wrote: > > > Would be a pity to disable the test. On the other hand we seem to flake > > wherever using watcher triggers in miniclusters to move state forward. > > That's fixed by porting the notification to ProcV2. Otherwise, we hack > > around the edges (like HBASE-14209). > > > > On Fri, Dec 4, 2015 at 9:47 AM, Stack <[email protected]> wrote: > > > > > It shuts down fine. It just fails too often in scheme of things. I > could > > > just disable it. > > > St.Ack > > > > > > On Fri, Dec 4, 2015 at 9:42 AM, Andrew Purtell <[email protected]> > > > wrote: > > > > > > > > Snapshot of AccessController state does not include instance on > > region > > > > > > > > We update a znode and wait for a state change driven by processing a > > > watch > > > > notification for the znode change. The watch notification is > apparently > > > > lost. Yeah, once that happens the test is dead. It shouldn't hang > > > > indefinitely, the predicate should only wait for 10 seconds, then > error > > > > out. If that isn't happening we've got some kind of test shutdown > hang > > > bug. > > > > > > > > > > > > > > > > On Fri, Dec 4, 2015 at 9:29 AM, Stack <[email protected]> wrote: > > > > > > > > > Anyone up for taking a look at this flakey test? > > > > > > > > > > See here for example: > > > > > > > > > > > > > > > > > > > > https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.2/419/jdk=latest1.7,label=Hadoop/testReport/junit/org.apache.hadoop.hbase.security.visibility/TestVisibilityLabelsWithACL/org_apache_hadoop_hbase_security_visibility_TestVisibilityLabelsWithACL/ > > > > > > > > > > I see it fail from time to time. > > > > > > > > > > Something is odd. Says we time out on setup after ten seconds. > > Digging > > > in > > > > > more, I see this around startup: > > > > > > > > > > > > > > > 2015-12-02 23:08:42,790 DEBUG > > > > > [B.defaultRpcServer.handler=1,queue=0,port=47849] > > ipc.CallRunner(112): > > > > > B.defaultRpcServer.handler=1,queue=0,port=47849: callId: 0 service: > > > > > RegionServerStatusService methodName: RegionServerStartup size: 45 > > > > > connection: 67.195.81.153:43968 > > > > > org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is > > > > > not running yet > > > > > at > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.HMaster.checkServiceStarted(HMaster.java:2265) > > > > > at > > > > > > > > > > > > > > > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:351) > > > > > at > > > > > > > > > > > > > > > org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615) > > > > > at > > > > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2168) > > > > > at > > > > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109) > > > > > at > > > > > > > > > > > > > > > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > > > > > at org.apache. > > > > > ...[truncated 182514 chars]... > > > > > ecureTestUtil$1(333): Snapshot of AccessController state does not > > > > > include instance on region > > > > > hbase:acl,,1449097729021.ec6be7579802c2fa1182dc62f5fb6137. > > > > > 2015-12-02 23:09:00,167 ERROR [main] access.SecureTestUtil$1(333): > > > > > Snapshot of AccessController state does not include instance on > > region > > > > > hbase:acl,,1449097729021.ec6be7579802c2fa1182dc62f5fb6137. > > > > > 2015-12-02 23:09:00,275 ERROR [main] access.SecureTestUtil$1(333): > > > > > Snapshot of AccessController state does not include instance on > > region > > > > > hbase:acl,,1449097729021.ec6be7579802c2fa1182dc62f5fb6137. > > > > > > > > > > > > > > > .... > > > > > > > > > > > > > > > > > > > > > > > > > We seem to just hang. > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > St.Ack > > > > > > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > > > > > - Andy > > > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > > Hein > > > > (via Tom White) > > > > > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
