>
> Any chance of similar hack in this test?

If we don't need multiple regionservers in the minicluster for this test,
yes.


On Fri, Dec 4, 2015 at 11:00 AM, Stack <[email protected]> wrote:

> I see. Any chance of similar hack in this test? Or disabling this test in
> all but master branch? Or a generic version of your hack (probably not)?
>
> Getting a successful test run requires our going through all unit tests
> twice, first on jdk7 and then on jdk8. The probability for fail is high
> (smile) or at least, for flakies to raise their heads. Its a pity after
> running thousands of unit tests, that all fail because of a single missed
> watcher.
>
> You think the test written wrong then Andrew? It should be done more
> defensively prepared to miss a watcher? If the latter, I could disable it
> until this had been addressed?
>
> Thanks for the back and forth,
> St.Ack
>
> On Fri, Dec 4, 2015 at 10:04 AM, Andrew Purtell <[email protected]>
> wrote:
>
> > Would be a pity to disable the test. On the other hand we seem to flake
> > wherever using watcher triggers in miniclusters to move state forward.
> > That's fixed by porting the notification to ProcV2. Otherwise, we hack
> > around the edges (like HBASE-14209).
> >
> > On Fri, Dec 4, 2015 at 9:47 AM, Stack <[email protected]> wrote:
> >
> > > It shuts down fine. It just fails too often in scheme of things. I
> could
> > > just disable it.
> > > St.Ack
> > >
> > > On Fri, Dec 4, 2015 at 9:42 AM, Andrew Purtell <[email protected]>
> > > wrote:
> > >
> > > > > Snapshot of AccessController state does not include instance on
> > region
> > > >
> > > > We update a znode and wait for a state change driven by processing a
> > > watch
> > > > notification for the znode change. The watch notification is
> apparently
> > > > lost. Yeah, once that happens the test is dead. It shouldn't hang
> > > > indefinitely, the predicate should only wait for 10 seconds, then
> error
> > > > out. If that isn't happening we've got some kind of test shutdown
> hang
> > > bug.
> > > >
> > > >
> > > >
> > > > On Fri, Dec 4, 2015 at 9:29 AM, Stack <[email protected]> wrote:
> > > >
> > > > > Anyone up for taking a look at this flakey test?
> > > > >
> > > > > See here for example:
> > > > >
> > > > >
> > > >
> > >
> >
> https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.2/419/jdk=latest1.7,label=Hadoop/testReport/junit/org.apache.hadoop.hbase.security.visibility/TestVisibilityLabelsWithACL/org_apache_hadoop_hbase_security_visibility_TestVisibilityLabelsWithACL/
> > > > >
> > > > > I see it fail from time to time.
> > > > >
> > > > > Something is odd. Says we time out on setup after ten seconds.
> > Digging
> > > in
> > > > > more, I see this around startup:
> > > > >
> > > > >
> > > > > 2015-12-02 23:08:42,790 DEBUG
> > > > > [B.defaultRpcServer.handler=1,queue=0,port=47849]
> > ipc.CallRunner(112):
> > > > > B.defaultRpcServer.handler=1,queue=0,port=47849: callId: 0 service:
> > > > > RegionServerStatusService methodName: RegionServerStartup size: 45
> > > > > connection: 67.195.81.153:43968
> > > > > org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is
> > > > > not running yet
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.master.HMaster.checkServiceStarted(HMaster.java:2265)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:351)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8615)
> > > > >         at
> > > > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2168)
> > > > >         at
> > > > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
> > > > >         at
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> > > > >         at org.apache.
> > > > > ...[truncated 182514 chars]...
> > > > > ecureTestUtil$1(333): Snapshot of AccessController state does not
> > > > > include instance on region
> > > > > hbase:acl,,1449097729021.ec6be7579802c2fa1182dc62f5fb6137.
> > > > > 2015-12-02 23:09:00,167 ERROR [main] access.SecureTestUtil$1(333):
> > > > > Snapshot of AccessController state does not include instance on
> > region
> > > > > hbase:acl,,1449097729021.ec6be7579802c2fa1182dc62f5fb6137.
> > > > > 2015-12-02 23:09:00,275 ERROR [main] access.SecureTestUtil$1(333):
> > > > > Snapshot of AccessController state does not include instance on
> > region
> > > > > hbase:acl,,1449097729021.ec6be7579802c2fa1182dc62f5fb6137.
> > > > >
> > > > >
> > > > > ....
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > We seem to just hang.
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > St.Ack
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >    - Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Reply via email to