The Zookeeper up and running. I doubt this is a new problem may be due to some recent commits and am pretty sure this was not happening in some older version of the trunk code. Based on the time available i will look into this flush issue.
Similarly stopping and starting the region server does not allow the assignment to be completed (as there is only 1 RS) and so the assignment of the table does not happen too. BTW thanks Jerry for your help in this. Regards Ram On Thu, Mar 12, 2015 at 10:54 AM, Jerry He <[email protected]> wrote: > The mater(coordinator) would clean up unfinished/aborted procedures on the > ZK nodes. > I wonder how the case you saw could happen. Was Zookeeper down at the same > time? > In the meantime, manually clean up the zk nodes (under > /hbase/flush-table-proc/) > would let you move forward for now. > > Thanks, > > Jerry > > On Wed, Mar 11, 2015 at 9:35 PM, ramkrishna vasudevan < > [email protected]> wrote: > > > Yes. Before restarting the server I tried running a flush on a table. > Was > > that the reason for this? > > > > On Thu, Mar 12, 2015 at 5:03 AM, Jerry He <[email protected]> wrote: > > > > > Hi, Ram > > > > > > Could you tell a little more about the context of what happened? Were > > you > > > running any flush table prior to the restart of the region server? > > > > > > Thanks, > > > > > > Jerry > > > > > > On Wed, Mar 11, 2015 at 4:07 AM, ramkrishna vasudevan < > > > [email protected]> wrote: > > > > > > > Hi All > > > > > > > > The latest trunk hangs after we do a stop and start of the Region > > Server > > > > with the following error > > > > > > > > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable > > via > > > > > > > > > > > > > > stobdtserver3,16040,1426090566331:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > > > > java.io.IOException: > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > KeeperErrorCode = NoNode for > > > > > > > > > > > > > > /hbase/flush-table-proc/acquired/TestTable/stobdtserver3,16040,1426090566331 > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:171) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.abort(ZKProcedureMemberRpcs.java:329) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.watchForAbortedProcedures(ZKProcedureMemberRpcs.java:142) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.start(ZKProcedureMemberRpcs.java:352) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager.start(RegionServerFlushTableProcedureManager.java:102) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost.start(RegionServerProcedureManagerHost.java:53) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:882) > > > > at java.lang.Thread.run(Thread.java:745) > > > > Caused by: > > > > > org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: > > > > java.io.IOException: > > > org.apache.zookeeper.KeeperException$NoNodeException: > > > > KeeperErrorCode = NoNode for > > > > > > > > > > > > > > /hbase/flush-table-proc/acquired/TestTable/stobdtserver3,16040,1426090566331 > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.Subprocedure.cancel(Subprocedure.java:273) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.ProcedureMember.controllerConnectionFailure(ProcedureMember.java:225) > > > > at > > > > > > > > > > > > > > org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAcquired(ZKProcedureMemberRpcs.java:254) > > > > at > > > > > > > > > > org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:166) > > > > at > > > > > > org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52) > > > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > > > at > > > > > > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > > > > > > > > > > > Even when we try to flush we get the above error. Because of this the > > > > system hangs and we are not able to proceed with performing > operations > > > > particularly after we restart the region server. > > > > > > > > I have a single RS and single master installation for internal > testing. > > > Any > > > > hints on why this happens? It was not happening till the update that > I > > > had > > > > taken 3 days back. > > > > > > > > Regards > > > > Ram > > > > > > > > > >
