[ 
https://issues.apache.org/jira/browse/HBASE-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289141#comment-13289141
 ] 

stack commented on HBASE-6012:
------------------------------

Chunhui This method is only called bulk assigning?  Why would there be a znode 
at all if we are bulk assigning?  Should we do cleanup of znode state before we 
bulk assign?  The delete of the znode ahead of forcing it offline makes me 
nervous.  If this issue only started showing up because we added bulk assigning 
to SSH (hbase-5914), then maybe in SSH before we do the bulk assign, we should 
be doing the clean of zk and not do this delete and then offline?  What you 
think?
                
> AssignmentManager#asyncSetOfflineInZooKeeper wouldn't force node offline
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6012
>                 URL: https://issues.apache.org/jira/browse/HBASE-6012
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.96.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6012.patch, HBASE-6012v2.patch
>
>
> As the javadoc of method and the log message
> {code}
> /**
>    * Set region as OFFLINED up in zookeeper asynchronously.
>    */
> boolean asyncSetOfflineInZooKeeper(
> ...
> master.abort("Unexpected ZK exception creating/setting node OFFLINE", e);
> ...
> }
> {code}
> I think AssignmentManager#asyncSetOfflineInZooKeeper should also force node 
> offline, just like AssignmentManager#setOfflineInZooKeeper do. Otherwise, it 
> may cause bulk assign failed which called this method.
> Error log on the master caused by the issue
> 2012-05-12 01:40:09,437 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=writetest,1YTQDPGLXBTICHOPQ6IL,1336590857771.674da422fc7cb9a7d42c74499ace1d93.
>  state=PENDING_CLOSE, ts=1336757876856 
> 2012-05-12 01:40:09,437 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:60000-0x23736bf74780082 Async create of unassigned node for 
> 674da422fc7cb9a7d42c74499ace1d93 with OFFLINE state 
> 2012-05-12 01:40:09,446 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager$CreateUnassignedAsyncCallback:
>  rc != 0 for /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93 -- 
> retryable connectionloss -- FIX see 
> http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A2 
> 2012-05-12 01:40:09,447 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Connectionloss writing unassigned at 
> /hbase-func1/unassigned/674da422fc7cb9a7d42c74499ace1d93, rc=-110 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to