[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181229#comment-13181229
 ] 

ramkrishna.s.vasudevan commented on HBASE-4357:
---

@Ming Ma
+1 on patch.  
When we started discussion timeout monitor was not doing anything for unassign.
But now it forcefully does an unassign also. :)


 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181437#comment-13181437
 ] 

Hadoop QA commented on HBASE-4357:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12509667/HBASE-4357-0.92.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/686//console

This message is automatically generated.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181438#comment-13181438
 ] 

Zhihong Yu commented on HBASE-4357:
---

+1 on patch, if tests pass.

Minor comment:
{code}
-   * @return Transition znode to CLOSED state.
+   * @return if Transition znode to RS_ZK_REGION_FAILED_OPEN state succeeds or
+   *  not
{code}
The above should read '@return whether znode transition to ...'

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181440#comment-13181440
 ] 

Zhihong Yu commented on HBASE-4357:
---

Please fix the following and submit new patch:
{code}
Hunk #6 FAILED at 111.
2 out of 6 hunks FAILED -- saving rejects to file 
src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java.rej
{code}

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181531#comment-13181531
 ] 

Hadoop QA commented on HBASE-4357:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12509708/HBASE-4357-0.92.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/688//console

This message is automatically generated.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-4357-0.92.patch, HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181577#comment-13181577
 ] 

Hadoop QA commented on HBASE-4357:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12509710/4357.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 79 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/689//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/689//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/689//console

This message is automatically generated.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: 4357.txt, HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2012-01-06 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181595#comment-13181595
 ] 

Zhihong Yu commented on HBASE-4357:
---

There was no hung test(s).
The above test failures are known due to MAPREDUCE-3583.

The latest patch is ready to be checked in.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0, 0.94.0

 Attachments: 4357.txt, HBASE-4357-0.92.patch


 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-30 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117970#comment-13117970
 ] 

Ted Yu commented on HBASE-4357:
---

I think points 1 and 2 above are reasonable.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-30 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118239#comment-13118239
 ] 

ramkrishna.s.vasudevan commented on HBASE-4357:
---

@Ming
3rd point am not sure
But even 2nd point may not be needed right.
Any way the timeout monitor if it sees the state in CLOSING it is again going 
to update to CLOSING.
And  if closeRegion is taking more time due to some bug how do we determine 
after what time we take corrective action?


 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-29 Thread Ming Ma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13117900#comment-13117900
 ] 

Ming Ma commented on HBASE-4357:


How about the followings, similar to how OPENING is handled?

1. If closeRegion fails for some reason, RS will transition ZK state to a new 
state ZK_FAILED_CLOSE. AM will reissue closeRegion request to the same RS right 
away.
2. closeRegion operation could take a long time, doing flush, etc. It will do 
tickClosing so AM's timeoutMonitor won't kick in.
3. When AM's timeoutMonitor kicks in, it will try to reissue closeRegion to the 
same RS. However, if the RS stays in closing the region due to some bug, not 
sure if reissuing closeRegion will help much. Instead, could Master restarts 
the RS instead?

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-12 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102478#comment-13102478
 ] 

ramkrishna.s.vasudevan commented on HBASE-4357:
---

@Ming ma
Yes what you say is correct.  Are you working on this issue?

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102973#comment-13102973
 ] 

stack commented on HBASE-4357:
--

@Ming In TRUNK, we have RecoverableZooKeeper.  It does the following when its 
trying to get version:

{code}
  /**
   * exists is an idempotent operation. Retry before throw out exception
   * @param path
   * @param watcher
   * @return
   * @throws KeeperException
   * @throws InterruptedException
   */
  public Stat exists(String path, Watcher watcher)
  throws KeeperException, InterruptedException {
RetryCounter retryCounter = retryCounterFactory.create();
while (true) {
  try {
return zk.exists(path, watcher);
  } catch (KeeperException e) {
switch (e.code()) {
  case CONNECTIONLOSS:
  case OPERATIONTIMEOUT:
LOG.warn(Possibly transient ZooKeeper exception:  + e);
if (!retryCounter.shouldRetry()) {
  LOG.error(ZooKeeper exists failed after 
+ retryCounter.getMaxRetries() +  retries);
  throw e;
}
break;

  default:
throw e;
}
  }
  LOG.info(The +retryCounter.getAttemptTimes()+ times to retry  +
  ZooKeeper after sleeping +retryIntervalMillis+ ms);
  retryCounter.sleepUntilNextRetry();
  retryCounter.useRetry();
}
  }
{code}

That is, it retries.

We should probably do your #2 above too.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-09 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101273#comment-13101273
 ] 

Ming Ma commented on HBASE-4357:


Stack, it is the trunk. I don't know the root cause yet.

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-09 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13101574#comment-13101574
 ] 

Ming Ma commented on HBASE-4357:


Here is the issue. It has nothing to do with master restart.

CloseRegionHandler.getCurrentVersion failed. Thus regionserver can't close the 
region properly. One reason it can't get data from zookeeper could be that 
there are lots of regions in transition.


11/09/07 17:21:48 WARN handler.CloseRegionHandler: Error getting node's version 
in CLOSING state, aborting close of 
miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.


Possible fixes:

1. Perhaps CloseRegionHandler.getCurrentVersion should retry on calls to 
ZKAssign.getVersion?
2. Timeout Monitor doesn't do anything for region that stays in CLOSING state 
for long. Perhaps it can try to repair it like reissuing a closeregion request 
in RS?

 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4357) Region in transition - in closing state

2011-09-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100912#comment-13100912
 ] 

stack commented on HBASE-4357:
--

What version of hbase Ming?



 Region in transition - in closing state
 ---

 Key: HBASE-4357
 URL: https://issues.apache.org/jira/browse/HBASE-4357
 Project: HBase
  Issue Type: Bug
Reporter: Ming Ma

 Got the following during testing, 
 1. On a given machine, kill RS process id. Then kill HMaster process id.
 2. Start RS first via bin/hbase-daemon.sh --config ./conf start 
 regionserver.. Start HMaster via bin/hbase-daemon.sh --config ./conf start 
 master.
 One region of a table stayed in closing state.
 According to zookeeper,
 794a6ff17a4de0dd0a19b984ba18eea9 
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  state=CLOSING, ts=Wed Sep 07 17:21:44 PDT 2011 (75701s ago), 
 server=sea-esxi-0,6,1315428682281 
 According to .META. table, the region has been assigned to from sea-esxi-0 to 
 sea-esxi-4.
 miweng_500region,H\xB49X\x10bM\xB1,1315338786464.794a6ff17a4de0dd0a19b984ba18eea9.
  sea-esxi-4:60030  H\xB49X\x10bM\xB1 I7K\xC6\xA7\xEF\x9D\x90 0 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira