from:"Jira"


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266725#comment-13266725
 ] 

stack commented on HBASE-5548:
--

Hmm... did I?   It doesn't list the files above.  Let me check.

 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, 
 ruby_HBASE-5548-addendum.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266732#comment-13266732
 ] 

stack commented on HBASE-5548:
--

Ok.  Over in hbase-5840 I say I'm going to leave it in but instead I'm going to 
back it out so its easier on the fellows who are trying to follow behind us 
trying to make sense of our actions.

 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, 
 ruby_HBASE-5548-addendum.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266735#comment-13266735
 ] 

stack commented on HBASE-5548:
--

Backed out the miscommit of hbase-5840 that went in with this.  Sorry for the 
mess.  Thanks Ram for fingering it.

 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, 
 ruby_HBASE-5548-addendum.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status


[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266736#comment-13266736
 ] 

stack commented on HBASE-5840:
--

Changed my mind.  Backed out the miscommit of this trunk patch that went in w/ 
the commit of hbase-5548 by mistake.

So, this patch is still to be committed on trunk and branch.  Sorry for my mess.

 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
 the old status
 --

 Key: HBASE-5840
 URL: https://issues.apache.org/jira/browse/HBASE-5840
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5840.patch, HBASE-5840_trunk.patch, 
 HBASE-5840_v2.patch


 TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
 keeps showing old status.
 This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266751#comment-13266751
 ] 

ramkrishna.s.vasudevan commented on HBASE-5840:
---

Thanks Stack.  Committed to trunk.  Waiting for Lars to confirm on 0.94.  Then 
will commit there.

 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
 the old status
 --

 Key: HBASE-5840
 URL: https://issues.apache.org/jira/browse/HBASE-5840
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5840.patch, HBASE-5840_trunk.patch, 
 HBASE-5840_v2.patch


 TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
 keeps showing old status.
 This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5869) Move SplitLogManager splitlog taskstate and AssignmentManager RegionTransitionData znode datas to pb


[ 
https://issues.apache.org/jira/browse/HBASE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266765#comment-13266765
 ] 

Hudson commented on HBASE-5869:
---

Integrated in HBase-TRUNK #2836 (See 
[https://builds.apache.org/job/HBase-TRUNK/2836/])
HBASE-5869 Move SplitLogManager splitlog taskstate and AssignmentManager 
RegionTransitionData znode datas to pb (Revision 1333099)

 Result = SUCCESS
stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/DeserializationException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/EmptyWatcher.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HBaseException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/RegionTransition.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerName.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/SplitLogCounters.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/SplitLogTask.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/EmptyWatcher.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/MasterAddressTracker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKSplitLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/trunk/src/main/protobuf/ZooKeeper.proto
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitLogWorker.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java


 Move SplitLogManager splitlog taskstate and AssignmentManager 
 RegionTransitionData znode datas to pb 
 -

 Key: HBASE-5869
 URL: https://issues.apache.org/jira/browse/HBASE-5869
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5869v7.txt, 5869v8.txt, 5869v9.txt, firstcut.txt, 
 secondcut.txt, v10.txt, v11.txt, v12.txt, v13.txt, v13.txt, v4.txt, v5.txt, 
 v6.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https

[jira] [Commented] (HBASE-5869) Move SplitLogManager splitlog taskstate and AssignmentManager RegionTransitionData znode datas to pb


[ 
https://issues.apache.org/jira/browse/HBASE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266773#comment-13266773
 ] 

stack commented on HBASE-5869:
--

No.  I committed the patch that passed hadoopqa.  Will do new issue to address 
your comments.

 Move SplitLogManager splitlog taskstate and AssignmentManager 
 RegionTransitionData znode datas to pb 
 -

 Key: HBASE-5869
 URL: https://issues.apache.org/jira/browse/HBASE-5869
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5869v7.txt, 5869v8.txt, 5869v9.txt, firstcut.txt, 
 secondcut.txt, v10.txt, v11.txt, v12.txt, v13.txt, v13.txt, v4.txt, v5.txt, 
 v6.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5919) Add missing Ted review fixes for HBASE-5869

stack created HBASE-5919:


 Summary: Add missing Ted review fixes for HBASE-5869
 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker


I missed addressing a few of Ted's comments on the end of my navigating 
HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status


[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266816#comment-13266816
 ] 

Hudson commented on HBASE-5840:
---

Integrated in HBase-TRUNK #2837 (See 
[https://builds.apache.org/job/HBase-TRUNK/2837/])
HBASE-5840 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, 
keeps showing the old status (RajeshBabu) (Revision 1333124)
HBASE-5548 Add ability to get a table in the shell; BACKING OUT MISTAKEN 
CO-COMMIT OF HBASE-5840 (Revision 1333123)

 Result = SUCCESS
ramkrishna : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
 the old status
 --

 Key: HBASE-5840
 URL: https://issues.apache.org/jira/browse/HBASE-5840
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5840.patch, HBASE-5840_trunk.patch, 
 HBASE-5840_v2.patch


 TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
 keeps showing old status.
 This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly


[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266818#comment-13266818
 ] 

Hudson commented on HBASE-2214:
---

Integrated in HBase-TRUNK #2837 (See 
[https://builds.apache.org/job/HBase-TRUNK/2837/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333122)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java
* /hbase/trunk/src/main/protobuf/Client.proto
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-2214-0.94-v2.txt, HBASE-2214-0.94-v3.txt, 
 HBASE-2214-0.94.txt, HBASE-2214-v4.txt, HBASE-2214-v5.txt, HBASE-2214-v6.txt, 
 HBASE-2214-v7.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266817#comment-13266817
 ] 

Hudson commented on HBASE-5548:
---

Integrated in HBase-TRUNK #2837 (See 
[https://builds.apache.org/job/HBase-TRUNK/2837/])
HBASE-5548 Add ability to get a table in the shell; BACKING OUT MISTAKEN 
CO-COMMIT OF HBASE-5840 (Revision 1333123)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, 
 ruby_HBASE-5548-addendum.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1996) Configure scanner buffer in bytes instead of number of rows


[ 
https://issues.apache.org/jira/browse/HBASE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266815#comment-13266815
 ] 

Hudson commented on HBASE-1996:
---

Integrated in HBase-TRUNK #2837 (See 
[https://builds.apache.org/job/HBase-TRUNK/2837/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333122)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java
* /hbase/trunk/src/main/protobuf/Client.proto
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Configure scanner buffer in bytes instead of number of rows
 ---

 Key: HBASE-1996
 URL: https://issues.apache.org/jira/browse/HBASE-1996
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Latham
Assignee: Dave Latham
 Fix For: 0.90.0

 Attachments: 1966.patch, 1996-0.20.3-v2.patch, 1996-0.20.3-v3.patch, 
 1996-0.20.3.patch


 Currently, the default scanner fetches a single row at a time.  This makes 
 for very slow scans on tables where the rows are not large.  You can change 
 the setting for an HTable instance or for each Scan.
 It would be better to have a default that performs reasonably well so that 
 people stop running into slow scans because they are evaluating HBase, aren't 
 familiar with the setting, or simply forgot.  Unfortunately, if we increase 
 the value of the current setting, then we run the risk of running OOM for 
 tables with large rows.  Let's change the setting so that it works with a 
 size in bytes, rather than in rows.  This will allow us to set a reasonable 
 default so that tables with small rows will scan performantly and tables with 
 large rows will not run OOM.
 Note that the case is very similar to table writes as well.  When disabling 
 auto flush, we buffer a list of Put's to commit at once.  That buffer is 
 measured in bytes, so that a small number of large Puts or a lot of small 
 Puts can each fit in a single flush.  If that buffer were measured in number 
 of Put's it would have the same problem that we have for the scan buffer, and 
 we wouldn't be able to set a good default value for tables with different 
 size rows.  Changing the scan buffer to be configured like the write buffer 
 will make it more consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-05-02 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266827#comment-13266827
 ] 

nkeywal commented on HBASE-5877:


I have the same locally, so it's likely my patch...

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5905) Protobuf interface for Admin: split between the internal and the external/customer interface

2012-05-02 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266839#comment-13266839
 ] 

nkeywal commented on HBASE-5905:


This would make sense if we think that customers should/will use the protobuf 
interface.

 Protobuf interface for Admin: split between the internal and the 
 external/customer interface
 

 Key: HBASE-5905
 URL: https://issues.apache.org/jira/browse/HBASE-5905
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal

 After a short discussion with Stack, I create a jira.
 --
 I'am a little bit confused by the protobuf interface for closeRegion.
 We have two types of closeRegion today:
 1) the external ones; available in client.HBaseAdmin. They take the server 
 and the region identifier as a parameter and nothing else.
 2) The internal ones, called for example by the master. They have more 
 parameters (like versionOfClosingNode or transitionInZK).
 When I look at protobuf.ProtobufUtil, I see:
   public static void closeRegion(final AdminProtocol admin,
   final byte[] regionName, final boolean transitionInZK) throws 
 IOException {
 CloseRegionRequest closeRegionRequest =
   RequestConverter.buildCloseRegionRequest(regionName, transitionInZK);
 try {
   admin.closeRegion(null, closeRegionRequest);
 } catch (ServiceException se) {
   throw getRemoteException(se);
 }
   }
 In other words, it seems that we merged the two interfaces into a single one. 
 Is that the intend?
 I checked, the internal fields in closeRegionRequest are all optional (that's 
 good). Still, it means that the end user could use them or at least would 
 need to distinguish between the optional for functional reasons and the 
 optional - do not use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1996) Configure scanner buffer in bytes instead of number of rows


[ 
https://issues.apache.org/jira/browse/HBASE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266870#comment-13266870
 ] 

Hudson commented on HBASE-1996:
---

Integrated in HBase-0.94 #170 (See 
[https://builds.apache.org/job/HBase-0.94/170/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333157)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Configure scanner buffer in bytes instead of number of rows
 ---

 Key: HBASE-1996
 URL: https://issues.apache.org/jira/browse/HBASE-1996
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Latham
Assignee: Dave Latham
 Fix For: 0.90.0

 Attachments: 1966.patch, 1996-0.20.3-v2.patch, 1996-0.20.3-v3.patch, 
 1996-0.20.3.patch


 Currently, the default scanner fetches a single row at a time.  This makes 
 for very slow scans on tables where the rows are not large.  You can change 
 the setting for an HTable instance or for each Scan.
 It would be better to have a default that performs reasonably well so that 
 people stop running into slow scans because they are evaluating HBase, aren't 
 familiar with the setting, or simply forgot.  Unfortunately, if we increase 
 the value of the current setting, then we run the risk of running OOM for 
 tables with large rows.  Let's change the setting so that it works with a 
 size in bytes, rather than in rows.  This will allow us to set a reasonable 
 default so that tables with small rows will scan performantly and tables with 
 large rows will not run OOM.
 Note that the case is very similar to table writes as well.  When disabling 
 auto flush, we buffer a list of Put's to commit at once.  That buffer is 
 measured in bytes, so that a small number of large Puts or a lot of small 
 Puts can each fit in a single flush.  If that buffer were measured in number 
 of Put's it would have the same problem that we have for the scan buffer, and 
 we wouldn't be able to set a good default value for tables with different 
 size rows.  Changing the scan buffer to be configured like the write buffer 
 will make it more consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly


[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266871#comment-13266871
 ] 

Hudson commented on HBASE-2214:
---

Integrated in HBase-0.94 #170 (See 
[https://builds.apache.org/job/HBase-0.94/170/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333157)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-2214-0.94-v2.txt, HBASE-2214-0.94-v3.txt, 
 HBASE-2214-0.94.txt, HBASE-2214-v4.txt, HBASE-2214-v5.txt, HBASE-2214-v6.txt, 
 HBASE-2214-v7.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5907) enhance HLog pretty printer to print additional useful stats

2012-05-02 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266904#comment-13266904
 ] 

Phabricator commented on HBASE-5907:


mbautin has committed the revision [jira] [HBASE-5907] [89-fb] enhance HLog 
pretty printer to print additional useful stats.

REVISION DETAIL
  https://reviews.facebook.net/D2979

COMMIT
  https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1333198


 enhance HLog pretty printer to print additional useful stats
 

 Key: HBASE-5907
 URL: https://issues.apache.org/jira/browse/HBASE-5907
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Priority: Minor
 Attachments: D2979.1.patch, D2979.2.patch


 It would be useful for analysis purposes to enhance the HLog pretty printer 
 to optionally print a bunch of additional stats such as:
 1) # of txns
 2) # of KVs updated
 3) avg size of txns
 4) avg size of KVs
 5) avg # of KVs written per txn
 5) unique CF signatures involved in put/delete operatons; and breakdown of 
 some of the above metrics by these signatures, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5879) Enable JMX metrics collection for the Thrift proxy

2012-05-02 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266903#comment-13266903
 ] 

Phabricator commented on HBASE-5879:


mbautin has committed the revision [jira] [HBASE-5879] [89-fb] Enable JMX 
metrics collection for the Thrift proxy.

REVISION DETAIL
  https://reviews.facebook.net/D2955

COMMIT
  https://reviews.facebook.net/rHBASEEIGHTNINEFBBRANCH1333194


 Enable JMX metrics collection for the Thrift proxy
 --

 Key: HBASE-5879
 URL: https://issues.apache.org/jira/browse/HBASE-5879
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5879_trunk.txt, D2955.1.patch


 We need to enable JMX on the Thrift proxy on a separate port different from 
 the JMX port used by regionserver. This is necessary for metrics collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2625) Make testDynamicBloom()'s randomness deterministic


[ 
https://issues.apache.org/jira/browse/HBASE-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266912#comment-13266912
 ] 

Hudson commented on HBASE-2625:
---

Integrated in HBase-TRUNK #2838 (See 
[https://builds.apache.org/job/HBase-TRUNK/2838/])
HBASE-2625 Avoid byte buffer allocations when reading a value from a Result 
object (Tudor Scurtu) (Revision 1333159)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Result.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestResult.java


 Make testDynamicBloom()'s randomness deterministic
 

 Key: HBASE-2625
 URL: https://issues.apache.org/jira/browse/HBASE-2625
 Project: HBase
  Issue Type: Test
  Components: test
Affects Versions: 0.90.0
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Fix For: 0.90.0

 Attachments: hbase-2625.patch


 Had a failure with testDynamicBloom on Hudson today.  Will investigate, 
 however it would be nice to reproduce the problem to make sure it's not the 
 fault of my test assumptions.  I plan to seed the Random number generator 
 with the current time and print that out for post-mortem analysis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4990) Document secure HBase setup


 [ 
https://issues.apache.org/jira/browse/HBASE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4990:
-

Attachment: 4990v2.txt

Version two; worked on it w/ Andrew sitting beside me.

 Document secure HBase setup
 ---

 Key: HBASE-4990
 URL: https://issues.apache.org/jira/browse/HBASE-4990
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: 4990.txt, 4990v2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4990) Document secure HBase setup


 [ 
https://issues.apache.org/jira/browse/HBASE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4990.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

Committed to trunk.

 Document secure HBase setup
 ---

 Key: HBASE-4990
 URL: https://issues.apache.org/jira/browse/HBASE-4990
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.96.0

 Attachments: 4990.txt, 4990v2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5919) Add missing Ted review fixes for HBASE-5869


 [ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5919:
-

Attachment: 5919.txt

Address issues raised by Ted over in hbase-5869 that I did not want to address 
there because the criticims' were relatively nits for a patch of  300k.  I'd 
gotten two clean hadoopqa builds on v13... and was afraid my patch would fail 
to apply if I did more hadoopqa cycles.

 Add missing Ted review fixes for HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker
 Attachments: 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5919) Add missing Ted review fixes for HBASE-5869


 [ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5919:
-

Status: Patch Available  (was: Open)

 Add missing Ted review fixes for HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker
 Attachments: 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5919) Add missing Ted review fixes for HBASE-5869


[ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266981#comment-13266981
 ] 

stack commented on HBASE-5919:
--

Review how this method is called.  Look up in the file two methods.

 Add missing Ted review fixes for HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Blocker
 Attachments: 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4990) Document secure HBase setup


[ 
https://issues.apache.org/jira/browse/HBASE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266991#comment-13266991
 ] 

Hudson commented on HBASE-4990:
---

Integrated in HBase-TRUNK #2839 (See 
[https://builds.apache.org/job/HBase-TRUNK/2839/])
HBASE-4990 Document secure HBase setup (Revision 1333212)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/docbkx/security.xml


 Document secure HBase setup
 ---

 Key: HBASE-4990
 URL: https://issues.apache.org/jira/browse/HBASE-4990
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.96.0

 Attachments: 4990.txt, 4990v2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13266998#comment-13266998
 ] 

stack commented on HBASE-5922:
--

When does this issue arise?  When we try to getclosest on split key?

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBase-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267002#comment-13267002
 ] 

stack commented on HBASE-5922:
--

{code}
===
--- src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java   
(revision 1169834)
+++ src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java   
(revision )
@@ -146,7 +146,7 @@
 } else {
   if (getComparator().compare(key, offset, length, splitkey, 0,
   splitkey.length) = 0) {
-return seekBefore(splitkey, 0, splitkey.length);
+return false;
   }
{code}

So, if result is  splitKey, we need to get something before the splitkey, not 
fail?

Maybe we should check what comes out of the seekBefore  how is it that its 
returning us splitkey again?

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBase-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267015#comment-13267015
 ] 

stack commented on HBASE-5922:
--

Hmm... maybe your fix is right.  There are other tests of half file, ones we 
added when we ran into issues w/ it in the past.  Lets try your patch against 
hadoopqa.

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBase-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267040#comment-13267040
 ] 

stack commented on HBASE-5444:
--

Is v6 what is up in rb?  And you've submitted it to hadoopqa?  Thanks Gregory.

 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Attachments: HBASE-5444-v6-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-5919) Add fixes for Ted's review comments from HBASE-5869


 [ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reassigned HBASE-5919:


Assignee: Ted Yu

I thought this was my issue but you seem to have taken it  over Ted.  Assigning 
you.

 Add fixes for Ted's review comments from HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Ted Yu
Priority: Blocker
 Attachments: 5919-v2.txt, 5919-v4.txt, 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5914) Bulk assign regions in the process of ServerShutdownHandler

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267154#comment-13267154
]

ramkrishna.s.vasudevan commented on HBASE-5914:
---

@Chunhui
Yes, in SSH case retain assignment is not valid. But what if i don't need the
roundrobin assignment itself to be called like how we have in master startup? I
just meant a similar configuration like the
'hbase.master.startup.retainassign', for eg.,
'hbase.master.servershutdown.roundrobinassign'something like this. Just my
thoughts.

Bulk assign regions in the process of ServerShutdownHandler
---

Key: HBASE-5914
URL: https://issues.apache.org/jira/browse/HBASE-5914
Project: HBase
Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
Fix For: 0.96.0

Attachments: HBASE-5914.patch, HBASE-5914v2.patch

In the process of ServerShutdownHandler, we currently assign regions singly.
In the large cluster, one regionserver always carried many regions, this
action is quite slow.
What about using bulk assign regions like cluster start up.
In current logic, if we failed assigning many regions to one destination
server, we will wait unitl timeout,
however in the process of ServerShutdownHandler, we should retry it to
another server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267156#comment-13267156
 ] 

stack commented on HBASE-5444:
--

Sorry Gregory, I meant to ask, would you like me commit this and then in 
separate issues work on the outstanding stuff or do you want to update a new 
patch?

 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Attachments: HBASE-5444-v6-trunk.patch, HBASE-5444-v9-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5919) Add fixes for Ted's review comments from HBASE-5869


[ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267163#comment-13267163
 ] 

stack commented on HBASE-5919:
--

None.

 Add fixes for Ted's review comments from HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Ted Yu
Priority: Blocker
 Attachments: 5919-v2.txt, 5919-v4.txt, 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5914) Bulk assign regions in the process of ServerShutdownHandler

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267164#comment-13267164
 ] 

ramkrishna.s.vasudevan commented on HBASE-5914:
---

Fine Ted.  I agree.

 Bulk assign regions in the process of ServerShutdownHandler
 ---

 Key: HBASE-5914
 URL: https://issues.apache.org/jira/browse/HBASE-5914
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-5914.patch, HBASE-5914v2.patch


 In the process of ServerShutdownHandler, we currently assign regions singly.
 In the large cluster, one regionserver always carried many regions, this 
 action is  quite slow.
 What about using bulk assign regions like cluster start up.
 In current logic,  if we failed assigning many regions to one destination 
 server, we will wait unitl timeout, 
 however in the process of ServerShutdownHandler, we should retry it to 
 another server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267175#comment-13267175
 ] 

ramkrishna.s.vasudevan commented on HBASE-5918:
---

+1 on patch.  

 Master will block forever when startup if root server died between assign 
 root and assign meta
 --

 Key: HBASE-5918
 URL: https://issues.apache.org/jira/browse/HBASE-5918
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5918.patch, HBASE-5918.patch


 When master is initializing, if root server died between assign root and 
 assign meta, master will block at 
 HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
 this.catalogTracker.waitForMeta();{code}
 because ServerShutdownHandler is disabled,
 So we should enable ServerShutdownHandler after called 
 assignmentManager.assignMeta();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


 [ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5922:
-

Attachment: HBASE-5922.patch

Retry

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


 [ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5922:
-

Status: Patch Available  (was: Open)

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


 [ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5922:
-

Status: Open  (was: Patch Available)

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5913) Speed up the full scan of META


[ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267178#comment-13267178
 ] 

stack commented on HBASE-5913:
--

+1 on patch

 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5913) Speed up the full scan of META


[ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267179#comment-13267179
 ] 

Hudson commented on HBASE-5913:
---

Integrated in HBase-TRUNK #2840 (See 
[https://builds.apache.org/job/HBase-TRUNK/2840/])
HBASE-5913 Speed up the full scan of META (Chunhui) (Revision 1333283)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java


 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5916:
--

Priority: Critical  (was: Major)

Upped the priority of this defect. While master is coming up
{code}
1-Wait for region server to register
2-Get the online server list
3- Start splitting the logs
{code}

Between step 2 and 3 if another new region server registers, we just split the 
logs of the new region server and in fact delete the HLog folder for that new 
region server. This seems critical. 
While analysing the issue for  which this JIRA was created we ended up in this 
problem.


 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta


[ 
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267187#comment-13267187
 ] 

stack commented on HBASE-5918:
--

Shouldn't we remove the setting of this flag that happens later in 
finishInitialization?

+1 on the patch otherwise.  This is good stuff.

Any chance of a test?  It looks like it'd be hard to get one in here but it be 
good if you fellas at least said why a test is hard to squeeze in here to show 
you at least tried figuring how to test this stuff.  The flag disabling 
shutdown handler was only added recently but here we find an issue w/ it 
already.

{code}
Author: Michael Stack st...@apache.org  2012-03-13 08:35:54
Committer: Michael Stack st...@apache.org  2012-03-13 08:35:54
Parent: fbd4bebd5cca129f49e91ec9936f604998a7025a (HBASE-5314 racefully rolling 
restart region servers in rolling-restart.sh)
Child:  59e5460807a1dc0fb5763e4b12dda4be49ef3bb4 (HBASE-5574 
DEFAULT_MAX_FILE_SIZE defaults to a negative value)
Branches: 094.testfail, 5833trunk, hanging, pbwork, 
remotes/origin/instant_schema_alter, remotes/origin/trunk, v10, v4, v6
Follows: 
Precedes: 

HBASE-5179 Handle potential data loss due to concurrent processing of 
processFaileOver and ServerShutdownHandler

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1300194 
13f79535-47bb-0310-9956-ffa450edef68

- src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java -
inde
{code}


 Master will block forever when startup if root server died between assign 
 root and assign meta
 --

 Key: HBASE-5918
 URL: https://issues.apache.org/jira/browse/HBASE-5918
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5918.patch, HBASE-5918.patch


 When master is initializing, if root server died between assign root and 
 assign meta, master will block at 
 HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
 this.catalogTracker.waitForMeta();{code}
 because ServerShutdownHandler is disabled,
 So we should enable ServerShutdownHandler after called 
 assignmentManager.assignMeta();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267189#comment-13267189
 ] 

stack commented on HBASE-5444:
--

@Gregory np  Please file the other issues.  Waiting on hadoopqa before 
committing.  Good stuff.

 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Attachments: HBASE-5444-v10-trunk.patch, HBASE-5444-v6-trunk.patch, 
 HBASE-5444-v9-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267192#comment-13267192
 ] 

ramkrishna.s.vasudevan commented on HBASE-5918:
---

Even HBASE-5916 is also due to this.  I will try on working on a testcase.

 Master will block forever when startup if root server died between assign 
 root and assign meta
 --

 Key: HBASE-5918
 URL: https://issues.apache.org/jira/browse/HBASE-5918
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5918.patch, HBASE-5918.patch


 When master is initializing, if root server died between assign root and 
 assign meta, master will block at 
 HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
 this.catalogTracker.waitForMeta();{code}
 because ServerShutdownHandler is disabled,
 So we should enable ServerShutdownHandler after called 
 assignmentManager.assignMeta();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267198#comment-13267198
 ] 

stack commented on HBASE-5916:
--

Tell us more Ram.  If new regionserver comes in and we split its logs, so what? 
 It is not carrying any regions, right?  We'll assign it regions when done 
splitting?  Or are you talking about a case where a regionserver is just very 
slow about checking in?

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-02 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267205#comment-13267205
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

Yes the RS is slow in checking in.
The problem here is the master will do the split of the newly checked in RS as 
he is the RS that is not there in the online list.
{code}
SetServerName onlineServers = new HashSetServerName(serverManager
.getOnlineServers().keySet());
// TODO: Should do this in background rather than block master startup
status.setStatus(Splitting logs after master startup);
splitLogAfterStartup(this.fileSystemManager, onlineServers);
{code}
To split if i find any log folder which is not belonging to any of those in 
'onlineServers'(the online list is already got) we will call split log and 
finally delete the log folder.  So though the server is online i will not be 
able to use the hLog and i get filenotfoundException. Sorry if am not clear.


 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267209#comment-13267209
]

stack commented on HBASE-5875:
--

The patch looks dodgy -- saying a region is online, if it is root or meta,
seems incorrect.

bq. Consider the case where my ROOT node is found in RIT. Hence the processRIT
will trigger the assignment.

What is the above referring to? Which part of the code?

bq. It so happened that when i try to verifyRootRegionLocation the root node is
created but the OpenRegionHandler has not added the ROOT region in its
memory(very very corner case and this happened once while testing). So the
verifyRootRegionLocation returns false and hence the master thinks it an server
to be expired.

Can the master not detect this corner case just by looking at whats in zk?

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Key: HBASE-5875
URL: https://issues.apache.org/jira/browse/HBASE-5875
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Fix For: 0.94.1

Attachments: HBASE-5875.patch

If on master restart it finds the ROOT/META to be in RIT state, master tries
to assign the ROOT region through ProcessRIT.
Master will trigger the assignment and next will try to verify the Root
Region Location.
Root region location verification is done seeing if the RS has the region in
its online list.
If the master triggered assignment has not yet been completed in RS then the
verify root region location will fail.
Because it failed
{code}
splitLogAndExpireIfOnline(currentRootServer);
{code}
we do split log and also remove the server from online server list. Ideally
here there is nothing to do in splitlog as no region server was restarted.
So master, though the server is online, master just invalidates the region
server.
In a special case, if i have only one RS then my cluster will become non
operative.

[jira] [Commented] (HBASE-5919) Add fixes for Ted's review comments from HBASE-5869


[ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267212#comment-13267212
 ] 

Hudson commented on HBASE-5919:
---

Integrated in HBase-TRUNK #2841 (See 
[https://builds.apache.org/job/HBase-TRUNK/2841/])
HBASE-5919 Add fixes for Ted's review comments from HBASE-5869 (Revision 
104)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Bytes.java


 Add fixes for Ted's review comments from HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Ted Yu
Priority: Blocker
 Attachments: 5919-v2.txt, 5919-v4.txt, 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5869) Move SplitLogManager splitlog taskstate and AssignmentManager RegionTransitionData znode datas to pb


[ 
https://issues.apache.org/jira/browse/HBASE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267213#comment-13267213
 ] 

Hudson commented on HBASE-5869:
---

Integrated in HBase-TRUNK #2841 (See 
[https://builds.apache.org/job/HBase-TRUNK/2841/])
HBASE-5919 Add fixes for Ted's review comments from HBASE-5869 (Revision 
104)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Bytes.java


 Move SplitLogManager splitlog taskstate and AssignmentManager 
 RegionTransitionData znode datas to pb 
 -

 Key: HBASE-5869
 URL: https://issues.apache.org/jira/browse/HBASE-5869
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 5869v7.txt, 5869v8.txt, 5869v9.txt, firstcut.txt, 
 secondcut.txt, v10.txt, v11.txt, v12.txt, v13.txt, v13.txt, v4.txt, v5.txt, 
 v6.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267214#comment-13267214
 ] 

stack commented on HBASE-5916:
--

bq.  So though the server is online i will not be able to use the hLog and i 
get filenotfoundException. 

Master should not take over hlogs that were created at about same time as 
master start and that have no content in them?

Should we check for regionserver znodes and if present, wait a little longer?  
Give them another chance?

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4990) Document secure HBase setup


[ 
https://issues.apache.org/jira/browse/HBASE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267219#comment-13267219
 ] 

Hudson commented on HBASE-4990:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-4990 Document secure HBase setup (Revision 1333212)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/docbkx/security.xml


 Document secure HBase setup
 ---

 Key: HBASE-4990
 URL: https://issues.apache.org/jira/browse/HBASE-4990
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.96.0

 Attachments: 4990.txt, 4990v2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status


[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267221#comment-13267221
 ] 

Hudson commented on HBASE-5840:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5840 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, 
keeps showing the old status (RajeshBabu) (Revision 1333124)
HBASE-5548 Add ability to get a table in the shell; BACKING OUT MISTAKEN 
CO-COMMIT OF HBASE-5840 (Revision 1333123)

 Result = SUCCESS
ramkrishna : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
 the old status
 --

 Key: HBASE-5840
 URL: https://issues.apache.org/jira/browse/HBASE-5840
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5840.patch, HBASE-5840_trunk.patch, 
 HBASE-5840_v2.patch


 TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
 keeps showing old status.
 This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1996) Configure scanner buffer in bytes instead of number of rows


[ 
https://issues.apache.org/jira/browse/HBASE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267220#comment-13267220
 ] 

Hudson commented on HBASE-1996:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333122)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java
* /hbase/trunk/src/main/protobuf/Client.proto
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Configure scanner buffer in bytes instead of number of rows
 ---

 Key: HBASE-1996
 URL: https://issues.apache.org/jira/browse/HBASE-1996
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Latham
Assignee: Dave Latham
 Fix For: 0.90.0

 Attachments: 1966.patch, 1996-0.20.3-v2.patch, 1996-0.20.3-v3.patch, 
 1996-0.20.3.patch


 Currently, the default scanner fetches a single row at a time.  This makes 
 for very slow scans on tables where the rows are not large.  You can change 
 the setting for an HTable instance or for each Scan.
 It would be better to have a default that performs reasonably well so that 
 people stop running into slow scans because they are evaluating HBase, aren't 
 familiar with the setting, or simply forgot.  Unfortunately, if we increase 
 the value of the current setting, then we run the risk of running OOM for 
 tables with large rows.  Let's change the setting so that it works with a 
 size in bytes, rather than in rows.  This will allow us to set a reasonable 
 default so that tables with small rows will scan performantly and tables with 
 large rows will not run OOM.
 Note that the case is very similar to table writes as well.  When disabling 
 auto flush, we buffer a list of Put's to commit at once.  That buffer is 
 measured in bytes, so that a small number of large Puts or a lot of small 
 Puts can each fit in a single flush.  If that buffer were measured in number 
 of Put's it would have the same problem that we have for the scan buffer, and 
 we wouldn't be able to set a good default value for tables with different 
 size rows.  Changing the scan buffer to be configured like the write buffer 
 will make it more consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5919) Add fixes for Ted's review comments from HBASE-5869


[ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267223#comment-13267223
 ] 

Hudson commented on HBASE-5919:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5919 Add fixes for Ted's review comments from HBASE-5869 (Revision 
104)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Bytes.java


 Add fixes for Ted's review comments from HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Ted Yu
Priority: Blocker
 Attachments: 5919-v2.txt, 5919-v4.txt, 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267222#comment-13267222
 ] 

Hudson commented on HBASE-5548:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5548 Add ability to get a table in the shell; BACKING OUT MISTAKEN 
CO-COMMIT OF HBASE-5840 (Revision 1333123)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, 
 ruby_HBASE-5548-addendum.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5913) Speed up the full scan of META


[ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267225#comment-13267225
 ] 

Hudson commented on HBASE-5913:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5913 Speed up the full scan of META (Chunhui) (Revision 1333283)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java


 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5869) Move SplitLogManager splitlog taskstate and AssignmentManager RegionTransitionData znode datas to pb


[ 
https://issues.apache.org/jira/browse/HBASE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267224#comment-13267224
 ] 

Hudson commented on HBASE-5869:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5919 Add fixes for Ted's review comments from HBASE-5869 (Revision 
104)
HBASE-5869 Move SplitLogManager splitlog taskstate and AssignmentManager 
RegionTransitionData znode datas to pb (Revision 1333099)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Bytes.java

stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/DeserializationException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/EmptyWatcher.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HBaseException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/RegionTransition.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerName.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/SplitLogCounters.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/SplitLogTask.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/EmptyWatcher.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/MasterAddressTracker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKSplitLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/trunk/src/main/protobuf/ZooKeeper.proto
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitLogWorker.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java


 Move SplitLogManager splitlog taskstate and AssignmentManager 
 RegionTransitionData znode datas to pb 
 -

 Key: HBASE-5869
 URL: https://issues.apache.org/jira/browse/HBASE-5869
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee

[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object


[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267226#comment-13267226
 ] 

Hudson commented on HBASE-5625:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5625 Avoid byte buffer allocations when reading a value from a Result 
object (Tudor Scurtu) (Revision 1333159)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Result.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestResult.java


 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Fix For: 0.96.0

 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 
 5625v5.txt, 5625v6.txt, 5625v7.txt, 5625v8.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly


[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267227#comment-13267227
 ] 

Hudson commented on HBASE-2214:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333122)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java
* /hbase/trunk/src/main/protobuf/Client.proto
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-2214-0.94-v2.txt, HBASE-2214-0.94-v3.txt, 
 HBASE-2214-0.94.txt, HBASE-2214-v4.txt, HBASE-2214-v5.txt, HBASE-2214-v6.txt, 
 HBASE-2214-v7.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267228#comment-13267228
 ] 

stack commented on HBASE-5922:
--

@Anoop Its late, but what you say makes sense.  Looking over in our half file 
test, TestHalfStoreFileReader, it seems pretty poor coverage.  What do you 
think?  It does not seem to test the boundary condition Nate ran into or that 
you reason above?

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267232#comment-13267232
 ] 

stack commented on HBASE-5916:
--

Well, thats useful, right?  Its useful in case where a regionserver crashes and 
a new one comes up fast, before the original regionserver's znode has expired 
in zk.  We shouldn't remove it.

On startup, you should not get this exception unless you have a condition like 
that described above where there was a regionserver on same host and port 
registered previously in the master and then a new regionserver comes in w/ 
same host and port but with different startcode?

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


 [ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5444:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch Gregory.

 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-5444-v10-trunk.patch, HBASE-5444-v6-trunk.patch, 
 HBASE-5444-v9-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta


[ 
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267243#comment-13267243
 ] 

stack commented on HBASE-5918:
--

@Chunhui I suppose so Its just odd have flags set in such different 
locations... It makes the tracking of stuff difficult.  At a minimum I'd think 
we'd make a single method that set the flag and then did the call to 
expireDeadNotExpiredServers so they are grouped... 

What about the call to expireDeadNotExpiredServers that is done twice?  On 
first call, we'd process possibly the server that was carrying root.  What 
happens when we call it again later out in finishInitialization?  Could we end 
up processing same server twice at all?

Thanks. 

 Master will block forever when startup if root server died between assign 
 root and assign meta
 --

 Key: HBASE-5918
 URL: https://issues.apache.org/jira/browse/HBASE-5918
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5918.patch, HBASE-5918.patch


 When master is initializing, if root server died between assign root and 
 assign meta, master will block at 
 HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
 this.catalogTracker.waitForMeta();{code}
 because ServerShutdownHandler is disabled,
 So we should enable ServerShutdownHandler after called 
 assignmentManager.assignMeta();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-03 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267253#comment-13267253
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

We should not remove PleaseHoldException(message) directly.
{code}
  if (services.isServerShutdownHandlerEnabled()) {
// master has completed the initialization
throw new PleaseHoldException(message);
  }
{code}
This solved the actual problem.  But the problem due to filenotfoundException 
should be addressed in a different way.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5913) Speed up the full scan of META


[ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267263#comment-13267263
 ] 

Hudson commented on HBASE-5913:
---

Integrated in HBase-0.94 #174 (See 
[https://builds.apache.org/job/HBase-0.94/174/])
HBASE-5913 Speed up the full scan of META (Revision 115)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java


 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267284#comment-13267284
 ] 

Hudson commented on HBASE-5444:
---

Integrated in HBase-TRUNK #2842 (See 
[https://builds.apache.org/job/HBase-TRUNK/2842/])
HBASE-5444 Add PB-based calls to HMasterRegionInterface (Revision 119)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerLoad.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterRegionInterface.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/RegionServerStatusProtocol.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MXBean.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MXBeanImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterDumpServlet.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RegionServerStatusProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/protobuf/RegionServerStatus.proto
* /hbase/trunk/src/main/protobuf/hbase.proto
* /hbase/trunk/src/main/resources/hbase-webapps/master/table.jsp
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMXBean.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java


 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-5444-v10-trunk.patch, HBASE-5444-v6-trunk.patch, 
 HBASE-5444-v9-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Attachment: 5844.v3.patch

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Status: Open  (was: Patch Available)

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Attachment: 5844.v4.patch

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Status: Patch Available  (was: Reopened)

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Status: Patch Available  (was: Open)

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash


[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267332#comment-13267332
 ] 

nkeywal commented on HBASE-5844:


v4 should be ok.
I will do another jira for the master.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

nkeywal created HBASE-5924:
--

 Summary: In the client code, don't wait for all the requests to be 
executed before resubmitting a request in error.
 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


The client (in the function HConnectionManager#processBatchCallback) works in 
two steps:
 - make the requests
 - collect the failures and successes and prepare for retry

It means that when there is an immediate error (region moved, split, dead 
server, ...) we still wait for all the initial requests to be executed before 
submitting again the failed request. If we have a scenario with all the 
requests taking 5 seconds we have a final execution time of: 5 (initial 
requests) + 1 (wait time) + 5 (final request) = 11s.

We could improve this by analyzing immediately the results. This would lead us, 
for the scenario mentioned above, to 6 seconds. 

So we could have a performance improvement of nearly 50% in many cases, and 
much more than 50% if the request execution time is different.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5902) Some scripts are not executable


 [ 
https://issues.apache.org/jira/browse/HBASE-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5902:
---

Status: Patch Available  (was: Open)

 Some scripts are not executable
 ---

 Key: HBASE-5902
 URL: https://issues.apache.org/jira/browse/HBASE-5902
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5902.v1.patch


 -rw-rw-r--  graceful_stop.sh
 -rw-rw-r--  hbase-config.sh
 -rw-rw-r--  local-master-backup.sh
 -rw-rw-r--  local-regionservers.sh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-05-03 Thread ramkrishna.s.vasudevan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267392#comment-13267392
]

ramkrishna.s.vasudevan edited comment on HBASE-5875 at 5/3/12 12:44 PM:

bq.What is the above referring to? Which part of the code?

In assignRootAndMeta()
{code}
boolean rit = this.assignmentManager.

processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);

{code}
bq.Can the master not detect this corner case just by looking at whats in zk?
Here zk you mean the RS node or the ROOT region node?

was (Author: ram_krish):
bq.What is the above referring to? Which part of the code?

In assignRootAndMeta()
{code}
boolean rit = this.assignmentManager.

processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);

{code}

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Attachments: HBASE-5875.patch

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-05-03 Thread ramkrishna.s.vasudevan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267392#comment-13267392
]

ramkrishna.s.vasudevan commented on HBASE-5875:
---

bq.What is the above referring to? Which part of the code?

In assignRootAndMeta()
{code}
boolean rit = this.assignmentManager.

processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);

{code}

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Attachments: HBASE-5875.patch

[jira] [Created] (HBASE-5926) Delete the master znode after a znode crash

nkeywal created HBASE-5926:
--

 Summary: Delete the master znode after a znode crash
 Key: HBASE-5926
 URL: https://issues.apache.org/jira/browse/HBASE-5926
 Project: HBase
  Issue Type: Improvement
  Components: master, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


This is the continuation of the work done in HBASE-5844.
But we can't apply exactly the same strategy: for the region server, there is a 
znode per region server, while for the master  backup master there is a single 
znode for both.

So if we apply the same strategy as for a regionserver, we may have this 
scenario:
1) Master starts
2) Backup master starts
3) Master dies
4) ZK detects it
5) Backup master receives the update from ZK
6) Backup master creates the new master node and become the main master
7) Previous master script continues
8) Previous master script delete the master node in ZK
9) = issue: we deleted the node just created by the new master

This should not happen often (usually the znode will be delete soon enough), 
but it can happen.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5926) Delete the master znode after a znode crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5926:
---

Description: 
This is the continuation of the work done in HBASE-5844.
But we can't apply exactly the same strategy: for the region server, there is a 
znode per region server, while for the master  backup master there is a single 
znode for both.

So if we apply the same strategy as for a regionserver, we may have this 
scenario:
1) Master starts
2) Backup master starts
3) Master dies
4) ZK detects it
5) Backup master receives the update from ZK
6) Backup master creates the new master node and become the main master
7) Previous master script continues
8) Previous master script deletes the master node in ZK
9) = issue: we deleted the node just created by the new master

This should not happen often (usually the znode will be deleted soon enough), 
but it can happen.

  was:
This is the continuation of the work done in HBASE-5844.
But we can't apply exactly the same strategy: for the region server, there is a 
znode per region server, while for the master  backup master there is a single 
znode for both.

So if we apply the same strategy as for a regionserver, we may have this 
scenario:
1) Master starts
2) Backup master starts
3) Master dies
4) ZK detects it
5) Backup master receives the update from ZK
6) Backup master creates the new master node and become the main master
7) Previous master script continues
8) Previous master script delete the master node in ZK
9) = issue: we deleted the node just created by the new master

This should not happen often (usually the znode will be delete soon enough), 
but it can happen.


 Delete the master znode after a znode crash
 ---

 Key: HBASE-5926
 URL: https://issues.apache.org/jira/browse/HBASE-5926
 Project: HBase
  Issue Type: Improvement
  Components: master, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 This is the continuation of the work done in HBASE-5844.
 But we can't apply exactly the same strategy: for the region server, there is 
 a znode per region server, while for the master  backup master there is a 
 single znode for both.
 So if we apply the same strategy as for a regionserver, we may have this 
 scenario:
 1) Master starts
 2) Backup master starts
 3) Master dies
 4) ZK detects it
 5) Backup master receives the update from ZK
 6) Backup master creates the new master node and become the main master
 7) Previous master script continues
 8) Previous master script deletes the master node in ZK
 9) = issue: we deleted the node just created by the new master
 This should not happen often (usually the znode will be deleted soon enough), 
 but it can happen.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5926:
---

Summary: Delete the master znode after a master crash  (was: Delete the 
master znode after a znode crash)

 Delete the master znode after a master crash
 

 Key: HBASE-5926
 URL: https://issues.apache.org/jira/browse/HBASE-5926
 Project: HBase
  Issue Type: Improvement
  Components: master, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 This is the continuation of the work done in HBASE-5844.
 But we can't apply exactly the same strategy: for the region server, there is 
 a znode per region server, while for the master  backup master there is a 
 single znode for both.
 So if we apply the same strategy as for a regionserver, we may have this 
 scenario:
 1) Master starts
 2) Backup master starts
 3) Master dies
 4) ZK detects it
 5) Backup master receives the update from ZK
 6) Backup master creates the new master node and become the main master
 7) Previous master script continues
 8) Previous master script deletes the master node in ZK
 9) = issue: we deleted the node just created by the new master
 This should not happen often (usually the znode will be deleted soon enough), 
 but it can happen.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-05-03 Thread ramkrishna.s.vasudevan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267521#comment-13267521
]

ramkrishna.s.vasudevan commented on HBASE-5875:
---

I have reproduced the scenario addressing the title of the JIRA with a testcase.
I have tried follow a approach that Bijieshan had suggested in
https://issues.apache.org/jira/browse/HBASE-5875?focusedCommentId=13264874page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13264874
to solve the problem. Tomorrow i can upload the testcase.

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Attachments: HBASE-5875.patch

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic


[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267568#comment-13267568
 ] 

stack commented on HBASE-5923:
--

This patch is great.  Thanks for going back and doing the cleanup.

This class should not be in filter package?
+import org.apache.hadoop.hbase.filter.WritableByteArrayComparable;

Probably hard to move it now?  Its part of a public API?  Could deprecate and 
replace w/ a more generic, non-filter specific class?  Moving it should not be 
part of this patch.  Its not so bad anyways having this filter package 
pollution since its in client facing code and clients need access to filter 
stuff...

Would think pollution:

+import 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos.Condition.CompareType;

Should be pulling in a non-pb class into an Interface like this.  Can we 
encapsulate these Client conditions in a non-pb class?

 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic


[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267593#comment-13267593
 ] 

stack commented on HBASE-5923:
--

bq. That also means that is 0.94 there would be a dependency on CompareFilter 
in HTableInterface.

Thats better than a generated pb dependency IMO.  If you'd like, I can make it 
so you can do same or similar in trunk: i.e. not have to import generated pb 
but rather the filter.CompareFilter or some such similar class?  Just say.


 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-0.94.txt, 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267624#comment-13267624
 ] 

Hudson commented on HBASE-5883:
---

Integrated in HBase-TRUNK #2843 (See 
[https://builds.apache.org/job/HBase-TRUNK/2843/])
HBASE-5883 Backup master is going down due to connection refused exception 
(Jieshan) (Revision 1333530)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267633#comment-13267633
 ] 

Hudson commented on HBASE-5883:
---

Integrated in HBase-0.94 #175 (See 
[https://builds.apache.org/job/HBase-0.94/175/])
HBASE-5883 Backup master is going down due to connection refused exception 
(Jieshan) (Revision 1333533)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa

[jira] [Updated] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time


 [ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5494:
---

Attachment: D2997.3.patch

avf requested code review of [jira] [HBASE-5494] [89-fb] Table-level locks for 
schema changing operations..
Reviewers: Kannan, mbautin, Liyin, JIRA

  Since concurrent modification (e.g., disabling and dropping a table under
  creation) could leave a cluster in an inconsistent state, we need table-level
  locks for schema changing operations.

  A ZooKeeper-based distributed lock has been implemented that
  attempts to create a persistent ZNode (one ZNode per entity being locked, 
i.e.,
  one per table) if one does not exist. Currently in case a master crashes while
  holding the lock, the lock must be manually removed using the ZooKeeper 
command
  line (locks being stored in /hbase/tableLock/).

  The locks implemented are not fair or re-entrant. RecoverableZooKeeper is used
  to correctly handle connection loss.

  To test the locks, InjectionHandler and InjectionEvent have been introduced,
  allowing for injection of arbitrary events, in this case adding delays during
  schema changing operations as to induce a race condition.

  Future work involves automatically deleting stale lock ZNodes upon server
  recovery (providing the attempted operations are not resumed), adding metrics
  around locks (e.g., list all locks held).

TEST PLAN
  Since concurrent modification (e.g., disabling and dropping a table
  under creation) could leave a cluster in an inconsistent state, we
  need table-level locks for schema changing operations.

  A ZooKeeper-based distributed lock has been implemented that attempts
  to create a persistent ZNode (one ZNode per entity being locked, i.e.,
  one per table) if one does not exist. Currently in case a master
  crashes while holding the lock, the lock must be manually removed
  using the ZooKeeper command line (locks being stored in
  /hbase/tableLock/).

  The locks implemented are not fair or re-entrant. RecoverableZooKeeper
  is used to correctly handle connection loss.

  To test the locks, InjectionHandler and InjectionEvent have been
  introduced, allowing for injection of arbitrary events, in this case
  adding delays during schema changing operations as to induce a race
  condition.

  Future work involves automatically deleting stale lock ZNodes upon
  server recovery (providing the attempted operations are not resumed),
  adding metrics around locks (e.g., list all locks held).


REVISION DETAIL
  https://reviews.facebook.net/D2997

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/TableLockTimeoutException.java
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java
  src/main/java/org/apache/hadoop/hbase/util/InjectionEvent.java
  src/main/java/org/apache/hadoop/hbase/util/InjectionHandler.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
  src/test/java/org/apache/hadoop/hbase/master/TestSchemaModificationLocks.java
  src/test/java/org/apache/hadoop/hbase/util/DelayInducingInjectionHandler.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestDistributedLock.java


 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Attachments: D2997.3.patch


 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

[
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267689#comment-13267689
]

Phabricator commented on HBASE-5494:

tedyu has commented on the revision [jira] [HBASE-5494] [89-fb] Table-level
locks for schema changing operations..

I only reviewed part of the patch.

Would this feature be refined in 0.89-fb branch before being ported to Apache
HBase trunk ?

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HConstants.java:98 Schema changes would
always involve master.
'master.' can be omitted.
src/main/java/org/apache/hadoop/hbase/HConstants.java:108 Is this value big
enough in cluster testing ?
src/main/java/org/apache/hadoop/hbase/TableLockTimeoutException.java:2 No
year is needed.
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:1353 This lock is
used to prevent two concurrent table creation attempts.
tryLockTable() is more desirable here.
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:1310 Can we add
tryLockTable() ?
It would be useful for the non-winning thread to exit quickly.
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:2 No year,
please.
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:47 Should
Bytes.toStringBinary() be used here ?
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:53 Add 'be
' before 'released'
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:137 What
if lock release fails ?
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:27 Can you
tell me which zookeeper branch provides this lock ?
In http://svn.apache.org/repos/asf/zookeeper/trunk, I don't seem to find this
class.

REVISION DETAIL
https://reviews.facebook.net/D2997

Introduce a zk hosted table-wide read/write lock so only one table operation
at a time
--

Key: HBASE-5494
URL: https://issues.apache.org/jira/browse/HBASE-5494
Project: HBase
Issue Type: Improvement
Reporter: stack
Attachments: D2997.3.patch

I saw this facility over in the accumulo code base.
Currently we just try to sort out the mess when splits come in during an
online schema edit; somehow we figure we can figure all possible region
transition combinations and make the right call.
We could try and narrow the number of combinations by taking out a zk table
lock when doing table operations.
For example, on split or merge, we could take a read-only lock meaning the
table can't be disabled while these are running.
We could then take a write only lock if we want to ensure the table doesn't
change while disabling or enabling process is happening.
Shouldn't be too hard to add.

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267714#comment-13267714
 ] 

Hudson commented on HBASE-5883:
---

Integrated in HBase-0.92 #396 (See 
[https://builds.apache.org/job/HBase-0.92/396/])
HBASE-5883  Backup master is going down due to connection refused exception 
(Jieshan) (Revision 1333537)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time


[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267728#comment-13267728
 ] 

Phabricator commented on HBASE-5494:


avf has commented on the revision [jira] [HBASE-5494] [89-fb] Table-level 
locks for schema changing operations..

  Thanks for the inline comments, @tedyu -- I've replied to a few quick ones 
inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:27 
DistributedLock is implemented as part of the patch (see DistributedLock.java)
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:47 
Metadata for table level locks is stored as plain text -- this is to allow 
operations to view lock information from the zookeeper CLI: toStringBinary() 
would not be needed here.
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:137 In 
this case, an IOException is thrown up to the caller: this is to indicate a 
non-recoverable ZooKeeper error (DistributedLock uses RecoverableZooKeeper 
class under the covers). .release() may also throw an IllegalStateException -- 
but this is essentially used an assertion in this case (releasing a lock that 
isn't held).

REVISION DETAIL
  https://reviews.facebook.net/D2997


 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Attachments: D2997.3.patch


 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267743#comment-13267743
 ] 

stack commented on HBASE-5883:
--

Can't we at least check the message to ensure its what we expect?  (See the 
second catch below where we look for connection reset).  Can we be sure what 
comes up here is the ConnectException we set down in HBaseRPC?

{code}
+  if (ioe instanceof ConnectException) {
+// Catch. Connect refused.
{code}

This redoing of an exception seems problematic.  Its really necessary?

{code}
+} else if (ioex.getMessage().toLowerCase()
+.contains(connection refused)) {
+  ce = new ConnectException(ioex.getMessage());
+  ioe = ce;
{code}

I'd feel better about this fix if we could figure where the exception came from 
(Its not from the rpc stringifying of exceptions to pass them from server to 
client?

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO

[jira] [Commented] (HBASE-5928) Hbck shouldn't npe when there are no tables.


[ 
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267744#comment-13267744
 ] 

stack commented on HBASE-5928:
--

+1 on patch.

Jon Hsieh?

 Hbck shouldn't npe when there are no tables.
 

 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5928-0.patch


 hbase fsck errors out when there are no tables.
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267751#comment-13267751
]

nkeywal commented on HBASE-5877:

v12, should be final.

1) ServerName is used everywhere in the interface, thanks to protobuf
2) hadoop.ipc serialization of exception is based on the #getMessage. So we
have to parse it internally. It's not visisble to the exception user.
3) The code to manage the error in the client package is quite complex. We have
the exception at the very beginning, and then it's checked again, but we don't
have the real exception anymore. I used a new historyList to make it works.
There is another JIRA for other improvement, in which I could get rid of this
(HBASE-5924)
4) Generated with protobuf 2.4.1
5) The destination is the closeRegion interface is a kind of interface
hijacking. Other options would be:
- sharing the region state in zookeeper
- letting the regionserver calls the master to get the new server. On paper
this would be more efficient than a client - master call. In both cases we
could consider that the client should not connect to the master except for
cluster administration (create table, split regin; ...). That would increase
global reliability. That's for another discussion as well I think.
6) RegionServerServices has been modified to set a destination when removing a
region from the online regions.
7) In another JIRA I will manage the case when the destination is not specified
when calling the move function.

When a query fails because the region has moved, let the regionserver return
the new address to the client
--

Key: HBASE-5877
URL: https://issues.apache.org/jira/browse/HBASE-5877
Project: HBase
Issue Type: Improvement
Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
Fix For: 0.96.0

Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch

This is mainly useful when we do a rolling restart. This will decrease the
load on the master and the network load.
Note that a region is not immediately opened after a close. So:
- it seems preferable to wait before retrying on the other server. An
optimisation would be to have an heuristic depending on when the region was
closed.
- during a rolling restart, the server moves the regions then stops. So we
may have failures when the server is stopped, and this patch won't help.
The implementation in the first patch does:
- on the region move, there is an added parameter on the regionserver#close
to say where we are sending the region
- the regionserver keeps a list of what was moved. Each entry is kept 100
seconds.
- the regionserver sends a specific exception when it receives a query on a
moved region. This exception contains the new address.
- the client analyses the exeptions and update its cache accordingly...

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Attachment: 5877.v12.patch

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Status: Open  (was: Patch Available)

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Status: Patch Available  (was: Open)

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5929) HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked to majorcompact regions.


[ 
https://issues.apache.org/jira/browse/HBASE-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267753#comment-13267753
 ] 

stack commented on HBASE-5929:
--

This seems uninterpretable as table name or region name 
'org.apache.hadoop.hbase.TableNotFoundException: ROOT,,0'... I'd have expected 
it to be -ROOT-,,0 if hbase was to have any chance?  Is this coming in via 
jruby mighty Aravind?  Does 
'ad_daily,49842:2009-07-10,1269763588508.1997607018' exist on the cluster? (I 
know I should look myself).

 HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked 
 to majorcompact regions.
 -

 Key: HBASE-5929
 URL: https://issues.apache.org/jira/browse/HBASE-5929
 Project: HBase
  Issue Type: Bug
  Components: client, shell
Affects Versions: 0.92.1
 Environment: Linux Ubuntu Lucid 64bit
Reporter: Aravind Gottipati
Priority: Minor

 I have been noticing that calls to HBaseAdmin.majorCompact throws exceptions 
 randomly for some regions.  I could not find a pattern to these exception.  
 The code I have simply does this 
 admin.majorCompact(region.getRegionNameAsString()).  admin is an instance of 
 HBaseAdmin and region is an instance of HRegionInfo.  The exception I get is 
 org.apache.hadoop.hbase.TableNotFoundException: -ROOT-,,0
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473)
  ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
 ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
 ~[hbase-0.92.1.jar:0.92.1]
 at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
 Source) [hbase_compact.jar:na]
 In this case it's the root region, but I get similar exceptions for other 
 tables, like this.
 2012-05-03 19:03:42,994 WARN  [main] HBaseCompact: Could not compact:
 org.apache.hadoop.hbase.TableNotFoundException: 
 ad_daily,49842:2009-07-10,1269763588508.1997607018
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473)
  ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
 ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
 ~[hbase-0.92.1.jar:0.92.1]
 at 
 org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1196) 
 ~[hbase-0.92.1.jar:0.92.1]
 at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
 Source) [hbase_compact.jar:na]
 at com.stumbleupon.hbaseadmin.HBaseCompact.main(Unknown Source) 
 [hbase_compact.jar:na]
 I see this on hbase shell as well.  However, I don't see these exceptions if 
 I use admin.majorCompact(region.getRegionName()), so it looks like something 
 gets lost when I use getRegionNameAsString().
 Let me know if I can provide more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5931) HBase security profile doesn't compile


 [ 
https://issues.apache.org/jira/browse/HBASE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5931:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch Jimmy

 HBase security profile doesn't compile 
 ---

 Key: HBASE-5931
 URL: https://issues.apache.org/jira/browse/HBASE-5931
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5931.patch


 The compilation is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time


[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267771#comment-13267771
 ] 

Phabricator commented on HBASE-5494:


tedyu has commented on the revision [jira] [HBASE-5494] [89-fb] Table-level 
locks for schema changing operations..

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:137 Is 
there a chance that acquiredTableLocks is out of sync with the lock status up 
in zk ?
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java:87 Add a 
space after 'wait'
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java:104 Do 
we really need this exception ?
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java:111 
Please include lockZNodeVersion here.
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java:180 
fullyQualifiedZNode should be included.
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java:292 
Please include fullyQualifiedZNode
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java:1310 
Please make the second part of the sentence correct
  src/main/java/org/apache/hadoop/hbase/util/InjectionHandler.java:144 Can this 
method be made non-public ?

REVISION DETAIL
  https://reviews.facebook.net/D2997


 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Attachments: D2997.3.patch


 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267787#comment-13267787
 ] 

stack commented on HBASE-5877:
--

You don't want to have RegionMovedException carry a ServerName#toString instead 
of host and port?  Or it doesn't make sense when our cached region exceptions 
are keyed by hostname+port only?

Is this a bug fix?

{code}
@@ -1910,6 +1989,7 @@ public class HConnectionManager {
 }
   } catch (ExecutionException e) {
 LOG.debug(Failed all from  + loc, e);
+updateCachedLocations(updateHistory, loc, e);
{code}

Put the history of moved regions out into its own class?

Don't presize this I'd say:

+  private static final long TIMEOUT_REGION_MOVED = (2L * 60L * 1000L);

Stuff is lazily cleared from movedRegions?  Should we have a cleaner come visit 
occasionally?

Patch looks fine to me.  Nice fat test.

bq. 5) The destination is the closeRegion interface is a kind of interface 
hijacking. Other options would be:

Why you say the above?  When we protobuf it, it'll just be an option so it 
shouldn't be too bad?

The HCM stuff is ugly but thats not your fault.



 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5931) HBase security profile doesn't compile


[ 
https://issues.apache.org/jira/browse/HBASE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267791#comment-13267791
 ] 

Hudson commented on HBASE-5931:
---

Integrated in HBase-TRUNK #2844 (See 
[https://builds.apache.org/job/HBase-TRUNK/2844/])
HBASE-5931 HBase security profile doesn't compile (Revision 1333600)

 Result = SUCCESS
stack : 
Files : 
* 
/hbase/trunk/security/src/main/java/org/apache/hadoop/hbase/security/HBasePolicyProvider.java
* /hbase/trunk/src/main/resources/hbase-webapps/master/table.jsp


 HBase security profile doesn't compile 
 ---

 Key: HBASE-5931
 URL: https://issues.apache.org/jira/browse/HBASE-5931
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5931.patch


 The compilation is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time