date:20120503

[
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267217#comment-13267217
]

Hadoop QA commented on HBASE-5444:
--

+1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12525392/HBASE-5444-v10-trunk.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 18 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1740//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1740//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1740//console

This message is automatically generated.

Add PB-based calls to HMasterRegionInterface

Key: HBASE-5444
URL: https://issues.apache.org/jira/browse/HBASE-5444
Project: HBase
Issue Type: Sub-task
Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
Attachments: HBASE-5444-v10-trunk.patch, HBASE-5444-v6-trunk.patch,
HBASE-5444-v9-trunk.patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

2012-05-03 Thread chunhui shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267218#comment-13267218
 ] 

chunhui shen commented on HBASE-5916:
-

We have also encountered this issue.

What about remove 
{code}throw new PleaseHoldException(message);
{code}
in ServerManager#checkAlreadySameHostPort

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4990) Document secure HBase setup


[ 
https://issues.apache.org/jira/browse/HBASE-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267219#comment-13267219
 ] 

Hudson commented on HBASE-4990:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-4990 Document secure HBase setup (Revision 1333212)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/docbkx/book.xml
* /hbase/trunk/src/docbkx/security.xml


 Document secure HBase setup
 ---

 Key: HBASE-4990
 URL: https://issues.apache.org/jira/browse/HBASE-4990
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.92.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.96.0

 Attachments: 4990.txt, 4990v2.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status


[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267221#comment-13267221
 ] 

Hudson commented on HBASE-5840:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5840 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, 
keeps showing the old status (RajeshBabu) (Revision 1333124)
HBASE-5548 Add ability to get a table in the shell; BACKING OUT MISTAKEN 
CO-COMMIT OF HBASE-5840 (Revision 1333123)

 Result = SUCCESS
ramkrishna : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
 the old status
 --

 Key: HBASE-5840
 URL: https://issues.apache.org/jira/browse/HBASE-5840
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5840.patch, HBASE-5840_trunk.patch, 
 HBASE-5840_v2.patch


 TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
 keeps showing old status.
 This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1996) Configure scanner buffer in bytes instead of number of rows


[ 
https://issues.apache.org/jira/browse/HBASE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267220#comment-13267220
 ] 

Hudson commented on HBASE-1996:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333122)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java
* /hbase/trunk/src/main/protobuf/Client.proto
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Configure scanner buffer in bytes instead of number of rows
 ---

 Key: HBASE-1996
 URL: https://issues.apache.org/jira/browse/HBASE-1996
 Project: HBase
  Issue Type: Improvement
Reporter: Dave Latham
Assignee: Dave Latham
 Fix For: 0.90.0

 Attachments: 1966.patch, 1996-0.20.3-v2.patch, 1996-0.20.3-v3.patch, 
 1996-0.20.3.patch


 Currently, the default scanner fetches a single row at a time.  This makes 
 for very slow scans on tables where the rows are not large.  You can change 
 the setting for an HTable instance or for each Scan.
 It would be better to have a default that performs reasonably well so that 
 people stop running into slow scans because they are evaluating HBase, aren't 
 familiar with the setting, or simply forgot.  Unfortunately, if we increase 
 the value of the current setting, then we run the risk of running OOM for 
 tables with large rows.  Let's change the setting so that it works with a 
 size in bytes, rather than in rows.  This will allow us to set a reasonable 
 default so that tables with small rows will scan performantly and tables with 
 large rows will not run OOM.
 Note that the case is very similar to table writes as well.  When disabling 
 auto flush, we buffer a list of Put's to commit at once.  That buffer is 
 measured in bytes, so that a small number of large Puts or a lot of small 
 Puts can each fit in a single flush.  If that buffer were measured in number 
 of Put's it would have the same problem that we have for the scan buffer, and 
 we wouldn't be able to set a good default value for tables with different 
 size rows.  Changing the scan buffer to be configured like the write buffer 
 will make it more consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5919) Add fixes for Ted's review comments from HBASE-5869


[ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267223#comment-13267223
 ] 

Hudson commented on HBASE-5919:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5919 Add fixes for Ted's review comments from HBASE-5869 (Revision 
104)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Bytes.java


 Add fixes for Ted's review comments from HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Ted Yu
Priority: Blocker
 Attachments: 5919-v2.txt, 5919-v4.txt, 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell


[ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267222#comment-13267222
 ] 

Hudson commented on HBASE-5548:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5548 Add ability to get a table in the shell; BACKING OUT MISTAKEN 
CO-COMMIT OF HBASE-5840 (Revision 1333123)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java


 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: ruby_HBASE-5528-v0.patch, 
 ruby_HBASE-5548-addendum.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5913) Speed up the full scan of META


[ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267225#comment-13267225
 ] 

Hudson commented on HBASE-5913:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5913 Speed up the full scan of META (Chunhui) (Revision 1333283)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java


 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5869) Move SplitLogManager splitlog taskstate and AssignmentManager RegionTransitionData znode datas to pb


[ 
https://issues.apache.org/jira/browse/HBASE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267224#comment-13267224
 ] 

Hudson commented on HBASE-5869:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5919 Add fixes for Ted's review comments from HBASE-5869 (Revision 
104)
HBASE-5869 Move SplitLogManager splitlog taskstate and AssignmentManager 
RegionTransitionData znode datas to pb (Revision 1333099)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Bytes.java

stack : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/DeserializationException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/EmptyWatcher.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/HBaseException.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/RegionTransition.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerName.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/SplitLogCounters.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/SplitLogTask.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/ExecutorService.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/EmptyWatcher.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/MasterAddressTracker.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKSplitLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/trunk/src/main/protobuf/ZooKeeper.proto
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitLogWorker.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java


 Move SplitLogManager splitlog taskstate and AssignmentManager 
 RegionTransitionData znode datas to pb 
 -

 Key: HBASE-5869
 URL: https://issues.apache.org/jira/browse/HBASE-5869
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee:

[jira] [Commented] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object


[ 
https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267226#comment-13267226
 ] 

Hudson commented on HBASE-5625:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-5625 Avoid byte buffer allocations when reading a value from a Result 
object (Tudor Scurtu) (Revision 1333159)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Result.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestResult.java


 Avoid byte buffer allocations when reading a value from a Result object
 ---

 Key: HBASE-5625
 URL: https://issues.apache.org/jira/browse/HBASE-5625
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.92.1
Reporter: Tudor Scurtu
Assignee: Tudor Scurtu
  Labels: patch
 Fix For: 0.96.0

 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 
 5625v5.txt, 5625v6.txt, 5625v7.txt, 5625v8.txt


 When calling Result.getValue(), an extra dummy KeyValue and its associated 
 underlying byte array are allocated, as well as a persistent buffer that will 
 contain the returned value.
 These can be avoided by reusing a static array for the dummy object and by 
 passing a ByteBuffer object as a value destination buffer to the read method.
 The current functionality is maintained, and we have added a separate method 
 call stack that employs the described changes. I will provide more details 
 with the patch.
 Running tests with a profiler, the reduction of read time seems to be of up 
 to 40%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly


[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267227#comment-13267227
 ] 

Hudson commented on HBASE-2214:
---

Integrated in HBase-TRUNK-security #190 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/190/])
HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than 
count of rows -- properly (Ferdy Galema) (Revision 1333122)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java
* /hbase/trunk/src/main/protobuf/Client.proto
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java


 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-2214-0.94-v2.txt, HBASE-2214-0.94-v3.txt, 
 HBASE-2214-0.94.txt, HBASE-2214-v4.txt, HBASE-2214-v5.txt, HBASE-2214-v6.txt, 
 HBASE-2214-v7.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267228#comment-13267228
 ] 

stack commented on HBASE-5922:
--

@Anoop Its late, but what you say makes sense.  Looking over in our half file 
test, TestHalfStoreFileReader, it seems pretty poor coverage.  What do you 
think?  It does not seem to test the boundary condition Nate ran into or that 
you reason above?

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5913) Speed up the full scan of META


 [ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5913:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.94 as well.

 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267232#comment-13267232
 ] 

stack commented on HBASE-5916:
--

Well, thats useful, right?  Its useful in case where a regionserver crashes and 
a new one comes up fast, before the original regionserver's znode has expired 
in zk.  We shouldn't remove it.

On startup, you should not get this exception unless you have a condition like 
that described above where there was a regionserver on same host and port 
registered previously in the master and then a new regionserver comes in w/ 
same host and port but with different startcode?

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


 [ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5444:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch Gregory.

 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-5444-v10-trunk.patch, HBASE-5444-v6-trunk.patch, 
 HBASE-5444-v9-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta


[ 
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267243#comment-13267243
 ] 

stack commented on HBASE-5918:
--

@Chunhui I suppose so Its just odd have flags set in such different 
locations... It makes the tracking of stuff difficult.  At a minimum I'd think 
we'd make a single method that set the flag and then did the call to 
expireDeadNotExpiredServers so they are grouped... 

What about the call to expireDeadNotExpiredServers that is done twice?  On 
first call, we'd process possibly the server that was carrying root.  What 
happens when we call it again later out in finishInitialization?  Could we end 
up processing same server twice at all?

Thanks. 

 Master will block forever when startup if root server died between assign 
 root and assign meta
 --

 Key: HBASE-5918
 URL: https://issues.apache.org/jira/browse/HBASE-5918
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5918.patch, HBASE-5918.patch


 When master is initializing, if root server died between assign root and 
 assign meta, master will block at 
 HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
 this.catalogTracker.waitForMeta();{code}
 because ServerShutdownHandler is disabled,
 So we should enable ServerShutdownHandler after called 
 assignmentManager.assignMeta();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative


[ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267253#comment-13267253
 ] 

ramkrishna.s.vasudevan commented on HBASE-5916:
---

We should not remove PleaseHoldException(message) directly.
{code}
  if (services.isServerShutdownHandlerEnabled()) {
// master has completed the initialization
throw new PleaseHoldException(message);
  }
{code}
This solved the actual problem.  But the problem due to filenotfoundException 
should be addressed in a different way.

 RS restart just before master intialization we make the cluster non operative
 -

 Key: HBASE-5916
 URL: https://issues.apache.org/jira/browse/HBASE-5916
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1, 0.94.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.1


 Consider a case where my master is getting restarted.  RS that was alive when 
 the master restart started, gets restarted before the master initializes the 
 ServerShutDownHandler.
 {code}
 serverShutdownHandlerEnabled = true;
 {code}
 In this case when the RS tries to register with the master, the master will 
 try to expire the server but the server cannot be expired as still the 
 serverShutdownHandler is not enabled.
 This case may happen when i have only one RS gets restarted or all the RS 
 gets restarted at the same time.(before assignRootandMeta).
 {code}
 LOG.info(message);
   if (existingServer.getStartcode()  serverName.getStartcode()) {
 LOG.info(Triggering server recovery; existingServer  +
   existingServer +  looks stale, new server: + serverName);
 expireServer(existingServer);
   }
 {code}
 If another RS is brought up then the cluster comes back to normalcy.
 May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267254#comment-13267254
 ] 

Anoop Sam John commented on HBASE-5922:
---

@Stack, I have not gone though the test cases..  Seems like boudary condition 
is not checked.. As far as my analysis there are 2 bugs in this seekBefore()..  
I will take a look at the tests and the other methods of HalfStoreFileReader...
Bugs
1. As the case with Nate, Stackoverflow when seekBefore() called with a 
key=splitKey
   on the bottom half file
2. On the top half file a seekBefore() call with a key = splitkey is supposed 
to return false but it wont happen. It will try to seek into the bottom half I 
fear ..

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic

[
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267260#comment-13267260
]

Hadoop QA commented on HBASE-5923:
--

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525400/5923-trunk.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1741//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1741//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1741//console

This message is automatically generated.

Cleanup checkAndXXX logic
-

Key: HBASE-5923
URL: https://issues.apache.org/jira/browse/HBASE-5923
Project: HBase
Issue Type: Improvement
Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
Fix For: 0.96.0, 0.94.1

Attachments: 5923-trunk.txt

1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via
HTable[Interface].
2. there is unnecessary duplicate code in the check{Put|Delete} code in
HRegionServer.

[jira] [Updated] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta

2012-05-03 Thread chunhui shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-5918:


Attachment: HBASE-5918V2.patch

In the v2 patch, I make a single method that set the flag and then did the call 
to expireDeadNotExpiredServers. And we will only call this method once now.

 Master will block forever when startup if root server died between assign 
 root and assign meta
 --

 Key: HBASE-5918
 URL: https://issues.apache.org/jira/browse/HBASE-5918
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5918.patch, HBASE-5918.patch, HBASE-5918V2.patch


 When master is initializing, if root server died between assign root and 
 assign meta, master will block at 
 HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
 this.catalogTracker.waitForMeta();{code}
 because ServerShutdownHandler is disabled,
 So we should enable ServerShutdownHandler after called 
 assignmentManager.assignMeta();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5913) Speed up the full scan of META


[ 
https://issues.apache.org/jira/browse/HBASE-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267263#comment-13267263
 ] 

Hudson commented on HBASE-5913:
---

Integrated in HBase-0.94 #174 (See 
[https://builds.apache.org/job/HBase-0.94/174/])
HBASE-5913 Speed up the full scan of META (Revision 115)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java


 Speed up the full scan of META
 --

 Key: HBASE-5913
 URL: https://issues.apache.org/jira/browse/HBASE-5913
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.1

 Attachments: 5913-v2.txt, HBASE-5913.patch


 In the master, we will do the full scan of META in some situations
 for example,
 1.master start up
 2.CatalogJanitor do the full scan per 5 mins
 3.ServerShutdownHandler, getServerUserRegions for dead server.
 For the online applications, we should try the best to reduce the process 
 time of ServerShutdownHandler in the situation 3. 
 However, we found MetaReader#getServerUserRegions take 14mins for 10w regions 
 in our production environment.
 And it is caused by two reasons:
 The first, we don't use cache and get one row per next() when fully scan 
 .META.
 The second, hbase.ipc.client.tcpnodelay is false as default, and in our 
 environment it take 40ms for per next() (It is related to the length of row 
 in the .META. , if someone also found, could try to set it true)
 For this issue, I think we could set the caching when do the full scan of META

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine

2012-05-03 Thread Devaraj Das (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267270#comment-13267270
]

Devaraj Das commented on HBASE-5732:

bq. There is no corresponding remove of the /security directory. Should it be
included here?

Yeah, it shouldn't be there. However, I generated the patch with
--no-diff-deleted and hence these files still show up but if you download the
patch you will see a bunch of lines that say Index: security/... (deleted).
The person who commits needs to be aware of this I guess and run the
appropriate svn commands.

bq. I don't see you regenerating pb stuff after making these changes in this
proto file

There is actually - RPCProtos.java.

bq. What is this? Mistake?

(comment to do with the conf file change). I merged in the stuff from
hbase-site.xml from the security/src/test/resources into the src/test/resources
one since the security one would go away (yeah you won't know about it unless
you do a manual diff of the two hbase-site.xml files).

I am in the process of setting up a secure cluster etc. for some manual
testing.. Fingers crossed.

Remove the SecureRPCEngine and merge the security-related logic in the core
engine
--

Key: HBASE-5732
URL: https://issues.apache.org/jira/browse/HBASE-5732
Project: HBase
Issue Type: Improvement
Reporter: Devaraj Das
Assignee: Devaraj Das
Attachments: rpcengine-merge.3.patch, rpcengine-merge.4.patch,
rpcengine-merge.patch

Remove the SecureRPCEngine and merge the security-related logic in the core
engine. Follow up to HBASE-5727.

[jira] [Commented] (HBASE-5444) Add PB-based calls to HMasterRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267284#comment-13267284
 ] 

Hudson commented on HBASE-5444:
---

Integrated in HBase-TRUNK #2842 (See 
[https://builds.apache.org/job/HBase-TRUNK/2842/])
HBASE-5444 Add PB-based calls to HMasterRegionInterface (Revision 119)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon
* 
/hbase/trunk/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ServerLoad.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HMasterRegionInterface.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/RegionServerStatusProtocol.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MXBean.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MXBeanImpl.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/MasterDumpServlet.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RegionServerStatusProtos.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/protobuf/RegionServerStatus.proto
* /hbase/trunk/src/main/protobuf/hbase.proto
* /hbase/trunk/src/main/resources/hbase-webapps/master/table.jsp
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMXBean.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java


 Add PB-based calls to HMasterRegionInterface
 

 Key: HBASE-5444
 URL: https://issues.apache.org/jira/browse/HBASE-5444
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Todd Lipcon
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: HBASE-5444-v10-trunk.patch, HBASE-5444-v6-trunk.patch, 
 HBASE-5444-v9-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5918) Master will block forever when startup if root server died between assign root and assign meta

[
https://issues.apache.org/jira/browse/HBASE-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267287#comment-13267287
]

Hadoop QA commented on HBASE-5918:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525408/HBASE-5918V2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1742//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1742//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1742//console

This message is automatically generated.

Master will block forever when startup if root server died between assign
root and assign meta
--

Key: HBASE-5918
URL: https://issues.apache.org/jira/browse/HBASE-5918
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.92.1
Reporter: chunhui shen
Assignee: chunhui shen
Attachments: HBASE-5918.patch, HBASE-5918.patch, HBASE-5918V2.patch

When master is initializing, if root server died between assign root and
assign meta, master will block at
HMaster#assignRootAndMeta:{code}assignmentManager.assignMeta();
this.catalogTracker.waitForMeta();{code}
because ServerShutdownHandler is disabled,
So we should enable ServerShutdownHandler after called
assignmentManager.assignMeta();

[jira] [Commented] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile

2012-05-03 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267310#comment-13267310
 ] 

Jonathan Hsieh commented on HBASE-5876:
---

This is related to HBASE-5697 -- it has to do with some sort of inconsistencies 
with old mr properties (fs.default.name) and new mr properties (fs.defaultFs).  
When only using the new hadoop 23 properties, this test passes consistently.

 TestImportExport has been failing against hadoop 0.23 profile
 -

 Key: HBASE-5876
 URL: https://issues.apache.org/jira/browse/HBASE-5876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Zhihong Yu
Assignee: Jonathan Hsieh

 TestImportExport has been failing against hadoop 0.23 profile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Attachment: 5844.v3.patch

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Status: Open  (was: Patch Available)

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Attachment: 5844.v4.patch

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Status: Patch Available  (was: Reopened)

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5844:
---

Status: Patch Available  (was: Open)

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash


[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267332#comment-13267332
 ] 

nkeywal commented on HBASE-5844:


v4 should be ok.
I will do another jira for the master.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 
 5844.v3.patch, 5844.v4.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

nkeywal created HBASE-5924:
--

 Summary: In the client code, don't wait for all the requests to be 
executed before resubmitting a request in error.
 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


The client (in the function HConnectionManager#processBatchCallback) works in 
two steps:
 - make the requests
 - collect the failures and successes and prepare for retry

It means that when there is an immediate error (region moved, split, dead 
server, ...) we still wait for all the initial requests to be executed before 
submitting again the failed request. If we have a scenario with all the 
requests taking 5 seconds we have a final execution time of: 5 (initial 
requests) + 1 (wait time) + 5 (final request) = 11s.

We could improve this by analyzing immediately the results. This would lead us, 
for the scenario mentioned above, to 6 seconds. 

So we could have a performance improvement of nearly 50% in many cases, and 
much more than 50% if the request execution time is different.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267346#comment-13267346
 ] 

Anoop Sam John commented on HBASE-5922:
---

Checked other methods in HalfStoreFileReader. Looks ok to me...

As Stack also asked how u get this issue in cluster? [Functionaly reproduce]

Any way the code is supposed to handle these cases I feel and needs fix 

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5902) Some scripts are not executable


 [ 
https://issues.apache.org/jira/browse/HBASE-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5902:
---

Status: Patch Available  (was: Open)

 Some scripts are not executable
 ---

 Key: HBASE-5902
 URL: https://issues.apache.org/jira/browse/HBASE-5902
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
 Attachments: 5902.v1.patch


 -rw-rw-r--  graceful_stop.sh
 -rw-rw-r--  hbase-config.sh
 -rw-rw-r--  local-master-backup.sh
 -rw-rw-r--  local-regionservers.sh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5902) Some scripts are not executable

[
https://issues.apache.org/jira/browse/HBASE-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267350#comment-13267350
]

Hadoop QA commented on HBASE-5902:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525056/5902.v1.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1744//console

This message is automatically generated.

Some scripts are not executable
---

Key: HBASE-5902
URL: https://issues.apache.org/jira/browse/HBASE-5902
Project: HBase
Issue Type: Bug
Components: scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Trivial
Attachments: 5902.v1.patch

-rw-rw-r-- graceful_stop.sh
-rw-rw-r-- hbase-config.sh
-rw-rw-r-- local-master-backup.sh
-rw-rw-r-- local-regionservers.sh

[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

[
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267352#comment-13267352
]

Hadoop QA commented on HBASE-5844:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525420/5844.v4.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1743//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1743//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1743//console

This message is automatically generated.

Delete the region servers znode after a regions server crash

Key: HBASE-5844
URL: https://issues.apache.org/jira/browse/HBASE-5844
Project: HBase
Issue Type: Improvement
Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Fix For: 0.96.0

Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch,
5844.v3.patch, 5844.v4.patch

today, if the regions server crashes, its znode is not deleted in ZooKeeper.
So the recovery process will stop only after a timeout, usually 30s.
By deleting the znode in start script, we remove this delay and the recovery
starts immediately.

[jira] [Updated] (HBASE-5883) Backup master is going down due to connection refused exception


 [ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5883:


Attachment: (was: HBASE-5883-94.patch)

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5883) Backup master is going down due to connection refused exception


 [ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5883:


Attachment: HBASE-5883-90.patch
HBASE-5883-92.patch
HBASE-5883-94.patch

Patches for all the branches. All test cases passed.

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267392#comment-13267392
]

ramkrishna.s.vasudevan edited comment on HBASE-5875 at 5/3/12 12:44 PM:

bq.What is the above referring to? Which part of the code?

In assignRootAndMeta()
{code}
boolean rit = this.assignmentManager.

processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);

{code}
bq.Can the master not detect this corner case just by looking at whats in zk?
Here zk you mean the RS node or the ROOT region node?

was (Author: ram_krish):
bq.What is the above referring to? Which part of the code?

In assignRootAndMeta()
{code}
boolean rit = this.assignmentManager.

processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);

{code}

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Key: HBASE-5875
URL: https://issues.apache.org/jira/browse/HBASE-5875
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Fix For: 0.94.1

Attachments: HBASE-5875.patch

If on master restart it finds the ROOT/META to be in RIT state, master tries
to assign the ROOT region through ProcessRIT.
Master will trigger the assignment and next will try to verify the Root
Region Location.
Root region location verification is done seeing if the RS has the region in
its online list.
If the master triggered assignment has not yet been completed in RS then the
verify root region location will fail.
Because it failed
{code}
splitLogAndExpireIfOnline(currentRootServer);
{code}
we do split log and also remove the server from online server list. Ideally
here there is nothing to do in splitlog as no region server was restarted.
So master, though the server is online, master just invalidates the region
server.
In a special case, if i have only one RS then my cluster will become non
operative.

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267392#comment-13267392
]

ramkrishna.s.vasudevan commented on HBASE-5875:
---

bq.What is the above referring to? Which part of the code?

In assignRootAndMeta()
{code}
boolean rit = this.assignmentManager.

processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);

{code}

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Attachments: HBASE-5875.patch

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception

[
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267395#comment-13267395
]

Hadoop QA commented on HBASE-5883:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525434/HBASE-5883-90.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1745//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1745//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1745//console

This message is automatically generated.

Backup master is going down due to connection refused exception
---

Key: HBASE-5883
URL: https://issues.apache.org/jira/browse/HBASE-5883
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
Fix For: 0.96.0, 0.94.1

Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch,
HBASE-5883-94.patch, HBASE-5883-trunk.patch

The active master node network was down for some time (This node contains
Master,DN,ZK,RS). Here backup node got
notification, and started to became active. Immedietly backup node got
aborted with the below exception.
{noformat}
2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager:
finished splitting (more than or equal to) 861248320 bytes in 4 log files in
[hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
in 26374ms
2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master
server abort: loaded coprocessors are: []
2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster:
Unhandled exception. Starting shutdown.
java.io.IOException: java.net.ConnectException: Connection refused
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
at
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
at $Proxy13.getProtocolVersion(Unknown Source)
at
org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
at
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
at

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError

2012-05-03 Thread Todd Johnson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267424#comment-13267424
 ] 

Todd Johnson commented on HBASE-5922:
-

I worked with Nate on this yesterday. We couldn't think of any reason you would 
want the delegate to search for the splitkey, nor any reason this method would 
need to recursively call itself.

Our reading of the code was that there are two reasons to return false: if 
'top' is true (you're in the top half of the split file) and the search key is 
greater than the splitkey (this works now) OR if 'top' is false (you're in the 
bottom half of the file) and the search key is less-than-or-equal-to the 
splitkey (presumably, the splitkey is stored in the top half, thus 
or-equal-to). 

If neither of those conditions exist, there is a possibility of finding the 
search key in the half-file you're looking at, so you call the delegate. 

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267453#comment-13267453
 ] 

Anoop Sam John commented on HBASE-5922:
---

In case of the bottom file, if the passed key is = splitkey, that is the case 
where we need to work with the passed key. This is not a case to return false. 
At this case ideally the the scanner should get pointed to the last key in the 
bottom file. Yes bottom file will not have the split key in it.
So we should change the key and need to seekBefore the splitKey, which in turn 
can make the pointer to the last key. I think why the stack overflow was coming 
is clear to you... It is because of the = check .. That is some thing 
unwanted...

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5925) Issue with only using the old config param hbase.hstore.compactionThreshold but not the corresponding new one

Anoop Sam John created HBASE-5925:
-

 Summary: Issue with only using the old config param 
hbase.hstore.compactionThreshold but not the corresponding new one
 Key: HBASE-5925
 URL: https://issues.apache.org/jira/browse/HBASE-5925
 Project: HBase
  Issue Type: Bug
Reporter: Anoop Sam John
Priority: Minor


One observation while going through the code:-

In MemStoreFlusher constructor
{code}
this.blockingStoreFilesNumber =
  conf.getInt(hbase.hstore.blockingStoreFiles, 7);
if (this.blockingStoreFilesNumber == -1) {
  this.blockingStoreFilesNumber = 1 +
conf.getInt(hbase.hstore.compactionThreshold, 3);
}
{code}
Here as per the code if hbase.hstore.blockingStoreFiles is configured as -1, we 
are making this value to be 1+ min files to compact

But here we read the old config item only!

Here also we need to read the new config 1st and if not there then the old 
one.. Is this a miss?

Like
conf.getInt(hbase.hstore.compaction.min,
 conf.getInt(hbase.hstore.compactionThreshold, 3))

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5926) Delete the master znode after a znode crash

nkeywal created HBASE-5926:
--

 Summary: Delete the master znode after a znode crash
 Key: HBASE-5926
 URL: https://issues.apache.org/jira/browse/HBASE-5926
 Project: HBase
  Issue Type: Improvement
  Components: master, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor


This is the continuation of the work done in HBASE-5844.
But we can't apply exactly the same strategy: for the region server, there is a 
znode per region server, while for the master  backup master there is a single 
znode for both.

So if we apply the same strategy as for a regionserver, we may have this 
scenario:
1) Master starts
2) Backup master starts
3) Master dies
4) ZK detects it
5) Backup master receives the update from ZK
6) Backup master creates the new master node and become the main master
7) Previous master script continues
8) Previous master script delete the master node in ZK
9) = issue: we deleted the node just created by the new master

This should not happen often (usually the znode will be delete soon enough), 
but it can happen.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5926) Delete the master znode after a znode crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5926:
---

Description: 
This is the continuation of the work done in HBASE-5844.
But we can't apply exactly the same strategy: for the region server, there is a 
znode per region server, while for the master  backup master there is a single 
znode for both.

So if we apply the same strategy as for a regionserver, we may have this 
scenario:
1) Master starts
2) Backup master starts
3) Master dies
4) ZK detects it
5) Backup master receives the update from ZK
6) Backup master creates the new master node and become the main master
7) Previous master script continues
8) Previous master script deletes the master node in ZK
9) = issue: we deleted the node just created by the new master

This should not happen often (usually the znode will be deleted soon enough), 
but it can happen.

  was:
This is the continuation of the work done in HBASE-5844.
But we can't apply exactly the same strategy: for the region server, there is a 
znode per region server, while for the master  backup master there is a single 
znode for both.

So if we apply the same strategy as for a regionserver, we may have this 
scenario:
1) Master starts
2) Backup master starts
3) Master dies
4) ZK detects it
5) Backup master receives the update from ZK
6) Backup master creates the new master node and become the main master
7) Previous master script continues
8) Previous master script delete the master node in ZK
9) = issue: we deleted the node just created by the new master

This should not happen often (usually the znode will be delete soon enough), 
but it can happen.


 Delete the master znode after a znode crash
 ---

 Key: HBASE-5926
 URL: https://issues.apache.org/jira/browse/HBASE-5926
 Project: HBase
  Issue Type: Improvement
  Components: master, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 This is the continuation of the work done in HBASE-5844.
 But we can't apply exactly the same strategy: for the region server, there is 
 a znode per region server, while for the master  backup master there is a 
 single znode for both.
 So if we apply the same strategy as for a regionserver, we may have this 
 scenario:
 1) Master starts
 2) Backup master starts
 3) Master dies
 4) ZK detects it
 5) Backup master receives the update from ZK
 6) Backup master creates the new master node and become the main master
 7) Previous master script continues
 8) Previous master script deletes the master node in ZK
 9) = issue: we deleted the node just created by the new master
 This should not happen often (usually the znode will be deleted soon enough), 
 but it can happen.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5919) Add fixes for Ted's review comments from HBASE-5869


 [ 
https://issues.apache.org/jira/browse/HBASE-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5919:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Add fixes for Ted's review comments from HBASE-5869
 ---

 Key: HBASE-5919
 URL: https://issues.apache.org/jira/browse/HBASE-5919
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Ted Yu
Priority: Blocker
 Attachments: 5919-v2.txt, 5919-v4.txt, 5919.txt


 I missed addressing a few of Ted's comments on the end of my navigating 
 HBASE-5869 commit.  Fix here.  Make it a blocker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError

2012-05-03 Thread Todd Johnson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267484#comment-13267484
 ] 

Todd Johnson commented on HBASE-5922:
-

Yeah, that's not how it seems to me. But then, I didn't write the code 
originally, so perhaps I misunderstand it.

We added a test case that causes infinite recursion with the old code, but 
appears to work with the patch. The test case searches for a key that is not 
equal to the split key. Given this, I don't see how the equals check could be 
the problem. 

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5926) Delete the master znode after a master crash


 [ 
https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5926:
---

Summary: Delete the master znode after a master crash  (was: Delete the 
master znode after a znode crash)

 Delete the master znode after a master crash
 

 Key: HBASE-5926
 URL: https://issues.apache.org/jira/browse/HBASE-5926
 Project: HBase
  Issue Type: Improvement
  Components: master, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 This is the continuation of the work done in HBASE-5844.
 But we can't apply exactly the same strategy: for the region server, there is 
 a znode per region server, while for the master  backup master there is a 
 single znode for both.
 So if we apply the same strategy as for a regionserver, we may have this 
 scenario:
 1) Master starts
 2) Backup master starts
 3) Master dies
 4) ZK detects it
 5) Backup master receives the update from ZK
 6) Backup master creates the new master node and become the main master
 7) Previous master script continues
 8) Previous master script deletes the master node in ZK
 9) = issue: we deleted the node just created by the new master
 This should not happen often (usually the znode will be deleted soon enough), 
 but it can happen.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5922) HalfStoreFileReader seekBefore causes StackOverflowError

2012-05-03 Thread Nate Putnam (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267502#comment-13267502
 ] 

Nate Putnam commented on HBASE-5922:


@Anoop as far as reproducing the issue, I'm not sure the exact steps that would 
cause this in a production environment. The test case in the patch will 
reproduce the issue though. 

 HalfStoreFileReader seekBefore causes StackOverflowError
 

 Key: HBASE-5922
 URL: https://issues.apache.org/jira/browse/HBASE-5922
 Project: HBase
  Issue Type: Bug
  Components: client, io
Affects Versions: 0.90.0
 Environment: HBase 0.90.4
Reporter: Nate Putnam
Assignee: Nate Putnam
 Fix For: 0.90.0

 Attachments: HBASE-5922.patch, HBASE-5922.patch


 Calling HRegionServer.getClosestRowBefore() can cause a stack overflow if the 
 underlying store file is a reference and the row key is in the bottom.
 java.io.IOException: java.io.IOException: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:990)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:978)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1651)
 at sun.reflect.GeneratedMethodAccessor174.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
 Caused by: java.lang.StackOverflowError
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:147)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)
 at 
 org.apache.hadoop.hbase.io.HalfStoreFileReader$1.seekBefore(HalfStoreFileReader.java:149)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile

2012-05-03 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267513#comment-13267513
 ] 

Jonathan Hsieh commented on HBASE-5876:
---

Problems in previous code:
# yarn execution framework not used because 
HBaseTestingUtility.startMiniCluster().getConfiguration() used instead of the 
HBaseTestingUtility.getConfiguration().  
# hadoop 1's mapred.output.dir and hadoop 2's  fileoutputformat.outputdir 
caused export job's data to get lost.

Currently running full builds against hadoop 1.0 and hadoop 0.23.x.

 TestImportExport has been failing against hadoop 0.23 profile
 -

 Key: HBASE-5876
 URL: https://issues.apache.org/jira/browse/HBASE-5876
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Zhihong Yu
Assignee: Jonathan Hsieh

 TestImportExport has been failing against hadoop 0.23 profile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

[
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267521#comment-13267521
]

ramkrishna.s.vasudevan commented on HBASE-5875:
---

I have reproduced the scenario addressing the title of the JIRA with a testcase.
I have tried follow a approach that Bijieshan had suggested in
https://issues.apache.org/jira/browse/HBASE-5875?focusedCommentId=13264874page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13264874
to solve the problem. Tomorrow i can upload the testcase.

Process RIT and Master restart may remove an online server considering it as
a dead server
--

Attachments: HBASE-5875.patch

[jira] [Created] (HBASE-5927) AM#unassign should handle local exceptions after calling sendRegionClose

Jieshan Bean created HBASE-5927:
---

 Summary: AM#unassign should handle local exceptions after calling 
sendRegionClose
 Key: HBASE-5927
 URL: https://issues.apache.org/jira/browse/HBASE-5927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.92.2, 0.96.0, 0.94.1


A possible exception: If the related regionserver was just killed(But HMaster 
has not perceived that), then we will get a local exception Connection reset 
by peer. If this region belongs to a disabling table. what will happen?

ServerShutdownHandler will remove this region from AM#regions. So this region 
is still existing in RIT. TimeoutMonitor will take care of it after it got 
timeout. Then invoke unassign again. But it has been removed from AM#regions, 
so it will return directly due to the below code:

  public void unassign(HRegionInfo region, boolean force) {
// TODO: Method needs refactoring.  Ugly buried returns throughout.  Beware!
LOG.debug(Starting unassignment of region  +
  region.getRegionNameAsString() +  (offlining));

synchronized (this.regions) {
  // Check if this region is currently assigned
  if (!regions.containsKey(region)) {
LOG.debug(Attempted to unassign region  +
  region.getRegionNameAsString() +  but it is not  +
  currently assigned anywhere);
return;
  }
}

Then it leads to an end-less loop.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5927) AM#unassign should handle local exceptions after calling sendRegionClose


 [ 
https://issues.apache.org/jira/browse/HBASE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5927:


Description: 
A possible exception: If the related regionserver was just killed(But HMaster 
has not perceived that), then we will get a local exception Connection reset 
by peer. If this region belongs to a disabling table. what will happen?

ServerShutdownHandler will remove this region from AM#regions. So this region 
is still existing in RIT. TimeoutMonitor will take care of it after it got 
timeout. Then invoke unassign again. Since this region has been removed from 
AM#regions, it will return directly due to the below code:

synchronized (this.regions) {
  // Check if this region is currently assigned
  if (!regions.containsKey(region)) {
LOG.debug(Attempted to unassign region  +
  region.getRegionNameAsString() +  but it is not  +
  currently assigned anywhere);
return;
  }
}

Then it leads to an end-less loop.


  was:
A possible exception: If the related regionserver was just killed(But HMaster 
has not perceived that), then we will get a local exception Connection reset 
by peer. If this region belongs to a disabling table. what will happen?

ServerShutdownHandler will remove this region from AM#regions. So this region 
is still existing in RIT. TimeoutMonitor will take care of it after it got 
timeout. Then invoke unassign again. But it has been removed from AM#regions, 
so it will return directly due to the below code:

  public void unassign(HRegionInfo region, boolean force) {
// TODO: Method needs refactoring.  Ugly buried returns throughout.  Beware!
LOG.debug(Starting unassignment of region  +
  region.getRegionNameAsString() +  (offlining));

synchronized (this.regions) {
  // Check if this region is currently assigned
  if (!regions.containsKey(region)) {
LOG.debug(Attempted to unassign region  +
  region.getRegionNameAsString() +  but it is not  +
  currently assigned anywhere);
return;
  }
}

Then it leads to an end-less loop.



 AM#unassign should handle local exceptions after calling sendRegionClose
 

 Key: HBASE-5927
 URL: https://issues.apache.org/jira/browse/HBASE-5927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.92.2, 0.96.0, 0.94.1


 A possible exception: If the related regionserver was just killed(But HMaster 
 has not perceived that), then we will get a local exception Connection reset 
 by peer. If this region belongs to a disabling table. what will happen?
 ServerShutdownHandler will remove this region from AM#regions. So this region 
 is still existing in RIT. TimeoutMonitor will take care of it after it got 
 timeout. Then invoke unassign again. Since this region has been removed from 
 AM#regions, it will return directly due to the below code:
 synchronized (this.regions) {
   // Check if this region is currently assigned
   if (!regions.containsKey(region)) {
 LOG.debug(Attempted to unassign region  +
   region.getRegionNameAsString() +  but it is not  +
   currently assigned anywhere);
 return;
   }
 }
 Then it leads to an end-less loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5927) AM#unassign should handle local exceptions after calling sendRegionClose


[ 
https://issues.apache.org/jira/browse/HBASE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267550#comment-13267550
 ] 

Zhihong Yu commented on HBASE-5927:
---

@Jieshan:
Can you a new test case show this possibility ?

 AM#unassign should handle local exceptions after calling sendRegionClose
 

 Key: HBASE-5927
 URL: https://issues.apache.org/jira/browse/HBASE-5927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.92.2, 0.96.0, 0.94.1


 A possible exception: If the related regionserver was just killed(But HMaster 
 has not perceived that), then we will get a local exception Connection reset 
 by peer. If this region belongs to a disabling table. what will happen?
 ServerShutdownHandler will remove this region from AM#regions. So this region 
 is still existing in RIT. TimeoutMonitor will take care of it after it got 
 timeout. Then invoke unassign again. Since this region has been removed from 
 AM#regions, it will return directly due to the below code:
 synchronized (this.regions) {
   // Check if this region is currently assigned
   if (!regions.containsKey(region)) {
 LOG.debug(Attempted to unassign region  +
   region.getRegionNameAsString() +  but it is not  +
   currently assigned anywhere);
 return;
   }
 }
 Then it leads to an end-less loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267560#comment-13267560
 ] 

Zhihong Yu commented on HBASE-5883:
---

Integrated to 0.94 and trunk.

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5927) AM#unassign should handle local exceptions after calling sendRegionClose


 [ 
https://issues.apache.org/jira/browse/HBASE-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5927:
--

Description: 
A possible exception: If the related regionserver was just killed(But HMaster 
has not perceived that), then we will get a local exception Connection reset 
by peer. If this region belongs to a disabling table. what will happen?

ServerShutdownHandler will remove this region from AM#regions. So this region 
is still existing in RIT. TimeoutMonitor will take care of it after it got 
timeout. Then invoke unassign again. Since this region has been removed from 
AM#regions, it will return directly due to the below code:
{code}
synchronized (this.regions) {
  // Check if this region is currently assigned
  if (!regions.containsKey(region)) {
LOG.debug(Attempted to unassign region  +
  region.getRegionNameAsString() +  but it is not  +
  currently assigned anywhere);
return;
  }
}
{code}
Then it leads to an end-less loop.


  was:
A possible exception: If the related regionserver was just killed(But HMaster 
has not perceived that), then we will get a local exception Connection reset 
by peer. If this region belongs to a disabling table. what will happen?

ServerShutdownHandler will remove this region from AM#regions. So this region 
is still existing in RIT. TimeoutMonitor will take care of it after it got 
timeout. Then invoke unassign again. Since this region has been removed from 
AM#regions, it will return directly due to the below code:

synchronized (this.regions) {
  // Check if this region is currently assigned
  if (!regions.containsKey(region)) {
LOG.debug(Attempted to unassign region  +
  region.getRegionNameAsString() +  but it is not  +
  currently assigned anywhere);
return;
  }
}

Then it leads to an end-less loop.



 AM#unassign should handle local exceptions after calling sendRegionClose
 

 Key: HBASE-5927
 URL: https://issues.apache.org/jira/browse/HBASE-5927
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.96.0, 0.94.1
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Fix For: 0.92.2, 0.96.0, 0.94.1


 A possible exception: If the related regionserver was just killed(But HMaster 
 has not perceived that), then we will get a local exception Connection reset 
 by peer. If this region belongs to a disabling table. what will happen?
 ServerShutdownHandler will remove this region from AM#regions. So this region 
 is still existing in RIT. TimeoutMonitor will take care of it after it got 
 timeout. Then invoke unassign again. Since this region has been removed from 
 AM#regions, it will return directly due to the below code:
 {code}
 synchronized (this.regions) {
   // Check if this region is currently assigned
   if (!regions.containsKey(region)) {
 LOG.debug(Attempted to unassign region  +
   region.getRegionNameAsString() +  but it is not  +
   currently assigned anywhere);
 return;
   }
 }
 {code}
 Then it leads to an end-less loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic


[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267568#comment-13267568
 ] 

stack commented on HBASE-5923:
--

This patch is great.  Thanks for going back and doing the cleanup.

This class should not be in filter package?
+import org.apache.hadoop.hbase.filter.WritableByteArrayComparable;

Probably hard to move it now?  Its part of a public API?  Could deprecate and 
replace w/ a more generic, non-filter specific class?  Moving it should not be 
part of this patch.  Its not so bad anyways having this filter package 
pollution since its in client facing code and clients need access to filter 
stuff...

Would think pollution:

+import 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos.Condition.CompareType;

Should be pulling in a non-pb class into an Interface like this.  Can we 
encapsulate these Client conditions in a non-pb class?

 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5883) Backup master is going down due to connection refused exception

[
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhihong Yu updated HBASE-5883:
--

Comment: was deleted

(was: -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525434/HBASE-5883-90.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

This message is automatically generated.)

Backup master is going down due to connection refused exception
---

Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch,
HBASE-5883-94.patch, HBASE-5883-trunk.patch

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267574#comment-13267574
 ] 

Zhihong Yu commented on HBASE-5883:
---

Integrated to 0.92 and 0.90 as well.

Thanks for the patch Jieshan.

Thanks for the review, Lars.

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5923) Cleanup checkAndXXX logic


 [ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5923:
-

Status: Open  (was: Patch Available)

 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5923) Cleanup checkAndXXX logic


 [ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5923:
-

Attachment: 5923-0.94.txt

0.94 patch.
Looking at the two patches now, the PB stuff is leaking through.
I.e. in trunk the generated CompareType is used by a client, whereas 0.94 
CompareFilter.compareOp has to be used.

That also means that is 0.94 there would be a dependency on CompareFilter in 
HTableInterface.

Please let me know what you think.

 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-0.94.txt, 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic


[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267593#comment-13267593
 ] 

stack commented on HBASE-5923:
--

bq. That also means that is 0.94 there would be a dependency on CompareFilter 
in HTableInterface.

Thats better than a generated pb dependency IMO.  If you'd like, I can make it 
so you can do same or similar in trunk: i.e. not have to import generated pb 
but rather the filter.CompareFilter or some such similar class?  Just say.


 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-0.94.txt, 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs


[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267594#comment-13267594
 ] 

Zhihong Yu commented on HBASE-3996:
---

There're a few suggestions from Stack pending.

@Stack:
Can you take a look at Eran's comments from Apr 5th ?

 Support multiple tables and scanners as input to the mapper in map/reduce jobs
 --

 Key: HBASE-3996
 URL: https://issues.apache.org/jira/browse/HBASE-3996
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Eran Kutner
Assignee: Eran Kutner
 Fix For: 0.96.0

 Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, 3996-v5.txt, 
 3996-v6.txt, 3996-v7.txt, HBase-3996.patch


 It seems that in many cases feeding data from multiple tables or multiple 
 scanners on a single table can save a lot of time when running map/reduce 
 jobs.
 I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic

[
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267597#comment-13267597
]

Lars Hofhansl commented on HBASE-5923:
--

Thanks Stack. These are exactly the concerns I had.

It becomes even more pronounced when looking at the 0.94 patch, which needs to
have a slightly different client facing API - since the PB stuff not exist
there.

I can see a few solutions:
* Only allow using WritableByteArrayComparable, i.e. make it implied and don't
even pass it (and hence only create the dependency for HTable but not
HTableInterface).
* As you said, have a separate CompareOp class that gets translated to the
correct compareType in HTable (again would allow only HTable having the
dependency, but not HTableInterface)

Cleanup checkAndXXX logic
-

Attachments: 5923-0.94.txt, 5923-trunk.txt

1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via
HTable[Interface].
2. there is unnecessary duplicate code in the check{Put|Delete} code in
HRegionServer.

[jira] [Updated] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

2012-05-03 Thread Matteo Bertozzi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5886:
---

Attachment: HBASE-5886-v4.patch

 Add new metric for possible data loss due to puts without WAL 
 --

 Key: HBASE-5886
 URL: https://issues.apache.org/jira/browse/HBASE-5886
 Project: HBase
  Issue Type: New Feature
  Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: metrics
 Attachments: HBASE-5886-v0.patch, HBASE-5886-v1.patch, 
 HBASE-5886-v2.patch, HBASE-5886-v3.patch, HBASE-5886-v4.patch


 Add a metrics to keep track of puts without WAL and possible data loss size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic


[ 
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267604#comment-13267604
 ] 

Lars Hofhansl commented on HBASE-5923:
--

@Stack: You mean have a CompareFilter.CompareOp to 
o.a.h.h.p.g.ClientProtos.Condition.CompareType mapping?
That'd be nice as the client facing interface would not change between 0.94 and 
trunk.
Or have a completely separate CompareOp/CompareType class?

 Cleanup checkAndXXX logic
 -

 Key: HBASE-5923
 URL: https://issues.apache.org/jira/browse/HBASE-5923
 Project: HBase
  Issue Type: Improvement
  Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: 5923-0.94.txt, 5923-trunk.txt


 1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via 
 HTable[Interface].
 2. there is unnecessary duplicate code in the check{Put|Delete} code in 
 HRegionServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5889) Remove HRegionInterface


 [ 
https://issues.apache.org/jira/browse/HBASE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5889:
---

Status: Patch Available  (was: Open)

 Remove HRegionInterface
 ---

 Key: HBASE-5889
 URL: https://issues.apache.org/jira/browse/HBASE-5889
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc, regionserver
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase_5889.patch


 As a step to move internals to PB, so as to avoid the conversion for 
 performance reason, we should remove the HRegionInterface. 
 Therefore region server only supports ClientProtocol and AdminProtocol.  
 Later on, HRegion can work with PB messages directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5889) Remove HRegionInterface


 [ 
https://issues.apache.org/jira/browse/HBASE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5889:
---

Attachment: hbase_5889.patch

 Remove HRegionInterface
 ---

 Key: HBASE-5889
 URL: https://issues.apache.org/jira/browse/HBASE-5889
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc, regionserver
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase_5889.patch


 As a step to move internals to PB, so as to avoid the conversion for 
 performance reason, we should remove the HRegionInterface. 
 Therefore region server only supports ClientProtocol and AdminProtocol.  
 Later on, HRegion can work with PB messages directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5889) Remove HRegionInterface

2012-05-03 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267613#comment-13267613
 ] 

jirapos...@reviews.apache.org commented on HBASE-5889:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4993/
---

Review request for hbase.


Summary
---

Removed HRegionInterface, and cleaned up the HRegionServer, moved pb code from 
RegionServer back to HRegionServer.

The goal is to avoid two copies of region server code to maintain, and make it 
possible to avoid data type conversion in the sever side.

Fixed some unit tests.  Now all region server unit tests test the new pb 
functions.

Enhanced getServerInfo so that it returns the webui port too.


This addresses bug HBASE-5889.
https://issues.apache.org/jira/browse/HBASE-5889


Diffs
-

  conf/hbase-policy.xml e45f23c 
  
security/src/main/java/org/apache/hadoop/hbase/security/HBasePolicyProvider.java
 0c4b4cb 
  src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon 
87f04f4 
  src/main/java/org/apache/hadoop/hbase/HConstants.java a9d80a0 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java e3912c2 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java fc9176d 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 757f98e 
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 
cd9b528 
  src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java 79d5fdd 
  src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java 212ee3e 
  src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java d1e0993 
  src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java 81603af 
  src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java 
fbf0127 
  src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java 
db1333b 
  src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java 
ae2094d 
  src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java 
8b45f03 
  src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java 
827fb23 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 96ac8bd 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java 
4cb070e 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java c2c89ea 
  src/main/protobuf/Admin.proto 2ad6fb0 
  src/main/protobuf/RPC.proto 105fb3f 
  src/main/resources/hbase-default.xml f54b345 
  src/main/resources/hbase-webapps/master/table.jsp ca7310c 
  src/test/java/org/apache/hadoop/hbase/TestDrainingServer.java a1992c3 
  src/test/java/org/apache/hadoop/hbase/TestGlobalMemStoreSize.java ad77e0a 
  src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 5574b7f 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 3dfc94e 
  src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java 
42092b7 
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java c270e28 
  src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java c36272f 
  src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java 
bdec3ee 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
 7dbba66 
  
src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java
 3acb988 
  src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java 
eb546a5 
  src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java ceba5cd 
  
src/test/java/org/apache/hadoop/hbase/master/TestMasterRestartAfterDisablingTable.java
 ec08b17 
  src/test/java/org/apache/hadoop/hbase/master/TestRollingRestart.java 30c6cf1 
  src/test/java/org/apache/hadoop/hbase/master/TestZKBasedOpenCloseRegion.java 
8c3f67e 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestEndToEndSplitTransaction.java
 7bfe4cd 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestRSStatusServlet.java 
ffce7e8 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java 
aa5ca37 
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java
 65fa948 
  src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java fc4a0a7 

Diff: https://reviews.apache.org/r/4993/diff


Testing
---

All regular and security profile tests are green before I rebased to the latest 
today.


Thanks,

Jimmy



 Remove HRegionInterface
 ---

 Key: HBASE-5889
 URL: https://issues.apache.org/jira/browse/HBASE-5889
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc, regionserver

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267624#comment-13267624
 ] 

Hudson commented on HBASE-5883:
---

Integrated in HBase-TRUNK #2843 (See 
[https://builds.apache.org/job/HBase-TRUNK/2843/])
HBASE-5883 Backup master is going down due to connection refused exception 
(Jieshan) (Revision 1333530)

 Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267633#comment-13267633
 ] 

Hudson commented on HBASE-5883:
---

Integrated in HBase-0.94 #175 (See 
[https://builds.apache.org/job/HBase-0.94/175/])
HBASE-5883 Backup master is going down due to connection refused exception 
(Jieshan) (Revision 1333533)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa

[jira] [Created] (HBASE-5928) Hbck shouldn't npe when there are no tables.

Elliott Clark created HBASE-5928:


 Summary: Hbck shouldn't npe when there are no tables.
 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor


hbase fsck errors out when there are no tables.

Exception in thread main java.lang.NullPointerException
at 
org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
at 
org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

[
https://issues.apache.org/jira/browse/HBASE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267642#comment-13267642
]

Hadoop QA commented on HBASE-5886:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525470/HBASE-5886-v4.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1746//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1746//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1746//console

This message is automatically generated.

Add new metric for possible data loss due to puts without WAL
--

Key: HBASE-5886
URL: https://issues.apache.org/jira/browse/HBASE-5886
Project: HBase
Issue Type: New Feature
Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
Labels: metrics
Attachments: HBASE-5886-v0.patch, HBASE-5886-v1.patch,
HBASE-5886-v2.patch, HBASE-5886-v3.patch, HBASE-5886-v4.patch

Add a metrics to keep track of puts without WAL and possible data loss size.

[jira] [Commented] (HBASE-5889) Remove HRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267645#comment-13267645
 ] 

Zhihong Yu commented on HBASE-5889:
---

HRegionInterface is used by asynchbase:
{code}
  writeHBaseString(buf, org.apache.hadoop.hbase.ipc.HRegionInterface);
final String klass = org.apache.hadoop.hbase.ipc.HRegionInterface;
./src/RegionClient.java
{code}
Should we start a discussion on dev@hbase to get wider feedback about the 
roadmap for non-bundled (third-party) HBase client(s) ?

 Remove HRegionInterface
 ---

 Key: HBASE-5889
 URL: https://issues.apache.org/jira/browse/HBASE-5889
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc, regionserver
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase_5889.patch


 As a step to move internals to PB, so as to avoid the conversion for 
 performance reason, we should remove the HRegionInterface. 
 Therefore region server only supports ClientProtocol and AdminProtocol.  
 Later on, HRegion can work with PB messages directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5889) Remove HRegionInterface

[
https://issues.apache.org/jira/browse/HBASE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267654#comment-13267654
]

Hadoop QA commented on HBASE-5889:
--

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525477/hbase_5889.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 60 new or modified tests.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1747//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1747//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1747//console

This message is automatically generated.

Remove HRegionInterface
---

Key: HBASE-5889
URL: https://issues.apache.org/jira/browse/HBASE-5889
Project: HBase
Issue Type: Improvement
Components: client, ipc, regionserver
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Fix For: 0.96.0

Attachments: hbase_5889.patch

As a step to move internals to PB, so as to avoid the conversion for
performance reason, we should remove the HRegionInterface.
Therefore region server only supports ClientProtocol and AdminProtocol.
Later on, HRegion can work with PB messages directly.

[jira] [Updated] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Phabricator (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5494:
---

Attachment: D2997.3.patch

avf requested code review of [jira] [HBASE-5494] [89-fb] Table-level locks for 
schema changing operations..
Reviewers: Kannan, mbautin, Liyin, JIRA

  Since concurrent modification (e.g., disabling and dropping a table under
  creation) could leave a cluster in an inconsistent state, we need table-level
  locks for schema changing operations.

  A ZooKeeper-based distributed lock has been implemented that
  attempts to create a persistent ZNode (one ZNode per entity being locked, 
i.e.,
  one per table) if one does not exist. Currently in case a master crashes while
  holding the lock, the lock must be manually removed using the ZooKeeper 
command
  line (locks being stored in /hbase/tableLock/).

  The locks implemented are not fair or re-entrant. RecoverableZooKeeper is used
  to correctly handle connection loss.

  To test the locks, InjectionHandler and InjectionEvent have been introduced,
  allowing for injection of arbitrary events, in this case adding delays during
  schema changing operations as to induce a race condition.

  Future work involves automatically deleting stale lock ZNodes upon server
  recovery (providing the attempted operations are not resumed), adding metrics
  around locks (e.g., list all locks held).

TEST PLAN
  Since concurrent modification (e.g., disabling and dropping a table
  under creation) could leave a cluster in an inconsistent state, we
  need table-level locks for schema changing operations.

  A ZooKeeper-based distributed lock has been implemented that attempts
  to create a persistent ZNode (one ZNode per entity being locked, i.e.,
  one per table) if one does not exist. Currently in case a master
  crashes while holding the lock, the lock must be manually removed
  using the ZooKeeper command line (locks being stored in
  /hbase/tableLock/).

  The locks implemented are not fair or re-entrant. RecoverableZooKeeper
  is used to correctly handle connection loss.

  To test the locks, InjectionHandler and InjectionEvent have been
  introduced, allowing for injection of arbitrary events, in this case
  adding delays during schema changing operations as to induce a race
  condition.

  Future work involves automatically deleting stale lock ZNodes upon
  server recovery (providing the attempted operations are not resumed),
  adding metrics around locks (e.g., list all locks held).


REVISION DETAIL
  https://reviews.facebook.net/D2997

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/TableLockTimeoutException.java
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java
  src/main/java/org/apache/hadoop/hbase/util/InjectionEvent.java
  src/main/java/org/apache/hadoop/hbase/util/InjectionHandler.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/DistributedLock.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
  src/test/java/org/apache/hadoop/hbase/master/TestSchemaModificationLocks.java
  src/test/java/org/apache/hadoop/hbase/util/DelayInducingInjectionHandler.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestDistributedLock.java


 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Attachments: D2997.3.patch


 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5928) Hbck shouldn't npe when there are no tables.


 [ 
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-5928:
-

Attachment: HBASE-5928-0.patch

Pretty small patch.
HConnectionManager.getHTableDescriptors returns null when there are no tables.

I assumed this was expected so handling the null is needed.

 Hbck shouldn't npe when there are no tables.
 

 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5928-0.patch


 hbase fsck errors out when there are no tables.
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5928) Hbck shouldn't npe when there are no tables.


 [ 
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-5928:
-

Status: Patch Available  (was: Open)

 Hbck shouldn't npe when there are no tables.
 

 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5928-0.patch


 hbase fsck errors out when there are no tables.
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5373) Table level lock to prevent the race of multiple table level operation

2012-05-03 Thread Liyin Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267667#comment-13267667
 ] 

Liyin Tang commented on HBASE-5373:
---

Sure. I am glad that Alex is working this jira right now and I will help on the 
code-review.

 Table level lock to prevent the race of multiple table level operation
 --

 Key: HBASE-5373
 URL: https://issues.apache.org/jira/browse/HBASE-5373
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 A table level lock can guarantee that only one table operation would happen 
 at one time for each table. The master should require and release these table 
 locks correctly during the failover time. One proposal is to keep track of 
 the lock and its corresponding operation in the zookeeper. If there is a 
 master failover, the secondary should have a way to check whether these 
 operations are succeeded nor not before releasing the lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Alex Feinberg (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267681#comment-13267681
]

Alex Feinberg commented on HBASE-5494:
--

This patch implements a ZK-hosted mutual exclusion lock (DistributedLock), and
table level locks (TableLockManager), and ensures that all schema changing
operations are serialized. Further work would be needed to add read-write locks
to handle region splitting and merges.

Introduce a zk hosted table-wide read/write lock so only one table operation
at a time
--

Key: HBASE-5494
URL: https://issues.apache.org/jira/browse/HBASE-5494
Project: HBase
Issue Type: Improvement
Reporter: stack
Attachments: D2997.3.patch

I saw this facility over in the accumulo code base.
Currently we just try to sort out the mess when splits come in during an
online schema edit; somehow we figure we can figure all possible region
transition combinations and make the right call.
We could try and narrow the number of combinations by taking out a zk table
lock when doing table operations.
For example, on split or merge, we could take a read-only lock meaning the
table can't be disabled while these are running.
We could then take a write only lock if we want to ensure the table doesn't
change while disabling or enabling process is happening.
Shouldn't be too hard to add.

[jira] [Commented] (HBASE-5889) Remove HRegionInterface


[ 
https://issues.apache.org/jira/browse/HBASE-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267684#comment-13267684
 ] 

Jimmy Xiang commented on HBASE-5889:


@Ted, I posted a message to dev@hbase as suggested. I think it is to their 
benefits to migrate as well.

 Remove HRegionInterface
 ---

 Key: HBASE-5889
 URL: https://issues.apache.org/jira/browse/HBASE-5889
 Project: HBase
  Issue Type: Improvement
  Components: client, ipc, regionserver
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase_5889.patch


 As a step to move internals to PB, so as to avoid the conversion for 
 performance reason, we should remove the HRegionInterface. 
 Therefore region server only supports ClientProtocol and AdminProtocol.  
 Later on, HRegion can work with PB messages directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Phabricator (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267689#comment-13267689
]

Phabricator commented on HBASE-5494:

tedyu has commented on the revision [jira] [HBASE-5494] [89-fb] Table-level
locks for schema changing operations..

I only reviewed part of the patch.

Would this feature be refined in 0.89-fb branch before being ported to Apache
HBase trunk ?

INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/HConstants.java:98 Schema changes would
always involve master.
'master.' can be omitted.
src/main/java/org/apache/hadoop/hbase/HConstants.java:108 Is this value big
enough in cluster testing ?
src/main/java/org/apache/hadoop/hbase/TableLockTimeoutException.java:2 No
year is needed.
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:1353 This lock is
used to prevent two concurrent table creation attempts.
tryLockTable() is more desirable here.
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:1310 Can we add
tryLockTable() ?
It would be useful for the non-winning thread to exit quickly.
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:2 No year,
please.
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:47 Should
Bytes.toStringBinary() be used here ?
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:53 Add 'be
' before 'released'
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:137 What
if lock release fails ?
src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:27 Can you
tell me which zookeeper branch provides this lock ?
In http://svn.apache.org/repos/asf/zookeeper/trunk, I don't seem to find this
class.

REVISION DETAIL
https://reviews.facebook.net/D2997

Introduce a zk hosted table-wide read/write lock so only one table operation
at a time
--

Key: HBASE-5494
URL: https://issues.apache.org/jira/browse/HBASE-5494
Project: HBase
Issue Type: Improvement
Reporter: stack
Attachments: D2997.3.patch

[jira] [Created] (HBASE-5929) HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked to majorcompact regions.

2012-05-03 Thread Aravind Gottipati (JIRA)

Aravind Gottipati created HBASE-5929:


 Summary: HBaseAdmin.majorCompact and hbase shell randomly throw 
exceptions when asked to majorcompact regions.
 Key: HBASE-5929
 URL: https://issues.apache.org/jira/browse/HBASE-5929
 Project: HBase
  Issue Type: Bug
  Components: client, shell
Affects Versions: 0.92.1
 Environment: Linux Ubuntu Lucid 64bit
Reporter: Aravind Gottipati
Priority: Minor


I have been noticing that calls to HBaseAdmin.majorCompact throws exceptions 
randomly for some regions.  I could not find a pattern to these exception.  The 
code I have simply does this 
admin.majorCompact(region.getRegionNameAsString()).  admin is an instance of 
HBaseAdmin and region is an instance of HRegionInfo.  The exception I get is 

org.apache.hadoop.hbase.TableNotFoundException: -ROOT-,,0
at 
org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
~[hbase-0.92.1.jar:0.92.1]
at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
Source) [hbase_compact.jar:na]


In this case it's the root region, but I get similar exceptions for other 
tables, like this.


2012-05-03 19:03:42,994 WARN  [main] HBaseCompact: Could not compact:
org.apache.hadoop.hbase.TableNotFoundException: 
ad_daily,49842:2009-07-10,1269763588508.1997607018
at 
org.apache.hadoop.hbase.client.HBaseAdmin.tableNameString(HBaseAdmin.java:1473) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.compact(HBaseAdmin.java:1235) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1209) 
~[hbase-0.92.1.jar:0.92.1]
at 
org.apache.hadoop.hbase.client.HBaseAdmin.majorCompact(HBaseAdmin.java:1196) 
~[hbase-0.92.1.jar:0.92.1]
at com.stumbleupon.hbaseadmin.HBaseCompact.compactAllServers(Unknown 
Source) [hbase_compact.jar:na]
at com.stumbleupon.hbaseadmin.HBaseCompact.main(Unknown Source) 
[hbase_compact.jar:na]


I see this on hbase shell as well.  However, I don't see these exceptions if I 
use admin.majorCompact(region.getRegionName()), so it looks like something gets 
lost when I use getRegionNameAsString().

Let me know if I can provide more information.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5928) Hbck shouldn't npe when there are no tables.

[
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267713#comment-13267713
]

Hadoop QA commented on HBASE-5928:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525489/HBASE-5928-0.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.master.TestAssignmentManager

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1748//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1748//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1748//console

This message is automatically generated.

Hbck shouldn't npe when there are no tables.

Key: HBASE-5928
URL: https://issues.apache.org/jira/browse/HBASE-5928
Project: HBase
Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
Attachments: HBASE-5928-0.patch

hbase fsck errors out when there are no tables.
Exception in thread main java.lang.NullPointerException
at
org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
at
org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267714#comment-13267714
 ] 

Hudson commented on HBASE-5883:
---

Integrated in HBase-0.92 #396 (See 
[https://builds.apache.org/job/HBase-0.92/396/])
HBASE-5883  Backup master is going down due to connection refused exception 
(Jieshan) (Revision 1333537)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:

[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time

2012-05-03 Thread Phabricator (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267728#comment-13267728
 ] 

Phabricator commented on HBASE-5494:


avf has commented on the revision [jira] [HBASE-5494] [89-fb] Table-level 
locks for schema changing operations..

  Thanks for the inline comments, @tedyu -- I've replied to a few quick ones 
inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:27 
DistributedLock is implemented as part of the patch (see DistributedLock.java)
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:47 
Metadata for table level locks is stored as plain text -- this is to allow 
operations to view lock information from the zookeeper CLI: toStringBinary() 
would not be needed here.
  src/main/java/org/apache/hadoop/hbase/master/TableLockManager.java:137 In 
this case, an IOException is thrown up to the caller: this is to indicate a 
non-recoverable ZooKeeper error (DistributedLock uses RecoverableZooKeeper 
class under the covers). .release() may also throw an IllegalStateException -- 
but this is essentially used an assertion in this case (releasing a lock that 
isn't held).

REVISION DETAIL
  https://reviews.facebook.net/D2997


 Introduce a zk hosted table-wide read/write lock so only one table operation 
 at a time
 --

 Key: HBASE-5494
 URL: https://issues.apache.org/jira/browse/HBASE-5494
 Project: HBase
  Issue Type: Improvement
Reporter: stack
 Attachments: D2997.3.patch


 I saw this facility over in the accumulo code base.
 Currently we just try to sort out the mess when splits come in during an 
 online schema edit; somehow we figure we can figure all possible region 
 transition combinations and make the right call.
 We could try and narrow the number of combinations by taking out a zk table 
 lock when doing table operations.
 For example, on split or merge, we could take a read-only lock meaning the 
 table can't be disabled while these are running.
 We could then take a write only lock if we want to ensure the table doesn't 
 change while disabling or enabling process is happening.
 Shouldn't be too hard to add.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception


[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267743#comment-13267743
 ] 

stack commented on HBASE-5883:
--

Can't we at least check the message to ensure its what we expect?  (See the 
second catch below where we look for connection reset).  Can we be sure what 
comes up here is the ConnectException we set down in HBaseRPC?

{code}
+  if (ioe instanceof ConnectException) {
+// Catch. Connect refused.
{code}

This redoing of an exception seems problematic.  Its really necessary?

{code}
+} else if (ioex.getMessage().toLowerCase()
+.contains(connection refused)) {
+  ce = new ConnectException(ioex.getMessage());
+  ioe = ce;
{code}

I'd feel better about this fix if we could figure where the exception came from 
(Its not from the rpc stringifying of exceptions to pass them from server to 
client?

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5883-90.patch, HBASE-5883-92.patch, 
 HBASE-5883-94.patch, HBASE-5883-trunk.patch


 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO

[jira] [Commented] (HBASE-5928) Hbck shouldn't npe when there are no tables.


[ 
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267744#comment-13267744
 ] 

stack commented on HBASE-5928:
--

+1 on patch.

Jon Hsieh?

 Hbck shouldn't npe when there are no tables.
 

 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5928-0.patch


 hbase fsck errors out when there are no tables.
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5928) Hbck shouldn't npe when there are no tables.


[ 
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267746#comment-13267746
 ] 

Elliott Clark commented on HBASE-5928:
--

I looped TestAssignmentManager several times locally and it always passes.

 Hbck shouldn't npe when there are no tables.
 

 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5928-0.patch


 hbase fsck errors out when there are no tables.
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5928) Hbck shouldn't npe when there are no tables.

2012-05-03 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267747#comment-13267747
 ] 

Todd Lipcon commented on HBASE-5928:


Why not make it return an empty list instead? Returning null instead of empty 
collections is just begging for bugs like this.

 Hbck shouldn't npe when there are no tables.
 

 Key: HBASE-5928
 URL: https://issues.apache.org/jira/browse/HBASE-5928
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5928-0.patch


 hbase fsck errors out when there are no tables.
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:560)
   at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:346)
   at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
   at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5931) HBase security profile doesn't compile

Jimmy Xiang created HBASE-5931:
--

 Summary: HBase security profile doesn't compile 
 Key: HBASE-5931
 URL: https://issues.apache.org/jira/browse/HBASE-5931
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang


The compilation is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-5930) Periodically flush the Memstore?

Lars Hofhansl created HBASE-5930:


 Summary: Periodically flush the Memstore?
 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor


A colleague of mine ran into an interesting issue.
He inserted some data with the WAL disabled, which happened to fit in the 
aggregate Memstores memory.

Two weeks later he a had problem with the HDFS cluster, which caused the region 
servers to abort. He found that his data was lost. Looking at the log we found 
that the Memstores were not flushed at all during these two weeks.

Should we have an option to flush memstores periodically. There are obvious 
downsides to this, like many small storefiles, etc.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5931) HBase security profile doesn't compile


 [ 
https://issues.apache.org/jira/browse/HBASE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5931:
---

Attachment: hbase-5931.patch

 HBase security profile doesn't compile 
 ---

 Key: HBASE-5931
 URL: https://issues.apache.org/jira/browse/HBASE-5931
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: hbase-5931.patch


 The compilation is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

[
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267751#comment-13267751
]

nkeywal commented on HBASE-5877:

v12, should be final.

1) ServerName is used everywhere in the interface, thanks to protobuf
2) hadoop.ipc serialization of exception is based on the #getMessage. So we
have to parse it internally. It's not visisble to the exception user.
3) The code to manage the error in the client package is quite complex. We have
the exception at the very beginning, and then it's checked again, but we don't
have the real exception anymore. I used a new historyList to make it works.
There is another JIRA for other improvement, in which I could get rid of this
(HBASE-5924)
4) Generated with protobuf 2.4.1
5) The destination is the closeRegion interface is a kind of interface
hijacking. Other options would be:
- sharing the region state in zookeeper
- letting the regionserver calls the master to get the new server. On paper
this would be more efficient than a client - master call. In both cases we
could consider that the client should not connect to the master except for
cluster administration (create table, split regin; ...). That would increase
global reliability. That's for another discussion as well I think.
6) RegionServerServices has been modified to set a destination when removing a
region from the online regions.
7) In another JIRA I will manage the case when the destination is not specified
when calling the move function.

When a query fails because the region has moved, let the regionserver return
the new address to the client
--

Key: HBASE-5877
URL: https://issues.apache.org/jira/browse/HBASE-5877
Project: HBase
Issue Type: Improvement
Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
Fix For: 0.96.0

Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch

This is mainly useful when we do a rolling restart. This will decrease the
load on the master and the network load.
Note that a region is not immediately opened after a close. So:
- it seems preferable to wait before retrying on the other server. An
optimisation would be to have an heuristic depending on when the region was
closed.
- during a rolling restart, the server moves the regions then stops. So we
may have failures when the server is stopped, and this patch won't help.
The implementation in the first patch does:
- on the region move, there is an added parameter on the regionserver#close
to say where we are sending the region
- the regionserver keeps a list of what was moved. Each entry is kept 100
seconds.
- the regionserver sends a specific exception when it receives a query on a
moved region. This exception contains the new address.
- the client analyses the exeptions and update its cache accordingly...

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Attachment: 5877.v12.patch

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5931) HBase security profile doesn't compile


 [ 
https://issues.apache.org/jira/browse/HBASE-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5931:
---

Status: Patch Available  (was: Open)

 HBase security profile doesn't compile 
 ---

 Key: HBASE-5931
 URL: https://issues.apache.org/jira/browse/HBASE-5931
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: hbase-5931.patch


 The compilation is broken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Status: Open  (was: Patch Available)

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client


 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5877:
---

Status: Patch Available  (was: Open)

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch, 5877.v12.patch, 5877.v6.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5929) HBaseAdmin.majorCompact and hbase shell randomly throw exceptions when asked to majorcompact regions.