date:20130212


[ 
https://issues.apache.org/jira/browse/HBASE-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576467#comment-13576467
 ] 

nkeywal commented on HBASE-7798:


+  String existingServer = (rt.getServerName() == null) ?  : 
rt.getServerName().toString();
It would be better to log explicitly that we don't have the server name (with 
something like 'unknown' or even 'null'), if it happens it make the problem 
easier to catch in the logs.

Otherwise, +1 for the patch. I can commit it if you're ok with my comment (you 
don't need to upload a new patch, I will change the code to 'unknown' on 
commit).


 ZKAssign logs the wrong server if the transition fails
 --

 Key: HBASE-7798
 URL: https://issues.apache.org/jira/browse/HBASE-7798
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: 7798-v0.patch, HBASE-7798-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3171) Drop ROOT and instead store META location(s) directly in ZooKeeper


[ 
https://issues.apache.org/jira/browse/HBASE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576470#comment-13576470
 ] 

nkeywal commented on HBASE-3171:


May be it's totally stupid question, but would it make sense to use this 
migration to rename '.META.' as well , to make it compatible with the Windows 
file naming policy, mentioned by [~enis] in another Jira?

 Drop ROOT and instead store META location(s) directly in ZooKeeper
 --

 Key: HBASE-3171
 URL: https://issues.apache.org/jira/browse/HBASE-3171
 Project: HBase
  Issue Type: Improvement
  Components: Client, master, regionserver, Zookeeper
Reporter: Jonathan Gray
 Attachments: HBASE-3171.patch, HBASE-3171-v2.patch, 
 HBASE-3171-v3.patch, HBASE-3171-v4.patch


 Rather than storing the ROOT region location in ZooKeeper, going to ROOT, and 
 reading the META location, we should just store the META location directly in 
 ZooKeeper.
 The purpose of the root region from the bigtable paper was to support 
 multiple meta regions.  Currently, we explicitly only support a single meta 
 region, so the translation from our current code of a single root location to 
 a single meta location will be very simple.  Long-term, it seems reasonable 
 that we could store several meta region locations in ZK.  There's been some 
 discussion in HBASE-1755 about actually moving META into ZK, but I think this 
 jira is a good step towards taking some of the complexity out of how we have 
 to deal with catalog tables everywhere.
 As-is, a new client already requires ZK to get the root location, so this 
 would not change those requirements in any way.
 The primary motivation for this is to simplify things like CatalogTracker.  
 The way we can handle root in that class is really simple but the tracking of 
 meta is difficulty and a bit hacky.  This hack on tracking of the meta 
 location is what caused one of the bugs over in HBASE-3159.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others


 [ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7407:
---

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 TestMasterFailover under tests some cases and over tests some others
 

 Key: HBASE-7407
 URL: https://issues.apache.org/jira/browse/HBASE-7407
 Project: HBase
  Issue Type: Bug
  Components: master, Region Assignment, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch, 
 7407.v5.patch, 7407.v6.patch, 7407_v7.patch


 The tests are done with this settings:
 conf.setInt(hbase.master.assignment.timeoutmonitor.period, 2000);
 conf.setInt(hbase.master.assignment.timeoutmonitor.timeout, 4000);
 As a results:
 1) some tests seems to work, but in real life, the recovery would take 5 
 minutes or more, as in production there always higher. So we don't see the 
 real issues.
 2) The tests include specific cases that should not happen in production. It 
 works because the timeout catches everything, but these scenarios do not need 
 to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others


[ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576493#comment-13576493
 ] 

nkeywal commented on HBASE-7407:


Committed, thanks a lot for the review, Jimmy  Stack!

 TestMasterFailover under tests some cases and over tests some others
 

 Key: HBASE-7407
 URL: https://issues.apache.org/jira/browse/HBASE-7407
 Project: HBase
  Issue Type: Bug
  Components: master, Region Assignment, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch, 
 7407.v5.patch, 7407.v6.patch, 7407_v7.patch


 The tests are done with this settings:
 conf.setInt(hbase.master.assignment.timeoutmonitor.period, 2000);
 conf.setInt(hbase.master.assignment.timeoutmonitor.timeout, 4000);
 As a results:
 1) some tests seems to work, but in real life, the recovery would take 5 
 minutes or more, as in production there always higher. So we don't see the 
 real issues.
 2) The tests include specific cases that should not happen in production. It 
 works because the timeout catches everything, but these scenarios do not need 
 to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7789) Clean DeadServer.java and add a Jitter method in ConnectionUtils


 [ 
https://issues.apache.org/jira/browse/HBASE-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7789:
---

Attachment: 7789.v2.patch

 Clean DeadServer.java and add a Jitter method in ConnectionUtils
 

 Key: HBASE-7789
 URL: https://issues.apache.org/jira/browse/HBASE-7789
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 7789.v1.patch, 7789.v2.patch


 I need to do some changes in DeadServer because of HBASE-7590. To minimize 
 the patch size and simplifies the feedback, I prefer to isolate the issue.
 Changes are:
  - Add the time when the server was declared as dead. It's what I need in 
 HBASE-7590, but it makes sense even without it, for example to be shown in 
 the UI.
  - suppress the extends on Set  clean up all the not used methods
  - use directly the object instead of a copy.
 For connection utils, we currently have a jitter of 1%. I need a bigger one 
 for sure in one case, but I wonder if we should not increase it in all cases? 
 instead of plus 1%, we should have plus or minus 10% imho.
 Tests are in progress locally, I will add the patch when they're ok.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


[ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576502#comment-13576502
 ] 

Matteo Bertozzi commented on HBASE-7829:


yup, that's my fault.. looking at my configuration the values are swapped
+1 on the patch (for 0.94 and trunk), thanks!

 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7789) Clean DeadServer.java and add a Jitter method in ConnectionUtils

[
https://issues.apache.org/jira/browse/HBASE-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nkeywal updated HBASE-7789:
---

Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)

Clean DeadServer.java and add a Jitter method in ConnectionUtils

Key: HBASE-7789
URL: https://issues.apache.org/jira/browse/HBASE-7789
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Fix For: 0.96.0

Attachments: 7789.v1.patch, 7789.v2.patch

I need to do some changes in DeadServer because of HBASE-7590. To minimize
the patch size and simplifies the feedback, I prefer to isolate the issue.
Changes are:
- Add the time when the server was declared as dead. It's what I need in
HBASE-7590, but it makes sense even without it, for example to be shown in
the UI.
- suppress the extends on Set clean up all the not used methods
- use directly the object instead of a copy.
For connection utils, we currently have a jitter of 1%. I need a bigger one
for sure in one case, but I wonder if we should not increase it in all cases?
instead of plus 1%, we should have plus or minus 10% imho.
Tests are in progress locally, I will add the patch when they're ok.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7789) Clean DeadServer.java and add a Jitter method in ConnectionUtils

[
https://issues.apache.org/jira/browse/HBASE-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576521#comment-13576521
]

nkeywal commented on HBASE-7789:

Committed (v2) thanks for the review, Ted!

Clean DeadServer.java and add a Jitter method in ConnectionUtils

Key: HBASE-7789
URL: https://issues.apache.org/jira/browse/HBASE-7789
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Fix For: 0.96.0

Attachments: 7789.v1.patch, 7789.v2.patch

[jira] [Commented] (HBASE-7789) Clean DeadServer.java and add a Jitter method in ConnectionUtils

[
https://issues.apache.org/jira/browse/HBASE-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576537#comment-13576537
]

Hadoop QA commented on HBASE-7789:
--

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12568947/7789.v2.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 9 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/4421//console

This message is automatically generated.

Clean DeadServer.java and add a Jitter method in ConnectionUtils

Key: HBASE-7789
URL: https://issues.apache.org/jira/browse/HBASE-7789
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Fix For: 0.96.0

Attachments: 7789.v1.patch, 7789.v2.patch

[jira] [Commented] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others


[ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576535#comment-13576535
 ] 

Hudson commented on HBASE-7407:
---

Integrated in HBase-TRUNK #3869 (See 
[https://builds.apache.org/job/HBase-TRUNK/3869/])
HBASE-7407 TestMasterFailover under tests some cases and over tests some 
others (Revision 1445074)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/DoNotRetryIOException.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/PleaseHoldException.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailoverBalancerPersistence.java


 TestMasterFailover under tests some cases and over tests some others
 

 Key: HBASE-7407
 URL: https://issues.apache.org/jira/browse/HBASE-7407
 Project: HBase
  Issue Type: Bug
  Components: master, Region Assignment, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch, 
 7407.v5.patch, 7407.v6.patch, 7407_v7.patch


 The tests are done with this settings:
 conf.setInt(hbase.master.assignment.timeoutmonitor.period, 2000);
 conf.setInt(hbase.master.assignment.timeoutmonitor.timeout, 4000);
 As a results:
 1) some tests seems to work, but in real life, the recovery would take 5 
 minutes or more, as in production there always higher. So we don't see the 
 real issues.
 2) The tests include specific cases that should not happen in production. It 
 works because the timeout catches everything, but these scenarios do not need 
 to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7802) Use the Store interface instead of HStore (coprocessor, compaction, ...)


[ 
https://issues.apache.org/jira/browse/HBASE-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576542#comment-13576542
 ] 

Matteo Bertozzi commented on HBASE-7802:


I'm going to commit this one if there no other comments
(it's mostly search  replace HStore with Store)

 Use the Store interface instead of HStore (coprocessor, compaction, ...)
 

 Key: HBASE-7802
 URL: https://issues.apache.org/jira/browse/HBASE-7802
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Attachments: HBASE-7802-v0.patch, HBASE-7802-v1.patch, 
 HBASE-7802-v2.patch


 We have a Store interface that is almost never used, coprocessors and 
 compactions are using the HStore.
 Replace the HStore with Store to have the ability to create custom Store for 
 testing, instead of injecting stuff in HStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6972) HBase Shell deleteall should not require column to be defined


[ 
https://issues.apache.org/jira/browse/HBASE-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576545#comment-13576545
 ] 

Matteo Bertozzi commented on HBASE-6972:


+1 on the .3 patch

 HBase Shell deleteall should not require column to be defined 
 --

 Key: HBASE-6972
 URL: https://issues.apache.org/jira/browse/HBASE-6972
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.96.0
Reporter: Ricky Saltzer
Assignee: Ricky Saltzer
 Fix For: 0.96.0

 Attachments: HBASE-6972.2.patch, HBASE-6972.3.patch, HBASE-6972.patch


 It appears that the shell does not allow users to delete a row without 
 specifying a column (deleteall). It looks like the deleteall.rb used to 
 pre-define column as nil, making it optional. 
 I've created a patch and confirmed it to be working in standalone mode, I 
 will upload it shortly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7789) Clean DeadServer.java and add a Jitter method in ConnectionUtils


[ 
https://issues.apache.org/jira/browse/HBASE-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576573#comment-13576573
 ] 

Hudson commented on HBASE-7789:
---

Integrated in HBase-TRUNK #3870 (See 
[https://builds.apache.org/job/HBase-TRUNK/3870/])
HBASE-7789 Clean DeadServer.java and add a Jitter method in ConnectionUtils 
(Revision 1445087)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRollingRestart.java


 Clean DeadServer.java and add a Jitter method in ConnectionUtils
 

 Key: HBASE-7789
 URL: https://issues.apache.org/jira/browse/HBASE-7789
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 7789.v1.patch, 7789.v2.patch


 I need to do some changes in DeadServer because of HBASE-7590. To minimize 
 the patch size and simplifies the feedback, I prefer to isolate the issue.
 Changes are:
  - Add the time when the server was declared as dead. It's what I need in 
 HBASE-7590, but it makes sense even without it, for example to be shown in 
 the UI.
  - suppress the extends on Set  clean up all the not used methods
  - use directly the object instead of a copy.
 For connection utils, we currently have a jitter of 1%. I need a bigger one 
 for sure in one case, but I wonder if we should not increase it in all cases? 
 instead of plus 1%, we should have plus or minus 10% imho.
 Tests are in progress locally, I will add the patch when they're ok.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7821) Remove Duplicated code in CompactSelection


[ 
https://issues.apache.org/jira/browse/HBASE-7821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576584#comment-13576584
 ] 

Hudson commented on HBASE-7821:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #404 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/404/])
HBASE-7821 Remove Duplicated code in CompactSelection (Revision 1444998)

 Result = FAILURE
eclark : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java


 Remove Duplicated code in CompactSelection
 --

 Key: HBASE-7821
 URL: https://issues.apache.org/jira/browse/HBASE-7821
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7821-0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7789) Clean DeadServer.java and add a Jitter method in ConnectionUtils


[ 
https://issues.apache.org/jira/browse/HBASE-7789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576583#comment-13576583
 ] 

Hudson commented on HBASE-7789:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #404 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/404/])
HBASE-7789 Clean DeadServer.java and add a Jitter method in ConnectionUtils 
(Revision 1445087)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/DeadServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRollingRestart.java


 Clean DeadServer.java and add a Jitter method in ConnectionUtils
 

 Key: HBASE-7789
 URL: https://issues.apache.org/jira/browse/HBASE-7789
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 7789.v1.patch, 7789.v2.patch


 I need to do some changes in DeadServer because of HBASE-7590. To minimize 
 the patch size and simplifies the feedback, I prefer to isolate the issue.
 Changes are:
  - Add the time when the server was declared as dead. It's what I need in 
 HBASE-7590, but it makes sense even without it, for example to be shown in 
 the UI.
  - suppress the extends on Set  clean up all the not used methods
  - use directly the object instead of a copy.
 For connection utils, we currently have a jitter of 1%. I need a bigger one 
 for sure in one case, but I wonder if we should not increase it in all cases? 
 instead of plus 1%, we should have plus or minus 10% imho.
 Tests are in progress locally, I will add the patch when they're ok.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7407) TestMasterFailover under tests some cases and over tests some others


[ 
https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576585#comment-13576585
 ] 

Hudson commented on HBASE-7407:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #404 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/404/])
HBASE-7407 TestMasterFailover under tests some cases and over tests some 
others (Revision 1445074)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/DoNotRetryIOException.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/PleaseHoldException.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailoverBalancerPersistence.java


 TestMasterFailover under tests some cases and over tests some others
 

 Key: HBASE-7407
 URL: https://issues.apache.org/jira/browse/HBASE-7407
 Project: HBase
  Issue Type: Bug
  Components: master, Region Assignment, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7407.v1.patch, 7407.v2.patch, 7407.v3.patch, 
 7407.v5.patch, 7407.v6.patch, 7407_v7.patch


 The tests are done with this settings:
 conf.setInt(hbase.master.assignment.timeoutmonitor.period, 2000);
 conf.setInt(hbase.master.assignment.timeoutmonitor.timeout, 4000);
 As a results:
 1) some tests seems to work, but in real life, the recovery would take 5 
 minutes or more, as in production there always higher. So we don't see the 
 real issues.
 2) The tests include specific cases that should not happen in production. It 
 works because the timeout catches everything, but these scenarios do not need 
 to be optimized, as they cannot happen. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

nkeywal updated HBASE-7590:
---

Attachment: 7590.inprogress.patch

Add a costless notifications mechanism from master to regionservers clients
-

Key: HBASE-7590
URL: https://issues.apache.org/jira/browse/HBASE-7590
Project: HBase
Issue Type: Bug
Components: Client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Attachments: 7590.inprogress.patch

t would be very useful to add a mechanism to distribute some information to
the clients and regionservers. Especially It would be useful to know globally
(regionservers + clients apps) that some regionservers are dead. This would
allow:
- to lower the load on the system, without clients using staled information
and going on dead machines
- to make the recovery faster from a client point of view. It's common to use
large timeouts on the client side, so the client may need a lot of time
before declaring a region server dead and trying another one. If the client
receives the information separatly about a region server states, it can take
the right decision, and continue/stop to wait accordingly.
We can also send more information, for example instructions like 'slow down'
to instruct the client to increase the retries delay and so on.
Technically, the master could send this information. To lower the load on
the system, we should:
- have a multicast communication (i.e. the master does not have to connect to
all servers by tcp), with once packet every 10 seconds or so.
- receivers should not depend on this: if the information is available great.
If not, it should not break anything.
- it should be optional.
So at the end we would have a thread in the master sending a protobuf message
about the dead servers on a multicast socket. If the socket is not
configured, it does not do anything. On the client side, when we receive an
information that a node is dead, we refresh the cache about it.

[jira] [Commented] (HBASE-7590) Add a costless notifications mechanism from master to regionservers clients

[
https://issues.apache.org/jira/browse/HBASE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576641#comment-13576641
]

nkeywal commented on HBASE-7590:

current patch shows the work in progress. All tests passes, with or without the
multicast activated. It works also on a real cluster.
I've got some work to do still:
- I've hijacked the current ClusterStatus protobuf, I'm going to create a
specific one
- I need to do some cleanup around ServerName ServerCallable.
- plus various.

Add a costless notifications mechanism from master to regionservers clients
-

[jira] [Commented] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


[ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576710#comment-13576710
 ] 

Ted Yu commented on HBASE-7829:
---

Integrated to 0.94 and trunk.

Thanks for the patch, Francis.

Thanks for the reviews, Lars and Matteo.

 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


 [ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7829:
--

Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.96.0, 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


 [ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7829:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.96.0, 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7823) Add ignores to hbase-prefix-tree


[ 
https://issues.apache.org/jira/browse/HBASE-7823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576725#comment-13576725
 ] 

stack commented on HBASE-7823:
--

+1

 Add ignores to hbase-prefix-tree
 

 Key: HBASE-7823
 URL: https://issues.apache.org/jira/browse/HBASE-7823
 Project: HBase
  Issue Type: Task
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Trivial
 Attachments: HBASE-7823-0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

[
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576782#comment-13576782
]

ramkrishna.s.vasudevan commented on HBASE-7521:
---

@Sergey
Do you think any problem with rolling restart with this?
I don't have experience with rolling restart. Also seeing the changes it does
not introduce new states nor changes the order of the states.
@Jimmy
I agree that there may be some cases hidden. We still see in user mail list
people saying Regions are in transition and they are not sure how to get around
it.
Atleast this patch solves the obvious cases if region in OPENING state.
Jimmy, do you think this will affect rolling restart?
@Stack
What do you think before we integrate to 0.94?

fix HBASE-6060 (regions stuck in opening state) in 0.94
---

Key: HBASE-7521
URL: https://issues.apache.org/jira/browse/HBASE-7521
Project: HBase
Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Fix For: 0.94.6

Attachments: 7521-original-patch-ported-v4.patch,
HBASE-7521-original-patch-ported-v0.patch,
HBASE-7521-original-patch-ported-v1.patch,
HBASE-7521-original-patch-ported-v2.patch,
HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch,
HBASE-7521-v1.patch, HBASE-7521-v5.patch

Discussion in HBASE-6060 implies that the fix there does not work on 0.94.
Still, we may want to fix the issue in 0.94 (via some different fix) because
the regions stuck in opening for ridiculous amounts of time is not a good
thing to have.

[jira] [Commented] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


[ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576791#comment-13576791
 ] 

Hudson commented on HBASE-7829:
---

Integrated in HBase-0.94 #838 (See 
[https://builds.apache.org/job/HBase-0.94/838/])
HBASE-7829 zookeeper kerberos conf keytab and principal parameters 
interchanged (Francis) (Revision 1445211)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java


 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.96.0, 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7781) Provide the ability to run integration tests with security enabled


 [ 
https://issues.apache.org/jira/browse/HBASE-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7781:
--

Priority: Critical  (was: Major)

 Provide the ability to run integration tests with security enabled
 --

 Key: HBASE-7781
 URL: https://issues.apache.org/jira/browse/HBASE-7781
 Project: HBase
  Issue Type: Test
  Components: security, test
Reporter: Gary Helmling
Priority: Critical

 We currently have large holes in the test coverage of HBase with security 
 enabled.  Two recent examples of bugs which really should have been caught 
 with testing are HBASE-7771 and HBASE-7772.  The long standing problem with 
 testing with security enabled has been the requirement for supporting 
 kerberos infrastructure.
 We need to close this gap and provide some automated testing with security 
 enabled, if necessary standing up and provisioning a temporary KDC as an 
 option for running integration tests, see HADOOP-8078 and HADOOP-9004 where a 
 similar approach was taken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7814:
--

Attachment: 7814-v2.txt

Patch v2 uses reflection to see if ugi.getShortUserName() is supported.

checkAccess() is only called by hbck at the beginning of the execution. So 
performance impact is negligible.

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-7814:
-

Assignee: Ted Yu  (was: Sergey Shelukhin)

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


[ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576794#comment-13576794
 ] 

Andrew Purtell commented on HBASE-7829:
---

Thanks. This points to we are generally not testing security. I raised the 
priority of HBASE-7781 to critical and hope to have some time for it soon.

 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.96.0, 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7829) zookeeper kerberos conf keytab and principal parameters interchanged


[ 
https://issues.apache.org/jira/browse/HBASE-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576795#comment-13576795
 ] 

Hudson commented on HBASE-7829:
---

Integrated in HBase-TRUNK #3873 (See 
[https://builds.apache.org/job/HBase-TRUNK/3873/])
HBASE-7829 zookeeper kerberos conf keytab and principal parameters 
interchanged (Francis) (Revision 1445213)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java


 zookeeper kerberos conf keytab and principal parameters interchanged
 

 Key: HBASE-7829
 URL: https://issues.apache.org/jira/browse/HBASE-7829
 Project: HBase
  Issue Type: Bug
Reporter: Francis Liu
Assignee: Francis Liu
Priority: Minor
 Fix For: 0.96.0, 0.94.6

 Attachments: HBASE-7829_94.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7781) Provide the ability to run integration tests with security enabled


[ 
https://issues.apache.org/jira/browse/HBASE-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576796#comment-13576796
 ] 

Andrew Purtell commented on HBASE-7781:
---

HBASE-7829 is another.

 Provide the ability to run integration tests with security enabled
 --

 Key: HBASE-7781
 URL: https://issues.apache.org/jira/browse/HBASE-7781
 Project: HBase
  Issue Type: Test
  Components: security, test
Reporter: Gary Helmling
Priority: Critical

 We currently have large holes in the test coverage of HBase with security 
 enabled.  Two recent examples of bugs which really should have been caught 
 with testing are HBASE-7771 and HBASE-7772.  The long standing problem with 
 testing with security enabled has been the requirement for supporting 
 kerberos infrastructure.
 We need to close this gap and provide some automated testing with security 
 enabled, if necessary standing up and provisioning a temporary KDC as an 
 option for running integration tests, see HADOOP-8078 and HADOOP-9004 where a 
 similar approach was taken.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576798#comment-13576798
]

Sergey Shelukhin commented on HBASE-7667:
-

The stripe boundaries are supposed to be an internal detail not visible to the
user, the user only configured the scheme (#of stripes, or size-based stripes
as described for time series data).
What kind of stripe configuration do you have in mind?
After talking about splits yesterday I am planning to make it a little more
flexible for splits, but still internal.

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

So I was thinking about having many regions as the way to make compactions
more manageable, and writing the level db doc about how level db range
overlap and data mixing breaks seqNum sorting, and discussing it with Jimmy,
Matteo and Ted, and thinking about how to avoid Level DB I/O multiplication
factor.
And I suggest the following idea, let's call it stripe compactions. It's a
mix between level db ideas and having many small regions.
It allows us to have a subset of benefits of many regions (wrt reads and
compactions) without many of the drawbacks (managing and current
memstore/etc. limitation).
It also doesn't break seqNum-based file sorting for any one key.
It works like this.
The region key space is separated into configurable number of fixed-boundary
stripes (determined the first time we stripe the data, see below).
All the data from memstores is written to normal files with all keys present
(not striped), similar to L0 in LevelDb, or current files.
Compaction policy does 3 types of compactions.
First is L0 compaction, which takes all L0 files and breaks them down by
stripe. It may be optimized by adding more small files from different
stripes, but the main logical outcome is that there are no more L0 files and
all data is striped.
Second is exactly similar to current compaction, but compacting one single
stripe. In future, nothing prevents us from applying compaction rules and
compacting part of the stripe (e.g. similar to current policy with rations
and stuff, tiers, whatever), but for the first cut I'd argue let it major
compact the entire stripe. Or just have the ratio and no more complexity.
Finally, the third addresses the concern of the fixed boundaries causing
stripes to be very unbalanced.
It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the
results out with different boundaries.
There's a tradeoff here - if we always take 2 adjacent stripes, compactions
will be smaller but rebalancing will take ridiculous amount of I/O.
If we take many stripes we are essentially getting into the
epic-major-compaction problem again. Some heuristics will have to be in place.
In general, if, before stripes are determined, we initially let L0 grow
before determining the stripes, we will get better boundaries.
Also, unless unbalancing is really large we don't need to rebalance really.
Obviously this scheme (as well as level) is not applicable for all scenarios,
e.g. if timestamp is your key it completely falls apart.
The end result:
- many small compactions that can be spread out in time.
- reads still read from a small number of files (one stripe + L0).
- region splits become marvelously simple (if we could move files between
regions, no references would be needed).
Main advantage over Level (for HBase) is that default store can still open
the files and get correct results - there are no range overlap shenanigans.
It also needs no metadata, although we may record some for convenience.
It also would appear to not cause as much I/O.

[jira] [Updated] (HBASE-7798) ZKAssign logs the wrong server if the transition fails


 [ 
https://issues.apache.org/jira/browse/HBASE-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7798:


Attachment: HBASE-7798-v1.patch

Updating patch

 ZKAssign logs the wrong server if the transition fails
 --

 Key: HBASE-7798
 URL: https://issues.apache.org/jira/browse/HBASE-7798
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: 7798-v0.patch, HBASE-7798-v0.patch, HBASE-7798-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

[
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576806#comment-13576806
]

Andrew Purtell commented on HBASE-6721:
---

bq. With this, master, zk and META are still shared across different groups
with little isolation guarantees or SLAs. I am wondering whether this is worth
the effort without having proper multi-tenancy at all levels.

I agree with this, but in defense of Francis' approach here, HDFS is also
shared (though it helps that writes prefer the local datanode, as do shortcut
reads). Admission control / QoS is a hard problem that requires a full stack
conversation and, in my opinion, calls for considering a framework for
admission control spanning from Common outward, something similar to a security
access token (perhaps even an attribute of same) with support at each layer for
mapping a logical notion of priority or quota to the actual enforcement
mechanism, whatever it is. That will be a difficult project to say the least
and in the meantime we could make incremental improvements at various layers of
the stack if there is a demonstrated benefit.

RegionServer Group based Assignment
---

Key: HBASE-6721
URL: https://issues.apache.org/jira/browse/HBASE-6721
Project: HBase
Issue Type: New Feature
Reporter: Francis Liu
Assignee: Vandana Ayyalasomayajula
Fix For: 0.96.0

Attachments: HBASE-6721_94_2.patch, HBASE-6721_94_3.patch,
HBASE-6721_94_3.patch, HBASE-6721_94_4.patch, HBASE-6721_94_5.patch,
HBASE-6721_94_6.patch, HBASE-6721_94_7.patch, HBASE-6721_94.patch,
HBASE-6721_94.patch, HBASE-6721-DesigDoc.pdf

In multi-tenant deployments of HBase, it is likely that a RegionServer will
be serving out regions from a number of different tables owned by various
client applications. Being able to group a subset of running RegionServers
and assign specific tables to it, provides a client application a level of
isolation and resource allocation.
The proposal essentially is to have an AssignmentManager which is aware of
RegionServer groups and assigns tables to region servers based on groupings.
Load balancing will occur on a per group basis as well.
This is essentially a simplification of the approach taken in HBASE-4120. See
attached document.

[jira] [Updated] (HBASE-7678) make storefile management pluggable, together with compaction


 [ 
https://issues.apache.org/jira/browse/HBASE-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7678:


Attachment: HBASE-7678-v2.patch

attach patch to rerun tests...

 make storefile management pluggable, together with compaction
 -

 Key: HBASE-7678
 URL: https://issues.apache.org/jira/browse/HBASE-7678
 Project: HBase
  Issue Type: Sub-task
  Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7678--and-7603.patch, HBASE-7678-v0.patch, 
 HBASE-7678-v1.patch, HBASE-7678-v2.patch, HBASE-7678-v2.patch, Pluggable 
 compactions doc.pdf




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7822) clean up compactionrequest and compactselection - part 1


[ 
https://issues.apache.org/jira/browse/HBASE-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576816#comment-13576816
 ] 

Sergey Shelukhin commented on HBASE-7822:
-

org.apache.hadoop.hbase.security.access.TestAccessController passes on local...

 clean up compactionrequest and compactselection - part 1
 

 Key: HBASE-7822
 URL: https://issues.apache.org/jira/browse/HBASE-7822
 Project: HBase
  Issue Type: Improvement
  Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7822-v0.patch, HBASE-7822-v0.patch


 Certain parts of CompactionRequest are unnecessary.
 Off-peak hour management is part way in selection, part way in Store and part 
 way in policy.
 Needs to be cleaned up in preparation of having CompactionPolicy return 
 CompactionRequest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7822) clean up compactionrequest and compactselection - part 1


 [ 
https://issues.apache.org/jira/browse/HBASE-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7822:


Attachment: HBASE-7822-v0.patch

 clean up compactionrequest and compactselection - part 1
 

 Key: HBASE-7822
 URL: https://issues.apache.org/jira/browse/HBASE-7822
 Project: HBase
  Issue Type: Improvement
  Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7822-v0.patch, HBASE-7822-v0.patch


 Certain parts of CompactionRequest are unnecessary.
 Off-peak hour management is part way in selection, part way in Store and part 
 way in policy.
 Needs to be cleaned up in preparation of having CompactionPolicy return 
 CompactionRequest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7798) ZKAssign logs the wrong server if the transition fails


[ 
https://issues.apache.org/jira/browse/HBASE-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576817#comment-13576817
 ] 

Hadoop QA commented on HBASE-7798:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12569019/HBASE-7798-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4423//console

This message is automatically generated.

 ZKAssign logs the wrong server if the transition fails
 --

 Key: HBASE-7798
 URL: https://issues.apache.org/jira/browse/HBASE-7798
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Trivial
 Attachments: 7798-v0.patch, HBASE-7798-v0.patch, HBASE-7798-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576819#comment-13576819
 ] 

stack commented on HBASE-7521:
--

What do I think?  This is a big change in AM internals.  I see Enis has tested 
it above but it only passes most of the time.  Any chance of some more 
testing before it gets committed?

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6972) HBase Shell deleteall should not require column to be defined


[ 
https://issues.apache.org/jira/browse/HBASE-6972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576824#comment-13576824
 ] 

stack commented on HBASE-6972:
--

+1 on patch.  Lets commit in new issue since this issue has gone stale.  Should 
it be in 0.94?

 HBase Shell deleteall should not require column to be defined 
 --

 Key: HBASE-6972
 URL: https://issues.apache.org/jira/browse/HBASE-6972
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.96.0
Reporter: Ricky Saltzer
Assignee: Ricky Saltzer
 Fix For: 0.96.0

 Attachments: HBASE-6972.2.patch, HBASE-6972.3.patch, HBASE-6972.patch


 It appears that the shell does not allow users to delete a row without 
 specifying a column (deleteall). It looks like the deleteall.rb used to 
 pre-define column as nil, making it optional. 
 I've created a patch and confirmed it to be working in standalone mode, I 
 will upload it shortly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7006) [MTTR] Study distributed log splitting to see how we can make it faster

2013-02-12 Thread Jeffrey Zhong (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576821#comment-13576821
 ] 

Jeffrey Zhong commented on HBASE-7006:
--

@Ted,

Yes, my first patch will include major logic for this JIRA and will be attached 
to a sub task JIRA(to be created) and be submitted within these two days. There 
will be two more sub JIRAs: one is to create a replay command and the other is 
to add metrics for better reporting purpose.

Thanks,
-Jeffrey 

 [MTTR] Study distributed log splitting to see how we can make it faster
 ---

 Key: HBASE-7006
 URL: https://issues.apache.org/jira/browse/HBASE-7006
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Jeffrey Zhong
 Fix For: 0.96.0

 Attachments: LogSplitting Comparison.pdf, 
 ProposaltoimprovelogsplittingprocessregardingtoHBASE-7006.pdf


 Just saw interesting issue where a cluster went down  hard and 30 nodes had 
 1700 WALs to replay.  Replay took almost an hour.  It looks like it could run 
 faster that much of the time is spent zk'ing and nn'ing.
 Putting in 0.96 so it gets a look at least.  Can always punt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576826#comment-13576826
]

stack commented on HBASE-7667:
--

bq. Currently Sergey puts the boundaries in metadata of store files.

What is wrong w/ this? It is the bounds of the keys the file contains? (Is
this not there already, the first and last key?)

What are the thoughts on figuring region stripe boundaries. Will it be done by
looking at the memstore content just before flush and dividing it into n files?
Or will it be done after the first flush on subsequent compactions by looking
at content of L0?

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576829#comment-13576829
]

Sergey Shelukhin commented on HBASE-7667:
-

In the N stripes scheme, boundaries will be determined at first L0
compaction. I am intending to add parameter to config to make first L0
compaction wait for more files to get better ones. Then, the stripes can be
rebalanced but the hope is that it happens infrequently.
In the growing key range scheme (for time series above) stripe boundaries
don't matter much, and are basically determined based on size. If nothing
intervenes, by EOW or next week I hope to have preliminary compaction
policy/compactor patch.

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

[jira] [Created] (HBASE-7830) Disable IntegrationTestRebalanceAndKillServersTargeted temporarily

stack created HBASE-7830:


 Summary: Disable IntegrationTestRebalanceAndKillServersTargeted 
temporarily
 Key: HBASE-7830
 URL: https://issues.apache.org/jira/browse/HBASE-7830
 Project: HBase
  Issue Type: Bug
Reporter: stack


Disabling IntegrationTestRebalanceAndKillServersTargeted until hbase-7520 is 
fixed so can move forward with getting hbase-it running on bigtop infra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7520) org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails when I cd hbase-it and mvn verify


[ 
https://issues.apache.org/jira/browse/HBASE-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576832#comment-13576832
 ] 

stack commented on HBASE-7520:
--

I see it is still hung.  I suppose yeah, for now, I'll disable it since don't 
have time at mo. to dig in.  Let me do it in separate issue.  Thanks Sergey for 
looking.

 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails 
 when I cd hbase-it and mvn verify
 --

 Key: HBASE-7520
 URL: https://issues.apache.org/jira/browse/HBASE-7520
 Project: HBase
  Issue Type: Bug
  Components: test
 Environment: macosx trunk
Reporter: stack
Assignee: Sergey Shelukhin
Priority: Critical

 Trying to make up something to hand off to bigtop project, running the hbase 
 it tests, this one fails.
 {code}
 durruti:failsafe-reports stack$ more 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.txt 
 ---
 Test set: 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted
 ---
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 206.538 sec 
  FAILURE!
 testDataIngest(org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted)
   Time elapsed: 206.395 sec   FAILURE!
 junit.framework.AssertionFailedError: Load failed with error code 1
 at junit.framework.Assert.fail(Assert.java:50)
 at 
 org.apache.hadoop.hbase.IngestIntegrationTestBase.runIngestTest(IngestIntegrationTestBase.java:98)
 at 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.testDataIngest(IntegrationTestRebalanceAndKillServersTargeted.java:121)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 ...
 {code}
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt
   has nothing in it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576836#comment-13576836
]

Ted Yu commented on HBASE-7667:
---

bq. It is the bounds of the keys the file contains? (Is this not there already,
the first and last key?)
Currently HFile provides the following methods:
{code}
byte[] getFirstRowKey();

byte[] getLastRowKey();
{code}
Stripe boundaries should be different from the above.

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

[jira] [Commented] (HBASE-7799) reassigning region stuck in open still may not work correctly due to leftover ZK node


[ 
https://issues.apache.org/jira/browse/HBASE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576835#comment-13576835
 ] 

ramkrishna.s.vasudevan commented on HBASE-7799:
---

Found the problem.
Straight forward scenario.
A region is in transition and getting moved to RS1.  RS1 gets killed in between 
and the state in zk is RS_ZK_OPENING.
Now SSH kicks in.  Jimmy's HBAS-7701 intelligently picks this region so that 
SSH can start assigning it.

The RIT state is forcefully made to CLOSE.  And then the GeneralBulkAssigner 
kicks in.
Now here try to create a znode with OFFLINE state. But if node exists we 
silently return.  
In AM.asyncSetOfflineInZooKeeper()
{code}
try {
  ZKAssign.asyncCreateNodeOffline(watcher, state.getRegion(),
destination, cb, state);
} catch (KeeperException e) {
  if (e instanceof NodeExistsException) {
LOG.warn(Node for  + state.getRegion() +  already exists);
  } else {
server.abort(Unexpected ZK exception creating/setting node OFFLINE, 
e);
  }
  return false;
{code}

Now when the new RS2 tries to transition the znode thinking it to be in 
M_ZK_OFFLINE state it does not happen. Thus leading to infinite loop.
Patch i will come up later as its late here.  
Pls correct me if am wrong here.


 reassigning region stuck in open still may not work correctly due to leftover 
 ZK node
 -

 Key: HBASE-7799
 URL: https://issues.apache.org/jira/browse/HBASE-7799
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
 Attachments: 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt.gz


 (logs grepped by region name, and abridged.
 META server was dead so OpenRegionHandler for the region took a while, and 
 was interrupted:
 {code}
 2013-02-08 14:35:01,555 DEBUG 
 [RS_OPEN_REGION-10.11.2.92,64485,1360362800564-2] 
 handler.OpenRegionHandler(255): Interrupting thread 
 Thread[PostOpenDeployTasks:871d1c3bdf98a2c93b527cb6cc61327d,5,main]
 {code}
 Then master tried to force region offline and reassign:
 {code}
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(347): Found opening region 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 be reassigned by SSH for 10.11.2.92,64485,1360362800564
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(242): Region {NAME = 
 'IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.',
  STARTKEY = '732c', ENDKEY = '7ff8', ENCODED = 
 871d1c3bdf98a2c93b527cb6cc61327d,} transitioned from 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,505 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 master.AssignmentManager(1530): Forcing OFFLINE; 
 was={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,506 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 zookeeper.ZKAssign(176): master:64483-0x13cbbf1025d Async create of 
 unassigned node for 871d1c3bdf98a2c93b527cb6cc61327d with OFFLINE state
 {code}
 But didn't delete the original ZK node?
 {code}
 2013-02-08 14:35:06,509 WARN  [main-EventThread] master.OfflineCallback(59): 
 Node for /hbase/region-in-transition/871d1c3bdf98a2c93b527cb6cc61327d already 
 exists
 2013-02-08 14:35:06,509 DEBUG [main-EventThread] master.OfflineCallback(69): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,512 DEBUG [main-EventThread] 
 master.OfflineCallback$ExistCallback(106): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 {code}
 So it went into infinite cycle of failing to assign due to this:
 {code}
 2013-02-08 14:35:06,517 INFO  [PRI IPC Server handler 7 on 64488] 
 regionserver.HRegionServer(3435): Received request to open region:

[jira] [Commented] (HBASE-7799) reassigning region stuck in open still may not work correctly due to leftover ZK node


[ 
https://issues.apache.org/jira/browse/HBASE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576837#comment-13576837
 ] 

ramkrishna.s.vasudevan commented on HBASE-7799:
---

One interesting thing in AM is every time we get new type of issues :).

 reassigning region stuck in open still may not work correctly due to leftover 
 ZK node
 -

 Key: HBASE-7799
 URL: https://issues.apache.org/jira/browse/HBASE-7799
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
 Attachments: 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt.gz


 (logs grepped by region name, and abridged.
 META server was dead so OpenRegionHandler for the region took a while, and 
 was interrupted:
 {code}
 2013-02-08 14:35:01,555 DEBUG 
 [RS_OPEN_REGION-10.11.2.92,64485,1360362800564-2] 
 handler.OpenRegionHandler(255): Interrupting thread 
 Thread[PostOpenDeployTasks:871d1c3bdf98a2c93b527cb6cc61327d,5,main]
 {code}
 Then master tried to force region offline and reassign:
 {code}
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(347): Found opening region 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 be reassigned by SSH for 10.11.2.92,64485,1360362800564
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(242): Region {NAME = 
 'IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.',
  STARTKEY = '732c', ENDKEY = '7ff8', ENCODED = 
 871d1c3bdf98a2c93b527cb6cc61327d,} transitioned from 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,505 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 master.AssignmentManager(1530): Forcing OFFLINE; 
 was={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,506 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 zookeeper.ZKAssign(176): master:64483-0x13cbbf1025d Async create of 
 unassigned node for 871d1c3bdf98a2c93b527cb6cc61327d with OFFLINE state
 {code}
 But didn't delete the original ZK node?
 {code}
 2013-02-08 14:35:06,509 WARN  [main-EventThread] master.OfflineCallback(59): 
 Node for /hbase/region-in-transition/871d1c3bdf98a2c93b527cb6cc61327d already 
 exists
 2013-02-08 14:35:06,509 DEBUG [main-EventThread] master.OfflineCallback(69): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,512 DEBUG [main-EventThread] 
 master.OfflineCallback$ExistCallback(106): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 {code}
 So it went into infinite cycle of failing to assign due to this:
 {code}
 2013-02-08 14:35:06,517 INFO  [PRI IPC Server handler 7 on 64488] 
 regionserver.HRegionServer(3435): Received request to open region: 
 IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  on 10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,521 WARN  
 [RS_OPEN_REGION-10.11.2.92,64488,1360362800651-0] zookeeper.ZKAssign(762): 
 regionserver:64488-0x13cbbf1025d0004 Attempt to transition the unassigned 
 node for 871d1c3bdf98a2c93b527cb6cc61327d from M_ZK_REGION_OFFLINE to 
 RS_ZK_REGION_OPENING failed, the node existed but was in the state 
 RS_ZK_REGION_OPENING set by the server [wrong server name redacted, see 
 HBASE-7798]
 {code}
 Transitioning failed-to-open similarly fails.
 It seems like master needs to nuke ZK node unconditionally to offline?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7520) org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails when I cd hbase-it and mvn verify


[ 
https://issues.apache.org/jira/browse/HBASE-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576843#comment-13576843
 ] 

Sergey Shelukhin commented on HBASE-7520:
-

Yeah, the improvements are not in yet, I found other issues while testing it :)
I suppose it would make sense to rerun test with Jimmy's fix for AM still, will 
try to squeeze it by EOW.

 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted fails 
 when I cd hbase-it and mvn verify
 --

 Key: HBASE-7520
 URL: https://issues.apache.org/jira/browse/HBASE-7520
 Project: HBase
  Issue Type: Bug
  Components: test
 Environment: macosx trunk
Reporter: stack
Assignee: Sergey Shelukhin
Priority: Critical

 Trying to make up something to hand off to bigtop project, running the hbase 
 it tests, this one fails.
 {code}
 durruti:failsafe-reports stack$ more 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.txt 
 ---
 Test set: 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted
 ---
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 206.538 sec 
  FAILURE!
 testDataIngest(org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted)
   Time elapsed: 206.395 sec   FAILURE!
 junit.framework.AssertionFailedError: Load failed with error code 1
 at junit.framework.Assert.fail(Assert.java:50)
 at 
 org.apache.hadoop.hbase.IngestIntegrationTestBase.runIngestTest(IngestIntegrationTestBase.java:98)
 at 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted.testDataIngest(IntegrationTestRebalanceAndKillServersTargeted.java:121)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 ...
 {code}
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt
   has nothing in it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576844#comment-13576844
 ] 

Jonathan Hsieh commented on HBASE-7814:
---

So this requires means this won't compile unless we compile against hadoop 
1.0+/.22+ but the binary should work against the older hadoop right?  I think 
I'm ok with that but please add comment and release note about this.

{code}
+// See HBASE-7814. UserGroupInformation from hadoop 0.20.x may not support 
getShortUserName().
+String username;
+try {
+  UserGroupInformation.class.getMethod(getShortUserName, new 
Class?[]{});
+  username = ugi.getShortUserName();
+} catch (Exception e) {
{code}

Can we be more specific about the Exception -- maybe figure out what the 
specific reflection exception is and use it?  


 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576845#comment-13576845
]

stack commented on HBASE-7667:
--

[~ted_yu] You raise a 'concern' that is nebulous to shoot down a suggested
implementation and then when asked why you do not answer the question asked.

[~sershe] Would it be easier looking at memstore than at content of L0? Would
have to do appropriate weighting... in that there will be hot spots in the
keyspace and these should get more stripes. Rejiggering the stripes joining up
the infrequently used and allotting more stripes to the hot keyspace area
sounds right. Good stuff.

Maybe its worth writing up a doc at this stage. This issue is full of
goodness. A doc could distill it out and make it easier on the digestion and
easier to think about it.

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

[jira] [Commented] (HBASE-7799) reassigning region stuck in open still may not work correctly due to leftover ZK node


[ 
https://issues.apache.org/jira/browse/HBASE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576847#comment-13576847
 ] 

Sergey Shelukhin commented on HBASE-7799:
-

Yeah, I was starting to come to a similar conclusion yesterday, although I 
didn't finish looking (that forcing offline is forcing only internal state, 
not ZK).
Wrt issues - there was HBASE-5something-or-other JIRA about making assignment 
simpler by getting rid/replacing/etc. things :)


 reassigning region stuck in open still may not work correctly due to leftover 
 ZK node
 -

 Key: HBASE-7799
 URL: https://issues.apache.org/jira/browse/HBASE-7799
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
 Attachments: 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt.gz


 (logs grepped by region name, and abridged.
 META server was dead so OpenRegionHandler for the region took a while, and 
 was interrupted:
 {code}
 2013-02-08 14:35:01,555 DEBUG 
 [RS_OPEN_REGION-10.11.2.92,64485,1360362800564-2] 
 handler.OpenRegionHandler(255): Interrupting thread 
 Thread[PostOpenDeployTasks:871d1c3bdf98a2c93b527cb6cc61327d,5,main]
 {code}
 Then master tried to force region offline and reassign:
 {code}
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(347): Found opening region 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 be reassigned by SSH for 10.11.2.92,64485,1360362800564
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(242): Region {NAME = 
 'IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.',
  STARTKEY = '732c', ENDKEY = '7ff8', ENCODED = 
 871d1c3bdf98a2c93b527cb6cc61327d,} transitioned from 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,505 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 master.AssignmentManager(1530): Forcing OFFLINE; 
 was={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,506 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 zookeeper.ZKAssign(176): master:64483-0x13cbbf1025d Async create of 
 unassigned node for 871d1c3bdf98a2c93b527cb6cc61327d with OFFLINE state
 {code}
 But didn't delete the original ZK node?
 {code}
 2013-02-08 14:35:06,509 WARN  [main-EventThread] master.OfflineCallback(59): 
 Node for /hbase/region-in-transition/871d1c3bdf98a2c93b527cb6cc61327d already 
 exists
 2013-02-08 14:35:06,509 DEBUG [main-EventThread] master.OfflineCallback(69): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,512 DEBUG [main-EventThread] 
 master.OfflineCallback$ExistCallback(106): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 {code}
 So it went into infinite cycle of failing to assign due to this:
 {code}
 2013-02-08 14:35:06,517 INFO  [PRI IPC Server handler 7 on 64488] 
 regionserver.HRegionServer(3435): Received request to open region: 
 IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  on 10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,521 WARN  
 [RS_OPEN_REGION-10.11.2.92,64488,1360362800651-0] zookeeper.ZKAssign(762): 
 regionserver:64488-0x13cbbf1025d0004 Attempt to transition the unassigned 
 node for 871d1c3bdf98a2c93b527cb6cc61327d from M_ZK_REGION_OFFLINE to 
 RS_ZK_REGION_OPENING failed, the node existed but was in the state 
 RS_ZK_REGION_OPENING set by the server [wrong server name redacted, see 
 HBASE-7798]
 {code}
 Transitioning failed-to-open similarly fails.
 It seems like master needs to nuke ZK node unconditionally to offline?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7830) Disable IntegrationTestRebalanceAndKillServersTargeted temporarily


 [ 
https://issues.apache.org/jira/browse/HBASE-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-7830:
-

Attachment: 7830.txt

Patch to disable the test.  Tried it locally and now hbase-it verify works.

 Disable IntegrationTestRebalanceAndKillServersTargeted temporarily
 --

 Key: HBASE-7830
 URL: https://issues.apache.org/jira/browse/HBASE-7830
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Attachments: 7830.txt


 Disabling IntegrationTestRebalanceAndKillServersTargeted until hbase-7520 is 
 fixed so can move forward with getting hbase-it running on bigtop infra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-7830) Disable IntegrationTestRebalanceAndKillServersTargeted temporarily


 [ 
https://issues.apache.org/jira/browse/HBASE-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-7830.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Assignee: stack
 Release Note: Disabled test

 Disable IntegrationTestRebalanceAndKillServersTargeted temporarily
 --

 Key: HBASE-7830
 URL: https://issues.apache.org/jira/browse/HBASE-7830
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 7830.txt


 Disabling IntegrationTestRebalanceAndKillServersTargeted until hbase-7520 is 
 fixed so can move forward with getting hbase-it running on bigtop infra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576850#comment-13576850
 ] 

Ted Yu commented on HBASE-7814:
---

I followed the same style in existing code:
{code}
  private static boolean isInSafeMode(DistributedFileSystem dfs) throws 
IOException {
boolean inSafeMode = false;
try {
  Method m = DistributedFileSystem.class.getMethod(setSafeMode, new 
Class? []{
  org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.class, 
boolean.class});
  inSafeMode = (Boolean) m.invoke(dfs,

org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET, true);
} catch (Exception e) {
{code}


 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7799) reassigning region stuck in open still may not work correctly due to leftover ZK node


[ 
https://issues.apache.org/jira/browse/HBASE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576851#comment-13576851
 ] 

ramkrishna.s.vasudevan commented on HBASE-7799:
---

One correction here.  I think the code in AM.asyncSetOfflineInZooKeeper() is 
fine.  But the OfflineCallBack just does not throw NodeExistsException in 
processResult.
{code}
 @Override
  public void processResult(int rc, String path, Object ctx, String name) {
if (rc == KeeperException.Code.NODEEXISTS.intValue()) {
  LOG.warn(Node for  + path +  already exists);
}
{code}
I see the above log in the log file and not the LOG.warn(Node for  + 
state.getRegion() +  already exists); in AM.
{code}
master.OfflineCallback(59): Node for 
/hbase/region-in-transition/871d1c3bdf98a2c93b527cb6cc61327d already exists
{code}
So the OfflineCallBack is eating up the NodeExistsException.  I may be wrong 
here because i am new to this CallBack code introduced in trunk.

 reassigning region stuck in open still may not work correctly due to leftover 
 ZK node
 -

 Key: HBASE-7799
 URL: https://issues.apache.org/jira/browse/HBASE-7799
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
 Attachments: 
 org.apache.hadoop.hbase.IntegrationTestRebalanceAndKillServersTargeted-output.txt.gz


 (logs grepped by region name, and abridged.
 META server was dead so OpenRegionHandler for the region took a while, and 
 was interrupted:
 {code}
 2013-02-08 14:35:01,555 DEBUG 
 [RS_OPEN_REGION-10.11.2.92,64485,1360362800564-2] 
 handler.OpenRegionHandler(255): Interrupting thread 
 Thread[PostOpenDeployTasks:871d1c3bdf98a2c93b527cb6cc61327d,5,main]
 {code}
 Then master tried to force region offline and reassign:
 {code}
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(347): Found opening region 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 be reassigned by SSH for 10.11.2.92,64485,1360362800564
 2013-02-08 14:35:06,500 INFO  
 [MASTER_SERVER_OPERATIONS-10.11.2.92,64483,1360362800340-1] 
 master.RegionStates(242): Region {NAME = 
 'IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.',
  STARTKEY = '732c', ENDKEY = '7ff8', ENCODED = 
 871d1c3bdf98a2c93b527cb6cc61327d,} transitioned from 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OPENING, ts=1360362901596, server=10.11.2.92,64485,1360362800564} to 
 {IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,505 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 master.AssignmentManager(1530): Forcing OFFLINE; 
 was={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=CLOSED, ts=1360362906500, server=null}
 2013-02-08 14:35:06,506 DEBUG 
 [10.11.2.92,64483,1360362800340-GeneralBulkAssigner-1] 
 zookeeper.ZKAssign(176): master:64483-0x13cbbf1025d Async create of 
 unassigned node for 871d1c3bdf98a2c93b527cb6cc61327d with OFFLINE state
 {code}
 But didn't delete the original ZK node?
 {code}
 2013-02-08 14:35:06,509 WARN  [main-EventThread] master.OfflineCallback(59): 
 Node for /hbase/region-in-transition/871d1c3bdf98a2c93b527cb6cc61327d already 
 exists
 2013-02-08 14:35:06,509 DEBUG [main-EventThread] master.OfflineCallback(69): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,512 DEBUG [main-EventThread] 
 master.OfflineCallback$ExistCallback(106): 
 rs={IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  state=OFFLINE, ts=1360362906506, server=null}, 
 server=10.11.2.92,64488,1360362800651
 {code}
 So it went into infinite cycle of failing to assign due to this:
 {code}
 2013-02-08 14:35:06,517 INFO  [PRI IPC Server handler 7 on 64488] 
 regionserver.HRegionServer(3435): Received request to open region: 
 IntegrationTestRebalanceAndKillServersTargeted,732c,1360362805563.871d1c3bdf98a2c93b527cb6cc61327d.
  on 10.11.2.92,64488,1360362800651
 2013-02-08 14:35:06,521 WARN  
 [RS_OPEN_REGION-10.11.2.92,64488,1360362800651-0] zookeeper.ZKAssign(762): 
 regionserver:64488-0x13cbbf1025d0004 Attempt to transition the unassigned 
 node for 871d1c3bdf98a2c93b527cb6cc61327d from M_ZK_REGION_OFFLINE to

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576855#comment-13576855
 ] 

Sergey Shelukhin commented on HBASE-7814:
-

Why not invoke the method via reflection instead to avoid potential compilation 
issues? Doesn't appear like perf would matter here.
Also, what if getShortUserName throws some logical exception, +1 to specific 
exception, or moving the call out of this try-catch

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576857#comment-13576857
 ] 

Sergey Shelukhin commented on HBASE-7521:
-

Why should there be problem with rolling restart in particular, as opposed to 
regular restart?
I tested with regular restart and it appeared to be normal, although 
integration tests on 94 are not very stable to begin with.

Alternatively, we can consider improving on the simpler patch I submitted (it 
fixes only specific issue in this JIRA, not the other one; but it's 
straightforward and as far as I was told there's only one (existing) race to 
fix to make it work, between retry and immediate reassignment).


 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-7831) lightweight way to make RS commit suicide

Sergey Shelukhin created HBASE-7831:
---

 Summary: lightweight way to make RS commit suicide
 Key: HBASE-7831
 URL: https://issues.apache.org/jira/browse/HBASE-7831
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor


There are times in the life of many a region server when it realizes that it 
cannot go on any longer in the hostile, inhospitable environment. Or it might 
realize that there's something inherently wrong with it on the inside (probably 
because a developer screwed up). The only way out is killing itself.
There's code in some places currently that does Runtime halt in such cases, and 
there may be code that calls HRegionServer method to do this.
I think we need some easy-to-access (lightweight interface, or static) way to 
trigger reliable (no catching exceptions on the upper levels, etc.) RS death in 
such cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7814:
--

Attachment: 7814-v3.txt

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576864#comment-13576864
 ] 

Ted Yu commented on HBASE-7814:
---

Patch v3 invokes the method from Method object obtained through reflection.

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7831) lightweight way to make RS commit suicide

2013-02-12 Thread Jean-Daniel Cryans (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576866#comment-13576866
 ] 

Jean-Daniel Cryans commented on HBASE-7831:
---

In HRS we have the concept of aborting on shutting down (abortRequested). It 
probably needs improvement.

 lightweight way to make RS commit suicide
 -

 Key: HBASE-7831
 URL: https://issues.apache.org/jira/browse/HBASE-7831
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor

 There are times in the life of many a region server when it realizes that it 
 cannot go on any longer in the hostile, inhospitable environment. Or it might 
 realize that there's something inherently wrong with it on the inside 
 (probably because a developer screwed up). The only way out is killing itself.
 There's code in some places currently that does Runtime halt in such cases, 
 and there may be code that calls HRegionServer method to do this.
 I think we need some easy-to-access (lightweight interface, or static) way to 
 trigger reliable (no catching exceptions on the upper levels, etc.) RS death 
 in such cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7822) clean up compactionrequest and compactselection - part 1

[
https://issues.apache.org/jira/browse/HBASE-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576868#comment-13576868
]

Hadoop QA commented on HBASE-7822:
--

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12569022/HBASE-7822-v0.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 12 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/4422//console

This message is automatically generated.

clean up compactionrequest and compactselection - part 1

Key: HBASE-7822
URL: https://issues.apache.org/jira/browse/HBASE-7822
Project: HBase
Issue Type: Improvement
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Attachments: HBASE-7822-v0.patch, HBASE-7822-v0.patch

Certain parts of CompactionRequest are unnecessary.
Off-peak hour management is part way in selection, part way in Store and part
way in policy.
Needs to be cleaned up in preparation of having CompactionPolicy return
CompactionRequest.

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576871#comment-13576871
 ] 

Jonathan Hsieh commented on HBASE-7814:
---

I agree with Sergey.  The v3 invoke is better.  Let's get the specific 
exception as well.

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3171) Drop ROOT and instead store META location(s) directly in ZooKeeper

2013-02-12 Thread Jean-Daniel Cryans (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576878#comment-13576878
]

Jean-Daniel Cryans commented on HBASE-3171:
---

[~nkeywal] I'd rather have it in subsequent Jira.

Drop ROOT and instead store META location(s) directly in ZooKeeper
--

Key: HBASE-3171
URL: https://issues.apache.org/jira/browse/HBASE-3171
Project: HBase
Issue Type: Improvement
Components: Client, master, regionserver, Zookeeper
Reporter: Jonathan Gray
Attachments: HBASE-3171.patch, HBASE-3171-v2.patch,
HBASE-3171-v3.patch, HBASE-3171-v4.patch

Rather than storing the ROOT region location in ZooKeeper, going to ROOT, and
reading the META location, we should just store the META location directly in
ZooKeeper.
The purpose of the root region from the bigtable paper was to support
multiple meta regions. Currently, we explicitly only support a single meta
region, so the translation from our current code of a single root location to
a single meta location will be very simple. Long-term, it seems reasonable
that we could store several meta region locations in ZK. There's been some
discussion in HBASE-1755 about actually moving META into ZK, but I think this
jira is a good step towards taking some of the complexity out of how we have
to deal with catalog tables everywhere.
As-is, a new client already requires ZK to get the root location, so this
would not change those requirements in any way.
The primary motivation for this is to simplify things like CatalogTracker.
The way we can handle root in that class is really simple but the tracking of
meta is difficulty and a bit hacky. This hack on tracking of the meta
location is what caused one of the bugs over in HBASE-3159.

[jira] [Updated] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7814:
--

Attachment: 7814-v4.txt

Patch v4 deals with individual exceptions.

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, 7814-v4.txt, 
 HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7667) Support stripe compaction

[
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576881#comment-13576881
]

Sergey Shelukhin commented on HBASE-7667:
-

Discussed with Ted, what was meant is configuring it like region splitting,
which is not planned at this point.
Doc - probably needed... I will get to it eventually.

Support stripe compaction
-

Key: HBASE-7667
URL: https://issues.apache.org/jira/browse/HBASE-7667
Project: HBase
Issue Type: New Feature
Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

[jira] [Commented] (HBASE-7831) lightweight way to make RS commit suicide

[
https://issues.apache.org/jira/browse/HBASE-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576883#comment-13576883
]

Jonathan Hsieh commented on HBASE-7831:
---

Can you talk about how this would be used by a user or dev? (or is this in the
software itself -- which the Abortable stuff should take care of).

Is the idea to remotely euthanize (via shell possibly) an unhappy region
server? Something similar may be nice for masters as well (HA hdfs has a way
to force an a master transition -- this could be similar or our Masters or
RS's).

lightweight way to make RS commit suicide
-

Key: HBASE-7831
URL: https://issues.apache.org/jira/browse/HBASE-7831
Project: HBase
Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor

There are times in the life of many a region server when it realizes that it
cannot go on any longer in the hostile, inhospitable environment. Or it might
realize that there's something inherently wrong with it on the inside
(probably because a developer screwed up). The only way out is killing itself.
There's code in some places currently that does Runtime halt in such cases,
and there may be code that calls HRegionServer method to do this.
I think we need some easy-to-access (lightweight interface, or static) way to
trigger reliable (no catching exceptions on the upper levels, etc.) RS death
in such cases.

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576887#comment-13576887
 ] 

Jonathan Hsieh commented on HBASE-7814:
---

lgtm. ideally this is tested before commit (tweak the pom to use an older 
hadoop jar?).

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, 7814-v4.txt, 
 HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-02-12 Thread Jimmy Xiang (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576888#comment-13576888
]

Jimmy Xiang commented on HBASE-7521:

This patch removed a method from RegionServerServices. If some coprocessor
happens to use it, it will be broken. It also added two new methods to this
class. Other than that, I don't see other rolling restart/compatibility issue.
However, the chance for a coprocessor to use these methods are very low.

fix HBASE-6060 (regions stuck in opening state) in 0.94
---

Key: HBASE-7521
URL: https://issues.apache.org/jira/browse/HBASE-7521
Project: HBase
Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Fix For: 0.94.6

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-02-12 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576890#comment-13576890
]

Enis Soztutar commented on HBASE-7521:
--

bq. but it only passes most of the time
Sorry I should have been more clear. The integration test actually does not
pass all the time, because of other problems, not directly related to region
assignment. We are running out of retries and giving up on write's. I should
inspect more and create issues for that. Without the patch, the integration
test does not even complete half the time because of the RITs in OPENING state.

fix HBASE-6060 (regions stuck in opening state) in 0.94
---

Key: HBASE-7521
URL: https://issues.apache.org/jira/browse/HBASE-7521
Project: HBase
Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Fix For: 0.94.6

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576896#comment-13576896
 ] 

Ted Yu commented on HBASE-7521:
---

bq. This patch removed a method from RegionServerServices.
I noticed this too.
{code}
-  public Mapbyte[], Boolean getRegionsInTransitionInRS();
+  public boolean removeFromRegionsInTransition(HRegionInfo hri);
{code}
getRegionsInTransitionInRS() is supposed to be internal (from trunk):
{code}
@InterfaceAudience.Private
public interface RegionServerServices extends OnlineRegions {
{code}

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576902#comment-13576902
 ] 

Lars Hofhansl commented on HBASE-7521:
--

I specifically meant a rolling upgrade where some servers might still run 
0.94.x (or even 0.92.x) and some will run 0.94.6.
The coprocessor bit is fine I think (we didn't say we'll keep the internal APIs 
stable).

Again, I'm not the decider. If you think this is an important fix and there 
reasonable confidence that this won't introduce any bad side-effects then 
please integrate.


 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-7779) [snapshot 130201 merge] Fix TestMultiParallel


 [ 
https://issues.apache.org/jira/browse/HBASE-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh resolved HBASE-7779.
---

   Resolution: Fixed
Fix Version/s: hbase-7290
 Assignee: Ted Yu
 Hadoop Flags: Reviewed

 [snapshot 130201 merge] Fix TestMultiParallel
 -

 Key: HBASE-7779
 URL: https://issues.apache.org/jira/browse/HBASE-7779
 Project: HBase
  Issue Type: Sub-task
Reporter: Jonathan Hsieh
Assignee: Ted Yu
 Fix For: hbase-7290

 Attachments: 7779-v1.txt, 7779-v2.txt


 TestMultiParallel has three tests that always fail on the merged branch: 
 #testFlushCommitsWithAbort, #testActiveThreadsCount and 
 #testflushCommitsNoAbort.  There were some changes introduced in HBASE-7299 
 which happend on trunk before the merge and are likely related.  (that patch 
 addresses problems on the same tests).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576910#comment-13576910
 ] 

Ted Yu commented on HBASE-7814:
---

I looked through 0.92 pom.xml and found no profile for hadoop 0.20.x
I ran the following command with patch and it passed:

mvn clean test -PlocalTests,security -Dtest=TestHBaseFsck

[~lhofhansl]:
Do you want to take a look at patch v4 ?

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, 7814-v4.txt, 
 HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576914#comment-13576914
 ] 

Ted Yu commented on HBASE-7521:
---

I would vote for this patch to go in.

HBASE-6060 was opened about 9 months ago. We should address this issue.

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7831) lightweight way to make RS commit suicide


[ 
https://issues.apache.org/jira/browse/HBASE-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576916#comment-13576916
 ] 

Sergey Shelukhin commented on HBASE-7831:
-

What I had in mind is code like this:

{code}
 if ((currentSeqNum != null)
 (currentSeqNum.longValue() = seqNumBeforeFlushStarts.longValue())) {
  String errorStr = Region  + Bytes.toString(encodedRegionName) +
  acquired edits out of order current memstore seq= + currentSeqNum
  + , previous oldest unflushed id= + seqNumBeforeFlushStarts;
  LOG.error(errorStr);
  assert false : errorStr;
  Runtime.getRuntime().halt(1);
}{code}

or 
{code}
if (!checkFileSystem()) {
  LOG.warn(Bad Filesystem, exiting);
  Runtime.getRuntime().halt(1);
} {code}

 lightweight way to make RS commit suicide
 -

 Key: HBASE-7831
 URL: https://issues.apache.org/jira/browse/HBASE-7831
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor

 There are times in the life of many a region server when it realizes that it 
 cannot go on any longer in the hostile, inhospitable environment. Or it might 
 realize that there's something inherently wrong with it on the inside 
 (probably because a developer screwed up). The only way out is killing itself.
 There's code in some places currently that does Runtime halt in such cases, 
 and there may be code that calls HRegionServer method to do this.
 I think we need some easy-to-access (lightweight interface, or static) way to 
 trigger reliable (no catching exceptions on the upper levels, etc.) RS death 
 in such cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7831) lightweight way to make RS commit suicide (internally, from code)


 [ 
https://issues.apache.org/jira/browse/HBASE-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7831:


Summary: lightweight way to make RS commit suicide (internally, from code)  
(was: lightweight way to make RS commit suicide)

 lightweight way to make RS commit suicide (internally, from code)
 -

 Key: HBASE-7831
 URL: https://issues.apache.org/jira/browse/HBASE-7831
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor

 There are times in the life of many a region server when it realizes that it 
 cannot go on any longer in the hostile, inhospitable environment. Or it might 
 realize that there's something inherently wrong with it on the inside 
 (probably because a developer screwed up). The only way out is killing itself.
 There's code in some places currently that does Runtime halt in such cases, 
 and there may be code that calls HRegionServer method to do this.
 I think we need some easy-to-access (lightweight interface, or static) way to 
 trigger reliable (no catching exceptions on the upper levels, etc.) RS death 
 in such cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4210) Allow coprocessor to interact with batches per region sent from a client(?)

2013-02-12 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576917#comment-13576917
 ] 

Anoop Sam John commented on HBASE-4210:
---

Work in progress. Will attach initial version tomorrow.

 Allow coprocessor to interact with batches per region sent from a client(?)
 ---

 Key: HBASE-4210
 URL: https://issues.apache.org/jira/browse/HBASE-4210
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Assignee: Anoop Sam John
Priority: Minor
 Fix For: 0.96.0, 0.94.6


 Currently the coprocessor write hooks - {pre|post}{Put|Delete} - are strictly 
 one row|cell operations.
 It might be a good idea to allow a coprocessor to deal with batches of puts 
 and deletes as they arrive from the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7822) clean up compactionrequest and compactselection - part 1

2013-02-12 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576932#comment-13576932
 ] 

Jimmy Xiang commented on HBASE-7822:


Looks good to me.

 clean up compactionrequest and compactselection - part 1
 

 Key: HBASE-7822
 URL: https://issues.apache.org/jira/browse/HBASE-7822
 Project: HBase
  Issue Type: Improvement
  Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7822-v0.patch, HBASE-7822-v0.patch


 Certain parts of CompactionRequest are unnecessary.
 Off-peak hour management is part way in selection, part way in Store and part 
 way in policy.
 Needs to be cleaned up in preparation of having CompactionPolicy return 
 CompactionRequest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7831) lightweight way to make RS commit suicide (internally, from code)


[ 
https://issues.apache.org/jira/browse/HBASE-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576934#comment-13576934
 ] 

Jonathan Hsieh commented on HBASE-7831:
---

This seems like what the abort(...) call from the Abortable interface is for on 
the HMaster and HRegionServer.  Does it cover the need you bring up?

https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1913

https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L1701

 lightweight way to make RS commit suicide (internally, from code)
 -

 Key: HBASE-7831
 URL: https://issues.apache.org/jira/browse/HBASE-7831
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Priority: Minor

 There are times in the life of many a region server when it realizes that it 
 cannot go on any longer in the hostile, inhospitable environment. Or it might 
 realize that there's something inherently wrong with it on the inside 
 (probably because a developer screwed up). The only way out is killing itself.
 There's code in some places currently that does Runtime halt in such cases, 
 and there may be code that calls HRegionServer method to do this.
 I think we need some easy-to-access (lightweight interface, or static) way to 
 trigger reliable (no catching exceptions on the upper levels, etc.) RS death 
 in such cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576955#comment-13576955
 ] 

Lars Hofhansl commented on HBASE-7521:
--

There are already enough +1's to commit.

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7830) Disable IntegrationTestRebalanceAndKillServersTargeted temporarily


[ 
https://issues.apache.org/jira/browse/HBASE-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576983#comment-13576983
 ] 

Sergey Shelukhin commented on HBASE-7830:
-

+1

 Disable IntegrationTestRebalanceAndKillServersTargeted temporarily
 --

 Key: HBASE-7830
 URL: https://issues.apache.org/jira/browse/HBASE-7830
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 7830.txt


 Disabling IntegrationTestRebalanceAndKillServersTargeted until hbase-7520 is 
 fixed so can move forward with getting hbase-it running on bigtop infra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576995#comment-13576995
 ] 

Ted Yu commented on HBASE-7521:
---

Integrated to 0.94

Thanks for the patch, Sergey.

Thanks for the reviews, Stack, Lars, Ram, Rajesh and Enis.

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577005#comment-13577005
 ] 

Hudson commented on HBASE-7521:
---

Integrated in HBase-0.94 #839 (See 
[https://builds.apache.org/job/HBase-0.94/839/])
HBASE-7521 fix HBASE-6060 (regions stuck in opening state) in 0.94 (Sergey 
and Rajeshbabu) (Revision 1445350)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedOpenCloseRegion.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java


 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover


[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577006#comment-13577006
 ] 

Hudson commented on HBASE-6060:
---

Integrated in HBase-0.94 #839 (See 
[https://builds.apache.org/job/HBase-0.94/839/])
HBASE-7521 fix HBASE-6060 (regions stuck in opening state) in 0.94 (Sergey 
and Rajeshbabu) (Revision 1445350)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/Mocking.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedOpenCloseRegion.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java


 Regions's in OPENING state from failed regionservers takes a long time to 
 recover
 -

 Key: HBASE-6060
 URL: https://issues.apache.org/jira/browse/HBASE-6060
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
 HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
 HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
 trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch


 we have seen a pattern in tests, that the regions are stuck in OPENING state 
 for a very long time when the region server who is opening the region fails. 
 My understanding of the process: 
  
  - master calls rs to open the region. If rs is offline, a new plan is 
 generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
 master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
 HMaster.assign()
  - RegionServer, starts opening a region, changes the state in znode. But 
 that znode is not ephemeral. (see ZkAssign)
  - Rs transitions zk node from OFFLINE to OPENING. See 
 OpenRegionHandler.process()
  - rs then opens the region, and changes znode from OPENING to OPENED
  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
 state, and the master just waits for rs to change the region state, but since 
 rs is down, that wont happen. 
  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
 against these kind of conditions. It periodically checks (every 10 sec by 
 default) the regions in transition to see whether they timedout 
 (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
 which explains what you and I are seeing. 
  - ServerShutdownHandler in Master does not reassign regions in OPENING 
 state, although it handles other states. 
 Lowering that threshold from the configuration is one option, but still I 
 think we can do better. 
 Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7290) Online snapshots


 [ 
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-7290:
--

Attachment: hbase-7290.mega.patch

Attached megapatch after 2/12/13 merge for initial merge reviews.

 Online snapshots 
 -

 Key: HBASE-7290
 URL: https://issues.apache.org/jira/browse/HBASE-7290
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-7290.mega.patch


 HBASE-6055 will be closed when the offline snapshots pieces get merged with 
 trunk.  This umbrella issue has all the online snapshot specific patches.  
 This will get merged once one of the implementations makes it into trunk.  
 Other flavors of online snapshots can then be done as normal patches instead 
 of on a development branch.  (was: HBASE-6055 will be closed when the online 
 snapshots pieces get merged with trunk.  This umbrella issue has all the 
 online snapshot specific patches.  This will get merged once one of the 
 implementations makes it into trunk.  Other flavors of online snapshots can 
 then be done as normal patches instead of on a development branch.)
 (not a fan of the quick edit descirption jira feature)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7290) Online snapshots


 [ 
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-7290:
--

Status: Patch Available  (was: Open)

Submitting megapatch to get apache jenkins build machines to give it a try.

 Online snapshots 
 -

 Key: HBASE-7290
 URL: https://issues.apache.org/jira/browse/HBASE-7290
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-7290.mega.patch


 HBASE-6055 will be closed when the offline snapshots pieces get merged with 
 trunk.  This umbrella issue has all the online snapshot specific patches.  
 This will get merged once one of the implementations makes it into trunk.  
 Other flavors of online snapshots can then be done as normal patches instead 
 of on a development branch.  (was: HBASE-6055 will be closed when the online 
 snapshots pieces get merged with trunk.  This umbrella issue has all the 
 online snapshot specific patches.  This will get merged once one of the 
 implementations makes it into trunk.  Other flavors of online snapshots can 
 then be done as normal patches instead of on a development branch.)
 (not a fan of the quick edit descirption jira feature)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-7521.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.94.6

 Attachments: 7521-original-patch-ported-v4.patch, 
 HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-original-patch-ported-v1.patch, 
 HBASE-7521-original-patch-ported-v2.patch, 
 HBASE-7521-original-patch-ported-v3.patch, HBASE-7521-v0.patch, 
 HBASE-7521-v1.patch, HBASE-7521-v5.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7822) clean up compactionrequest and compactselection - part 1


[ 
https://issues.apache.org/jira/browse/HBASE-7822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577033#comment-13577033
 ] 

Ted Yu commented on HBASE-7822:
---

{code}
+public class OffPeakCompactions {
{code}
Please add javadoc for the new class. Annotation for audience, too.
{code}
-  + CompactSelection.getNumOutStandingOffPeakCompactions());
+  LOG.info(Running an off-peak compaction, selection ratio =  + ratio);
...
+  public long getNumOutStandingOffPeakCompactions() {
{code}
I don't see the new getNumOutStandingOffPeakCompactions() method called.

 clean up compactionrequest and compactselection - part 1
 

 Key: HBASE-7822
 URL: https://issues.apache.org/jira/browse/HBASE-7822
 Project: HBase
  Issue Type: Improvement
  Components: Compaction
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7822-v0.patch, HBASE-7822-v0.patch


 Certain parts of CompactionRequest are unnecessary.
 Off-peak hour management is part way in selection, part way in Store and part 
 way in policy.
 Needs to be cleaned up in preparation of having CompactionPolicy return 
 CompactionRequest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577044#comment-13577044
 ] 

Lars Hofhansl commented on HBASE-7814:
--

+1

(Test with older version would be nice, but the change is obvious enough that I 
would not make that a requirement.)

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, 7814-v4.txt, 
 HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


[ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577053#comment-13577053
 ] 

Ted Yu commented on HBASE-7814:
---

Integrated to 0.94

Thanks for the reviews, Jon, Sergey and Lars.

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, 7814-v4.txt, 
 HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount

2013-02-12 Thread Tianying Chang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-7818:
--

Attachment: HBASE-7818.patch

Patch attached. 

 add region level metrics readReqeustCount and writeRequestCount 
 

 Key: HBASE-7818
 URL: https://issues.apache.org/jira/browse/HBASE-7818
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.4
Reporter: Tianying Chang
Assignee: Tianying Chang
Priority: Minor
 Fix For: 0.94.6

 Attachments: HBASE-7818.patch


 Request rate at region server level can help identify the hot region server. 
 But it will be good if we can further identify the hot regions on that region 
 server. That way, we can easily find out unbalanced regions problem. 
 Currently, readRequestCount and writeReqeustCount per region is exposed at 
 webUI. It will be more useful to expose it through hadoop metrics framework 
 and/or JMX, so that people can see the history when the region is hot.
 I am exposing the existing readRequestCount/writeRequestCount into the 
 dynamic region level metrics framework. I am not changing/exposing it as rate 
 because our openTSDB is taking the raw data of read/write count, and apply 
 rate function to display the rate already. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7802) Use the Store interface instead of HStore (coprocessor, compaction, ...)


[ 
https://issues.apache.org/jira/browse/HBASE-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577060#comment-13577060
 ] 

Ted Yu commented on HBASE-7802:
---

+1 with minor comment:
{code}
+ * Immutable information for scans over a store.
+ */
+public class ScanInfo {
{code}
Please add annotation for the new class.

 Use the Store interface instead of HStore (coprocessor, compaction, ...)
 

 Key: HBASE-7802
 URL: https://issues.apache.org/jira/browse/HBASE-7802
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Attachments: HBASE-7802-v0.patch, HBASE-7802-v1.patch, 
 HBASE-7802-v2.patch


 We have a Store interface that is almost never used, coprocessors and 
 compactions are using the HStore.
 Replace the HStore with Store to have the ability to create custom Store for 
 testing, instead of injecting stuff in HStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7830) Disable IntegrationTestRebalanceAndKillServersTargeted temporarily


[ 
https://issues.apache.org/jira/browse/HBASE-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577061#comment-13577061
 ] 

Hudson commented on HBASE-7830:
---

Integrated in HBase-TRUNK #3874 (See 
[https://builds.apache.org/job/HBase-TRUNK/3874/])
HBASE-7830 Disable IntegrationTestRebalanceAndKillServersTargeted 
temporarily (Revision 1445294)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestRebalanceAndKillServersTargeted.java


 Disable IntegrationTestRebalanceAndKillServersTargeted temporarily
 --

 Key: HBASE-7830
 URL: https://issues.apache.org/jira/browse/HBASE-7830
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.96.0

 Attachments: 7830.txt


 Disabling IntegrationTestRebalanceAndKillServersTargeted until hbase-7520 is 
 fixed so can move forward with getting hbase-it running on bigtop infra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-7814) Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-7814.
--

Resolution: Fixed

 Port HBASE-6963 'unable to run hbck on a secure cluster' to 0.94
 

 Key: HBASE-7814
 URL: https://issues.apache.org/jira/browse/HBASE-7814
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.94.6

 Attachments: 7814-v2.txt, 7814-v3.txt, 7814-v4.txt, 
 HBASE-7814-v0.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3171) Drop ROOT and instead store META location(s) directly in ZooKeeper

2013-02-12 Thread Enis Soztutar (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577083#comment-13577083
]

Enis Soztutar commented on HBASE-3171:
--

bq. May be it's totally stupid question, but would it make sense to use this
migration to rename '.META.' as well
I though I would never see this day. The jira for windows is HBASE-6818, but it
does not rename the table, but only the table dir name. I'll open another one
for discussing the rename.

Drop ROOT and instead store META location(s) directly in ZooKeeper
--

[jira] [Commented] (HBASE-7818) add region level metrics readReqeustCount and writeRequestCount


[ 
https://issues.apache.org/jira/browse/HBASE-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577089#comment-13577089
 ] 

Ted Yu commented on HBASE-7818:
---

{code}
+  public void SetOpMetricsReadRequestCount(long value)
{code}
Lower case the starting 'S', same with SetOpMetricsWriteRequestCount().
{code}
+  private void doUpdateNumericPersistentMetrics(Setbyte[] columnFamilies, 
String key, long value) {
+   String cfPrefix = null;
{code}
Indentation should be two spaces.
{code}
* This deletes all old metrics this instance has ever created or updated.
*/
-  public void closeMetrics() {
-RegionMetricsStorage.clear();
+  public void closeMetrics(String regionEncodedName) {
{code}
Javadoc no longer matches code.

Can you upload patch for trunk ?

Thanks

 add region level metrics readReqeustCount and writeRequestCount 
 

 Key: HBASE-7818
 URL: https://issues.apache.org/jira/browse/HBASE-7818
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.4
Reporter: Tianying Chang
Assignee: Tianying Chang
Priority: Minor
 Fix For: 0.94.6

 Attachments: HBASE-7818.patch


 Request rate at region server level can help identify the hot region server. 
 But it will be good if we can further identify the hot regions on that region 
 server. That way, we can easily find out unbalanced regions problem. 
 Currently, readRequestCount and writeReqeustCount per region is exposed at 
 webUI. It will be more useful to expose it through hadoop metrics framework 
 and/or JMX, so that people can see the history when the region is hot.
 I am exposing the existing readRequestCount/writeRequestCount into the 
 dynamic region level metrics framework. I am not changing/exposing it as rate 
 because our openTSDB is taking the raw data of read/write count, and apply 
 rate function to display the rate already. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7290) Online snapshots

[
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577104#comment-13577104
]

Hadoop QA commented on HBASE-7290:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12569057/hbase-7290.mega.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 91 new
or modified tests.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 6
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:red}-1 findbugs{color}. The patch appears to introduce 7 new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:red}-1 lineLengths{color}. The patch introduces lines longer than
100

{color:green}+1 core tests{color}. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/4425//console

This message is automatically generated.

Online snapshots
-

Key: HBASE-7290
URL: https://issues.apache.org/jira/browse/HBASE-7290
Project: HBase
Issue Type: Bug
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Attachments: hbase-7290.mega.patch

HBASE-6055 will be closed when the offline snapshots pieces get merged with
trunk. This umbrella issue has all the online snapshot specific patches.
This will get merged once one of the implementations makes it into trunk.
Other flavors of online snapshots can then be done as normal patches instead
of on a development branch. (was: HBASE-6055 will be closed when the online
snapshots pieces get merged with trunk. This umbrella issue has all the
online snapshot specific patches. This will get merged once one of the
implementations makes it into trunk. Other flavors of online snapshots can
then be done as normal patches instead of on a development branch.)
(not a fan of the quick edit descirption jira feature)

[jira] [Commented] (HBASE-7290) Online snapshots


[ 
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577112#comment-13577112
 ] 

Jonathan Hsieh commented on HBASE-7290:
---

We can fix these qabot issues before the merge.

 Online snapshots 
 -

 Key: HBASE-7290
 URL: https://issues.apache.org/jira/browse/HBASE-7290
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-7290.mega.patch


 HBASE-6055 will be closed when the offline snapshots pieces get merged with 
 trunk.  This umbrella issue has all the online snapshot specific patches.  
 This will get merged once one of the implementations makes it into trunk.  
 Other flavors of online snapshots can then be done as normal patches instead 
 of on a development branch.  (was: HBASE-6055 will be closed when the online 
 snapshots pieces get merged with trunk.  This umbrella issue has all the 
 online snapshot specific patches.  This will get merged once one of the 
 implementations makes it into trunk.  Other flavors of online snapshots can 
 then be done as normal patches instead of on a development branch.)
 (not a fan of the quick edit descirption jira feature)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-7802) Use the Store interface instead of HStore (coprocessor, compaction, ...)