[jira] [Commented] (HBASE-14227) Fold special cased MOB APIs into existing APIs

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718126#comment-14718126
 ] 

Hadoop QA commented on HBASE-14227:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12752927/HBASE-14227_v5.patch
  against master branch at commit cc1542828de93b8d54cc14497fd5937989ea1b6d.
  ATTACHMENT ID: 12752927

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1851 checkstyle errors (more than the master's current 1849 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestIOFencing

 {color:red}-1 core zombie tests{color}.  There are 7 zombie test(s):   
at 
org.apache.hadoop.hbase.security.access.TestAccessController.testAccessControllerUserPermsRegexHandling(TestAccessController.java:2560)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15313//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15313//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15313//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15313//console

This message is automatically generated.

 Fold special cased MOB APIs into existing APIs
 --

 Key: HBASE-14227
 URL: https://issues.apache.org/jira/browse/HBASE-14227
 Project: HBase
  Issue Type: Task
  Components: mob
Affects Versions: 2.0.0
Reporter: Andrew Purtell
Assignee: Heng Chen
Priority: Blocker
 Fix For: 2.0.0

 Attachments: HBASE-14227.patch, HBASE-14227_v1.patch, 
 HBASE-14227_v2.patch, HBASE-14227_v3.patch, HBASE-14227_v4.patch, 
 HBASE-14227_v5.patch


 There are a number of APIs that came in with MOB that are not new actions for 
 HBase, simply new actions for a MOB implementation:
 - compactMob
 - compactMobs
 - majorCompactMob
 - majorCompactMobs
 - getMobCompactionState
 And in HBaseAdmin:
 - validateMobColumnFamily
 Remove these special cases from the Admin API where possible by folding them 
 into existing APIs.
 We definitely don't need one method for a singleton and another for 
 collections.
 Ideally we will not have any APIs named *Mob when finished, whether MOBs are 
 in use on a table or not should be largely an internal detail. Exposing as 
 schema option would be fine, this conforms to existing practice for other 
 features.
 Marking critical because I think removing the *Mob special cased APIs should 
 be a precondition for release of this feature either in 2.0 or as a backport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14327) TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky

2015-08-28 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718138#comment-14718138
 ] 

Heng Chen commented on HBASE-14327:
---

I met this problem too.
https://builds.apache.org/job/PreCommit-HBASE-Build/15313//testReport/



 TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky
 --

 Key: HBASE-14327
 URL: https://issues.apache.org/jira/browse/HBASE-14327
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Dima Spivak
Priority: Critical

 I'm looking into some more of the flaky tests on trunk and this one seems to 
 be  particularly gross, failing about half the time in recent days. Some 
 probably-relevant output from [a recent 
 run|https://builds.apache.org/job/HBase-TRUNK/6761/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/]:
 {noformat}
 2015-08-27 18:50:14,318 INFO  [main] hbase.TestIOFencing(326): Allowing 
 compaction to proceed
 2015-08-27 18:50:14,318 DEBUG [main] 
 hbase.TestIOFencing$CompactionBlockerRegion(110): allowing compactions
 2015-08-27 18:50:14,318 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,323 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,330 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,331 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe
 2015-08-27 18:50:14,341 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc
 2015-08-27 18:50:14,342 INFO  
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1353): 
 Completed compaction of 2 (all) file(s) in family of 
 tabletest,,1440701396419.94d6f21f7cf387d73d8622f535c67311. into 
 e138bb0ec6c64ad19efab3b44dbbcb1a(size=68.7 K), total size for store is 146.9 
 K. This selection was in queue for 0sec, and took 10sec to execute.
 2015-08-27 18:50:14,343 INFO  
 [RS:0;hemera:35619-longCompactions-1440701391112] 
 regionserver.CompactSplitThread$CompactionRunner(527): Completed compaction: 
 Request = 
 regionName=tabletest,,1440701396419.94d6f21f7cf387d73d8622f535c67311., 
 storeName=family, fileCount=2, fileSize=73.1 K, priority=998, 
 time=525052314434020; duration=10sec
 2015-08-27 18:50:14,343 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/2926c09f1941416eb557ee5d283d7e2b,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/2926c09f1941416eb557ee5d283d7e2b
 2015-08-27 18:50:14,347 

[jira] [Commented] (HBASE-14261) Enhance Chaos Monkey framework by adding zookeeper and datanode fault injections.

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718186#comment-14718186
 ] 

Hadoop QA commented on HBASE-14261:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12752938/HBASE-14261.branch-1_v2.patch
  against branch-1 branch at commit cc1542828de93b8d54cc14497fd5937989ea1b6d.
  ATTACHMENT ID: 12752938

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 30 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
3816 checkstyle errors (more than the master's current 3815 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+clusterManager.kill(ServiceType.HBASE_REGIONSERVER, 
serverName.getHostname(), serverName.getPort());
+clusterManager.stop(ServiceType.HBASE_REGIONSERVER, 
serverName.getHostname(), serverName.getPort());
+clusterManager.kill(ServiceType.HBASE_ZOOKEEPER, serverName.getHostname(), 
serverName.getPort());
+clusterManager.stop(ServiceType.HBASE_ZOOKEEPER, serverName.getHostname(), 
serverName.getPort());
+clusterManager.start(ServiceType.HADOOP_DATANODE, 
serverName.getHostname(), serverName.getPort());
+clusterManager.kill(ServiceType.HADOOP_DATANODE, serverName.getHostname(), 
serverName.getPort());
+clusterManager.stop(ServiceType.HADOOP_DATANODE, serverName.getHostname(), 
serverName.getPort());
+   * 1 SSH options, 2 user name , 3 @ if username is set, 4 host, 5 original 
command, 6 service user.
+  private static final String DEFAULT_TUNNEL_CMD = /usr/bin/ssh %1$s 
%2$s%3$s%4$s \sudo -u %6$s %5$s\;
+  String cmd = String.format(tunnelCmd, sshOptions, sshUserName, at, 
hostname, remoteCmd, getServiceUser());

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 5 zombie test(s):   
at 
org.apache.cloudstack.GetServiceProviderMetaDataCmdTest.testAuthenticate(GetServiceProviderMetaDataCmdTest.java:83)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15314//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15314//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15314//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15314//console

This message is automatically generated.

 Enhance Chaos Monkey framework by adding zookeeper and datanode fault 
 injections.
 -

 Key: HBASE-14261
 URL: https://issues.apache.org/jira/browse/HBASE-14261
 Project: HBase
  Issue Type: Improvement
Reporter: Srikanth Srungarapu
Assignee: Srikanth Srungarapu
 Attachments: HBASE-14261-branch-1.patch, HBASE-14261.branch-1_v2.patch


 One of the shortcomings of existing ChaosMonkey framework is lack of fault 
 injections for hbase dependencies like zookeeper, hdfs etc. This patch 
 attempts to solve this problem partially by adding datanode and zk node fault 
 injections.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13153) enable bulkload to support replication

2015-08-28 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-13153:
--
Fix Version/s: 2.0.0
  Component/s: (was: API)
   Replication
   Issue Type: New Feature  (was: Bug)

 enable bulkload to support replication
 --

 Key: HBASE-13153
 URL: https://issues.apache.org/jira/browse/HBASE-13153
 Project: HBase
  Issue Type: New Feature
  Components: Replication
Reporter: sunhaitao
Assignee: Ashish Singhi
 Fix For: 2.0.0

 Attachments: HBase Bulk Load Replication.pdf


 Currently we plan to use HBase Replication feature to deal with disaster 
 tolerance scenario.But we encounter an issue that we will use bulkload very 
 frequently,because bulkload bypass write path, and will not generate WAL, so 
 the data will not be replicated to backup cluster. It's inappropriate to 
 bukload twice both on active cluster and backup cluster. So i advise do some 
 modification to bulkload feature to enable bukload to both active cluster and 
 backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14327) TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky

2015-08-28 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718387#comment-14718387
 ] 

Heng Chen commented on HBASE-14327:
---

As testcase comments.
{code}
Test that puts up a regionserver, starts a compaction on a loaded region but 
holds the
   * compaction completion until after we have killed the server and the region 
has come up on
   * a new regionserver altogether.  
{code}

It seems we should check the store files on new region before compaction 
finished. But as code below
{code}
  LOG.info(Allowing compaction to proceed);
  compactingRegion.allowCompactions();
  while (compactingRegion.compactCount == 0) {
Thread.sleep(1000);
  }
  // The server we killed stays up until the compaction that was started 
before it was killed completes.  In logs
  // you should see the old regionserver now going down.
  LOG.info(Compaction finished);
  // After compaction of old region finishes on the server that was going 
down, make sure that
  // all the files we expect are still working when region is up in new 
location.
  FileSystem fs = newRegion.getFilesystem();
  for (String f: newRegion.getStoreFileList(new byte [][] {FAMILY})) {
assertTrue(After compaction, does not exist:  + f, fs.exists(new 
Path(f)));
  }
{code}

It seems we check store files after compaction.  Is it right?






 TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky
 --

 Key: HBASE-14327
 URL: https://issues.apache.org/jira/browse/HBASE-14327
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Dima Spivak
Priority: Critical

 I'm looking into some more of the flaky tests on trunk and this one seems to 
 be  particularly gross, failing about half the time in recent days. Some 
 probably-relevant output from [a recent 
 run|https://builds.apache.org/job/HBase-TRUNK/6761/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/]:
 {noformat}
 2015-08-27 18:50:14,318 INFO  [main] hbase.TestIOFencing(326): Allowing 
 compaction to proceed
 2015-08-27 18:50:14,318 DEBUG [main] 
 hbase.TestIOFencing$CompactionBlockerRegion(110): allowing compactions
 2015-08-27 18:50:14,318 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,323 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,330 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,331 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe
 2015-08-27 18:50:14,341 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc
 2015-08-27 18:50:14,342 INFO  
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1353): 
 Completed compaction of 2 (all) file(s) in family of 
 

[jira] [Updated] (HBASE-14331) a single callQueue related improvements

2015-08-28 Thread Hiroshi Ikeda (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hiroshi Ikeda updated HBASE-14331:
--
Attachment: CallQueuePerformanceTestApp.java

Added a simple application to check performance between {{LinkedBlockingQueue}} 
and {{ConcurrentLinkedQueue}} with {{Semaphore}}. The callQueue here is not one 
in HBase but is defined inside as a mere simple interface. This application 
also show how many threads to read from the queue were simultaneously waken.

In my environment using {{ConcurrentLinkedQueue}} is about 1.5-2 times faster, 
aside from whether the performance test practically makes sense.


 a single callQueue related improvements
 ---

 Key: HBASE-14331
 URL: https://issues.apache.org/jira/browse/HBASE-14331
 Project: HBase
  Issue Type: Improvement
  Components: IPC/RPC, Performance
Reporter: Hiroshi Ikeda
Priority: Minor
 Attachments: CallQueuePerformanceTestApp.java


 {{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
 the {{put}} method, but not between takers, and not between putters. These 
 methods are implemented to take locks at the almost beginning of their logic. 
 HBASE-11355 introduces multiple call-queues to reduce such possible 
 congestion, but I doubt that it is required to stick to {{BlockingQueue}}.
 There are the other shortcomings of using {{BlockingQueue}}. When using 
 multiple queues, since {{BlockingQueue}} blocks threads it is required to 
 prepare enough threads for each queue. It is possible that there is a queue 
 starving for threads while there is another queue where threads are idle. 
 Even if you can tune parameters to avoid such situations, the tuning is not 
 so trivial.
 I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14279) Race condition in ConcurrentIndex

2015-08-28 Thread Hiroshi Ikeda (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718316#comment-14718316
 ] 

Hiroshi Ikeda commented on HBASE-14279:
---

There is a similar class, {{KeyLoccker}}, whose key type is a generic one, and 
you can use it without copying its implementation logic. The implementation of 
{{IdLocker}} and {{KeyLocker}} is not good and should be re-written, and I 
believe it is better to avoid to copy their implementation.

To begin with, using 2 objects of {{ConcurrentHashMap}} is not good because 
{{putIfAbset}} and {{remove}} use just lock striping internally, and the 
periods of time which may cause conflict between threads increases twofold. It 
is better to manually use lock striping instead of {{ConcurrentHashMap}}.

BTW, {{ConcurrentHashMap.get}} just locks the segment only when accessing a 
value, and it is an idem to use {{get}} before {{putIfAbsent}}. This has 
potentially to reduce granularity of locks, though it seems not trivial to 
implement.

There is another thing to concern about. From the point of the view of 
object-oriented programing, exposing internal values by the method {{values}} 
causes breaking the invariant that the empty set should be removed from the 
map. It should be defensively copied, though it is required to change the API, 
and it may also be required to change code of the clients which use this class, 
making the patch easier to become stale.


 Race condition in ConcurrentIndex
 -

 Key: HBASE-14279
 URL: https://issues.apache.org/jira/browse/HBASE-14279
 Project: HBase
  Issue Type: Bug
Reporter: Hiroshi Ikeda
Assignee: Heng Chen
Priority: Minor
 Attachments: HBASE-14279.patch, HBASE-14279_v2.patch


 {{ConcurrentIndex.put}} and {{remove}} are in race condition. It is possible 
 to remove a non-empty set, and to add a value to a removed set. Also 
 {{ConcurrentIndex.values}} is vague in sense that the returned set sometimes 
 trace the current state and sometimes doesn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14331) a single callQueue related improvements

2015-08-28 Thread Hiroshi Ikeda (JIRA)
Hiroshi Ikeda created HBASE-14331:
-

 Summary: a single callQueue related improvements
 Key: HBASE-14331
 URL: https://issues.apache.org/jira/browse/HBASE-14331
 Project: HBase
  Issue Type: Improvement
  Components: IPC/RPC, Performance
Reporter: Hiroshi Ikeda
Priority: Minor


{{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
the {{put}} method, but not between takers, and not between putters. These 
methods are implemented to take locks at the almost beginning of their logic. 
HBASE-11355 introduces multiple call-queues to reduce such possible congestion, 
but I doubt that it is required to stick to {{BlockingQueue}}.

There are the other shortcomings of using {{BlockingQueue}}. When using 
multiple queues, since {{BlockingQueue}} blocks threads it is required to 
prepare enough threads for each queue. It is possible that there is a queue 
starving for threads while there is another queue where threads are idle. Even 
if you can tune parameters to avoid such situations, the tuning is not so 
trivial.

I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14279) Race condition in ConcurrentIndex

2015-08-28 Thread Heng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heng Chen updated HBASE-14279:
--
Status: Patch Available  (was: Open)

 Race condition in ConcurrentIndex
 -

 Key: HBASE-14279
 URL: https://issues.apache.org/jira/browse/HBASE-14279
 Project: HBase
  Issue Type: Bug
Reporter: Hiroshi Ikeda
Assignee: Heng Chen
Priority: Minor
 Attachments: HBASE-14279.patch, HBASE-14279_v2.patch


 {{ConcurrentIndex.put}} and {{remove}} are in race condition. It is possible 
 to remove a non-empty set, and to add a value to a removed set. Also 
 {{ConcurrentIndex.values}} is vague in sense that the returned set sometimes 
 trace the current state and sometimes doesn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14327) TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky

2015-08-28 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718366#comment-14718366
 ] 

Heng Chen commented on HBASE-14327:
---

After analysis the log,  we can see the storefile which not found was archived 
into other place. 
{code}
2015-08-27 18:50:14,341 DEBUG [RS:0;hemera:35619-longCompactions-1440701391112] 
backup.HFileArchiver(438): Finished archiving from class 
org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc,
 to 
hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc
{code}

So the reader to this storefile closed,  and NPE throws when we use this reader 
to generate log message
{code}
1340  for (StoreFile sf: sfs) {
1341message.append(sf.getPath().getName());
1342message.append((size=);
1343
message.append(TraditionalBinaryPrefix.long2String(sf.getReader().length(), , 
1));
1344message.append(), );
1345  }
{code}

So, to fix this issue.  We can stop archive store files or delay it.



 TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky
 --

 Key: HBASE-14327
 URL: https://issues.apache.org/jira/browse/HBASE-14327
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Dima Spivak
Priority: Critical

 I'm looking into some more of the flaky tests on trunk and this one seems to 
 be  particularly gross, failing about half the time in recent days. Some 
 probably-relevant output from [a recent 
 run|https://builds.apache.org/job/HBase-TRUNK/6761/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/]:
 {noformat}
 2015-08-27 18:50:14,318 INFO  [main] hbase.TestIOFencing(326): Allowing 
 compaction to proceed
 2015-08-27 18:50:14,318 DEBUG [main] 
 hbase.TestIOFencing$CompactionBlockerRegion(110): allowing compactions
 2015-08-27 18:50:14,318 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,323 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,330 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,331 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe
 2015-08-27 18:50:14,341 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc
 2015-08-27 18:50:14,342 INFO  
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1353): 
 Completed compaction of 2 (all) file(s) in family of 
 tabletest,,1440701396419.94d6f21f7cf387d73d8622f535c67311. into 
 e138bb0ec6c64ad19efab3b44dbbcb1a(size=68.7 K), total 

[jira] [Commented] (HBASE-14279) Race condition in ConcurrentIndex

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718440#comment-14718440
 ] 

Hadoop QA commented on HBASE-14279:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12752702/HBASE-14279_v2.patch
  against master branch at commit cc1542828de93b8d54cc14497fd5937989ea1b6d.
  ATTACHMENT ID: 12752702

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1850 checkstyle errors (more than the master's current 1849 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 10 zombie test(s):  
at 
org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatTestBase.testScan(MultiTableInputFormatTestBase.java:255)
at 
org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatTestBase.testScanEmptyToAPP(MultiTableInputFormatTestBase.java:202)
at 
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles.testSimpleHFileSplit(TestLoadIncrementalHFiles.java:153)
at 
org.apache.hadoop.hbase.mapreduce.TestRowCounter.testRowCounterExclusiveColumn(TestRowCounter.java:111)
at 
org.apache.hadoop.hbase.mapreduce.TestImportTSVWithVisibilityLabels.testMROnTableWithDeletes(TestImportTSVWithVisibilityLabels.java:182)
at 
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles.testRegionCrossingHFileSplit(TestLoadIncrementalHFiles.java:193)
at 
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles.testRegionCrossingHFileSplit(TestLoadIncrementalHFiles.java:171)
at 
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles.testSimpleHFileSplit(TestLoadIncrementalHFiles.java:153)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15315//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15315//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15315//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15315//console

This message is automatically generated.

 Race condition in ConcurrentIndex
 -

 Key: HBASE-14279
 URL: https://issues.apache.org/jira/browse/HBASE-14279
 Project: HBase
  Issue Type: Bug
Reporter: Hiroshi Ikeda
Assignee: Heng Chen
Priority: Minor
 Attachments: HBASE-14279.patch, HBASE-14279_v2.patch


 {{ConcurrentIndex.put}} and {{remove}} are in race condition. It is possible 
 to remove a non-empty set, and to add a value to a removed set. Also 
 {{ConcurrentIndex.values}} is vague in sense that the returned set sometimes 
 trace the current state and sometimes doesn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14332) give the table state when we encountered exception while disable/enable table

2015-08-28 Thread Nick.han (JIRA)
Nick.han created HBASE-14332:


 Summary: give the table state when we encountered exception while 
disable/enable table
 Key: HBASE-14332
 URL: https://issues.apache.org/jira/browse/HBASE-14332
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Nick.han
Assignee: Nick.han
Priority: Minor
 Fix For: 2.0.0


This patch is a advice for good user experience
reason:
When we disable a table and the table is not enabled,we receive a exception,but 
the exception is too brief,some time we  want to know what state is the table 
in,so that we can know why the table can't be disable.For example,I once 
encountered a problem the table is neither disable nor enable when my region 
server crash down,if we give the table state,I will find the problem more 
quickly .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14325) Add snapshotinfo command to hbase script

2015-08-28 Thread Nick.han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick.han updated HBASE-14325:
-
Attachment: (was: HBASE-14332.patch)

 Add snapshotinfo command to hbase script
 

 Key: HBASE-14325
 URL: https://issues.apache.org/jira/browse/HBASE-14325
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.0.0, 1.2.0
Reporter: Samir Ahmic
Assignee: Samir Ahmic
Priority: Minor
 Attachments: HBASE-14325.patch


 Since we already have commands like hbck, hfile, wal etc. that are used for 
 getting various types of information about HBase components it make sense to 
 me to add SnapshotInfo tool to collection. 
 If nobody objects i would add patch for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14325) Add snapshotinfo command to hbase script

2015-08-28 Thread Nick.han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick.han updated HBASE-14325:
-
Attachment: HBASE-14332.patch

 Add snapshotinfo command to hbase script
 

 Key: HBASE-14325
 URL: https://issues.apache.org/jira/browse/HBASE-14325
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.0.0, 1.2.0
Reporter: Samir Ahmic
Assignee: Samir Ahmic
Priority: Minor
 Attachments: HBASE-14325.patch, HBASE-14332.patch


 Since we already have commands like hbck, hfile, wal etc. that are used for 
 getting various types of information about HBase components it make sense to 
 me to add SnapshotInfo tool to collection. 
 If nobody objects i would add patch for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14332) give the table state when we encountered exception while disable/enable table

2015-08-28 Thread Nick.han (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick.han updated HBASE-14332:
-
Attachment: HBASE-14332.patch

 give the table state when we encountered exception while disable/enable table
 -

 Key: HBASE-14332
 URL: https://issues.apache.org/jira/browse/HBASE-14332
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Nick.han
Assignee: Nick.han
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-14332.patch


 This patch is a advice for good user experience
 reason:
 When we disable a table and the table is not enabled,we receive a 
 exception,but the exception is too brief,some time we  want to know what 
 state is the table in,so that we can know why the table can't be disable.For 
 example,I once encountered a problem the table is neither disable nor enable 
 when my region server crash down,if we give the table state,I will find the 
 problem more quickly .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14289) Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14289:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the review, Andrew.

 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98
 ---

 Key: HBASE-14289
 URL: https://issues.apache.org/jira/browse/HBASE-14289
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.15

 Attachments: 14289-0.98-v2.txt, 14289-0.98-v3.txt, 14289-0.98-v4.txt, 
 14289-0.98-v5.txt


 The default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 This issue backports HBASE-13965 to 0.98 branch to provide visibility via JMX 
 into each cost function of the stochastic load balancer, as well as the 
 overall cost of the balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14332) give the table state when we encountered exception while disable/enable table

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14332:
---
Status: Patch Available  (was: Open)

 give the table state when we encountered exception while disable/enable table
 -

 Key: HBASE-14332
 URL: https://issues.apache.org/jira/browse/HBASE-14332
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Nick.han
Assignee: Nick.han
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-14332.patch


 This patch is a advice for good user experience
 reason:
 When we disable a table and the table is not enabled,we receive a 
 exception,but the exception is too brief,some time we  want to know what 
 state is the table in,so that we can know why the table can't be disable.For 
 example,I once encountered a problem the table is neither disable nor enable 
 when my region server crash down,if we give the table state,I will find the 
 problem more quickly .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14332) Show the table state when we encounter exception while disabling / enabling table

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14332:
---
Summary: Show the table state when we encounter exception while disabling / 
enabling table  (was: give the table state when we encountered exception while 
disable/enable table)

 Show the table state when we encounter exception while disabling / enabling 
 table
 -

 Key: HBASE-14332
 URL: https://issues.apache.org/jira/browse/HBASE-14332
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Nick.han
Assignee: Nick.han
Priority: Minor
 Fix For: 2.0.0

 Attachments: HBASE-14332.patch


 This patch is a advice for good user experience
 reason:
 When we disable a table and the table is not enabled,we receive a 
 exception,but the exception is too brief,some time we  want to know what 
 state is the table in,so that we can know why the table can't be disable.For 
 example,I once encountered a problem the table is neither disable nor enable 
 when my region server crash down,if we give the table state,I will find the 
 problem more quickly .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures

2015-08-28 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718631#comment-14718631
 ] 

Sean Busbey commented on HBASE-12187:
-

[~d.yuan], did your changes to error-prone ever get accepted into the main 
error-prone repository? I'm having some difficulty finding them.

 Review in source the paper Simple Testing Can Prevent Most Critical Failures
 --

 Key: HBASE-12187
 URL: https://issues.apache.org/jira/browse/HBASE-12187
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Critical
 Attachments: HBASE-12187.patch, abortInOvercatch.warnings.txt, 
 emptyCatch.warnings.txt, todoInCatch.warnings.txt


 Review the helpful paper 
 https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf
 It describes 'catastrophic failures', especially issues where exceptions are 
 thrown but not properly handled.  Their static analysis tool Aspirator turns 
 up a bunch of the obvious offenders (Lets add to test-patch.sh alongside 
 findbugs?).  This issue is about going through code base making sub-issues to 
 root out these and others (Don't we have the test described in figure #6 
 already? I thought we did?  If we don't, need to add).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14331) a single callQueue related improvements

2015-08-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720048#comment-14720048
 ] 

Ted Yu commented on HBASE-14331:


In HBase, there is BoundedConcurrentLinkedQueue
Can you expand your program by including BoundedConcurrentLinkedQueue ?

Thanks

 a single callQueue related improvements
 ---

 Key: HBASE-14331
 URL: https://issues.apache.org/jira/browse/HBASE-14331
 Project: HBase
  Issue Type: Improvement
  Components: IPC/RPC, Performance
Reporter: Hiroshi Ikeda
Priority: Minor
 Attachments: CallQueuePerformanceTestApp.java


 {{LinkedBlockingQueue}} well separates locks between the {{take}} method and 
 the {{put}} method, but not between takers, and not between putters. These 
 methods are implemented to take locks at the almost beginning of their logic. 
 HBASE-11355 introduces multiple call-queues to reduce such possible 
 congestion, but I doubt that it is required to stick to {{BlockingQueue}}.
 There are the other shortcomings of using {{BlockingQueue}}. When using 
 multiple queues, since {{BlockingQueue}} blocks threads it is required to 
 prepare enough threads for each queue. It is possible that there is a queue 
 starving for threads while there is another queue where threads are idle. 
 Even if you can tune parameters to avoid such situations, the tuning is not 
 so trivial.
 I suggest using a single {{ConcurrentLinkedQueue}} with {{Semaphore}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13469:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
 

 Key: HBASE-13469
 URL: https://issues.apache.org/jira/browse/HBASE-13469
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 1.1.0
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
 Fix For: 1.1.0

 Attachments: HBASE-13469.v2-branch-1.1.patch


 In branch-1, I think we want proc v2 to be configurable, so that if any 
 non-recoverable issue is found, at least there is a workaround. We already 
 have the handlers and code laying around. It will be just introducing the 
 config to enable / disable. We can even make it dynamically configurable via 
 the new framework. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13415) Procedure V2 - Use nonces for double submits from client

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13415:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - Use nonces for double submits from client
 

 Key: HBASE-13415
 URL: https://issues.apache.org/jira/browse/HBASE-13415
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Stephen Yuan Jiang
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.3.0

 Attachments: HBASE-13415.v1-master.patch, 
 HBASE-13415.v2-master.patch, HBASE-13415.v3-master.patch


 The client can submit a procedure, but before getting the procId back, the 
 master might fail. In this case, the client request will fail and the client 
 will re-submit the request. If 1.1 client or if there is no contention for 
 the table lock, the time window is pretty small, but still might happen. 
 If the proc was accepted and stored in the procedure store, a re-submit from 
 the client will add another procedure, which will execute after the first 
 one. The first one will likely succeed, and the second one will fail (for 
 example in the case of create table, the second one will throw 
 TableExistsException). 
 One idea is to use client generated nonces (that we already have) to guard 
 against these cases. The client will submit the request with the nonce and 
 the nonce will be saved together with the procedure in the store. In case of 
 a double submit, the nonce-cache is checked and the procId of the original 
 request is returned. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13211) Procedure V2 - master Enable/Disable table

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13211:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - master Enable/Disable table
 --

 Key: HBASE-13211
 URL: https://issues.apache.org/jira/browse/HBASE-13211
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0, 1.1.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
  Labels: reliability
 Fix For: 2.0.0, 1.1.0

 Attachments: EnableDisableTableServer-no-gen-files.v1-master.patch, 
 HBASE-13211-v2-branch-1.patch, HBASE-13211-v2.patch

   Original Estimate: 120h
  Time Spent: 216h
  Remaining Estimate: 0h

 master side, part of HBASE-12439
 starts up the procedure executor on the master
 and replaces the enable/disable table handlers with the procedure version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13212:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - master Create/Modify/Delete namespace
 

 Key: HBASE-13212
 URL: https://issues.apache.org/jira/browse/HBASE-13212
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
  Labels: reliability
 Fix For: 2.0.0, 1.3.0

 Attachments: HBASE-13212.v1-branch-1.patch, 
 HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, 
 HBASE-13212.v3-master.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 master side, part of HBASE-12439
 starts up the procedure executor on the master
 and replaces the create/modify/delete namespace handlers with the procedure 
 version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13290) Procedure v2 - client enable/disable table sync

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13290:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - client enable/disable table sync
 ---

 Key: HBASE-13290
 URL: https://issues.apache.org/jira/browse/HBASE-13290
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Affects Versions: 2.0.0, 1.1.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
 Fix For: 2.0.0, 1.1.0

 Attachments: EnableDisableTableClientPV2-draft.v0-master.patch, 
 HBASE-13290-v2-branch-1.patch, HBASE-13290-v2.patch

   Original Estimate: 120h
  Time Spent: 72h
  Remaining Estimate: 0h

 client side, part of HBASE-12439/HBASE-13211
 it uses the new procedure code to be know when the procedure is completed, 
 and have a proper sync/async behavior on enable/disable table.  
 For 1.1, It has to be binary compatible (1.0 client can talk to 1.1 server  
 1.1 client can talk to 1.0 server).  Binary compatible is TBD for 2.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14289) Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720901#comment-14720901
 ] 

Hudson commented on HBASE-14289:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1058 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1058/])
HBASE-14289 Revert due to compilation problem in downstream project(s) (tedyu: 
rev c78260cd1edc85a5cc1d7e7caaedf499b12e67b9)
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSourceImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java
* 
hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSourceImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/TestStochasticBalancerJmxMetrics.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/SimpleLoadBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/FavoredNodeLoadBalancer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* 
hbase-hadoop2-compat/src/main/resources/META-INF/services/org.apache.hadoop.hbase.master.balancer.MetricsStochasticBalancerSource
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
hbase-hadoop1-compat/src/main/resources/META-INF/services/org.apache.hadoop.hbase.master.balancer.MetricsStochasticBalancerSource
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98
 ---

 Key: HBASE-14289
 URL: https://issues.apache.org/jira/browse/HBASE-14289
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.15

 Attachments: 14289-0.98-v2.txt, 14289-0.98-v3.txt, 14289-0.98-v4.txt, 
 14289-0.98-v5.txt


 The default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 This issue backports HBASE-13965 to 0.98 branch to provide visibility via JMX 
 into each cost function of the stochastic load balancer, as well as the 
 overall cost of the balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720902#comment-14720902
 ] 

Hudson commented on HBASE-14315:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1058 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1058/])
HBASE-14315 Save one call to KeyValueHeap.peek per row. (larsh: rev 
7a4fa7f20cf8084ce57dcaa83e0c4f430d290736)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720933#comment-14720933
 ] 

Elliott Clark commented on HBASE-6721:
--

bq.Maybe you can elaborate on this point?
Sure let me write up something more on that point.

bq.Let's not let the perfect be the enemy of the good.
I'm not asking for perfect. I'm asking not to take a half step back for most 
users so that one user can merge this feature and take a single step forward.

bq.Meanwhile we have a patch on deck and we need to be evaluating it and its 
contributor's concerns on their merit.
Yeah and I have evaluated it and used my knowledge and judgement and cast my 
vote on this patch and feature in accordance with the Apache Foundation rules 
(http://www.apache.org/foundation/voting.html#votes-on-code-modification)  . I 
have added my technical reasons. I have even outlined what I would need to vote 
a different way.

bq. the 0.90 master rewrite (remember that?)
Yeah that has since proved to be the wrong way to go. It put way too much in ZK 
and now we've spent years un-doing it. We would have been better served with 
asking for parts to be pluggable and not on by default.

bq.or the memcache-based block cache carried
That followed the exact same bar that I'm requesting here. No added complexity 
on the default use case. I've even moved it into a different module so that it 
will be exactly what I'm asking for here.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720960#comment-14720960
 ] 

Hadoop QA commented on HBASE-14283:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12753103/HBASE-14283-v2.patch
  against master branch at commit 0d06d8ddd0aa9aff7476fb6a7acd6af1d24ba3fc.
  ATTACHMENT ID: 12753103

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 checkstyle{color}.  The applied patch generated 
1849 checkstyle errors (more than the master's current 1846 errors).

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15322//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15322//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15322//artifact/patchprocess/checkstyle-aggregate.html

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15322//console

This message is automatically generated.

 Reverse scan doesn’t work with HFile inline index/bloom blocks
 --

 Key: HBASE-14283
 URL: https://issues.apache.org/jira/browse/HBASE-14283
 Project: HBase
  Issue Type: Bug
Reporter: Ben Lau
Assignee: Ben Lau
 Attachments: HBASE-14283-v2.patch, HBASE-14283.patch, 
 hfile-seek-before.patch


 Reverse scans do not work if an HFile contains inline bloom blocks or leaf 
 level index blocks.  The reason is because the seekBefore() call calculates 
 the previous data block’s size by assuming data blocks are contiguous which 
 is not the case in HFile V2 and beyond.
 Attached is a first cut patch (targeting 
 bcef28eefaf192b0ad48c8011f98b8e944340da5 on trunk) which includes:
 (1) a unit test which exposes the bug and demonstrates failures for both 
 inline bloom blocks and inline index blocks
 (2) a proposed fix for inline index blocks that does not require a new HFile 
 version change, but is only performant for 1 and 2-level indexes and not 3+.  
 3+ requires an HFile format update for optimal performance.
 This patch does not fix the bloom filter blocks bug.  But the fix should be 
 similar to the case of inline index blocks.  The reason I haven’t made the 
 change yet is I want to confirm that you guys would be fine with me revising 
 the HFile.Reader interface.
 Specifically, these 2 functions (getGeneralBloomFilterMetadata and 
 getDeleteBloomFilterMetadata) need to return the BloomFilter.  Right now the 
 HFileReader class doesn’t have a reference to the bloom filters (and hence 
 their indices) and only constructs the IO streams and hence has no way to 
 know where the bloom blocks are in the HFile.  It seems that the HFile.Reader 
 bloom method comments state that they “know nothing about how that metadata 
 is structured” but I do not know if that is a requirement of the abstraction 
 (why?) or just an incidental current property. 
 We would like to do 3 things with community approval:
 (1) Update the HFile.Reader interface and implementation to contain and 
 return BloomFilters directly rather than unstructured IO streams
 (2) Merge the fixes for index blocks and bloom blocks into open source
 (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ 
 field in the block header in the next HFile version, so that seekBefore() 
 calls can not only be correct but performant in all cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14336) Procedure V2 Phase 1 - Procedure Framework and Making DDL Operations fault tolerant

2015-08-28 Thread Stephen Yuan Jiang (JIRA)
Stephen Yuan Jiang created HBASE-14336:
--

 Summary: Procedure V2 Phase 1 - Procedure Framework and Making DDL 
Operations fault tolerant
 Key: HBASE-14336
 URL: https://issues.apache.org/jira/browse/HBASE-14336
 Project: HBase
  Issue Type: Task
  Components: proc-v2
Affects Versions: 1.1.0, 2.0.0, 1.2.0, 1.3.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang


This is the first phase of Procedure V2 (HBASE-12439)
- Core framework
- re-implement Namespace/Table/Column DDLs to multi-steps procedure with a 
rollback/rollforward ability in case of failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14108) Administrative Task: provide an API to abort a procedure

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14108:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Administrative Task: provide an API to abort a procedure
 

 Key: HBASE-14108
 URL: https://issues.apache.org/jira/browse/HBASE-14108
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.3.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
 Attachments: HBASE-14108.v1-master.patch


 With Procedure-V2 in production since HBASE 1.1 release, there is a need to 
 abort a procedure (eg. for a long-running procedure that stucks somewhere and 
 blocks others).  The command could either from shell or Web UI.
 This task tracks the work to provide an API to abort a procedure (either 
 rollback or simply quit).  This API could be used either from shell or Web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14212) Add IT test for procedure-v2-based namespace DDL

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14212:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Add IT test for procedure-v2-based namespace DDL
 

 Key: HBASE-14212
 URL: https://issues.apache.org/jira/browse/HBASE-14212
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.3.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang

 Integration test for proc-v2-based table DDLs was created in HBASE-12439 
 during HBASE 1.1 release.  With HBASE-13212, proc-v2-based namespace DDLs are 
 introduced.  We need to enhanced the IT from HBASE-12429 to include namespace 
 DDLs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13993) WALProcedureStore fencing is not effective if new WAL rolls

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13993:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 WALProcedureStore fencing is not effective if new WAL rolls 
 

 Key: HBASE-13993
 URL: https://issues.apache.org/jira/browse/HBASE-13993
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-13993-v2.patch, HBASE-13993-v3.patch, 
 hbase-13993_v1.patch


 WAL fencing for the WALProcedureStore is a bit different than the fencing 
 done for region server WALs. 
 In case of this sequence of events, the WAL is not fenced (especially with 
 HBASE-13832 patch): 
  - master1 creates WAL with logId = 1: 
 {{/MasterProcWALs/state-0001.log}} 
  - master2 takes over, fences logId = 1 with recoverLease(), creates logId=2: 
 {{/MasterProcWALs/state-0002.log}}.
  - master2 writes some procedures and rolls the logId2, and creates logId = 
 3, and deletes logId = 2. 
  - master1 now tries to write a procedure, gets lease mismatch, rolls the log 
 from 1 to 2, and succeeds the write since it can write logId = 2 (master2 
 uses logId=3 now). 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13832) Procedure V2: master fail to start due to WALProcedureStore sync failures when HDFS data nodes count is low

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13832:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2: master fail to start due to WALProcedureStore sync failures 
 when HDFS data nodes count is low
 ---

 Key: HBASE-13832
 URL: https://issues.apache.org/jira/browse/HBASE-13832
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2
Affects Versions: 2.0.0, 1.1.0, 1.2.0
Reporter: Stephen Yuan Jiang
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0

 Attachments: HBASE-13832-v0.patch, HBASE-13832-v1.patch, 
 HBASE-13832-v2.patch, HBASE-13832-v4.patch, HBASE-13832-v5.patch, 
 HBASE-13832-v6.patch, HDFSPipeline.java, hbase-13832-test-hang.patch, 
 hbase-13832-v3.patch


 when the data node  3, we got failure in WALProcedureStore#syncLoop() during 
 master start.  The failure prevents master to get started.  
 {noformat}
 2015-05-29 13:27:16,625 ERROR [WALProcedureStoreSyncThread] 
 wal.WALProcedureStore: Sync slot failed, abort.
 java.io.IOException: Failed to replace a bad datanode on the existing 
 pipeline due to no more good datanodes being available to try. (Nodes: 
 current=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK],
  
 DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-490ece56c772,DISK]],
  
 original=[DatanodeInfoWithStorage[10.333.444.555:50010,DS-3ced-93f4-47b6-9c23-1426f7a6acdc,DISK],
  DatanodeInfoWithStorage[10.222.666.777:50010,DS-f9c983b4-1f10-4d5e-8983-
 490ece56c772,DISK]]). The current failed datanode replacement policy is 
 DEFAULT, and a client may configure this via 
 'dfs.client.block.write.replace-datanode-on-failure.policy'  in its 
 configuration.
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951)
 {noformat}
 One proposal is to implement some similar logic as FSHLog: if IOException is 
 thrown during syncLoop in WALProcedureStore#start(), instead of immediate 
 abort, we could try to roll the log and see whether this resolve the issue; 
 if the new log cannot be created or more exception from rolling the log, we 
 then abort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13950) Add a NoopProcedureStore for testing

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13950:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Add a NoopProcedureStore for testing
 

 Key: HBASE-13950
 URL: https://issues.apache.org/jira/browse/HBASE-13950
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13950-v0-branch-1.patch, 
 HBASE-13950-v1-branch-1.patch, HBASE-13950-v1.patch


 Add a NoopProcedureStore and an helper in ProcedureTestingUtil to 
 submitAndWait() a procedure without having to do anything else.
 This is useful to avoid extra code like in case of 
 TestAssignmentManager.processServerShutdownHandler()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14051) Undo workarounds in IntegrationTestDDLMasterFailover for client double submit

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14051:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Undo workarounds in IntegrationTestDDLMasterFailover for client double submit
 -

 Key: HBASE-14051
 URL: https://issues.apache.org/jira/browse/HBASE-14051
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Sophia Feng
 Fix For: 2.0.0, 1.2.0, 1.3.0


 Now that nonce support for dealing with double-submits from client in 
 HBASE-13415 is committed, we should undo the workarounds done in HBASE-13470 
 for this part. We needed these workarounds for 1.1.1, but we should not do 
 that anymore for proper testing for 1.2+. 
 [~syuanjiang], [~fengs], [~mbertozzi] FYI. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14107) Administrative Task: Provide an API to List all procedures

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14107:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Administrative Task: Provide an API to List all procedures 
 ---

 Key: HBASE-14107
 URL: https://issues.apache.org/jira/browse/HBASE-14107
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.3.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang

 With Procedure V2 in production since HBASE 1.1 release, there is a need to 
 list all procedures (running, queued, recently completed) from HBASE shell 
 (or Web UI).  
 This JIRA is to track the work to add a API to list all procedures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14017) Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue deletion

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14017:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - MasterProcedureQueue fix concurrency issue on table queue 
 deletion
 -

 Key: HBASE-14017
 URL: https://issues.apache.org/jira/browse/HBASE-14017
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0, 1.1.1, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.2.0, 1.1.2

 Attachments: HBASE-14017-v0.patch, HBASE-14017-v0.patch, 
 HBASE-14017.as-pushed-master.patch, HBASE-14017.v1-branch1.1.patch, 
 HBASE-14017.v1-branch1.1.patch


 [~syuanjiang] found a concurrecy issue in the procedure queue delete where we 
 don't have an exclusive lock before deleting the table
 {noformat}
 Thread 1: Create table is running - the queue is empty and wlock is false 
 Thread 2: markTableAsDeleted see the queue empty and wlock= false
 Thread 1: tryWrite() set wlock=true; too late
 Thread 2: delete the queue
 Thread 1: never able to release the lock - NPE when trying to get the queue
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14334) Move Memcached block cache in to it's own optional module.

2015-08-28 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14334:
--
Issue Type: Improvement  (was: Bug)

 Move Memcached block cache in to it's own optional module.
 --

 Key: HBASE-14334
 URL: https://issues.apache.org/jira/browse/HBASE-14334
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14334.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720908#comment-14720908
 ] 

Hudson commented on HBASE-14315:


FAILURE: Integrated in HBase-1.3 #139 (See 
[https://builds.apache.org/job/HBase-1.3/139/])
HBASE-14315 Save one call to KeyValueHeap.peek per row. (larsh: rev 
c277166fd1f3104c0db9011f700eebf404b812ad)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720921#comment-14720921
 ] 

Andrew Purtell commented on HBASE-6721:
---

Let's not let the perfect be the enemy of the good. A proper multi-layer 
admission control change isn't on the table, it isn't on anyone's roadmap, it 
isn't even something proposed on a JIRA and/or in a design document. Even if we 
have a proposal for HBase this will certainly be considered imperfect and 
incomplete by some without 100% agreement and a plan at the HDFS level, and 
getting that is as likely as finding a unicorn wandering around downtown SF. 
(Ok... maybe a horse dressed to look like a unicorn could be a thing...)  
Meanwhile we have a patch on deck and we need to be evaluating it and its 
contributor's concerns on their merit.

This is something that one of our esteemed users runs in production, is 
persistent about getting in and responsive to feedback, and both of those 
things in my opinion should carry a lot of weight. The same kind of weight that 
previously proposed changes like HFileV2 or the 0.90 master rewrite (remember 
that?), or the memcache-based block cache carried, or pending IPv6 related 
changes.

That said, I don't think it's ready to be merged into master. We have it up in 
a feature branch. Let's continue that, address concerns, make sure it's totally 
optional for those who don't want it, measure its impact. 

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720938#comment-14720938
 ] 

Hudson commented on HBASE-14322:


FAILURE: Integrated in HBase-1.2 #143 (See 
[https://builds.apache.org/job/HBase-1.2/143/])
HBASE-14322 Add a master priority function to let master use it's threads 
(eclark: rev 5f15583d342cd3129648452cdd7a6c686ad7fa81)
* hbase-server/src/test/java/org/apache/hadoop/hbase/QosTestHelper.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/AnnotationReadingPriorityFunction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterAnnotationReadingPriorityFunction.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterQosFunction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQosFunction.java


 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322-v7.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13204) Procedure v2 - client create/delete table sync

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13204:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - client create/delete table sync
 --

 Key: HBASE-13204
 URL: https://issues.apache.org/jira/browse/HBASE-13204
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13204-v0-branch-1.patch, HBASE-13204-v0.patch


 client side, part of HBASE-12439/HBASE-13203
 it uses the new procedure code to be know when the procedure is completed, 
 and have a proper sync behavior on create/delete table.
 Review: https://reviews.apache.org/r/32391/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13209) Procedure V2 - master Add/Modify/Delete Column Family

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13209:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - master Add/Modify/Delete Column Family
 -

 Key: HBASE-13209
 URL: https://issues.apache.org/jira/browse/HBASE-13209
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0, 1.1.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
  Labels: reliability
 Fix For: 2.0.0, 1.1.0

 Attachments: AlterColumnFamily-no-gen-file.v1-master.patch, 
 HBASE-13209-v2.patch, HBASE-13209-v3-branch-1.patch, HBASE-13209-v3.patch

   Original Estimate: 168h
  Time Spent: 384h
  Remaining Estimate: 0h

 master side, part of HBASE-12439
 starts up the procedure executor on the master
 and replaces the add/modify/delete handlers with the procedure version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13210) Procedure V2 - master Modify table

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13210:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - master Modify table
 --

 Key: HBASE-13210
 URL: https://issues.apache.org/jira/browse/HBASE-13210
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0, 1.1.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
  Labels: reliablity
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13210-v2.patch, HBASE-13210-v3.patch, 
 ModifyTableProcedure-no-gen-file.v1-master.patch

   Original Estimate: 72h
  Time Spent: 168h
  Remaining Estimate: 0h

 master side, part of HBASE-12439
 starts up the procedure executor on the master
 and replaces the modify table handlers with the procedure version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13202) Procedure v2 - core framework

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13202:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - core framework
 -

 Key: HBASE-13202
 URL: https://issues.apache.org/jira/browse/HBASE-13202
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0, 1.1.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13202-v0-hbase-12439.patch, 
 HBASE-13202-v1-hbase-12439.patch, HBASE-13202-v2.patch, 
 HBASE-13203-v2-branch_1.patch, ProcedureV2-overview.pdf


 core package, part of HBASE-12439
 this is just the proc-v2 submodule. it depends only on hbase-common.
 https://reviews.apache.org/r/27703/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13203) Procedure v2 - master create/delete table

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13203:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - master create/delete table
 -

 Key: HBASE-13203
 URL: https://issues.apache.org/jira/browse/HBASE-13203
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0, 1.1.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13203-v0.patch, HBASE-13203-v1-branch-1.patch, 
 HBASE-13203-v1.patch


 master side, part of HBASE-12439
 starts up the procedure executor on the master
 and replaces the create/delete table handlers with the procedure version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13685) Procedure v2 - Add maxProcId to the wal header

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13685:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - Add maxProcId to the wal header
 --

 Key: HBASE-13685
 URL: https://issues.apache.org/jira/browse/HBASE-13685
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.1.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 1.1.0

 Attachments: HBASE-13685-v0.patch


 while working on HBASE-13476 I found that having the max-proc-id in the wal 
 header, allows some nice optimizations.
 [~ndimiduk] since 1.1 is not released yet, can we get this in so we can avoid 
 all extra handling?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13759) Improve procedure yielding

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13759:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Improve procedure yielding
 --

 Key: HBASE-13759
 URL: https://issues.apache.org/jira/browse/HBASE-13759
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.2.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13759-v0.patch, HBASE-13759-v1.patch


 Adds the ability to yield the procedure every execution step.
 by default, a procedure will try to go begin to end without stopping.
 This allows procedures to be nice to other procedures.
 one usage example is ServerShutdownHandler where we want everyone to make 
 some progress.
 Allows procedure to throw InterruptedException, the default handling will be:
 ask the master if there is an abort of stop. If there is, stop executions 
 and exit. Else, clear the IE and carryon executing. the interruted procedure 
 will retry.
 If the procedure implementor wants a different behavior, the IE can be 
 catched and custom handling can be performed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13551) Procedure V2 - Procedure classes should not be InterfaceAudience.Public

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13551:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - Procedure classes should not be InterfaceAudience.Public
 ---

 Key: HBASE-13551
 URL: https://issues.apache.org/jira/browse/HBASE-13551
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Blocker
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13551_v1.patch


 Just noticed this. We have ProcedureStore, and exceptions and some procedures 
 declared as {{InterfaceAudience.Public}}. We should really make them Private 
 or LimiterPrivate since Public is for user comsumption. Are we exposing 
 Procedures to coprocessors? I do not see a use case for that, so it should be 
 Private IMO.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11080) TestZKSecretWatcher#testKeyUpdate occasionally fails

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-11080.

Resolution: Cannot Reproduce

 TestZKSecretWatcher#testKeyUpdate occasionally fails
 

 Key: HBASE-11080
 URL: https://issues.apache.org/jira/browse/HBASE-11080
 Project: HBase
  Issue Type: Test
Affects Versions: 0.98.1
Reporter: Ted Yu
Priority: Minor

 From 
 https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/280/testReport/junit/org.apache.hadoop.hbase.security.token/TestZKSecretWatcher/testKeyUpdate/
  :
 {code}
 java.lang.AssertionError
   at org.junit.Assert.fail(Assert.java:86)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertNotNull(Assert.java:621)
   at org.junit.Assert.assertNotNull(Assert.java:631)
   at 
 org.apache.hadoop.hbase.security.token.TestZKSecretWatcher.testKeyUpdate(TestZKSecretWatcher.java:221)
 {code}
 Here is the assertion that failed:
 {code}
 assertNotNull(newMaster);
 {code}
 Looks like new master did not come up within 5 tries.
 One potential fix is to increase the number of attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720930#comment-14720930
 ] 

Hudson commented on HBASE-14315:


FAILURE: Integrated in HBase-0.98 #1104 (See 
[https://builds.apache.org/job/HBase-0.98/1104/])
HBASE-14315 Save one call to KeyValueHeap.peek per row. (larsh: rev 
7a4fa7f20cf8084ce57dcaa83e0c4f430d290736)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13476) Procedure v2 - Add Replay Order logic for child procedures

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13476:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - Add Replay Order logic for child procedures
 --

 Key: HBASE-13476
 URL: https://issues.apache.org/jira/browse/HBASE-13476
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.1.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-13476-v0.patch, HBASE-13476-v1.patch, 
 HBASE-13476-v2.patch


 The current replay order logic is only for single-level procedures (which is 
 what we are using today for master operations).
 To complete the implementation for the notification-bus we need to be able to 
 replay in correct order child procs too.
 this will not impact the the current procs implementation 
 (create/delete/modify/...) it is just a change at the framework level.
 https://reviews.apache.org/r/34289/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13529) Procedure v2 - WAL Improvements

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13529:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure v2 - WAL Improvements
 ---

 Key: HBASE-13529
 URL: https://issues.apache.org/jira/browse/HBASE-13529
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.1.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13529-v0.patch, HBASE-13529-v1.patch, 
 HBASE-13529-v2.patch, ProcedureStoreTest.java


 from the discussion in HBASE-12439 the wal was resulting slow.
  * there is an error around the awake of the slotCond.await(), causing more 
 wait then necessary
  * ArrayBlockingQueue is dog slow, replace it with ConcurrentLinkedQueue
  * roll the wal only if reaches a threshold (conf ops) to amortize the cost
  * hsync() is used by default, when the normal wal is using just hflush() 
 make it tunable via conf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13470) High level Integration test for master DDL operations

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13470:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 High level Integration test for master DDL operations
 -

 Key: HBASE-13470
 URL: https://issues.apache.org/jira/browse/HBASE-13470
 Project: HBase
  Issue Type: Sub-task
  Components: master
Reporter: Enis Soztutar
Assignee: Sophia Feng
 Fix For: 2.0.0, 1.2.0, 1.1.1, 1.3.0

 Attachments: HBASE-13470-v0.patch, HBASE-13470-v1.patch, 
 HBASE-13470-v2.patch, HBASE-13470-v3.patch, HBASE-13470-v4.patch, 
 hbase-13740_v5.patch, hbase-13740_v6.patch, hbase-13740_v6.patch


 Our [~fengs] has an integration test which executes DDL operations with a new 
 monkey to kill the active master as a high level test for the proc v2 
 changes. 
 The test does random DDL operations from 20 client threads. The DDL 
 statements are create / delete / modify / enable / disable table and CF 
 operations. It runs HBCK to verify the end state. 
 The test can be run on a single master, or multi master setup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14336) Procedure V2 Phase 1 - Procedure Framework and Making DDL Operations fault tolerant

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14336:
---
Labels: reliability  (was: )

 Procedure V2 Phase 1 - Procedure Framework and Making DDL Operations fault 
 tolerant
 ---

 Key: HBASE-14336
 URL: https://issues.apache.org/jira/browse/HBASE-14336
 Project: HBase
  Issue Type: Task
  Components: proc-v2
Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.3.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
  Labels: reliability

 This is the first phase of Procedure V2 (HBASE-12439)
 - Core framework
 - re-implement Namespace/Table/Column DDLs to multi-steps procedure with a 
 rollback/rollforward ability in case of failure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13455) Procedure V2 - master truncate table

2015-08-28 Thread Stephen Yuan Jiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-13455:
---
Parent Issue: HBASE-14336  (was: HBASE-12439)

 Procedure V2 - master truncate table
 

 Key: HBASE-13455
 URL: https://issues.apache.org/jira/browse/HBASE-13455
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0, 1.1.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-13455-v0.patch, HBASE-13455-v1.patch


 master side, part of HBASE-12439
 and replaces the truncate table handlers with the procedure version.
 https://reviews.apache.org/r/33102



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11159) Some medium tests should be classified as large tests

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-11159.

Resolution: Later

 Some medium tests should be classified as large tests
 -

 Key: HBASE-11159
 URL: https://issues.apache.org/jira/browse/HBASE-11159
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor

 Jeff Bowles made this observation based on Jenkins build.
 From https://builds.apache.org/job/HBase-TRUNK/5131/consoleFull :
 {code}
 Running org.apache.hadoop.hbase.client.TestHCM
 Tests run: 20, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 197.448 sec
 Running org.apache.hadoop.hbase.client.TestClientOperationInterrupt
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 72.34 sec
 {code}
 The above tests should be classified as large tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11198) test-patch.sh should handle the case where trunk patch is attached along with patch for 0.98

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-11198.

Resolution: Won't Fix

 test-patch.sh should handle the case where trunk patch is attached along with 
 patch for 0.98
 

 Key: HBASE-11198
 URL: https://issues.apache.org/jira/browse/HBASE-11198
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu

 From https://builds.apache.org/job/PreCommit-HBASE-Build/9531//console :
 {code}
 -1 overall. Here are the results of testing the latest attachment 
 http://issues.apache.org/jira/secure/attachment/12645375/HBASE-11104_98_v3.patch
 against trunk revision .
 ATTACHMENT ID: 12645375
 +1 @author. The patch does not contain any @author tags.
 +1 tests included. The patch appears to include 3 new or modified tests.
 -1 patch. The patch command could not apply the patch.
 {code}
 The cause was that patch for 0.98 was slightly newer than trunk patch.
 test-patch.sh should handle this case by recognizing 'could not apply the 
 patch' error and retrying with trunk patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720970#comment-14720970
 ] 

Hudson commented on HBASE-14322:


SUCCESS: Integrated in HBase-1.2-IT #119 (See 
[https://builds.apache.org/job/HBase-1.2-IT/119/])
HBASE-14322 Add a master priority function to let master use it's threads 
(eclark: rev 5f15583d342cd3129648452cdd7a6c686ad7fa81)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQosFunction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterAnnotationReadingPriorityFunction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/AnnotationReadingPriorityFunction.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/QosTestHelper.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterQosFunction.java


 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322-v7.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720888#comment-14720888
 ] 

Hudson commented on HBASE-14315:


SUCCESS: Integrated in HBase-1.3-IT #121 (See 
[https://builds.apache.org/job/HBase-1.3-IT/121/])
HBASE-14315 Save one call to KeyValueHeap.peek per row. (larsh: rev 
c277166fd1f3104c0db9011f700eebf404b812ad)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14334) Move Memcached block cache in to it's own optional module.

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720915#comment-14720915
 ] 

Hadoop QA commented on HBASE-14334:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12753087/HBASE-14334.patch
  against master branch at commit cf4c0fb71ccb8b15549b2410083434398aa0ebb3.
  ATTACHMENT ID: 12753087

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+ xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;
+  
outputFile${project.build.directory}/test-classes/mrapp-generated-classpath/outputFile
+  
outputFile${project.build.directory}/test-classes/mrapp-generated-classpath/outputFile

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting

 {color:red}-1 core zombie tests{color}.  There are 10 zombie test(s):  
at 
org.apache.hadoop.hbase.master.balancer.BalancerTestBase.testWithCluster(BalancerTestBase.java:432)
at 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts(TestStochasticLoadBalancer.java:454)
at 
org.apache.hadoop.hbase.master.balancer.BalancerTestBase.testWithCluster(BalancerTestBase.java:432)
at 
org.apache.hadoop.hbase.master.balancer.BalancerTestBase.testWithCluster(BalancerTestBase.java:422)
at 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2.testRegionReplicasOnMidClusterHighReplication(TestStochasticLoadBalancer2.java:73)
at 
org.apache.hadoop.hbase.security.access.TestAccessController2.testACLTableAccess(TestAccessController2.java:273)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15320//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15320//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15320//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15320//console

This message is automatically generated.

 Move Memcached block cache in to it's own optional module.
 --

 Key: HBASE-14334
 URL: https://issues.apache.org/jira/browse/HBASE-14334
 Project: HBase
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14334.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720920#comment-14720920
 ] 

Hudson commented on HBASE-14315:


FAILURE: Integrated in HBase-TRUNK #6763 (See 
[https://builds.apache.org/job/HBase-TRUNK/6763/])
HBASE-14315 Save one call to KeyValueHeap.peek per row. (larsh: rev 
cf4c0fb71ccb8b15549b2410083434398aa0ebb3)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720932#comment-14720932
 ] 

Hadoop QA commented on HBASE-14322:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12753101/HBASE-14322-v7.patch
  against master branch at commit cf4c0fb71ccb8b15549b2410083434398aa0ebb3.
  ATTACHMENT ID: 12753101

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 15 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 3 zombie test(s):   
at 
org.apache.hadoop.hbase.regionserver.TestRegionReplicas.testRefreshStoreFiles(TestRegionReplicas.java:241)
at 
org.apache.hadoop.hbase.regionserver.TestCorruptedRegionStoreFile.testLosingFileAfterScannerInit(TestCorruptedRegionStoreFile.java:173)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321//console

This message is automatically generated.

 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322-v7.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720939#comment-14720939
 ] 

Hudson commented on HBASE-14322:


FAILURE: Integrated in HBase-TRUNK #6764 (See 
[https://builds.apache.org/job/HBase-TRUNK/6764/])
HBASE-14322 Add a master priority function to let master use it's threads 
(eclark: rev 0d06d8ddd0aa9aff7476fb6a7acd6af1d24ba3fc)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestQosFunction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterAnnotationReadingPriorityFunction.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterPriorityRpc.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/QosTestHelper.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/AnnotationReadingPriorityFunction.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterQosFunction.java


 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322-v7.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720940#comment-14720940
 ] 

stack commented on HBASE-14322:
---

bq. Or am I mis-understanding what you're saying?

My bad. My misreading of patch. Ignore.

 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322-v7.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14258) Make region_mover.rb script case insensitive with regard to hostname

2015-08-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720955#comment-14720955
 ] 

Ted Yu commented on HBASE-14258:


@Vlad:
If you agree with the formation in addendum 2, I can commit it.

 Make region_mover.rb script case insensitive with regard to hostname
 

 Key: HBASE-14258
 URL: https://issues.apache.org/jira/browse/HBASE-14258
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3

 Attachments: 14258-addendum.2, HBASE-14258.patch, 
 HBASE-14258.patch.add


 The script is case sensitive and fails when case of a host name being 
 unloaded does not match with a case of a region server name returned by HBase 
 API. 
 This doc clarifies IETF rules on case insensitivities in DNS:
 https://www.ietf.org/rfc/rfc4343.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14258) Make region_mover.rb script case insensitive with regard to hostname

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14258:
---
Attachment: 14258-addendum.2

 Make region_mover.rb script case insensitive with regard to hostname
 

 Key: HBASE-14258
 URL: https://issues.apache.org/jira/browse/HBASE-14258
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3

 Attachments: 14258-addendum.2, HBASE-14258.patch, 
 HBASE-14258.patch.add


 The script is case sensitive and fails when case of a host name being 
 unloaded does not match with a case of a region server name returned by HBase 
 API. 
 This doc clarifies IETF rules on case insensitivities in DNS:
 https://www.ietf.org/rfc/rfc4343.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720115#comment-14720115
 ] 

Francis Liu commented on HBASE-6721:


Here's a new RB request:

https://reviews.apache.org/r/27673/

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720145#comment-14720145
 ] 

Hadoop QA commented on HBASE-6721:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12753020/HBASE-6721_12.patch
  against master branch at commit cc1542828de93b8d54cc14497fd5937989ea1b6d.
  ATTACHMENT ID: 12753020

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 33 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15317//console

This message is automatically generated.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720239#comment-14720239
 ] 

Devaraj Das commented on HBASE-14309:
-

I think it's good to ensure that meta is not in transition when we attempt to 
initiate the balancer.

 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720069#comment-14720069
 ] 

Francis Liu commented on HBASE-6721:


Sounds good. I don't have any other changes pending so I'm going to update RB 
with the new patch on trunk. 

[~apurtell] thanks for taking care of the backport to 0.98 for the patches.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_8.patch, HBASE-6721_9.patch, HBASE-6721_9.patch, 
 HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721_94_2.patch, 
 HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, HBASE-6721_94_4.patch, 
 HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, HBASE-6721_94_7.patch, 
 HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14327) TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky

2015-08-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720087#comment-14720087
 ] 

stack commented on HBASE-14327:
---

The archiver is running in between close and open of the region in the new 
location? Is the file removed the result of the compaction or is it one of the 
inputs on the original region? The archive thinks it is safe to remove though 
the compaction has not 'completed' yet? That sounds like another issue. In this 
case, if the archiver is moving the file aside, before the new region open has 
a chance to work on it, yeah, delay or stop the archiver in the test.

Thank you for digging in on this [~chenheng]

 TestIOFencing#testFencingAroundCompactionAfterWALSync is flaky
 --

 Key: HBASE-14327
 URL: https://issues.apache.org/jira/browse/HBASE-14327
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Dima Spivak
Priority: Critical

 I'm looking into some more of the flaky tests on trunk and this one seems to 
 be  particularly gross, failing about half the time in recent days. Some 
 probably-relevant output from [a recent 
 run|https://builds.apache.org/job/HBase-TRUNK/6761/testReport/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompactionAfterWALSync/]:
 {noformat}
 2015-08-27 18:50:14,318 INFO  [main] hbase.TestIOFencing(326): Allowing 
 compaction to proceed
 2015-08-27 18:50:14,318 DEBUG [main] 
 hbase.TestIOFencing$CompactionBlockerRegion(110): allowing compactions
 2015-08-27 18:50:14,318 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,323 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1732): 
 Removing store files after compaction...
 2015-08-27 18:50:14,330 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,331 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(224): 
 Archiving compacted store files.
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/99e903ad7e0f4029862d0e35c5548464
 2015-08-27 18:50:14,337 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/74a80cc06d134361941085bc2bb905fe
 2015-08-27 18:50:14,341 DEBUG 
 [RS:0;hemera:35619-longCompactions-1440701391112] backup.HFileArchiver(438): 
 Finished archiving from class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, 
 file:hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc,
  to 
 hdfs://localhost:34675/user/jenkins/test-data/19edea13-027b-4c6a-9f3f-edaf1fc590ab/archive/data/default/tabletest/94d6f21f7cf387d73d8622f535c67311/family/7067addd325446089ba15ec2c77becbc
 2015-08-27 18:50:14,342 INFO  
 [RS:0;hemera:35619-longCompactions-1440701391112] regionserver.HStore(1353): 
 Completed compaction of 2 (all) file(s) in family of 
 tabletest,,1440701396419.94d6f21f7cf387d73d8622f535c67311. into 
 e138bb0ec6c64ad19efab3b44dbbcb1a(size=68.7 K), total size for store is 146.9 
 K. This selection was in queue for 0sec, and took 10sec to execute.
 2015-08-27 18:50:14,343 INFO  
 [RS:0;hemera:35619-longCompactions-1440701391112] 
 regionserver.CompactSplitThread$CompactionRunner(527): Completed compaction: 
 Request = 
 regionName=tabletest,,1440701396419.94d6f21f7cf387d73d8622f535c67311., 
 storeName=family, fileCount=2, fileSize=73.1 K, priority=998, 
 time=525052314434020; duration=10sec
 2015-08-27 18:50:14,343 DEBUG 
 [RS:0;hemera:35619-shortCompactions-1440701403303] backup.HFileArchiver(438): 
 Finished archiving from class 
 

[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14309:
---
Attachment: 14309-v7.txt

Patch v7 refines the WARN message in balancer.rb

When hbase:meta is in transition, ignore force flag for balancer command.

 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt, 14309-v7.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14309:
---
Description: 
This issue adds boolean parameter, force, to 'balancer' command so that admin 
can force region balancing even when there is region in transition - assuming 
RIT being transient.

This enhancement was requested by some customer.

The assumption of this change is that the operator has run hbck and has a 
reasonable idea why regions are stuck in transition before using the force flag.

There was a recent event at the customer where a cluster ended up with a small 
number of regionservers hosting most of the regions on the cluster (one 
regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
run due to the small number of regions that were stuck in transition. The admin 
ended up killing the regionservers so that reassignment would yield a more 
equitable distribution of the regions.

On a different cluster, there was a single store file that had corrupt HDFS 
blocks (the SSDs on the cluster were known to lose data). However, since this 
single region (out of 10s of 1000s of regions on this cluster) was stuck in 
transition, the balancer couldn't run.
While the state keeping in HBase isn't so good yet that the admin can kick off 
the balancer automatically in such scenarios knowing when it is safe to do so 
and when it is not, having this option available for the operator to use as he 
/ she sees fit seems prudent.

  was:
This issue adds boolean parameter, force, to 'balancer' command so that admin 
can force region balancing even when there is region in transition - assuming 
RIT being transient.

This enhancement was requested by some customer.


 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720293#comment-14720293
 ] 

Jerry He commented on HBASE-14309:
--

+1 on the WARNING and guard on meta.
force is commonly always last resort :-)

{code}
-LOG.debug(Not running balancer because  + regionsInTransition.size() 
+
+// if hbase:meta region is in transition, result of assignment cannot 
be recorded
+// ignore the force flag in that case
+String prefix = force  
!assignmentManager.getRegionStates().isMetaRegionInTransition() ?
+R : Not r;
+LOG.debug(prefix + unning balancer because  + 
regionsInTransition.size() +
region(s) in transition:  + org.apache.commons.lang.StringUtils.
 abbreviate(regionsInTransition.toString(), 256));
-return false;
+if (!force) return false;
{code}
Should return false if isMetaRegionInTransition is true.



 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt, 14309-v7.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Francis Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-6721:
---
Attachment: HBASE-6721_12.patch

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720119#comment-14720119
 ] 

Francis Liu commented on HBASE-6721:


Sorry it's not a new RB request I updated the old one. Was thinking wether I 
should just create a new RB request. 

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720203#comment-14720203
 ] 

stack commented on HBASE-14309:
---

bq. The assumption of this change is that the operator has run hbck and has a 
reasonable idea why regions are stuck in transition before using the force flag.

There was nothing to this effect in patches until v6. Now it has below.

29  WARNING: Have you run hbck, etc, to determine the cause for region 
stuck in transition
30before using the force flag ?
31  Examples:

Should be more clear that it can do damage and make less reference to 'hbck', 
'etc.', and 'rit'. 'For experts only. Forcing a balance may do more damage than 
repair when assignment is confused.' Would be good to then link to a section in 
refguide on plus/minus/implications.

Agree it is good to expose tools to help in extreme. Dangerous options that may 
do more damage than good need proper couching with warning including 
justification for why we need this option when you open the issue. The 
original, la-de-dah text is: This issue adds boolean parameter, force, to 
'balancer' command so that admin can force region balancing even when there is 
region in transition - assuming RIT being transient. which comes across as you 
have nothing better to do all day but add options on commands when in fact, you 
have good cause.

bq. In patch v6, added guard against system table region in transition even if 
force parameter carries value of true.

What is this about? Why if operator thinks a force is needed, internally, we 
pass on it because its a system table in transition...  Seems arbitrary. You 
provide no reasoning for the exclusion.


 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720221#comment-14720221
 ] 

Ted Yu commented on HBASE-14309:


I can adopt the suggested warning in balancer.rb in the next patch.

w.r.t. the guard against system table region in transition, the reasoning is 
that if, e.g. hbase:meta, is in transition, the results of reassignments can't 
be written. I can narrow the condition to checking hbase:meta only if you think 
that is proper.

 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14309:
---
Release Note: 
This issue adds boolean parameter, force, to 'balancer' command so that admin 
can force region balancing even when there is region (other than hbase:meta) in 
transition - assuming RIT being transient.

WARNING: For experts only. Forcing a balance may do more damage than repair 
when assignment is confused

 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt, 14309-v7.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720273#comment-14720273
 ] 

Elliott Clark commented on HBASE-6721:
--

I'm still officially -1 as long as this is built into the core. 99.99%(assuming 
10k hbase users)  of HBase users should never ever run something like this. It 
will make an already very operationally complex system un-workable. Because of 
that anything that's in the default code, adds to the default admin, and is 
built in is something I can't see ever being ok with.

All of this has already been tried at FB and it was a mistake.  This ends up 
looking functionally very similar to 0.89-fb's favored nodes. ( Only assign 
regions to specific set of machines that's configured by the admin ). It's so 
bad that almost every time we try and solve an issue on a cluster with favored 
nodes, the first thing we do is turn off the balancer so that we don't have to 
worry about which nodes are configured to have with regions. That's literally 
step one of debugging. Turn off this feature.
We'll have a party when FB no longer has this operational nightmare. I won't 
sign anyone up for the same. I won't sign myself up for the same.

So I'm -1 on anything that I can't completely remove. RM -RF.

* Assignment manager is already too complex adding more complexity is awful
* region movement is already too stateful. Adding more is awful
* configuration of HBase is already way too complex. Multiplying that with 
multiple groups is awful.
* Admin already has way too many things for users to do that cause issue. 
Adding more ways for a cluster to be borked is awful.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14309:
---
Attachment: 14309-v7.txt

Nice catch, Jerry.

 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt, 14309-v7.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720383#comment-14720383
 ] 

Mikhail Antonov commented on HBASE-14322:
-

Sorry for delay, thanks for clarification, I see now - basically the piece to 
handle region transition reports about system tables was there before, but 
misplaced in AnnotationReadingPriorityFunction so when requests are coming from 
admin user, we never get this code path. Let me take a look a bit more.

 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720395#comment-14720395
 ] 

Francis Liu commented on HBASE-6721:


{quote}
All of this has already been tried at FB and it was a mistake. This ends up 
looking functionally very similar to 0.89-fb's favored nodes. ( Only assign 
regions to specific set of machines that's configured by the admin ). It's so 
bad that almost every time we try and solve an issue on a cluster with favored 
nodes, the first thing we do is turn off the balancer so that we don't have to 
worry about which nodes are configured to have with regions. That's literally 
step one of debugging. Turn off this feature.
We'll have a party when FB no longer has this operational nightmare. I won't 
sign anyone up for the same. I won't sign myself up for the same.
{quote}
IMHO that's not an objective comparison. Favored Nodes and Region Server groups 
are very different. Their use cases are very different and their 
implementations are also very different.  

As for how useful it is for us (and potenitally for others), if we actually 
removed region server groups I'm pretty sure our HBase team and SEs would 
revolt :-). If we didn't have this feature we would be managing around 80 hbase 
clusters right now instead of the 6 multi-tenant cluster we are currently 
running. Step one of debugging is not turning of the balancer that would make 
things worse. In fact one of the useful features of region server groups is 
quickly isolating tables to a new group if they are misbehaving or their 
workload has changed. This can be done in a few minutes if not seconds. 

{quote}
Assignment manager is already too complex adding more complexity is awful
{quote}
If you look at the patch, the change in AM is an extra 20 lines of code. 6 
lines are just bugfixes that should be done to AM anyway and the other 14 lines 
which is fairly straightforward we can even live without if that's what it 
takes.

{quote}
region movement is already too stateful. Adding more is awful
{quote}
Adding more states?

{quote}
Configuration of HBase is already way too complex. Multiplying that with 
multiple groups is awful.
{quote}
Not sure what the concern here is? That there's an option to configure a 
different balancer? 







 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720408#comment-14720408
 ] 

Mikhail Antonov commented on HBASE-14322:
-

Looks good to me, +1.

Few nits - QosTestHelper needs audience annotation and some class javadoc? Also 
unused import in newly added test;

I'd consider adding you first comment (Basically reporting that a region is in 
transition needs to access meta. If reporting that meta is online is stuck 
behind requests trying to access meta, then everything times out and fails.) 
to javadoc somethere in MAPRF?

 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14322) Master still not using more than it's priority threads

2015-08-28 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720323#comment-14720323
 ] 

Elliott Clark commented on HBASE-14322:
---

Ping?

This is running in production for us. It made it possible to restart the master 
while doing a rolling deploy (even if things are still weird they work with 
this patch).

 Master still not using more than it's priority threads
 --

 Key: HBASE-14322
 URL: https://issues.apache.org/jira/browse/HBASE-14322
 Project: HBase
  Issue Type: Sub-task
  Components: master, rpc
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 1.2.0

 Attachments: HBASE-14322-v1.patch, HBASE-14322-v2.patch, 
 HBASE-14322-v3-branch-1.patch, HBASE-14322-v3.patch, 
 HBASE-14322-v4-branch-1.patch, HBASE-14322-v5-branch-1.patch, 
 HBASE-14322-v6.patch, HBASE-14322.patch


 Master and regionserver will be running as the same user.
 Superusers by default adds the current user as a super user.
 Super users' requests always go to the priority threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14309:
---
Attachment: (was: 14309-v7.txt)

 Allow load balancer to operate when there is region in transition by adding 
 force flag
 --

 Key: HBASE-14309
 URL: https://issues.apache.org/jira/browse/HBASE-14309
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.0.0, 1.3.0

 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 
 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt, 
 14309-v5.txt, 14309-v6.txt


 This issue adds boolean parameter, force, to 'balancer' command so that admin 
 can force region balancing even when there is region in transition - assuming 
 RIT being transient.
 This enhancement was requested by some customer.
 The assumption of this change is that the operator has run hbck and has a 
 reasonable idea why regions are stuck in transition before using the force 
 flag.
 There was a recent event at the customer where a cluster ended up with a 
 small number of regionservers hosting most of the regions on the cluster (one 
 regionserver had 50% of the roughly 20,000 regions). The balancer couldn't be 
 run due to the small number of regions that were stuck in transition. The 
 admin ended up killing the regionservers so that reassignment would yield a 
 more equitable distribution of the regions.
 On a different cluster, there was a single store file that had corrupt HDFS 
 blocks (the SSDs on the cluster were known to lose data). However, since this 
 single region (out of 10s of 1000s of regions on this cluster) was stuck in 
 transition, the balancer couldn't run.
 While the state keeping in HBase isn't so good yet that the admin can kick 
 off the balancer automatically in such scenarios knowing when it is safe to 
 do so and when it is not, having this option available for the operator to 
 use as he / she sees fit seems prudent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13965) Stochastic Load Balancer JMX Metrics

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720350#comment-14720350
 ] 

Hudson commented on HBASE-13965:


FAILURE: Integrated in HBase-0.98 #1103 (See 
[https://builds.apache.org/job/HBase-0.98/1103/])
HBASE-14289 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98 
(tedyu: rev 7a0d36fecfebb1f61811181347f90a9b88a9d09b)
* 
hbase-hadoop1-compat/src/main/resources/META-INF/services/org.apache.hadoop.hbase.master.balancer.MetricsStochasticBalancerSource
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/FavoredNodeLoadBalancer.java
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsBalancer.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSourceImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/TestStochasticBalancerJmxMetrics.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSourceImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
hbase-hadoop2-compat/src/main/resources/META-INF/services/org.apache.hadoop.hbase.master.balancer.MetricsStochasticBalancerSource
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/SimpleLoadBalancer.java


 Stochastic Load Balancer JMX Metrics
 

 Key: HBASE-13965
 URL: https://issues.apache.org/jira/browse/HBASE-13965
 Project: HBase
  Issue Type: Improvement
  Components: Balancer, metrics
Reporter: Lei Chen
Assignee: Lei Chen
 Fix For: 2.0.0, 1.3.0

 Attachments: 13965-addendum.txt, HBASE-13965-branch-1-v2.patch, 
 HBASE-13965-branch-1.patch, HBASE-13965-v10.patch, HBASE-13965-v11.patch, 
 HBASE-13965-v3.patch, HBASE-13965-v4.patch, HBASE-13965-v5.patch, 
 HBASE-13965-v6.patch, HBASE-13965-v7.patch, HBASE-13965-v8.patch, 
 HBASE-13965-v9.patch, HBASE-13965_v2.patch, HBase-13965-JConsole.png, 
 HBase-13965-v1.patch, stochasticloadbalancerclasses_v2.png


 Today’s default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 A driving example is a cluster we have been tuning which has skewed rack size 
 (one rack has half the nodes of the other few racks). We are tuning the 
 cluster for uniform response time from all region servers with the ability to 
 tolerate a rack failure. Balancing LocalityCost, RegionReplicaRack Cost and 
 RegionCountSkew Cost is difficult without a way to attribute each cost 
 function’s contribution to overall cost. 
 What this jira proposes is to provide visibility via JMX into each cost 
 function of the stochastic load balancer, as well as the overall cost of the 
 balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14289) Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720351#comment-14720351
 ] 

Hudson commented on HBASE-14289:


FAILURE: Integrated in HBase-0.98 #1103 (See 
[https://builds.apache.org/job/HBase-0.98/1103/])
HBASE-14289 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98 
(tedyu: rev 7a0d36fecfebb1f61811181347f90a9b88a9d09b)
* 
hbase-hadoop1-compat/src/main/resources/META-INF/services/org.apache.hadoop.hbase.master.balancer.MetricsStochasticBalancerSource
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/FavoredNodeLoadBalancer.java
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancer.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsBalancer.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
* 
hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSourceImpl.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/TestStochasticBalancerJmxMetrics.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/balancer/MetricsStochasticBalancerSourceImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
hbase-hadoop2-compat/src/main/resources/META-INF/services/org.apache.hadoop.hbase.master.balancer.MetricsStochasticBalancerSource
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/SimpleLoadBalancer.java


 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98
 ---

 Key: HBASE-14289
 URL: https://issues.apache.org/jira/browse/HBASE-14289
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.15

 Attachments: 14289-0.98-v2.txt, 14289-0.98-v3.txt, 14289-0.98-v4.txt, 
 14289-0.98-v5.txt


 The default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 This issue backports HBASE-13965 to 0.98 branch to provide visibility via JMX 
 into each cost function of the stochastic load balancer, as well as the 
 overall cost of the balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-14289) Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98

2015-08-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-14289:


 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98
 ---

 Key: HBASE-14289
 URL: https://issues.apache.org/jira/browse/HBASE-14289
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.15

 Attachments: 14289-0.98-v2.txt, 14289-0.98-v3.txt, 14289-0.98-v4.txt, 
 14289-0.98-v5.txt


 The default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 This issue backports HBASE-13965 to 0.98 branch to provide visibility via JMX 
 into each cost function of the stochastic load balancer, as well as the 
 overall cost of the balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-08-28 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720745#comment-14720745
 ] 

Nick Dimiduk commented on HBASE-6721:
-

{quote}
bq.This is not only a multi-tenancy solution but an isolation solution.
Isolation is the worse solution for getting multi-tenancy.
{quote}

Maybe you can elaborate on this point? Seems we need to quarantine users from 
each other, whether that's physically as per this patch or via imposed resource 
controls within a single process. Either way, quarantine is the same as 
isolation; we're isolating users from each other to achieve fairness of 
service delivery in a multi-tenant environment.

 RegionServer Group based Assignment
 ---

 Key: HBASE-6721
 URL: https://issues.apache.org/jira/browse/HBASE-6721
 Project: HBase
  Issue Type: New Feature
Reporter: Francis Liu
Assignee: Francis Liu
  Labels: hbase-6721
 Attachments: 6721-master-webUI.patch, HBASE-6721 
 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
 HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
 HBASE-6721_12.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, 
 HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, 
 HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, 
 HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, 
 HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
 HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
 HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
 HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
 immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
 Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
 Sequence Diagram.svg


 In multi-tenant deployments of HBase, it is likely that a RegionServer will 
 be serving out regions from a number of different tables owned by various 
 client applications. Being able to group a subset of running RegionServers 
 and assign specific tables to it, provides a client application a level of 
 isolation and resource allocation.
 The proposal essentially is to have an AssignmentManager which is aware of 
 RegionServer groups and assigns tables to region servers based on groupings. 
 Load balancing will occur on a per group basis as well. 
 This is essentially a simplification of the approach taken in HBASE-4120. See 
 attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14289) Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98

2015-08-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720751#comment-14720751
 ] 

Ted Yu commented on HBASE-14289:


Reverted.

Let me know the branch of Phoenix you used for compilation.

 Backport HBASE-13965 'Stochastic Load Balancer JMX Metrics' to 0.98
 ---

 Key: HBASE-14289
 URL: https://issues.apache.org/jira/browse/HBASE-14289
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.98.15

 Attachments: 14289-0.98-v2.txt, 14289-0.98-v3.txt, 14289-0.98-v4.txt, 
 14289-0.98-v5.txt


 The default HBase load balancer (the Stochastic load balancer) is cost 
 function based. The cost function weights are tunable but no visibility into 
 those cost function results is directly provided.
 This issue backports HBASE-13965 to 0.98 branch to provide visibility via JMX 
 into each cost function of the stochastic load balancer, as well as the 
 overall cost of the balancing plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720758#comment-14720758
 ] 

Hudson commented on HBASE-14315:


FAILURE: Integrated in HBase-1.1 #641 (See 
[https://builds.apache.org/job/HBase-1.1/641/])
HBASE-14315 Save one call to KeyValueHeap.peek per row. (larsh: rev 
74c5ee41ed10971b8d7cc515695d3ca07faff13a)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java


 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14261) Enhance Chaos Monkey framework by adding zookeeper and datanode fault injections.

2015-08-28 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720761#comment-14720761
 ] 

Enis Soztutar commented on HBASE-14261:
---

Thanks Srikanth. The patch looks almost there. Some more comments: 

bq. Currently, zk nodes are being read using ZKServerTool and the actions get 
triggered on those nodes pointed to by the output. Not sure what exactly was 
being pointed to here.
Sorry I missed this part in the first patch. 

This still prints to the System.out and creates a new Configuration. We want to 
pass the conf and only print out to system.out when ZkServerTool.main() is 
called. 
{code}
+  public static ServerName[] readZKNodes() {
{code}

This should be ZOOKEEPER_SERVER since it is not a daemon of HBase. 
{code}
HBASE_ZOOKEEPER(zookeeper)
{code}

Is there a way to ask NN the list of datanodes instead of reading {{slaves}} 
file? In my deployment, the slaves files is under {{$HADOOP_CONF_DIR}}. 
{code}
+  for (String line: FileUtils.readLines(new File(hadoopHome + 
/etc/hadoop/slaves))) {
{code}

This should default to hbase rather than root: 
{code}
+  return conf.get(hbase.it.clustermanager.hbase.user, root);
{code}

In deployments I have seen, hdfs daemons and mapred daemons are run as 
different users (hdfs and mapred typically). We are only interested in killing 
hdfs datanodes for now. So maybe {{hbase.it.clustermanager.hadoop.hdfs.user}}? 
{code}
+  return conf.get(hbase.it.clustermanager.hadoop.user, hadoop);
{code}

This does not belong in RemoteShell IMO. Passing the user directly is better. 
{code}
+private String getServiceUser() {
{code}

Better to sleep 100ms rather than 1sec. 
{code}
+  Threads.sleep(1000);
{code}

Some lines are using 4 instead of 2 spaces as indentation. 


 Enhance Chaos Monkey framework by adding zookeeper and datanode fault 
 injections.
 -

 Key: HBASE-14261
 URL: https://issues.apache.org/jira/browse/HBASE-14261
 Project: HBase
  Issue Type: Improvement
Reporter: Srikanth Srungarapu
Assignee: Srikanth Srungarapu
 Attachments: HBASE-14261-branch-1.patch, HBASE-14261.branch-1_v2.patch


 One of the shortcomings of existing ChaosMonkey framework is lack of fault 
 injections for hbase dependencies like zookeeper, hdfs etc. This patch 
 attempts to solve this problem partially by adding datanode and zk node fault 
 injections.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14334) Move Memcached block cache in to it's own optional module.

2015-08-28 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14334:
--
Attachment: HBASE-14334.patch

Just a move and refer to classes by name so there's no compile time dependency.

 Move Memcached block cache in to it's own optional module.
 --

 Key: HBASE-14334
 URL: https://issues.apache.org/jira/browse/HBASE-14334
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Elliott Clark
 Attachments: HBASE-14334.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14334) Move Memcached block cache in to it's own optional module.

2015-08-28 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14334:
--
Fix Version/s: 1.2.0
   2.0.0
Affects Version/s: 1.2.0
   Status: Patch Available  (was: Open)

 Move Memcached block cache in to it's own optional module.
 --

 Key: HBASE-14334
 URL: https://issues.apache.org/jira/browse/HBASE-14334
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 2.0.0, 1.2.0

 Attachments: HBASE-14334.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14258) Make region_mover.rb script case insensitive with regard to hostname

2015-08-28 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14258:
--
Attachment: HBASE-14258.patch.add

Addendum fixes case sensitivity in 'load' 

 Make region_mover.rb script case insensitive with regard to hostname
 

 Key: HBASE-14258
 URL: https://issues.apache.org/jira/browse/HBASE-14258
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3

 Attachments: HBASE-14258.patch, HBASE-14258.patch.add


 The script is case sensitive and fails when case of a host name being 
 unloaded does not match with a case of a region server name returned by HBase 
 API. 
 This doc clarifies IETF rules on case insensitivities in DNS:
 https://www.ietf.org/rfc/rfc4343.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14258) Make region_mover.rb script case insensitive with regard to hostname

2015-08-28 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720779#comment-14720779
 ] 

Vladimir Rodionov commented on HBASE-14258:
---

[~te...@apache.org], addendum to the original patch that fixes case sensitivity 
in 'load' method.

 Make region_mover.rb script case insensitive with regard to hostname
 

 Key: HBASE-14258
 URL: https://issues.apache.org/jira/browse/HBASE-14258
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3

 Attachments: HBASE-14258.patch, HBASE-14258.patch.add


 The script is case sensitive and fails when case of a host name being 
 unloaded does not match with a case of a region server name returned by HBase 
 API. 
 This doc clarifies IETF rules on case insensitivities in DNS:
 https://www.ietf.org/rfc/rfc4343.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14258) Make region_mover.rb script case insensitive with regard to hostname

2015-08-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720784#comment-14720784
 ] 

Ted Yu commented on HBASE-14258:


{code}
+if hostFromServerName == hostname.upcase and portFromServerName == port
{code}
Uppcasing hostname can be done outside the loop, right ?

 Make region_mover.rb script case insensitive with regard to hostname
 

 Key: HBASE-14258
 URL: https://issues.apache.org/jira/browse/HBASE-14258
 Project: HBase
  Issue Type: Bug
Reporter: Vladimir Rodionov
Assignee: Vladimir Rodionov
Priority: Minor
 Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3

 Attachments: HBASE-14258.patch, HBASE-14258.patch.add


 The script is case sensitive and fails when case of a host name being 
 unloaded does not match with a case of a region server name returned by HBase 
 API. 
 This doc clarifies IETF rules on case insensitivities in DNS:
 https://www.ietf.org/rfc/rfc4343.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >