[jira] [Commented] (HBASE-14608) testWalRollOnLowReplication has some risk to assert failed after HBASE-14600

2015-10-19 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962934#comment-14962934
 ] 

Heng Chen commented on HBASE-14608:
---

{quote}
if we add throw e just like you mention, sync abort exception will still 
interrupt this testcase
{quote}
{code}
if (msg != null && msg.toLowerCase().contains("sync aborted") && i > 50)
{ return; }
+ throw re;
}
{code}
Just as i said above,  i think we should NOT add {{throw e}}.

Otherwise it will break down this testcase,  we'd better just catch "Sync 
aborted exception".

how do you think about it?   [~stack]


> testWalRollOnLowReplication has some risk to assert failed after HBASE-14600
> 
>
> Key: HBASE-14608
> URL: https://issues.apache.org/jira/browse/HBASE-14608
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14608.patch, HBASE-14608_v1.patch
>
>
> After HBASE-14600,  we catch runtime exception if dn recover slowly,  but it 
> has some risk to assert failed.
> For example, https://builds.apache.org/job/HBase-TRUNK/6907/testReport/
> The reason is we catch the exception, but in {{WALProcedureStore}}, it will 
> still stop the Procedure. So when we assert stop.isRunning, it will failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-14604:
---
Attachment: HBASE-14604_98_with_ut.diff

> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use maxMoves.
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962949#comment-14962949
 ] 

Hadoop QA commented on HBASE-14631:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767292/14631-branch-1.0.txt
  against branch-1.0 branch at commit 8e6316a80cf96f4d4cd6bd10f4c647ebf45c7e02.
  ATTACHMENT ID: 12767292

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16087//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16087//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16087//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16087//console

This message is automatically generated.

> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962921#comment-14962921
 ] 

Hadoop QA commented on HBASE-14420:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767283/none_fix.txt
  against master branch at commit f1b6355fc54616c08480c0f1b0965a244252.
  ATTACHMENT ID: 12767283

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16085//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16085//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16085//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16085//console

This message is automatically generated.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14641) Move JDO example from Wiki to Ref Guide

2015-10-19 Thread Misty Stanley-Jones (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misty Stanley-Jones updated HBASE-14641:

Attachment: HBASE-14641-v1.patch

Removed @author tag

> Move JDO example from Wiki to Ref Guide
> ---
>
> Key: HBASE-14641
> URL: https://issues.apache.org/jira/browse/HBASE-14641
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.0.0
>
> Attachments: HBASE-14641-v1.patch, HBASE-14641.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14600) Make #testWalRollOnLowReplication looser still

2015-10-19 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962936#comment-14962936
 ] 

Heng Chen commented on HBASE-14600:
---


{code}
if (msg != null && msg.toLowerCase().contains("sync aborted") && i > 50)
{ return; }
+ throw re;
}
{code}
Just as i said in HBASE-14608,  i think we should NOT add {{throw e}}.

Otherwise it will break down this testcase when i<50,  we'd better just catch 
"Sync aborted exception".

how do you think about it?   [~stack]


> Make #testWalRollOnLowReplication looser still
> --
>
> Key: HBASE-14600
> URL: https://issues.apache.org/jira/browse/HBASE-14600
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14600.addendum.txt, 14600.txt
>
>
> The parent upped timeouts on testWalRollOnLowReplication. It still fails on 
> occasion. Chatting w/ [~mbertozzi], he suggested that if we've make progress 
> in the test, return the test as compeleted successfully if we get a 
> RuntimeException out of the sync call(because DN is slow to recover).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13153) Bulk Loaded HFile Replication

2015-10-19 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-13153:
--
Attachment: HBASE-13153-v11.patch

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v2.patch, HBASE-13153-v3.patch, 
> HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch, 
> HBASE-13153-v7.patch, HBASE-13153-v8.patch, HBASE-13153-v9.patch, 
> HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load 
> Replication-v2.pdf, HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14600) Make #testWalRollOnLowReplication looser still

2015-10-19 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962937#comment-14962937
 ] 

Heng Chen commented on HBASE-14600:
---


{code}
if (msg != null && msg.toLowerCase().contains("sync aborted") && i > 50)
{ return; }
+ throw re;
}
{code}
Just as i said in HBASE-14608,  i think we should NOT add {{throw e}}.

Otherwise it will break down this testcase when i<50,  we'd better just catch 
"Sync aborted exception".

how do you think about it?   [~stack]


> Make #testWalRollOnLowReplication looser still
> --
>
> Key: HBASE-14600
> URL: https://issues.apache.org/jira/browse/HBASE-14600
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14600.addendum.txt, 14600.txt
>
>
> The parent upped timeouts on testWalRollOnLowReplication. It still fails on 
> occasion. Chatting w/ [~mbertozzi], he suggested that if we've make progress 
> in the test, return the test as compeleted successfully if we get a 
> RuntimeException out of the sync call(because DN is slow to recover).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962978#comment-14962978
 ] 

Guanghao Zhang commented on HBASE-14604:


Add ut for MoveCostFunction.

---
 T E S T S
---
Running org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.08 sec - in 
org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
Running org.apache.hadoop.hbase.master.balancer.TestDefaultLoadBalancer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.812 sec - in 
org.apache.hadoop.hbase.master.balancer.TestDefaultLoadBalancer
Running org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 28.241 sec - 
in org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
Running org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.601 sec - in 
org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer

Results :

Tests run: 21, Failures: 0, Errors: 0, Skipped: 0

> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use maxMoves.
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14641) Move JDO example from Wiki to Ref Guide

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962907#comment-14962907
 ] 

Hadoop QA commented on HBASE-14641:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767284/HBASE-14641.patch
  against master branch at commit f1b6355fc54616c08480c0f1b0965a244252.
  ATTACHMENT ID: 12767284

{color:red}-1 @author{color}.  The patch appears to contain 1 @author tags 
which the Hadoop community has agreed to not allow in code contributions.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.thrift.TestThriftHttpServer

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16086//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16086//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16086//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16086//console

This message is automatically generated.

> Move JDO example from Wiki to Ref Guide
> ---
>
> Key: HBASE-14641
> URL: https://issues.apache.org/jira/browse/HBASE-14641
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.0.0
>
> Attachments: HBASE-14641.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14642) Disable flakey TestMultiParallel#testActiveThreadsCount

2015-10-19 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962899#comment-14962899
 ] 

Heng Chen commented on HBASE-14642:
---

Any more information,  I have time recently, and want to dig it.  :)

> Disable flakey TestMultiParallel#testActiveThreadsCount
> ---
>
> Key: HBASE-14642
> URL: https://issues.apache.org/jira/browse/HBASE-14642
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Attachments: 14642.txt
>
>
> Failed twice in a row on 1.2 build... Disabling for now Unless someone 
> wants to dig in and fix it that is...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13153) Bulk Loaded HFile Replication

2015-10-19 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-13153:
--
Attachment: (was: HBASE-13153-v11.patch)

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v2.patch, HBASE-13153-v3.patch, HBASE-13153-v4.patch, 
> HBASE-13153-v5.patch, HBASE-13153-v6.patch, HBASE-13153-v7.patch, 
> HBASE-13153-v8.patch, HBASE-13153-v9.patch, HBASE-13153.patch, HBase Bulk 
> Load Replication-v1-1.pdf, HBase Bulk Load Replication-v2.pdf, HBase Bulk 
> Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13153) Bulk Loaded HFile Replication

2015-10-19 Thread Ashish Singhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Singhi updated HBASE-13153:
--
Release Note: 
This jira enhances the HBase replication to support replication of bulk loaded 
data. This is configurable, by default it is set to false which means it will 
not replicate the bulk loaded data to the sink(s). To enable it set 
"hbase.replication.bulkload.enabled" to true.
As part of this we made have made following changes to LoadIncrementalHFiles 
class which is marked as Public and Stable class,
 a. Raised the visbility scope of LoadQueueItem class from package private to 
public.
 b. Added a new field splittigDir, which allows client to tell the tool in 
which directory it would like get hfiles split during the operation.
 c. Added a new method loadHFileQueue, which loads the queue of LoadQueueItem 
into the table as per the region keys provided.

Added release note. Let me know if any update is required.

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v2.patch, HBASE-13153-v3.patch, 
> HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch, 
> HBASE-13153-v7.patch, HBASE-13153-v8.patch, HBASE-13153-v9.patch, 
> HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load 
> Replication-v2.pdf, HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963258#comment-14963258
 ] 

Hadoop QA commented on HBASE-14604:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12767319/HBASE-14604_98_with_ut.diff
  against master branch at commit 8e6316a80cf96f4d4cd6bd10f4c647ebf45c7e02.
  ATTACHMENT ID: 12767319

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16094//console

This message is automatically generated.

> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 0.98.15
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max 
> cost, so it can scale moveCost to [0,1].
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14463) Severe performance downgrade when parallel reading a single key from BucketCache

2015-10-19 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963215#comment-14963215
 ] 

Yu Li commented on HBASE-14463:
---

[~stack] boss, anything more to be addressed or could I get your +1 here? 
Thanks.

> Severe performance downgrade when parallel reading a single key from 
> BucketCache
> 
>
> Key: HBASE-14463
> URL: https://issues.apache.org/jira/browse/HBASE-14463
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.16
>
> Attachments: GC_with_WeakObjectPool.png, HBASE-14463.patch, 
> HBASE-14463_v11.patch, HBASE-14463_v12.patch, HBASE-14463_v2.patch, 
> HBASE-14463_v3.patch, HBASE-14463_v4.patch, HBASE-14463_v5.patch, 
> TestBucketCache-new_with_IdLock.png, 
> TestBucketCache-new_with_IdReadWriteLock.png, 
> TestBucketCache_with_IdLock-latest.png, TestBucketCache_with_IdLock.png, 
> TestBucketCache_with_IdReadWriteLock-latest.png, 
> TestBucketCache_with_IdReadWriteLock-resolveLockLeak.png, 
> TestBucketCache_with_IdReadWriteLock.png
>
>
> We store feature data of online items in HBase, do machine learning on these 
> features, and supply the outputs to our online search engine. In such 
> scenario we will launch hundreds of yarn workers and each worker will read 
> all features of one item(i.e. single rowkey in HBase), so there'll be heavy 
> parallel reading on a single rowkey.
> We were using LruCache but start to try BucketCache recently to resolve gc 
> issue, and just as titled we have observed severe performance downgrade. 
> After some analytics we found the root cause is the lock in 
> BucketCache#getBlock, as shown below
> {code}
>   try {
> lockEntry = offsetLock.getLockEntry(bucketEntry.offset());
> // ...
> if (bucketEntry.equals(backingMap.get(key))) {
>   // ...
>   int len = bucketEntry.getLength();
>   Cacheable cachedBlock = ioEngine.read(bucketEntry.offset(), len,
>   bucketEntry.deserializerReference(this.deserialiserMap));
> {code}
> Since ioEnging.read involves array copy, it's much more time-costed than the 
> operation in LruCache. And since we're using synchronized in 
> IdLock#getLockEntry, parallel read dropping on the same bucket would be 
> executed in serial, which causes a really bad performance.
> To resolve the problem, we propose to use ReentranceReadWriteLock in 
> BucketCache, and introduce a new class called IdReadWriteLock to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14463) Severe performance downgrade when parallel reading a single key from BucketCache

2015-10-19 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-14463:
--
Attachment: GC_with_WeakObjectPool.png

Launched several YCSB workloadc testing against a 4 nodes dev cluster with 
zipfian/hotspot requestdistributio, each round running for around 20 minutes 
and watch the GC status with VirsualVM, the result shows no memory leak in the 
new implementation with WeakObjectPool. See the attached screenshot for more 
details.

> Severe performance downgrade when parallel reading a single key from 
> BucketCache
> 
>
> Key: HBASE-14463
> URL: https://issues.apache.org/jira/browse/HBASE-14463
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14, 1.1.2
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.16
>
> Attachments: GC_with_WeakObjectPool.png, HBASE-14463.patch, 
> HBASE-14463_v11.patch, HBASE-14463_v12.patch, HBASE-14463_v2.patch, 
> HBASE-14463_v3.patch, HBASE-14463_v4.patch, HBASE-14463_v5.patch, 
> TestBucketCache-new_with_IdLock.png, 
> TestBucketCache-new_with_IdReadWriteLock.png, 
> TestBucketCache_with_IdLock-latest.png, TestBucketCache_with_IdLock.png, 
> TestBucketCache_with_IdReadWriteLock-latest.png, 
> TestBucketCache_with_IdReadWriteLock-resolveLockLeak.png, 
> TestBucketCache_with_IdReadWriteLock.png
>
>
> We store feature data of online items in HBase, do machine learning on these 
> features, and supply the outputs to our online search engine. In such 
> scenario we will launch hundreds of yarn workers and each worker will read 
> all features of one item(i.e. single rowkey in HBase), so there'll be heavy 
> parallel reading on a single rowkey.
> We were using LruCache but start to try BucketCache recently to resolve gc 
> issue, and just as titled we have observed severe performance downgrade. 
> After some analytics we found the root cause is the lock in 
> BucketCache#getBlock, as shown below
> {code}
>   try {
> lockEntry = offsetLock.getLockEntry(bucketEntry.offset());
> // ...
> if (bucketEntry.equals(backingMap.get(key))) {
>   // ...
>   int len = bucketEntry.getLength();
>   Cacheable cachedBlock = ioEngine.read(bucketEntry.offset(), len,
>   bucketEntry.deserializerReference(this.deserialiserMap));
> {code}
> Since ioEnging.read involves array copy, it's much more time-costed than the 
> operation in LruCache. And since we're using synchronized in 
> IdLock#getLockEntry, parallel read dropping on the same bucket would be 
> executed in serial, which causes a really bad performance.
> To resolve the problem, we propose to use ReentranceReadWriteLock in 
> BucketCache, and introduce a new class called IdReadWriteLock to implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-14604:
---
Status: Patch Available  (was: In Progress)

> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 0.98.15
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max 
> cost, so it can scale moveCost to [0,1].
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962989#comment-14962989
 ] 

Hadoop QA commented on HBASE-14631:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767306/14631-branch-0.98.txt
  against 0.98 branch at commit 8e6316a80cf96f4d4cd6bd10f4c647ebf45c7e02.
  ATTACHMENT ID: 12767306

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
29 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16089//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16089//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16089//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16089//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16089//console

This message is automatically generated.

> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-14604:
---
Affects Version/s: 0.98.15

> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 0.98.15
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max 
> cost, so it can scale moveCost to [0,1].
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14642) Disable flakey TestMultiParallel#testActiveThreadsCount

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963001#comment-14963001
 ] 

Hudson commented on HBASE-14642:


SUCCESS: Integrated in HBase-1.3-IT #250 (See 
[https://builds.apache.org/job/HBase-1.3-IT/250/])
HBASE-14642 Disable flakey TestMultiParallel#testActiveThreadsCount (stack: rev 
39f6a4eb0b27bf16d3c9ea98b3f5e8db8c594e78)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java


> Disable flakey TestMultiParallel#testActiveThreadsCount
> ---
>
> Key: HBASE-14642
> URL: https://issues.apache.org/jira/browse/HBASE-14642
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Attachments: 14642.txt
>
>
> Failed twice in a row on 1.2 build... Disabling for now Unless someone 
> wants to dig in and fix it that is...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14643) Avoid Splits from once again opening a closed reader for fetching the first and last key

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-14643:
---
Summary: Avoid Splits from once again opening a closed reader for fetching 
the first and last key  (was: Avoid Splits to create a closed reader for 
fetching the first and last key)

> Avoid Splits from once again opening a closed reader for fetching the first 
> and last key
> 
>
> Key: HBASE-14643
> URL: https://issues.apache.org/jira/browse/HBASE-14643
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>
> Currently split flow is such that we close the parent region and all its 
> store file readers are also closed.  After that inorder to split the 
> reference files we need the first and last keys for which once again open the 
> readers on those store files.  This could be costlier operation considering 
> the fact that it has to contact the HDFS for this close and open operation. 
> This JIRA is to see if we can improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14629) Create replication zk node in postLogRoll instead of preLogRoll

2015-10-19 Thread He Liangliang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963048#comment-14963048
 ] 

He Liangliang commented on HBASE-14629:
---

Just can't figure out the reason. From our case, ignoring FNFE is dangerous and 
it's disabled in production cluster. So avoiding FNFE is a helpful improvement.

+ [~jdcryans][~stack] any hints?

> Create replication zk node in postLogRoll instead of preLogRoll
> ---
>
> Key: HBASE-14629
> URL: https://issues.apache.org/jira/browse/HBASE-14629
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: He Liangliang
>
> Currently the hlog zk node is added before creating the log file, so it's 
> possible to raise FileNotFoundException if the server crash between these tow 
> operations. Move this step after file creating can avoid this exception (then 
> FNFE will be certainly treated as an error). If there is an crash after log 
> creation before creating the zk node, the log will be replayed and there 
> should be no data loss either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13153) Bulk Loaded HFile Replication

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963077#comment-14963077
 ] 

Hadoop QA commented on HBASE-13153:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767313/HBASE-13153-v11.patch
  against master branch at commit 8e6316a80cf96f4d4cd6bd10f4c647ebf45c7e02.
  ATTACHMENT ID: 12767313

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 41 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16091//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16091//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16091//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16091//console

This message is automatically generated.

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v2.patch, HBASE-13153-v3.patch, 
> HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch, 
> HBASE-13153-v7.patch, HBASE-13153-v8.patch, HBASE-13153-v9.patch, 
> HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load 
> Replication-v2.pdf, HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-14604 started by Guanghao Zhang.
--
> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 0.98.15
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max 
> cost, so it can scale moveCost to [0,1].
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Guanghao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-14604:
---
Description: 
The code in MoveCoseFunction:
{code}
return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
{code}
It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
have a lot of regions.

Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
to [0, 0.25].

Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max cost, 
so it can scale moveCost to [0,1].
{code}
return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
moveCost);
{code}

  was:
The code in MoveCoseFunction:
{code}
return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
{code}
It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
have a lot of regions.

Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
to [0, 0.25].

Improve moveCost by use maxMoves.
{code}
return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
moveCost);
{code}


> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max 
> cost, so it can scale moveCost to [0,1].
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14641) Move JDO example from Wiki to Ref Guide

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963088#comment-14963088
 ] 

Hadoop QA commented on HBASE-14641:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767309/HBASE-14641-v1.patch
  against master branch at commit 8e6316a80cf96f4d4cd6bd10f4c647ebf45c7e02.
  ATTACHMENT ID: 12767309

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16090//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16090//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16090//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16090//console

This message is automatically generated.

> Move JDO example from Wiki to Ref Guide
> ---
>
> Key: HBASE-14641
> URL: https://issues.apache.org/jira/browse/HBASE-14641
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.0.0
>
> Attachments: HBASE-14641-v1.patch, HBASE-14641.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-13082:
---
Attachment: HBASE-13082.pdf

A simple one pager indicating what is the high level approach to the versioned 
store file approach. 

> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14643) Avoid Splits to create a closed reader for fetching the first and last key

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-14643:
--

 Summary: Avoid Splits to create a closed reader for fetching the 
first and last key
 Key: HBASE-14643
 URL: https://issues.apache.org/jira/browse/HBASE-14643
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: ramkrishna.s.vasudevan


Currently split flow is such that we close the parent region and all its store 
file readers are also closed.  After that inorder to split the reference files 
we need the first and last keys for which once again open the readers on those 
store files.  This could be costlier operation considering the fact that it has 
to contact the HDFS for this close and open operation. This JIRA is to see if 
we can improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963297#comment-14963297
 ] 

Anoop Sam John commented on HBASE-13082:


So we wont reset the heap of the files during the scan.  Compaction case is 
fine.. the versioned approach and we wont move the old compacted files until 
the scanners finished will do the solution.  What about flush?   Once the flush 
is finished, we will have to clear the snapshot of cells in Memstore no?  Then 
we will have to open this newly added file for reading.. If we dont open it, we 
can not release the memstore snapshot.  Means we will have to keep the cells as 
long as the scanner completes!  That will not be acceptable IMO.

> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963357#comment-14963357
 ] 

Ted Yu commented on HBASE-14631:


Planning to integrate to all branches soon, if there is no more review comment.

> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963400#comment-14963400
 ] 

Anoop Sam John commented on HBASE-13082:


Good doc explaining the change.

bq. Along with the ref count, we also have CompactionStatus (COMPACTED and 
NON_COMPACTED) added to these store file readers
This status is in reader? It suits better in StoreFile no? Even the ref count 
is in reader? I would say it is better suited in StoreFile.  That call the 
state as Compacted we better call it as a file to be discarded?  When we mark a 
file as compacted the confusion can be like whether this file is a file created 
out of compaction. (?)

bq. Ensure that a background thread runs periodically runs that scans the list 
of store files and checks for the state as COMPACTED...
This is a new Chore thread adding to the system..  Mention about it clearly. It 
will be one thread per RS. Also what is its interval? Is it configurable? How?

bq.Hence it requires us to add new APIs which ensures that a split can go ahead 
even if reference files are present but they are in the COMPACTED state
hmm.. making this more complex :-)

> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14604) Improve MoveCostFunction in StochasticLoadBalancer

2015-10-19 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963411#comment-14963411
 ] 

Ted Yu commented on HBASE-14604:


The classes are in the same package. Please make the new methods package 
private.

Patch for 0.98 branch should contain 0.98 in the filename.

Extension should be either .txt or .patch.

Please attach patch for master branch.

> Improve MoveCostFunction in StochasticLoadBalancer
> --
>
> Key: HBASE-14604
> URL: https://issues.apache.org/jira/browse/HBASE-14604
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 0.98.15
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-14604_98.diff, HBASE-14604_98_with_ut.diff
>
>
> The code in MoveCoseFunction:
> {code}
> return scale(0, cluster.numRegions + META_MOVE_COST_MULT, moveCost);
> {code}
> It uses cluster.numRegions + META_MOVE_COST_MULT as the max value when scale 
> moveCost to [0,1]. But this should use maxMoves as the max value when cluster 
> have a lot of regions.
> Assume a cluster have 1 regions, maxMoves is 2500, it only scale moveCost 
> to [0, 0.25].
> Improve moveCost by use Math.min(cluster.numRegions, maxMoves) as the max 
> cost, so it can scale moveCost to [0,1].
> {code}
> return scale(0, Math.min(cluster.numRegions, maxMoves) + META_MOVE_COST_MULT, 
> moveCost);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14643) Avoid Splits from once again opening a closed reader for fetching the first and last key

2015-10-19 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963442#comment-14963442
 ] 

Heng Chen commented on HBASE-14643:
---

IMO we can got the first and last keys before close store files and pass them 
as params into {{SplitTransactionImpl.splitStoreFiles}},  any concerns?

> Avoid Splits from once again opening a closed reader for fetching the first 
> and last key
> 
>
> Key: HBASE-14643
> URL: https://issues.apache.org/jira/browse/HBASE-14643
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>
> Currently split flow is such that we close the parent region and all its 
> store file readers are also closed.  After that inorder to split the 
> reference files we need the first and last keys for which once again open the 
> readers on those store files.  This could be costlier operation considering 
> the fact that it has to contact the HDFS for this close and open operation. 
> This JIRA is to see if we can improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14624) BucketCache.freeBlock is too expensive

2015-10-19 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963471#comment-14963471
 ] 

Anoop Sam John commented on HBASE-14624:


Not able to see from log that bucket cache free call takes time..  Missing some 
logs?

> BucketCache.freeBlock is too expensive
> --
>
> Key: HBASE-14624
> URL: https://issues.apache.org/jira/browse/HBASE-14624
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Affects Versions: 1.0.0
>Reporter: Randy Fox
>
> Moving regions is unacceptably slow when using bucket cache, as it takes too 
> long to free all the blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13153) Bulk Loaded HFile Replication

2015-10-19 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963386#comment-14963386
 ] 

Anoop Sam John commented on HBASE-13153:


Checking the op flow : 
Considering the scenario where peer cluster is not secure (no secure EP) and 
the bulk load to peer cluster needs a split,  we will do split by reading each 
cell from remote src cluster HFile. This will be a costly op. Suggestion will 
be like when we have to do bulk load to peer cluster, make sure the big file is 
copied to dest peer cluster first and then do the split and read of the file.  
Had a call with Ashish and discussed this. He will come with change in the flow 
and doc

> Bulk Loaded HFile Replication
> -
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
>  Issue Type: New Feature
>  Components: Replication
>Reporter: sunhaitao
>Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v2.patch, HBASE-13153-v3.patch, 
> HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch, 
> HBASE-13153-v7.patch, HBASE-13153-v8.patch, HBASE-13153-v9.patch, 
> HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load 
> Replication-v2.pdf, HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14624) BucketCache.freeBlock is too expensive

2015-10-19 Thread Randy Fox (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963452#comment-14963452
 ] 

Randy Fox commented on HBASE-14624:
---

The region was about 1.4G compressed and it took about 2.5 minutes to move.  
The bucketcache is configed at 72G, but was not close to full.  The bucket 
sizes: 9216,17408,33792,66560

2015-10-15 08:34:28,510 INFO 
org.apache.hadoop.hbase.regionserver.RSRpcServices: Close 
dad6c71ed395df19220ef1056a110086, moving to 
hb17.prod1.connexity.net,60020,125924021
2015-10-15 08:34:28,510 DEBUG 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Processing 
close of 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:28,511 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Closing 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.:
 disabling compactions & flushes
2015-10-15 08:34:28,511 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Running close preflush of 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:28,511 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Started memstore flush for 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.,
 current region memstore size 44.89 MB, and 1/1 column families' memstores are 
being flushed.
2015-10-15 08:34:28,511 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
Couldn't find oldest seqNum for the region we are about to flush: 
[dad6c71ed395df19220ef1056a110086]
2015-10-15 08:34:29,137 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=70839856845, memsize=44.9 M, hasBloomFilter=false, into tmp file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/3600b47839a945de9733cf17581458e0
2015-10-15 08:34:29,144 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: Committing store file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/3600b47839a945de9733cf17581458e0
 as 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/3600b47839a945de9733cf17581458e0
2015-10-15 08:34:29,150 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/3600b47839a945de9733cf17581458e0,
 entries=190131, sequenceid=70839856845, filesize=3.2 M
2015-10-15 08:34:29,151 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Finished memstore flush of ~44.89 MB/47066832, currentsize=22.39 KB/22928 for 
region 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
 in 640ms, sequenceid=70839856845, compaction requested=false
2015-10-15 08:34:29,152 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: 
Updates disabled for region 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:29,152 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Started memstore flush for 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.,
 current region memstore size 22.39 KB, and 1/1 column families' memstores are 
being flushed.
2015-10-15 08:34:29,152 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
Couldn't find oldest seqNum for the region we are about to flush: 
[dad6c71ed395df19220ef1056a110086]
2015-10-15 08:34:29,225 INFO 
org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed, 
sequenceid=70839856876, memsize=22.4 K, hasBloomFilter=false, into tmp file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/a8a74549c7a44174b105cbed23cfaea1
2015-10-15 08:34:29,231 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionFileSystem: Committing store file 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/a8a74549c7a44174b105cbed23cfaea1
 as 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/a8a74549c7a44174b105cbed23cfaea1
2015-10-15 08:34:29,279 INFO org.apache.hadoop.hbase.regionserver.HStore: Added 
hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/a8a74549c7a44174b105cbed23cfaea1,
 entries=128, sequenceid=70839856876, filesize=2.6 K
2015-10-15 08:34:29,280 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Finished memstore flush of ~22.39 KB/22928, currentsize=0 B/0 for region 
Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
 in 128ms, sequenceid=70839856876, compaction requested=true

...

2015-10-15 08:37:02,945 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Closed L
2015-10-15 08:37:02,954 DEBUG org.apache.hadoop.hbase.wal.WALSplitter: Wrote 
region 
seqId=hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/recovered.edits/70839856879.seqid
 to 

[jira] [Commented] (HBASE-14624) BucketCache.freeBlock is too expensive

2015-10-19 Thread Randy Fox (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963477#comment-14963477
 ] 

Randy Fox commented on HBASE-14624:
---

There are no logs for that. Deduced it from the code and that fact that when i 
added a table to cache it also started taking along time to move when it was 
previously quick.  Vladimir Rodionov suggested i open this Jira after 
discussions on the maillist. 

> BucketCache.freeBlock is too expensive
> --
>
> Key: HBASE-14624
> URL: https://issues.apache.org/jira/browse/HBASE-14624
> Project: HBase
>  Issue Type: Improvement
>  Components: BlockCache
>Affects Versions: 1.0.0
>Reporter: Randy Fox
>
> Moving regions is unacceptably slow when using bucket cache, as it takes too 
> long to free all the blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14631:
---
 Hadoop Flags: Reviewed
Fix Version/s: 0.98.16
   1.1.3
   1.0.3
   1.3.0
   1.2.0
   2.0.0

> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963502#comment-14963502
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-13082 at 10/19/15 3:58 PM:
--

bq.What about flush? Once the flush is finished, we will have to clear the 
snapshot of cells in Memstore no? Then we will have to open this newly added 
file for reading.. If we dont open it, we can not release the memstore 
snapshot. 
We do open the reader for sure. But one thing may be to check is that if we 
open but still the current scanner heap is not reset will it have any impact on 
the current scan is what needs to be checked?  Because during flush the reads 
can still be served from the snapshot. Only after flush is a point to be noted. 
bq.This status is in reader? It suits better in StoreFile no?
The reader is the common object here.  Hence was referencing it from there. 
Initially had it in storefile only but felt that StoreFile is more volatile. 
Let me check. Can be done.  Infact I was also not very sure of having the state 
in the reader. 
We can change the states no problem.
bq.This is a new Chore thread adding to the system.. Mention about it clearly. 
It will be one thread per RS. Also what is its interval? Is it configurable? 
How?
Forgot to add that config details. It is per store and the chore is started by 
the HStore and not the RS.  It can be configured may be once in 2 mins or so?  
currently in patch it is once in 5 min.
bq.Hence it requires us to add new APIs which ensures that a split can go ahead 
even if reference files are present but they are in the COMPACTED state
This is bit of sticky area.  In case of merge I have tried to forcefully clear 
the compacted files. May be we need to do the same with split also. But in 
split do we explicitly call compact?  I was not pretty sure on that. But in 
merge we do. 




was (Author: ram_krish):
bq.What about flush? Once the flush is finished, we will have to clear the 
snapshot of cells in Memstore no? Then we will have to open this newly added 
file for reading.. If we dont open it, we can not release the memstore 
snapshot. 
We do open the reader for sure. But one thing may be to check is that if we 
open but still the current scanner heap is not reset will it have any impact on 
the current scan is what needs to be checked?  Because during flush the reads 
can still be served from the snapshot. Only after flush is a point to be noted. 
bq.This status is in reader? It suits better in StoreFile no?
The reader is the common object here.  Hence was referencing it from there. 
Initially had it in storefile only but felt that StoreFile is more volatile. 
Let me check. Can be done.  Infact I was also not very sure of having the state 
in the reader. 
We can change the states no problem.
bq.This is a new Chore thread adding to the system.. Mention about it clearly. 
It will be one thread per RS. Also what is its interval? Is it configurable? 
How?
For to add that config details. It is per store and the chore is started by the 
HStore and not the RS. 
bq.Hence it requires us to add new APIs which ensures that a split can go ahead 
even if reference files are present but they are in the COMPACTED state
This is bit of sticky area.  In case of merge I have tried to forcefully clear 
the compacted files. May be we need to do the same with split also. But in 
split do we explicitly call compact?  I was not pretty sure on that. But in 
merge we do. 



> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> 

[jira] [Commented] (HBASE-14643) Avoid Splits from once again opening a closed reader for fetching the first and last key

2015-10-19 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963543#comment-14963543
 ] 

Heng Chen commented on HBASE-14643:
---

Yeah, we can do this in HRegion.close(),  but i found that there are a lot of 
usages of {{HRegion.Close}} in project. It will change many places. 

If we do it outer HRegion.close, it will change little,  i made a simple patch, 
could i upload it ?

> Avoid Splits from once again opening a closed reader for fetching the first 
> and last key
> 
>
> Key: HBASE-14643
> URL: https://issues.apache.org/jira/browse/HBASE-14643
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> Currently split flow is such that we close the parent region and all its 
> store file readers are also closed.  After that inorder to split the 
> reference files we need the first and last keys for which once again open the 
> readers on those store files.  This could be costlier operation considering 
> the fact that it has to contact the HDFS for this close and open operation. 
> This JIRA is to see if we can improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-14643) Avoid Splits from once again opening a closed reader for fetching the first and last key

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-14643:
--

Assignee: ramkrishna.s.vasudevan

> Avoid Splits from once again opening a closed reader for fetching the first 
> and last key
> 
>
> Key: HBASE-14643
> URL: https://issues.apache.org/jira/browse/HBASE-14643
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> Currently split flow is such that we close the parent region and all its 
> store file readers are also closed.  After that inorder to split the 
> reference files we need the first and last keys for which once again open the 
> readers on those store files.  This could be costlier operation considering 
> the fact that it has to contact the HDFS for this close and open operation. 
> This JIRA is to see if we can improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14643) Avoid Splits from once again opening a closed reader for fetching the first and last key

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963517#comment-14963517
 ] 

ramkrishna.s.vasudevan commented on HBASE-14643:


Exactly that was the plan. I noticed this while working on something else. I 
wanted to change the return type from the HRegion.close() itself so that the 
same can be used in the split flow. 

> Avoid Splits from once again opening a closed reader for fetching the first 
> and last key
> 
>
> Key: HBASE-14643
> URL: https://issues.apache.org/jira/browse/HBASE-14643
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>
> Currently split flow is such that we close the parent region and all its 
> store file readers are also closed.  After that inorder to split the 
> reference files we need the first and last keys for which once again open the 
> readers on those store files.  This could be costlier operation considering 
> the fact that it has to contact the HDFS for this close and open operation. 
> This JIRA is to see if we can improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14631:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963502#comment-14963502
 ] 

ramkrishna.s.vasudevan commented on HBASE-13082:


bq.What about flush? Once the flush is finished, we will have to clear the 
snapshot of cells in Memstore no? Then we will have to open this newly added 
file for reading.. If we dont open it, we can not release the memstore 
snapshot. 
We do open the reader for sure. But one thing may be to check is that if we 
open but still the current scanner heap is not reset will it have any impact on 
the current scan is what needs to be checked?  Because during flush the reads 
can still be served from the snapshot. Only after flush is a point to be noted. 
bq.This status is in reader? It suits better in StoreFile no?
The reader is the common object here.  Hence was referencing it from there. 
Initially had it in storefile only but felt that StoreFile is more volatile. 
Let me check. Can be done.  Infact I was also not very sure of having the state 
in the reader. 
We can change the states no problem.
bq.This is a new Chore thread adding to the system.. Mention about it clearly. 
It will be one thread per RS. Also what is its interval? Is it configurable? 
How?
For to add that config details. It is per store and the chore is started by the 
HStore and not the RS. 
bq.Hence it requires us to add new APIs which ensures that a split can go ahead 
even if reference files are present but they are in the COMPACTED state
This is bit of sticky area.  In case of merge I have tried to forcefully clear 
the compacted files. May be we need to do the same with split also. But in 
split do we explicitly call compact?  I was not pretty sure on that. But in 
merge we do. 



> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963544#comment-14963544
 ] 

stack commented on HBASE-14636:
---

Want me to try something?

> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14645) ServerManager's lock is a serious point of contention

2015-10-19 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-14645:
-

 Summary: ServerManager's lock is a serious point of contention
 Key: HBASE-14645
 URL: https://issues.apache.org/jira/browse/HBASE-14645
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark


On cluster instability the server manager lock is where all threads go to hang 
out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14646.
---
   Resolution: Fixed
Fix Version/s: 1.3.0
   1.2.0
   2.0.0

> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964016#comment-14964016
 ] 

Hadoop QA commented on HBASE-14636:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767386/HBASE-14636.patch
  against master branch at commit ea0cf399b4b665d6a0daa0c4e616e893e377a283.
  ATTACHMENT ID: 12767386

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16096//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16096//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16096//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16096//console

This message is automatically generated.

> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-14636.patch, HBASE-14636.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14541) TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many splits and few retries

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964075#comment-14964075
 ] 

Hudson commented on HBASE-14541:


FAILURE: Integrated in HBase-TRUNK #6928 (See 
[https://builds.apache.org/job/HBase-TRUNK/6928/])
HBASE-14541 TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed 
(matteo.bertozzi: rev fb583dd1ea5850f8d826bd39f9f8a61f5053e8e3)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java


> TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many 
> splits and few retries
> --
>
> Key: HBASE-14541
> URL: https://issues.apache.org/jira/browse/HBASE-14541
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Matteo Bertozzi
> Attachments: HBASE-14541-test.patch, HBASE-14541-v0.patch, 
> HBASE-14541-v0.patch
>
>
> This one seems worth a dig. We seem to be making progress but here is what we 
> are trying to load which seems weird:
> {code}
> 2015-10-01 17:19:41,322 INFO  [main] mapreduce.LoadIncrementalHFiles(360): 
> Split occured while grouping HFiles, retry attempt 10 with 4 files remaining 
> to group or split
> 2015-10-01 17:19:41,323 ERROR [main] mapreduce.LoadIncrementalHFiles(402): 
> -
> Bulk load aborted with some files not yet loaded:
> -
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-B/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/ce11cbe2490d444d8958264004286aff.bottom
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-B/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/ce11cbe2490d444d8958264004286aff.top
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-A/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/30c58eeb23a6464da21117e6e1bc565c.bottom
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-A/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/30c58eeb23a6464da21117e6e1bc565c.top
> {code}
> Whats that about?
> Making note here. Will keep an eye on this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964057#comment-14964057
 ] 

Hudson commented on HBASE-14631:


FAILURE: Integrated in HBase-0.98 #1161 (See 
[https://builds.apache.org/job/HBase-0.98/1161/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev aaf5653767c5b73ad15eb65cb644e1b985ac21d7)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14642) Disable flakey TestMultiParallel#testActiveThreadsCount

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963952#comment-14963952
 ] 

stack commented on HBASE-14642:
---

It can also fail in a different manner. See 
https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.2/264/jdk=latest1.7,label=Hadoop/consoleText

> Disable flakey TestMultiParallel#testActiveThreadsCount
> ---
>
> Key: HBASE-14642
> URL: https://issues.apache.org/jira/browse/HBASE-14642
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Attachments: 14642.txt
>
>
> Failed twice in a row on 1.2 build... Disabling for now Unless someone 
> wants to dig in and fix it that is...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964005#comment-14964005
 ] 

stack commented on HBASE-13082:
---

The one pager is great ([~lhofhansl]!). Nice summary of the issue up top. 
Explanation of the approach your fix takes helps. 

I am trying to understand the COMPACTED vs NON_COMPACTED state. Is 
NON_COMPACTED a freshly-flushed file? Is a COMPACTED file a file that is made 
up of N other storefiles and you want to make sure the scan doesn't include 
duplicated info -- the compacted file and the compactions inputs?

Or I think you are saying that when a file is COMPACTED, then its content can 
be found in another file and so it should not be included in a scan? Should the 
state be COMPACTED_AWAY or REPLACED (by file...). Do we need the NON_COMPACTED 
state? It is the default. No need to call this state anything (Active?)

How can the following happen? "Now in case of the versioned storefile approach 
change, there is a chance that there are reference files which are not yet 
archived after compaction because of the change in the store file management."

Where will you write the COMPACTED_AWAY state? In memory or into the file? (I 
suppose you can't write it to the file because it is already written)

Doc is great [~ram_krish]





> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964056#comment-14964056
 ] 

stack commented on HBASE-14636:
---

What about closeCheckInterval? We are already on an interval checking for a 
close. Could we do the shipped in here on this same interval? Or at least unite 
the two time-based checks?

> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-14636.patch, HBASE-14636.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964085#comment-14964085
 ] 

Hudson commented on HBASE-14647:


FAILURE: Integrated in HBase-1.3 #281 (See 
[https://builds.apache.org/job/HBase-1.3/281/])
HBASE-14647 Disable (stack: rev 8940be05978950d18916d7d67d65c1fa915bd802)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestWALProcedureStoreOnHDFS.java


> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964086#comment-14964086
 ] 

Hudson commented on HBASE-14646:


SUCCESS: Integrated in HBase-1.3-IT #252 (See 
[https://builds.apache.org/job/HBase-1.3-IT/252/])
HBASE-14646 Move TestCellACLs from medium to large category (stack: rev 
b921ed422215dbdb3c90a2f3bbda08c5aecee9ab)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java
 HBASE-14646 Move TestCellACLs from medium to large category; ADDENDUM (stack: 
rev 74ad3947e998301052503cbb4f7b9aade8ef42ef)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java


> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964042#comment-14964042
 ] 

stack edited comment on HBASE-14636 at 10/19/15 9:06 PM:
-

The patch seems to be doing well in my testbed (it takes a good few hours to 
run).  Will keep you posted. We got much further than w/o the patch.

On the patch:

this.curBlock.getMemoryType() == MemoryType.SHARED

Can we keep this internal to the block rather than have HFileScannerImpl have 
to know about SHARED?... Maybe a method on block... 

Sort of similiar, this bit where we do a check every two seconds.. could we 
change it to be size based?

Maybe not every two seconds seems a little arbitrary. Pity there not a more 
'natural' place to do the housekeeping.

I had other comments here but edited it out because I remembered how this stuff 
worked after I'd made the comment... my comment made no sense after I 
remembered what shipped does. Please excuse.




was (Author: stack):
The patch seems to be doing well in my testbed (it takes a good few hours to 
run).  Will keep you posted. We got much further than w/o the patch.

On the patch:

this.curBlock.getMemoryType() == MemoryType.SHARED

Can we keep this internal to the block rather than have HFileScannerImpl have 
to know about SHARED?... Maybe a method on block... 

Sort of similiar, this bit where we do a check every two seconds.. could we 
change it to be size based?

  private static final long COMPACTION_PROGRESS_SHIPPED_CALL_INTERVAL = 2 * 
1000;

Could it be a method in KeyValueScanner that checks if we need to ship?

Could call  kvs.shipped(); everytime through and it figures when to ship?  Or 
add a 'ship' method and if it returns true, then called shipped (Is that right? 
The method is 'shipped'. Have the KVs been shipped at this stage or does this 
method ship them? If it ships them, the method should be 'ship' rather than 
'shipped'?)



> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-14636.patch, HBASE-14636.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14420) Zombie Stomping Session

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14420:
--
Attachment: none_fix.txt

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964042#comment-14964042
 ] 

stack commented on HBASE-14636:
---

The patch seems to be doing well in my testbed (it takes a good few hours to 
run).  Will keep you posted. We got much further than w/o the patch.

On the patch:

this.curBlock.getMemoryType() == MemoryType.SHARED

Can we keep this internal to the block rather than have HFileScannerImpl have 
to know about SHARED?... Maybe a method on block... 

Sort of similiar, this bit where we do a check every two seconds.. could we 
change it to be size based?

  private static final long COMPACTION_PROGRESS_SHIPPED_CALL_INTERVAL = 2 * 
1000;

Could it be a method in KeyValueScanner that checks if we need to ship?

Could call  kvs.shipped(); everytime through and it figures when to ship?  Or 
add a 'ship' method and if it returns true, then called shipped (Is that right? 
The method is 'shipped'. Have the KVs been shipped at this stage or does this 
method ship them? If it ships them, the method should be 'ship' rather than 
'shipped'?)



> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-14636.patch, HBASE-14636.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964063#comment-14964063
 ] 

Hudson commented on HBASE-14646:


FAILURE: Integrated in HBase-1.2 #275 (See 
[https://builds.apache.org/job/HBase-1.2/275/])
 HBASE-14646 Move TestCellACLs from medium to large category; ADDENDUM (stack: 
rev 443de6ef405887a052b66fbbd56282caf931f5a5)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java


> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964074#comment-14964074
 ] 

Hudson commented on HBASE-14646:


FAILURE: Integrated in HBase-TRUNK #6928 (See 
[https://builds.apache.org/job/HBase-TRUNK/6928/])
HBASE-14646 Move TestCellACLs from medium to large category (stack: rev 
ea0cf399b4b665d6a0daa0c4e616e893e377a283)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java


> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964076#comment-14964076
 ] 

Hudson commented on HBASE-14647:


FAILURE: Integrated in HBase-TRUNK #6928 (See 
[https://builds.apache.org/job/HBase-TRUNK/6928/])
HBASE-14647 Disable (stack: rev c1f0442045f44fcbb3935f9244794929a5d0caea)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestWALProcedureStoreOnHDFS.java


> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964083#comment-14964083
 ] 

Hudson commented on HBASE-14646:


FAILURE: Integrated in HBase-1.3 #281 (See 
[https://builds.apache.org/job/HBase-1.3/281/])
HBASE-14646 Move TestCellACLs from medium to large category (stack: rev 
b921ed422215dbdb3c90a2f3bbda08c5aecee9ab)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java
 HBASE-14646 Move TestCellACLs from medium to large category; ADDENDUM (stack: 
rev 74ad3947e998301052503cbb4f7b9aade8ef42ef)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java


> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964087#comment-14964087
 ] 

Hudson commented on HBASE-14647:


SUCCESS: Integrated in HBase-1.3-IT #252 (See 
[https://builds.apache.org/job/HBase-1.3-IT/252/])
HBASE-14647 Disable (stack: rev 8940be05978950d18916d7d67d65c1fa915bd802)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestWALProcedureStoreOnHDFS.java


> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14541) TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many splits and few retries

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964084#comment-14964084
 ] 

Hudson commented on HBASE-14541:


FAILURE: Integrated in HBase-1.3 #281 (See 
[https://builds.apache.org/job/HBase-1.3/281/])
HBASE-14541 TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed 
(matteo.bertozzi: rev d01063c9a33608fd93a3043a3c3a96e83959cdfb)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java


> TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many 
> splits and few retries
> --
>
> Key: HBASE-14541
> URL: https://issues.apache.org/jira/browse/HBASE-14541
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Matteo Bertozzi
> Attachments: HBASE-14541-test.patch, HBASE-14541-v0.patch, 
> HBASE-14541-v0.patch
>
>
> This one seems worth a dig. We seem to be making progress but here is what we 
> are trying to load which seems weird:
> {code}
> 2015-10-01 17:19:41,322 INFO  [main] mapreduce.LoadIncrementalHFiles(360): 
> Split occured while grouping HFiles, retry attempt 10 with 4 files remaining 
> to group or split
> 2015-10-01 17:19:41,323 ERROR [main] mapreduce.LoadIncrementalHFiles(402): 
> -
> Bulk load aborted with some files not yet loaded:
> -
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-B/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/ce11cbe2490d444d8958264004286aff.bottom
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-B/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/ce11cbe2490d444d8958264004286aff.top
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-A/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/30c58eeb23a6464da21117e6e1bc565c.bottom
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-A/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/30c58eeb23a6464da21117e6e1bc565c.top
> {code}
> Whats that about?
> Making note here. Will keep an eye on this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964031#comment-14964031
 ] 

stack commented on HBASE-13082:
---

bq. When we mark a file as compacted the confusion can be like whether this 
file is a file created out of compaction.

Yeah, that was my first thought too... same as Anoop.

Is that a new Chore per Store per Region?  Every two minutes seems like a long 
time to hold on to files?

Yeah, this is a good point by [~anoop.hbase]:

bq. What about flush? .So we wont change the store file's heap during the 
scan is not really possible?

The other thing to consider is getting bulk loaded files in here while scans 
are going on. Seems like they could go in on open of a new scan. That'll work 
nicely. A file will be bullk loaded but won't be seen by ongoing scans.

The hard thing then is what to do about flush (we've been here before!)

On flush, what if we let all Scans complete before letting go the snapshot? 
More memory pressure. Simpler implementation.

Otherwise, flush registers it has happened at the region level. Scanners check 
for flush events every-so-often (we already have checkpoints in the scan to 
ensure we don't go over size or time constraints... could check at these times) 
and when they find one, they swap in the flushed file. When all Scanners have 
done this, then we let go the snapshot.  This might be a different sort of 
event to the one described in the doc here..  where we swap in compacted files 
on new scan creation.. but yeah, implementation would be cleaner if swap in of 
flushed files and compacted files all happened in the one manner.





> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964049#comment-14964049
 ] 

Hadoop QA commented on HBASE-14420:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12767392/none_fix.txt
  against master branch at commit c1f0442045f44fcbb3935f9244794929a5d0caea.
  ATTACHMENT ID: 12767392

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16097//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16097//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16097//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/16097//console

This message is automatically generated.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964101#comment-14964101
 ] 

Hudson commented on HBASE-14646:


FAILURE: Integrated in HBase-1.2-IT #224 (See 
[https://builds.apache.org/job/HBase-1.2-IT/224/])
HBASE-14646 Move TestCellACLs from medium to large category (stack: rev 
78e6da0c7caf80d0d9af66f75aad98684b9f176f)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java
 HBASE-14646 Move TestCellACLs from medium to large category; ADDENDUM (stack: 
rev 443de6ef405887a052b66fbbd56282caf931f5a5)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java


> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964102#comment-14964102
 ] 

Hudson commented on HBASE-14647:


FAILURE: Integrated in HBase-1.2-IT #224 (See 
[https://builds.apache.org/job/HBase-1.2-IT/224/])
HBASE-14647 Disable (stack: rev 102637f3beafe7faa8badb730762a3642a53940a)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestWALProcedureStoreOnHDFS.java


> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14647:
--
Fix Version/s: (was: 1.1.3)

> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14362) org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS is super duper flaky

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14362:
--
Fix Version/s: (was: 1.1.3)

> org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS is super 
> duper flaky
> -
>
> Key: HBASE-14362
> URL: https://issues.apache.org/jira/browse/HBASE-14362
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0
>Reporter: Dima Spivak
>Assignee: Heng Chen
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14362.patch, HBASE-14362.patch, 
> HBASE-14362_v1.patch
>
>
> [As seen in 
> Jenkins|https://builds.apache.org/job/HBase-TRUNK/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master.procedure/TestWALProcedureStoreOnHDFS/history/],
>  this test has been super flaky and we should probably address it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963684#comment-14963684
 ] 

Hudson commented on HBASE-14631:


SUCCESS: Integrated in HBase-1.3-IT #251 (See 
[https://builds.apache.org/job/HBase-1.3-IT/251/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev 61b11e0765206b0acf2099958a46604216287ea7)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransactionImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14647.
---
Resolution: Fixed

Pushed to branch-1.2+ Resolving

> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14646:
--
Attachment: 14646.txt

Pushed to branch-1.2+

> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963860#comment-14963860
 ] 

ramkrishna.s.vasudevan commented on HBASE-13082:


But again if handling flushes is a problem then resetting the heap should be 
done at any case. 

> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14642) Disable flakey TestMultiParallel#testActiveThreadsCount

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963946#comment-14963946
 ] 

stack commented on HBASE-14642:
---

[~chenheng] Thanks. Here is an example fail: 
https://builds.apache.org/job/HBase-1.3/jdk=latest1.8,label=Hadoop/277/console

Says
{code}
Failed tests: 
  TestMultiParallel.testActiveThreadsCount:160 expected:<5> but was:<4>
{code}

Above it says in branch-1.2 but it should have been branch-1: i.e. 1.3.I 
googled and found that this test has been failing for years...  on and off. 
Thanks.

> Disable flakey TestMultiParallel#testActiveThreadsCount
> ---
>
> Key: HBASE-14642
> URL: https://issues.apache.org/jira/browse/HBASE-14642
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Attachments: 14642.txt
>
>
> Failed twice in a row on 1.2 build... Disabling for now Unless someone 
> wants to dig in and fix it that is...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963669#comment-14963669
 ] 

Hudson commented on HBASE-14631:


SUCCESS: Integrated in HBase-1.2-IT #223 (See 
[https://builds.apache.org/job/HBase-1.2-IT/223/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev f62bbc9b669bdbae4b8267b1728a7743db847f8a)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransactionImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13318) RpcServer.Listener.getAddress should be synchronized

2015-10-19 Thread Nitin Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963780#comment-14963780
 ] 

Nitin Aggarwal commented on HBASE-13318:


We are seeing this issue as well, any updates for it?

> RpcServer.Listener.getAddress should be synchronized
> 
>
> Key: HBASE-13318
> URL: https://issues.apache.org/jira/browse/HBASE-13318
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.10.1
>Reporter: Lars Hofhansl
>Priority: Minor
>  Labels: thread-safety
>
> We just saw exceptions like these:
> {noformat}
> Exception in thread "B.DefaultRpcServer.handler=45,queue=0,port=60020" 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.getAddress(RpcServer.java:753)
>   at 
> org.apache.hadoop.hbase.ipc.RpcServer.getListenerAddress(RpcServer.java:2157)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:146)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Looks like RpcServer$Listener.getAddress should be synchronized 
> (acceptChannel is set to null upon exiting the thread under in a synchronized 
> block).
> Should be happening very rarely only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963832#comment-14963832
 ] 

Hudson commented on HBASE-14631:


SUCCESS: Integrated in HBase-TRUNK #6927 (See 
[https://builds.apache.org/job/HBase-TRUNK/6927/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev 57fea77074ea319024b00ea1fab9cc6f1068d696)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransactionImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963831#comment-14963831
 ] 

Hudson commented on HBASE-14631:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1114 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1114/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev aaf5653767c5b73ad15eb65cb644e1b985ac21d7)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14648) Reenable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)
stack created HBASE-14648:
-

 Summary: Reenable 
TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
 Key: HBASE-14648
 URL: https://issues.apache.org/jira/browse/HBASE-14648
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14647:
--
Attachment: 14647.txt

Disabled for now. Needs work.

Pushed to branch-1.2+

The push added more logging to try and help with this case:

{code}
java.lang.RuntimeException: sync aborted
at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:492)
at 
org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.insert(WALProcedureStore.java:335)
at 
org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS.testWalRollOnLowReplication(TestWALProcedureStoreOnHDFS.java:201)
{code}

... looks like i < 50 inserts but I didn't have a log to say how many.

The exception is:

{code}
2015-10-19 04:32:36,051 WARN  [Thread-416] 
hdfs.DFSOutputStream$DataStreamer(558): DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/test-logs/state-0015.log could only be replicated to 2 nodes 
instead of minReplication (=3).  There are 3 datanode(s) running and 3 node(s) 
are excluded in this operation.
...
{code}

So, somehow we are marking all our replicas as bad.

Upping the datanodes to 6 or 10 would make this test more likely to pass?

For study. Let me file issue to reenable.

> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963841#comment-14963841
 ] 

Hudson commented on HBASE-14647:


FAILURE: Integrated in HBase-1.2 #274 (See 
[https://builds.apache.org/job/HBase-1.2/274/])
HBASE-14647 Disable (stack: rev 102637f3beafe7faa8badb730762a3642a53940a)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestWALProcedureStoreOnHDFS.java


> Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> ---
>
> Key: HBASE-14647
> URL: https://issues.apache.org/jira/browse/HBASE-14647
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14647.txt
>
>
> It failed on two trunk builds. Even after attempts at making the test looser, 
> we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963876#comment-14963876
 ] 

Hudson commented on HBASE-14631:


FAILURE: Integrated in HBase-1.3 #280 (See 
[https://builds.apache.org/job/HBase-1.3/280/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev 61b11e0765206b0acf2099958a46604216287ea7)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransactionImpl.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread stack (JIRA)
stack created HBASE-14646:
-

 Summary: Move TestCellACLs from medium to large category
 Key: HBASE-14646
 URL: https://issues.apache.org/jira/browse/HBASE-14646
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: stack
Priority: Minor


Move this test to the large category because on my local rig, I got this on a 
run:

{code}
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
(secondPartTestsExecution) on project hbase-server: There was a timeout or 
other error in the fork
{code}

 looking at the test output, TestCellACLs seems to be the 'hanging' test.. 
the test that is not completing.  Lets see if moving the test to different 
category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14648) Reenable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963682#comment-14963682
 ] 

stack commented on HBASE-14648:
---

See HBASE-14647 for some observations on recent fails. This is a critical test. 
Needs to work.

> Reenable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> 
>
> Key: HBASE-14648
> URL: https://issues.apache.org/jira/browse/HBASE-14648
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963710#comment-14963710
 ] 

Hudson commented on HBASE-14631:


FAILURE: Integrated in HBase-1.0 #1091 (See 
[https://builds.apache.org/job/HBase-1.0/1091/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev 1b816a518572fbfa4928a792bd160cabf948fb4c)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14647) Disable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)
stack created HBASE-14647:
-

 Summary: Disable 
TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
 Key: HBASE-14647
 URL: https://issues.apache.org/jira/browse/HBASE-14647
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: stack


It failed on two trunk builds. Even after attempts at making the test looser, 
we still fail. Needs work. Disabling for now while trying to stabilize  build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14646) Move TestCellACLs from medium to large category

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963840#comment-14963840
 ] 

Hudson commented on HBASE-14646:


FAILURE: Integrated in HBase-1.2 #274 (See 
[https://builds.apache.org/job/HBase-1.2/274/])
HBASE-14646 Move TestCellACLs from medium to large category (stack: rev 
78e6da0c7caf80d0d9af66f75aad98684b9f176f)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestCellACLs.java


> Move TestCellACLs from medium to large category
> ---
>
> Key: HBASE-14646
> URL: https://issues.apache.org/jira/browse/HBASE-14646
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14646.txt
>
>
> Move this test to the large category because on my local rig, I got this on a 
> run:
> {code}
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test 
> (secondPartTestsExecution) on project hbase-server: There was a timeout or 
> other error in the fork
> {code}
>  looking at the test output, TestCellACLs seems to be the 'hanging' 
> test.. the test that is not completing.  Lets see if moving the test to 
> different category helps with these fails because of timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963858#comment-14963858
 ] 

ramkrishna.s.vasudevan commented on HBASE-13082:


bq.So we wont change the store file's heap during the scan is not really 
possible?
I think our comments were published at the same time and in between JIRA was 
down. Still I think we can do this for a pure bulk loaded case. 

> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1_WIP.patch, 
> HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, gc.png, 
> gc.png, gc.png, hits.png, next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14648) Reenable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14648:
--
Priority: Critical  (was: Major)

> Reenable TestWALProcedureStoreOnHDFS#testWalRollOnLowReplication
> 
>
> Key: HBASE-14648
> URL: https://issues.apache.org/jira/browse/HBASE-14648
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14541) TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many splits and few retries

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963685#comment-14963685
 ] 

Hudson commented on HBASE-14541:


SUCCESS: Integrated in HBase-1.3-IT #251 (See 
[https://builds.apache.org/job/HBase-1.3-IT/251/])
HBASE-14541 TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed 
(matteo.bertozzi: rev d01063c9a33608fd93a3043a3c3a96e83959cdfb)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java


> TestHFileOutputFormat.testMRIncrementalLoadWithSplit failed due to too many 
> splits and few retries
> --
>
> Key: HBASE-14541
> URL: https://issues.apache.org/jira/browse/HBASE-14541
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Matteo Bertozzi
> Attachments: HBASE-14541-test.patch, HBASE-14541-v0.patch, 
> HBASE-14541-v0.patch
>
>
> This one seems worth a dig. We seem to be making progress but here is what we 
> are trying to load which seems weird:
> {code}
> 2015-10-01 17:19:41,322 INFO  [main] mapreduce.LoadIncrementalHFiles(360): 
> Split occured while grouping HFiles, retry attempt 10 with 4 files remaining 
> to group or split
> 2015-10-01 17:19:41,323 ERROR [main] mapreduce.LoadIncrementalHFiles(402): 
> -
> Bulk load aborted with some files not yet loaded:
> -
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-B/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/ce11cbe2490d444d8958264004286aff.bottom
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-B/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/ce11cbe2490d444d8958264004286aff.top
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-A/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/30c58eeb23a6464da21117e6e1bc565c.bottom
>   
> hdfs://localhost:39540/user/jenkins/test-data/720ae36a-2495-456b-ba68-19e260685a35/testLocalMRIncrementalLoad/info-A/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/_tmp/30c58eeb23a6464da21117e6e1bc565c.top
> {code}
> Whats that about?
> Making note here. Will keep an eye on this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14631) Region merge request should be audited with request user through proper scope of doAs() calls to region observer notifications

2015-10-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963702#comment-14963702
 ] 

Hudson commented on HBASE-14631:


FAILURE: Integrated in HBase-1.2 #273 (See 
[https://builds.apache.org/job/HBase-1.2/273/])
HBASE-14631 Region merge request should be audited with request user (tedyu: 
rev f62bbc9b669bdbae4b8267b1728a7743db847f8a)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionMergeTransactionImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerObserver.java


> Region merge request should be audited with request user through proper scope 
> of doAs() calls to region observer notifications
> --
>
> Key: HBASE-14631
> URL: https://issues.apache.org/jira/browse/HBASE-14631
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: 14631-branch-0.98.txt, 14631-branch-1.0.txt, 
> 14631-branch-1.txt, 14631-v1.txt
>
>
> HBASE-14475 and HBASE-14605 narrowed the scope of doAs() calls to region 
> observer notifications for region splitting.
> During review of HBASE-14605, Andrew brought up the case for region merge.
> This JIRA is to implement similar scope narrowing technique for region 
> merging.
> The majority of the change would be in RegionMergeTransactionImpl class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963854#comment-14963854
 ] 

ramkrishna.s.vasudevan commented on HBASE-14636:


With 9G heap space and LRU cache just after loading 20G of data and running 
workload c - causes pauses upto 1.5 secs even with the patch applied. So it is 
something else that is leading to this bigger GCs?

> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-14636.patch, HBASE-14636.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14387) Compaction improvements: Maximum off-peak compaction size

2015-10-19 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14387:
--
Summary: Compaction improvements: Maximum off-peak compaction size  (was: 
Compaction improvements: Maximum compaction size)

> Compaction improvements: Maximum off-peak compaction size
> -
>
> Key: HBASE-14387
> URL: https://issues.apache.org/jira/browse/HBASE-14387
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-14387-v1.patch
>
>
> Make max compaction size for peak and off peak separate config options.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14387) Compaction improvements: Maximum compaction size

2015-10-19 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14387:
--
Status: Patch Available  (was: Open)

> Compaction improvements: Maximum compaction size
> 
>
> Key: HBASE-14387
> URL: https://issues.apache.org/jira/browse/HBASE-14387
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-14387-v1.patch
>
>
> Make max compaction size for peak and off peak separate config options.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14387) Compaction improvements: Maximum compaction size

2015-10-19 Thread Vladimir Rodionov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Rodionov updated HBASE-14387:
--
Attachment: HBASE-14387-v1.patch

Patch v1. Added new compaction configuration option: 
*hbase.hstore.compaction.max.size.offpeak* - maximum selection size eligible 
for minor compaction during off peak hours.

*hbase.hstore.compaction.max.size* - this is default max if no off-peak hours 
are defined or if no maximum off-peak size is defined. 

> Compaction improvements: Maximum compaction size
> 
>
> Key: HBASE-14387
> URL: https://issues.apache.org/jira/browse/HBASE-14387
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-14387-v1.patch
>
>
> Make max compaction size for peak and off peak separate config options.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14636) Clear HFileScannerImpl#prevBlocks in between Compaction flow

2015-10-19 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964258#comment-14964258
 ] 

stack commented on HBASE-14636:
---

Patch seems to have made it so we survived the loading. Great. I'm trying 
workloadc again... and then will try again but w/ less memory. Will report back.

> Clear HFileScannerImpl#prevBlocks in between Compaction flow
> 
>
> Key: HBASE-14636
> URL: https://issues.apache.org/jira/browse/HBASE-14636
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-14636.patch, HBASE-14636.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-8939) Hanging unit tests

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-8939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-8939.
--
Resolution: Fixed
  Assignee: stack

Resolving parent issue because all subtasks done and then this issue was 
subsumed by HBASE-14420

> Hanging unit tests
> --
>
> Key: HBASE-8939
> URL: https://issues.apache.org/jira/browse/HBASE-8939
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 8939.txt
>
>
> We have hanging tests.  Here's a few from this morning's review:
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  
> https://builds.apache.org/job/hbase-0.95-on-hadoop2/176/consoleText
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100 3300k0 3300k0 0   508k  0 --:--:--  0:00:06 --:--:--  621k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
> {code}
> And...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh 
> http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/396/consoleText
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100  779k0  779k0 0   538k  0 --:--:--  0:00:01 --:--:--  559k
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running 
> org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide3
> {code}
> and
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  
> http://54.241.6.143/job/HBase-0.95/607/consoleText
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100  445k0  445k0 0   490k  0 --:--:-- --:--:-- --:--:--  522k
> Hanging test: Running 
> org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer
> Hanging test: Running org.apache.hadoop.hbase.master.TestAssignmentManager
> Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck
> Hanging test: Running 
> org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
> Hanging test: Running 
> org.apache.hadoop.hbase.IntegrationTestDataIngestSlowDeterministic
> {code}
> and...
> {code}
> durruti:0.95 stack$ ./dev-support/findHangingTest.sh  
> http://54.241.6.143/job/HBase-0.95-Hadoop-2/607/consoleText
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
> 100  781k0  781k0 0   240k  0 --:--:--  0:00:03 --:--:--  244k
> Hanging test: Running 
> org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
> Hanging test: Running org.apache.hadoop.hbase.client.TestFromClientSide
> Hanging test: Running org.apache.hadoop.hbase.TestIOFencing
> Hanging test: Running 
> org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence
> Hanging test: Running 
> org.apache.hadoop.hbase.master.TestDistributedLogSplitting
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12558) Disable TestHCM.testClusterStatus Unexpected exception, expected but was

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-12558.
---
Resolution: Fixed

Resolving. This test was disabled. Make a new issue to reeenable after 
flakyness is fixed.

> Disable TestHCM.testClusterStatus Unexpected exception, 
> expected 
> but was
> -
>
> Key: HBASE-12558
> URL: https://issues.apache.org/jira/browse/HBASE-12558
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: 12558-master.patch, 12558.ignore.txt
>
>
> Happens for me reliably on mac os x. I looked at fixing it. The listener is 
> not noticing the publish for whatever reason.  Thats where I stopped.
> {code}
> java.lang.Exception: Unexpected exception, 
> expected 
> but was
>   at junit.framework.Assert.fail(Assert.java:57)
>   at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:193)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.waitFor(HBaseTestingUtility.java:3537)
>   at 
> org.apache.hadoop.hbase.client.TestHCM.testClusterStatus(TestHCM.java:273)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14569) Disable hanging test TestNamespaceAuditor

2015-10-19 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14569.
---
Resolution: Fixed
  Assignee: stack  (was: Vandana Ayyalasomayajula)

HBASE-14650 is the issue to reenable this flakey test when fixed.

> Disable hanging test TestNamespaceAuditor
> -
>
> Key: HBASE-14569
> URL: https://issues.apache.org/jira/browse/HBASE-14569
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14569.txt, 14569.txt
>
>
> The test hung here:  
> https://builds.apache.org/job/PreCommit-HBASE-Build/15893//console It hangs 
> quite regularly. Any chance of taking a look [~avandana]? Else, I'll just 
> disable it so we can get clean builds again. Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >