[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Patch Available  (was: Open)

Ran the failed test locally which passed.

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Open  (was: Patch Available)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360004#comment-15360004
 ] 

Reid Chan commented on HBASE-14743:
---

previous test failed caused by 
"TEST-org.apache.hadoop.hbase.TestAcidGuarantees.xml."
this test failed caused by 
"TEST-org.apache.hadoop.hbase.client.TestHCM.xml."

...i don't understand.



> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16168) Split cache usage into tables

2016-07-01 Thread darion yaphet (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

darion yaphet updated HBASE-16168:
--
Description: Currently all tables in one region server share the statistics 
of cache usage . It's hard to decision the tables use how many memory in cache 
. So I think we should split cache usage statistics into tables . This is more 
convenient to know the table cache proportion .

> Split cache usage into tables
> -
>
> Key: HBASE-16168
> URL: https://issues.apache.org/jira/browse/HBASE-16168
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.1.1, 0.98.20
>Reporter: darion yaphet
>
> Currently all tables in one region server share the statistics of cache usage 
> . It's hard to decision the tables use how many memory in cache . So I think 
> we should split cache usage statistics into tables . This is more convenient 
> to know the table cache proportion .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16170) Phoenix4.4-HBase1.1 Unit Test hanging for infinite time.

2016-07-01 Thread sonali shrivastava (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sonali shrivastava updated HBASE-16170:
---
Description: 
Hello HBase Team,

I am facing hanging problem while building "Phoenix4.4-HBase1.1" on RHEL 7.2 
ppc64le which is dependent on HBase.

I am having IOP setup done with environment setup as open jdk 1.8 and maven 
3.3.9, Hbase 1.1.1 installed and IOP hadoop services with ambari running on it.

When I build "Phoenix4.4-HBase 1.1" than there occurs hang at below point 
without any error logs, hanging occurs each time at different points.
For 1st build, below are the lines where hang up occurs for infinite time:
i.e. Running org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec - in 
org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest

2nd time when I build,hang up occurs at different point. It is suspected to be 
HBase issue.

Team: Can you please help us in this and let us know the reason for hanging. 
Thank You.

Thanks & Regards,
Sonali Shrivastava



  was:

Hello Thrift Team,

I am facing hanging problem while building "Phoenix4.4-HBase1.1" on RHEL 7.2 
ppc64le which is dependent on HBase.

I am having IOP setup done with environment setup as open jdk 1.8 and maven 
3.3.9, Hbase 1.1.1 installed and IOP hadoop services with ambari running on it.

When I build "Phoenix4.4-HBase 1.1" than there occurs hang at below point 
without any error logs, hanging occurs each time at different points.
For 1st build, below are the lines where hang up occurs for infinite time:
i.e. Running org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec - in 
org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest

2nd time when I build,hang up occurs at different point. It is suspected to be 
HBase issue.

Team: Can you please help us in this and let us know the reason for hanging. 
Thank You.

Thanks & Regards,
Sonali Shrivastava




> Phoenix4.4-HBase1.1 Unit Test hanging for infinite time.
> 
>
> Key: HBASE-16170
> URL: https://issues.apache.org/jira/browse/HBASE-16170
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.0, 1.1.1
> Environment: open jdk 1.8, maven 3.3.9,IOP setup with HBase 1.1.1 
> installed and service running
>Reporter: sonali shrivastava
>
> Hello HBase Team,
> I am facing hanging problem while building "Phoenix4.4-HBase1.1" on RHEL 7.2 
> ppc64le which is dependent on HBase.
> I am having IOP setup done with environment setup as open jdk 1.8 and maven 
> 3.3.9, Hbase 1.1.1 installed and IOP hadoop services with ambari running on 
> it.
> When I build "Phoenix4.4-HBase 1.1" than there occurs hang at below point 
> without any error logs, hanging occurs each time at different points.
> For 1st build, below are the lines where hang up occurs for infinite time:
> i.e. Running 
> org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest
> Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec - 
> in org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest
> 2nd time when I build,hang up occurs at different point. It is suspected to 
> be HBase issue.
> Team: Can you please help us in this and let us know the reason for hanging. 
> Thank You.
> Thanks & Regards,
> Sonali Shrivastava



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16170) Phoenix4.4-HBase1.1 Unit Test hanging for infinite time.

2016-07-01 Thread sonali shrivastava (JIRA)
sonali shrivastava created HBASE-16170:
--

 Summary: Phoenix4.4-HBase1.1 Unit Test hanging for infinite time.
 Key: HBASE-16170
 URL: https://issues.apache.org/jira/browse/HBASE-16170
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.1.1, 1.1.0
 Environment: open jdk 1.8, maven 3.3.9,IOP setup with HBase 1.1.1 
installed and service running
Reporter: sonali shrivastava



Hello Thrift Team,

I am facing hanging problem while building "Phoenix4.4-HBase1.1" on RHEL 7.2 
ppc64le which is dependent on HBase.

I am having IOP setup done with environment setup as open jdk 1.8 and maven 
3.3.9, Hbase 1.1.1 installed and IOP hadoop services with ambari running on it.

When I build "Phoenix4.4-HBase 1.1" than there occurs hang at below point 
without any error logs, hanging occurs each time at different points.
For 1st build, below are the lines where hang up occurs for infinite time:
i.e. Running org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec - in 
org.apache.hadoop.hbase.regionserver.PhoenixRpcSchedulerFactoryTest

2nd time when I build,hang up occurs at different point. It is suspected to be 
HBase issue.

Team: Can you please help us in this and let us know the reason for hanging. 
Thank You.

Thanks & Regards,
Sonali Shrivastava





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359996#comment-15359996
 ] 

Hadoop QA commented on HBASE-14743:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
1s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
33s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
39s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 36s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s 
{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 51s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
43s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 147m 39s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815851/HBASE-14743.010.v2.patch
 |
| JIRA Issue | HBASE-14743 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 

[jira] [Commented] (HBASE-16166) Create a memstore impl that does not compact but creates pipelines and flushes all segments from pipeline

2016-07-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359993#comment-15359993
 ] 

stack commented on HBASE-16166:
---

You fellows finding Segments are in the way when default case? What you seeing 
in terms of friction when the segments pipeline is running (I've not tried it). 
Thanks.

> Create a memstore impl that does not compact but creates pipelines and 
> flushes all segments from pipeline
> -
>
> Key: HBASE-16166
> URL: https://issues.apache.org/jira/browse/HBASE-16166
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
>
> A new memstore impl for the default cases, where we could create pipelines 
> and then directly flush all the segments in pipelines. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16149) Log the underlying RPC exception in RpcRetryingCallerImpl

2016-07-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359988#comment-15359988
 ] 

stack commented on HBASE-16149:
---

Looks like you applied this to branch-1 [~jerryhe]? Correct me if I have it 
wrong. Thanks boss.

> Log the underlying RPC exception in RpcRetryingCallerImpl 
> --
>
> Key: HBASE-16149
> URL: https://issues.apache.org/jira/browse/HBASE-16149
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-16149-branch-1.patch, HBASE-16149.patch
>
>
> In RpcRetryingCallerImpl:
> {code}
>   public T callWithRetries(RetryingCallable callable, int callTimeout)
>   throws IOException, RuntimeException {
> ...
> for (int tries = 0;; tries++) {
>   try {
> ...
> return callable.call(getTimeout(callTimeout));
> ...
>   } catch (Throwable t) {
> ExceptionUtil.rethrowIfInterrupt(t);
> if (tries > startLogErrorsCnt) {
>   LOG.info("Call exception, tries=" + tries + ", maxAttempts=" + 
> maxAttempts + ", started="
>   + (EnvironmentEdgeManager.currentTime() - 
> tracker.getStartTime()) + " ms ago, "
>   + "cancelled=" + cancelled.get() + ", msg="
>   + callable.getExceptionMessageAdditionalDetail());
> }
> ...
> {code}
> We log the callable.getExceptionMessageAdditionalDetail() msg. But 
> callable.getExceptionMessageAdditionalDetail() may not provide the underlying 
> cause..
> For example, in AbstractRegionServerCallable, 
> {code}
>   public String getExceptionMessageAdditionalDetail() {
> return "row '" + Bytes.toString(row) + "' on table '" + tableName + "' at 
> " + location;
>   }
> {code}
> Let's add the underlying exception cause to the message as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16168) Split cache usage into tables

2016-07-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359984#comment-15359984
 ] 

stack commented on HBASE-16168:
---

Please say more what this is about [~darion]? Thanks.

> Split cache usage into tables
> -
>
> Key: HBASE-16168
> URL: https://issues.apache.org/jira/browse/HBASE-16168
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 1.1.1, 0.98.20
>Reporter: darion yaphet
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359978#comment-15359978
 ] 

Hadoop QA commented on HBASE-16157:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
52s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
53s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 1s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 5s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 133m 22s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815808/HBASE-16157-v4.patch |
| JIRA Issue | HBASE-16157 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 561eb82 |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
| findbugs | v3.0.0 |
| unit | 

[jira] [Comment Edited] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-01 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358843#comment-15358843
 ] 

Yu Li edited comment on HBASE-16132 at 7/2/16 4:26 AM:
---

>From current code it seems to me that RpcRetryingCallerWithReadReplicas.call() 
>also use {{ResultBoundedCompletionService#poll}}, mind double check or further 
>explain? Thanks.


was (Author: carp84):
>From current code it seems to me that RpcRetryingCallerWithReadReplicas.call() 
>also use {{ResultBoundedCompletionService#call}}, mind double check or further 
>explain? Thanks.

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-01 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359972#comment-15359972
 ] 

Yu Li commented on HBASE-16132:
---

Thanks for further clarification, got your point now.

So in both ScannerCallableWithReplicas and RpcRetryingCallerWithReadReplicas it 
calls {{Future.get}}, and the main difference is that 
RpcRetryingCallerWithReadReplicas calls {{cs.take}} instead of {{cs.poll}} for 
the second replica, which means we will dead-wait on the second one if the 
first replica timed out. Since RpcRetryingCallerWithReadReplicas is used by get 
and get is a special type of scan, I agree that it's better to follow the same 
way in ScannerCallableWithReplicas.

Let me push this patch in first (to solve the problem) and open another JIRA 
about your proposal sir (a new JIRA will be more visible so we could better see 
others' thoughts :-)). Thanks for point this out [~devaraj]

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16159) OutOfMemory exception when using AsyncRpcClient with encryption to read rpc response

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359944#comment-15359944
 ] 

Hudson commented on HBASE-16159:


SUCCESS: Integrated in HBase-Trunk_matrix #1152 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1152/])
HBASE-16159 OutOfMemory exception when using AsyncRpcClient with (tedyu: rev 
af9422c04a343730d9ed6f3cf46cd6fc9d77c04a)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslClientHandler.java


> OutOfMemory exception when using AsyncRpcClient with encryption to read rpc 
> response
> 
>
> Key: HBASE-16159
> URL: https://issues.apache.org/jira/browse/HBASE-16159
> Project: HBase
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16159.001.patch
>
>
> Test the Get with encryption AsyncRpcClient with in infinity loop, will get 
> the OOM exception. The root cause is the same as HBASE-16054.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: HBASE-14743.010.v2.patch

try again

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: (was: HBASE-14743.010.v2.patch)

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: (was: HBASE-14743.008.patch)

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: (was: HBASE-14743.003.patch)

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: (was: HBASE-14743.002.patch)

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: (was: HBASE-14743.005.patch)

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.009.patch, HBASE-14743.009.rw3.patch, 
> HBASE-14743.009.v2.patch, HBASE-14743.010.patch, HBASE-14743.010.v2.patch, 
> HBASE-14743.011.patch, Metrics snapshot 2016-6-30.png, Screen Shot 2016-06-16 
> at 5.39.13 PM.png, test2_1.png, test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14743) Add metrics around HeapMemoryManager

2016-07-01 Thread Reid Chan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reid Chan updated HBASE-14743:
--
Attachment: (was: HBASE-14743.001.patch)

> Add metrics around HeapMemoryManager
> 
>
> Key: HBASE-14743
> URL: https://issues.apache.org/jira/browse/HBASE-14743
> Project: HBase
>  Issue Type: Improvement
>Reporter: Elliott Clark
>Assignee: Reid Chan
>Priority: Minor
> Attachments: HBASE-14743.002.patch, HBASE-14743.003.patch, 
> HBASE-14743.004.patch, HBASE-14743.005.patch, HBASE-14743.006.patch, 
> HBASE-14743.007.patch, HBASE-14743.008.patch, HBASE-14743.009.patch, 
> HBASE-14743.009.rw3.patch, HBASE-14743.009.v2.patch, HBASE-14743.010.patch, 
> HBASE-14743.010.v2.patch, HBASE-14743.011.patch, Metrics snapshot 
> 2016-6-30.png, Screen Shot 2016-06-16 at 5.39.13 PM.png, test2_1.png, 
> test2_2.png, test2_3.png, test2_4.png
>
>
> it would be good to know how many invocations there have been.
> How many decided to expand memstore.
> How many decided to expand block cache.
> How many decided to do nothing.
> etc.
> When that's done use those metrics to clean up the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Assignee: ChiaPing Tsai
  Status: Patch Available  (was: Open)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Assignee: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16159) OutOfMemory exception when using AsyncRpcClient with encryption to read rpc response

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359929#comment-15359929
 ] 

Hudson commented on HBASE-16159:


SUCCESS: Integrated in HBase-1.4 #265 (See 
[https://builds.apache.org/job/HBase-1.4/265/])
HBASE-16159 OutOfMemory exception when using AsyncRpcClient with (tedyu: rev 
7121bc41e7555980924ecaf4dc595a402c130dd2)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/security/SaslClientHandler.java


> OutOfMemory exception when using AsyncRpcClient with encryption to read rpc 
> response
> 
>
> Key: HBASE-16159
> URL: https://issues.apache.org/jira/browse/HBASE-16159
> Project: HBase
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16159.001.patch
>
>
> Test the Get with encryption AsyncRpcClient with in infinity loop, will get 
> the OOM exception. The root cause is the same as HBASE-16054.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359917#comment-15359917
 ] 

Ted Yu commented on HBASE-16052:


Ben, when you have a chance (after long weekend), can you backport to 0.98 
branch ?

Thanks

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359917#comment-15359917
 ] 

Ted Yu edited comment on HBASE-16052 at 7/2/16 2:02 AM:


Ben, when you have a chance (after long weekend), can you attach backported 
patch for 0.98 branch ?

Thanks


was (Author: yuzhih...@gmail.com):
Ben, when you have a chance (after long weekend), can you backport to 0.98 
branch ?

Thanks

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15532) core favored nodes enhancements

2016-07-01 Thread Thiruvel Thirumoolan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-15532:
-
Attachment: HBASE-15532.master.000.patch

Attaching a prelim/draft version of our implementation. Some more portions to 
cleanup and then I will open it up for review. Please do consider this a draft 
version.

> core favored nodes enhancements
> ---
>
> Key: HBASE-15532
> URL: https://issues.apache.org/jira/browse/HBASE-15532
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Francis Liu
>Assignee: Thiruvel Thirumoolan
> Attachments: HBASE-15532.master.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15844) We should respect hfile.block.index.cacheonwrite when write intermediate index Block

2016-07-01 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359885#comment-15359885
 ] 

Heng Chen commented on HBASE-15844:
---

push to master and branch-1

> We should respect hfile.block.index.cacheonwrite when write intermediate 
> index Block
> 
>
> Key: HBASE-15844
> URL: https://issues.apache.org/jira/browse/HBASE-15844
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-15844.patch, HBASE-15844_v1.patch
>
>
> {code: title=BlockIndexWriter#writeIntermediateBlock}
>   if (cacheConf != null) {
> HFileBlock blockForCaching = 
> blockWriter.getBlockForCaching(cacheConf);
> cacheConf.getBlockCache().cacheBlock(new BlockCacheKey(nameForCaching,
>   beginOffset, true, blockForCaching.getBlockType()), 
> blockForCaching);
>   }
> {code}
> The if condition should be ?
> {code}
> if (cacheConf != null && cacheConf.shouldCacheIndexesOnWrite()) 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15844) We should respect hfile.block.index.cacheonwrite when write intermediate index Block

2016-07-01 Thread Heng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heng Chen updated HBASE-15844:
--
Resolution: Fixed
  Assignee: Heng Chen
Status: Resolved  (was: Patch Available)

> We should respect hfile.block.index.cacheonwrite when write intermediate 
> index Block
> 
>
> Key: HBASE-15844
> URL: https://issues.apache.org/jira/browse/HBASE-15844
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-15844.patch, HBASE-15844_v1.patch
>
>
> {code: title=BlockIndexWriter#writeIntermediateBlock}
>   if (cacheConf != null) {
> HFileBlock blockForCaching = 
> blockWriter.getBlockForCaching(cacheConf);
> cacheConf.getBlockCache().cacheBlock(new BlockCacheKey(nameForCaching,
>   beginOffset, true, blockForCaching.getBlockType()), 
> blockForCaching);
>   }
> {code}
> The if condition should be ?
> {code}
> if (cacheConf != null && cacheConf.shouldCacheIndexesOnWrite()) 
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16165) Decrease RpcServer.callQueueSize before writeResponse causes OOM

2016-07-01 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-16165:
--
Priority: Minor  (was: Major)

> Decrease RpcServer.callQueueSize before writeResponse causes OOM
> 
>
> Key: HBASE-16165
> URL: https://issues.apache.org/jira/browse/HBASE-16165
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>Priority: Minor
>
> In RpcServer, we use {{callQueueSizeInBytes}} to avoid queuing too many calls 
> which causes OOM. But in {{CallRunner.run}}, we decrease it before send the 
> response back. And even after calling {{sendResponseIfReady}}, the call 
> object could stay in our heap for a long time if we can not write out the 
> response(That's why we need a Responder thread...). This makes it possible 
> that the actual size of all call object in heap is larger than 
> {{maxQueueSizeInBytes}} and causes OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16165) Decrease RpcServer.callQueueSize before writeResponse causes OOM

2016-07-01 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359877#comment-15359877
 ] 

Duo Zhang commented on HBASE-16165:
---

We hit this recently, but only happens on our legacy 94 clusters. And we found 
that there is another bug in 0.94.

In 0.94, when we can not write back the whole response at the first place, we 
will attach the call to the channel's SelectionKey, and never detach it. So if 
we have lots of connections whose selection key is attached with a call, and 
the call's param field is large(this usually happens when replication is 
enabled) then we will run into OOM.

So for hbase 0.98+, I think this is only theoretical. It could only happen if a 
client keeps sending large put request but never receives the response. Let's 
modify the priority. :)

> Decrease RpcServer.callQueueSize before writeResponse causes OOM
> 
>
> Key: HBASE-16165
> URL: https://issues.apache.org/jira/browse/HBASE-16165
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>
> In RpcServer, we use {{callQueueSizeInBytes}} to avoid queuing too many calls 
> which causes OOM. But in {{CallRunner.run}}, we decrease it before send the 
> response back. And even after calling {{sendResponseIfReady}}, the call 
> object could stay in our heap for a long time if we can not write out the 
> response(That's why we need a Responder thread...). This makes it possible 
> that the actual size of all call object in heap is larger than 
> {{maxQueueSizeInBytes}} and causes OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15945) Patch for Cell and CellImpl

2016-07-01 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359864#comment-15359864
 ] 

Enis Soztutar commented on HBASE-15945:
---

+1. [~eclark] wdyt? 

> Patch for Cell and CellImpl
> ---
>
> Key: HBASE-15945
> URL: https://issues.apache.org/jira/browse/HBASE-15945
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-15945-HBASE-14850.v2.patch, 
> HBASE-15945-HBASE-14850.v3.patch, HBASE-15945-HBASE-14850.v4.patch, 
> HBASE-15945.HBASE-14850.v1.patch, HBASE-15945.HBASE-14850.v5.patch
>
>
> This patch contains an implementation of Key Value, Bytes and Cell modeled on 
> the lines of Java implementation.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Ben Lau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-16052:

Release Note: 
HBASE-16052 improves the performance and scalability of HBaseFsck, especially 
for large clusters with a small number of large tables.  

Searching for lingering reference files is now a multi-threaded operation.  
Loading HDFS region directory information is now multi-threaded at the 
region-level instead of the table-level to maximize concurrency.  A performance 
bug in HBaseFsck that resulted in redundant I/O and RPCs was fixed by 
introducing a FileStatusFilter that filters FileStatus objects directly.  

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Ben Lau (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359848#comment-15359848
 ] 

Ben Lau commented on HBASE-16052:
-

Okay let me know if you guys have a consensus for what other versions should be 
patched.  Re: 0.98-- yes we tested the original version of this patch in 0.98.  
However if we want to patch 0.98 it probably makes more sense to apply a 
backport of the trunk patch than our original 0.98 patch.  The trunk patch has 
some improvements that make the code cleaner on trunk but weren't that 
important on 0.98 originally (eg there are a lot more PathFilter classes in 
trunk so adding an abstract class to avoid too much code duplication became a 
no brainer).  To keep the code from diverging too much it would make sense to 
backport from trunk if we decide to patch 0.98.  I'll add a release note later 
based on the Jira description.  Feel free to expand/shorten it.

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359832#comment-15359832
 ] 

Nick Dimiduk commented on HBASE-16052:
--

Sounds like an enhancement to me, not a bug (correctness) fix.

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15945) Patch for Cell and CellImpl

2016-07-01 Thread Sudeep Sunthankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudeep Sunthankar updated HBASE-15945:
--
Attachment: HBASE-15945.HBASE-14850.v5.patch

Hi, this patch consists of name changes as per google styling and tests.

> Patch for Cell and CellImpl
> ---
>
> Key: HBASE-15945
> URL: https://issues.apache.org/jira/browse/HBASE-15945
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Sudeep Sunthankar
>Assignee: Sudeep Sunthankar
> Attachments: HBASE-15945-HBASE-14850.v2.patch, 
> HBASE-15945-HBASE-14850.v3.patch, HBASE-15945-HBASE-14850.v4.patch, 
> HBASE-15945.HBASE-14850.v1.patch, HBASE-15945.HBASE-14850.v5.patch
>
>
> This patch contains an implementation of Key Value, Bytes and Cell modeled on 
> the lines of Java implementation.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359807#comment-15359807
 ] 

Hadoop QA commented on HBASE-16030:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HBASE-16030 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815834/hbase-16030-v3.patch |
| JIRA Issue | HBASE-16030 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2490/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.3
>
> Attachments: Screen Shot 2016-06-15 at 11.35.42 PM.png, Screen Shot 
> 2016-06-15 at 11.52.38 PM.png, hbase-16030-v2.patch, hbase-16030-v3.patch, 
> hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-07-01 Thread Tianying Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359803#comment-15359803
 ] 

Tianying Chang commented on HBASE-16030:


Attached a new patch. Still use my old way, but updated to fit 1.2, and 
addressed the earlier CR comments. 

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.3
>
> Attachments: Screen Shot 2016-06-15 at 11.35.42 PM.png, Screen Shot 
> 2016-06-15 at 11.52.38 PM.png, hbase-16030-v2.patch, hbase-16030-v3.patch, 
> hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16030) All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is on, causing flush spike

2016-07-01 Thread Tianying Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianying Chang updated HBASE-16030:
---
Attachment: hbase-16030-v3.patch

> All Regions are flushed at about same time when MEMSTORE_PERIODIC_FLUSH is 
> on, causing flush spike
> --
>
> Key: HBASE-16030
> URL: https://issues.apache.org/jira/browse/HBASE-16030
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
> Fix For: 2.0.0, 1.4.0, 1.3.1, 1.2.3
>
> Attachments: Screen Shot 2016-06-15 at 11.35.42 PM.png, Screen Shot 
> 2016-06-15 at 11.52.38 PM.png, hbase-16030-v2.patch, hbase-16030-v3.patch, 
> hbase-16030.patch
>
>
> In our production cluster, we observed that memstore flush spike every hour 
> for all regions/RS. (we use the default memstore periodic flush time of 1 
> hour). 
> This will happend when two conditions are met: 
> 1. the memstore does not have enough data to be flushed before 1 hour limit 
> reached;
> 2. all regions are opened around the same time, (e.g. all RS are started at 
> the same time when start a cluster). 
> With above two conditions, all the regions will be flushed around the same 
> time at: startTime+1hour-delay again and again.
> We added a flush jittering time to randomize the flush time of each region, 
> so that they don't get flushed at around the same time. We had this feature 
> running in our 94.7 and 94.26 cluster. Recently, we upgrade to 1.2, found 
> this issue still there in 1.2. So we are porting this into 1.2 branch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359801#comment-15359801
 ] 

Hudson commented on HBASE-16157:


FAILURE: Integrated in HBase-Trunk_matrix #1151 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1151/])
HBASE-16157 Revert - ChiaPing has new fix (tedyu: rev 
fe75edb556649a101482b974eead2d506f78efe4)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java


> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16091) Canary takes lot more time when there are delete markers in the table

2016-07-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359775#comment-15359775
 ] 

Andrew Purtell commented on HBASE-16091:


Patch lgtm. If no further comment or objection I'll commit early next week 
after the holiday.


> Canary takes lot more time when there are delete markers in the table
> -
>
> Key: HBASE-16091
> URL: https://issues.apache.org/jira/browse/HBASE-16091
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
> Fix For: 2.0.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16091.00.patch, HBASE-16091.01.patch, 
> HBASE-16091.02.patch
>
>
> We have a table which has lot of delete markers and we running Canary test on 
> a regular interval sometimes tests are timing out because to reading first 
> row would skip all these delete markers. Since purpose of Canary is to find 
> health of the region, i think keeping raw=true would not defeat the purpose 
> but provide good perf improvement. 
> Following are the example of one such scan where 
> without changing code it took 62.3 sec for onre region scan
> 2016-06-23 08:49:11,670 INFO  [pool-2-thread-1] tool.Canary - read from 
> region  . column family 0 in 62338ms
> whereas after setting raw=true, it reduced to 58ms
> 2016-06-23 08:45:20,259 INFO  [pool-2-thread-1] tests.Canary - read from 
> region . column family 0 in 58ms
> Taking this over multiple tables , with multiple region would be a good 
> performance gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16091) Canary takes lot more time when there are delete markers in the table

2016-07-01 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-16091:
---
Fix Version/s: (was: 0.98.22)
   0.98.21
   1.4.0
   2.0.0

> Canary takes lot more time when there are delete markers in the table
> -
>
> Key: HBASE-16091
> URL: https://issues.apache.org/jira/browse/HBASE-16091
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vishal Khandelwal
>Assignee: Vishal Khandelwal
> Fix For: 2.0.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
> Attachments: HBASE-16091.00.patch, HBASE-16091.01.patch, 
> HBASE-16091.02.patch
>
>
> We have a table which has lot of delete markers and we running Canary test on 
> a regular interval sometimes tests are timing out because to reading first 
> row would skip all these delete markers. Since purpose of Canary is to find 
> health of the region, i think keeping raw=true would not defeat the purpose 
> but provide good perf improvement. 
> Following are the example of one such scan where 
> without changing code it took 62.3 sec for onre region scan
> 2016-06-23 08:49:11,670 INFO  [pool-2-thread-1] tool.Canary - read from 
> region  . column family 0 in 62338ms
> whereas after setting raw=true, it reduced to 58ms
> 2016-06-23 08:45:20,259 INFO  [pool-2-thread-1] tests.Canary - read from 
> region . column family 0 in 58ms
> Taking this over multiple tables , with multiple region would be a good 
> performance gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16125) RegionMover uses hardcoded, Unix-style tmp folder - breaks Windows

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359732#comment-15359732
 ] 

Hudson commented on HBASE-16125:


SUCCESS: Integrated in HBase-1.4 #264 (See 
[https://builds.apache.org/job/HBase-1.4/264/])
HBASE-16125 RegionMover uses hardcoded, Unix-style tmp folder - breaks (tedyu: 
rev 2aa8cdc98979e6353276100d4cc7ba7001795696)
* bin/region_mover.rb


> RegionMover uses hardcoded, Unix-style tmp folder - breaks Windows
> --
>
> Key: HBASE-16125
> URL: https://issues.apache.org/jira/browse/HBASE-16125
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16125-branch-1.1-v1.patch, HBASE-16125-v1.patch
>
>
> The issue exists in all branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16108) RowCounter should support multiple key ranges

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359731#comment-15359731
 ] 

Hudson commented on HBASE-16108:


SUCCESS: Integrated in HBase-1.4 #264 (See 
[https://builds.apache.org/job/HBase-1.4/264/])
HBASE-16108 RowCounter should support multiple key ranges (Konstantin (tedyu: 
rev a345aa8707e86405751fda7caa08990aa0842e23)
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestRowCounter.java


> RowCounter should support multiple key ranges
> -
>
> Key: HBASE-16108
> URL: https://issues.apache.org/jira/browse/HBASE-16108
> Project: HBase
>  Issue Type: Improvement
>Reporter: Geoffrey Jacoby
>Assignee: Konstantin Ryakhovskiy
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16108.branch-1.001.patch, 
> HBASE-16108.master.001.patch, HBASE-16108.master.003.patch, 
> HBASE-16108.master.004.patch, test_HBASE-16108.log, 
> test_TestTableBasedReplicationSourceManagerImpl.log
>
>
> Currently, RowCounter only allows a single key range to be used as a filter. 
> It would be useful in some cases to be able to specify multiple key ranges 
> (or prefixes) in the same job. (For example, counting over a set of Phoenix 
> tenant ids in an unsalted table)
> This could be done by enhancing the existing key range parameter to take 
> multiple start/stop row pairs. Alternately, a new --row-prefixes option could 
> be added, similar to what HBASE-15847 did for VerifyReplication. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15716) HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random read

2016-07-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359682#comment-15359682
 ] 

stack commented on HBASE-15716:
---

Can you make a patch [~ikeda]? I tried looking at HBASE-14479 and intrepreting 
your above but I have holes Thanks.

> HRegion#RegionScannerImpl scannerReadPoints synchronization constrains random 
> read
> --
>
> Key: HBASE-15716
> URL: https://issues.apache.org/jira/browse/HBASE-15716
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Reporter: stack
>Assignee: stack
> Attachments: 
> 15716.implementation.using.ScannerReadPoints.branch-1.patch, 
> 15716.prune.synchronizations.patch, 15716.prune.synchronizations.v3.patch, 
> 15716.prune.synchronizations.v4.patch, 15716.prune.synchronizations.v4.patch, 
> 15716.wip.more_to_be_done.patch, HBASE-15716.branch-1.001.patch, 
> HBASE-15716.branch-1.002.patch, HBASE-15716.branch-1.003.patch, 
> HBASE-15716.branch-1.004.patch, HBASE-15716.branch-1.005.patch, 
> ScannerReadPoints.java, Screen Shot 2016-04-26 at 2.05.45 PM.png, Screen Shot 
> 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 2.07.06 PM.png, 
> Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 2016-04-26 at 6.02.29 
> PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png, Screen Shot 2016-06-30 at 
> 9.52.52 PM.png, Screen Shot 2016-06-30 at 9.54.08 PM.png, 
> TestScannerReadPoints.java, before_after.png, 
> current-branch-1.vs.NoSynchronization.vs.Patch.png, hits.png, 
> remove.locks.patch, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16159) OutOfMemory exception when using AsyncRpcClient with encryption to read rpc response

2016-07-01 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16159:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 1.4.0
   2.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the patch, Colin.

> OutOfMemory exception when using AsyncRpcClient with encryption to read rpc 
> response
> 
>
> Key: HBASE-16159
> URL: https://issues.apache.org/jira/browse/HBASE-16159
> Project: HBase
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16159.001.patch
>
>
> Test the Get with encryption AsyncRpcClient with in infinity loop, will get 
> the OOM exception. The root cause is the same as HBASE-16054.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16055) PutSortReducer loses any Visibility/acl attribute set on the Puts

2016-07-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359671#comment-15359671
 ] 

Andrew Purtell commented on HBASE-16055:


The fix versions specify multiple branches but the change was only checked into 
trunk and branch-1. Are you planning to commit to the other branches 
[~ram_krish] ? I can do this as part of 0.98 maintenance also, let me know. 

> PutSortReducer loses any Visibility/acl attribute set on the Puts 
> --
>
> Key: HBASE-16055
> URL: https://issues.apache.org/jira/browse/HBASE-16055
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0, 1.0.4, 1.4.0, 0.98.21
>
> Attachments: HBASE-16055_1.patch, HBASE-16055_2.patch
>
>
> Based on a user discussion, I think as the user pointed out rightly, when a 
> PutSortReducer is used any visibility attribute or external attribute set on 
> the Put will be lost as we create KVs out of the cells in the puts whereas 
> the ACL and visibility are all set as Attributes. 
> In TextSortReducer we tend to read the information we tend to read the 
> information from the parsed line but here in PutSortReducer we don't do it. I 
> think this problem should be in all the existing versions where we support 
> Tags. Correct me if am wrong here. 
> [~anoop.hbase], [~andrew.purt...@gmail.com]?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16149) Log the underlying RPC exception in RpcRetryingCallerImpl

2016-07-01 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-16149:
-
Fix Version/s: 1.4.0
   1.3.0

> Log the underlying RPC exception in RpcRetryingCallerImpl 
> --
>
> Key: HBASE-16149
> URL: https://issues.apache.org/jira/browse/HBASE-16149
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Jerry He
>Assignee: Jerry He
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-16149-branch-1.patch, HBASE-16149.patch
>
>
> In RpcRetryingCallerImpl:
> {code}
>   public T callWithRetries(RetryingCallable callable, int callTimeout)
>   throws IOException, RuntimeException {
> ...
> for (int tries = 0;; tries++) {
>   try {
> ...
> return callable.call(getTimeout(callTimeout));
> ...
>   } catch (Throwable t) {
> ExceptionUtil.rethrowIfInterrupt(t);
> if (tries > startLogErrorsCnt) {
>   LOG.info("Call exception, tries=" + tries + ", maxAttempts=" + 
> maxAttempts + ", started="
>   + (EnvironmentEdgeManager.currentTime() - 
> tracker.getStartTime()) + " ms ago, "
>   + "cancelled=" + cancelled.get() + ", msg="
>   + callable.getExceptionMessageAdditionalDetail());
> }
> ...
> {code}
> We log the callable.getExceptionMessageAdditionalDetail() msg. But 
> callable.getExceptionMessageAdditionalDetail() may not provide the underlying 
> cause..
> For example, in AbstractRegionServerCallable, 
> {code}
>   public String getExceptionMessageAdditionalDetail() {
> return "row '" + Bytes.toString(row) + "' on table '" + tableName + "' at 
> " + location;
>   }
> {code}
> Let's add the underlying exception cause to the message as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16128) add support for p999 histogram metrics

2016-07-01 Thread Tianying Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359650#comment-15359650
 ] 

Tianying Chang commented on HBASE-16128:


Should I provide a patch for a different version? 

> add support for p999 histogram metrics
> --
>
> Key: HBASE-16128
> URL: https://issues.apache.org/jira/browse/HBASE-16128
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 1.2.1
>Reporter: Tianying Chang
>Assignee: Tianying Chang
>Priority: Minor
> Attachments: HBase-16128.patch
>
>
> Currently there is support for p75,p90,p99, but not support for p999. We need 
> p999 metrics for reflecting p99 metrics at client level, especially client 
> side is fanout call. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359658#comment-15359658
 ] 

Hadoop QA commented on HBASE-14422:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
52s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 56s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 8s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
7s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 42s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815814/HBASE-14422.master.007.patch
 |
| JIRA Issue | HBASE-14422 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / fe75edb |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
| findbugs | v3.0.0 |
|  Test Results | 

[jira] [Updated] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Ryakhovskiy updated HBASE-14422:
---
Status: Patch Available  (was: Open)

> Fix TestFastFailWithoutTestUtil
> ---
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Konstantin Ryakhovskiy
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14422.master.001.patch, 
> HBASE-14422.master.002.patch, HBASE-14422.master.003.patch, 
> HBASE-14422.master.004.patch, HBASE-14422.master.005.patch, 
> HBASE-14422.master.006.patch, HBASE-14422.master.007.patch, log.txt, trace.log
>
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Ryakhovskiy updated HBASE-14422:
---
Attachment: HBASE-14422.master.007.patch

another attempt, expecting a fail

> Fix TestFastFailWithoutTestUtil
> ---
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Konstantin Ryakhovskiy
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14422.master.001.patch, 
> HBASE-14422.master.002.patch, HBASE-14422.master.003.patch, 
> HBASE-14422.master.004.patch, HBASE-14422.master.005.patch, 
> HBASE-14422.master.006.patch, HBASE-14422.master.007.patch, log.txt, trace.log
>
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Ryakhovskiy updated HBASE-14422:
---
Status: Open  (was: Patch Available)

> Fix TestFastFailWithoutTestUtil
> ---
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Konstantin Ryakhovskiy
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14422.master.001.patch, 
> HBASE-14422.master.002.patch, HBASE-14422.master.003.patch, 
> HBASE-14422.master.004.patch, HBASE-14422.master.005.patch, 
> HBASE-14422.master.006.patch, log.txt, trace.log
>
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359602#comment-15359602
 ] 

Hadoop QA commented on HBASE-14422:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
53s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 56s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
7s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 44s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815811/HBASE-14422.master.006.patch
 |
| JIRA Issue | HBASE-14422 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / fe75edb |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
| findbugs | v3.0.0 |
|  Test Results | 

[jira] [Commented] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359562#comment-15359562
 ] 

Hadoop QA commented on HBASE-15935:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
17s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 8s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
28m 40s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 18s 
{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
8s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 54s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815810/HBASE-15935.patch |
| JIRA Issue | HBASE-15935 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d92a99d |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/JDK_1.7_latest_:1.7.0_80 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2486/testReport/ |
| modules | C: hbase-it U: 

[jira] [Updated] (HBASE-16108) RowCounter should support multiple key ranges

2016-07-01 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16108:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch, Konstantin

> RowCounter should support multiple key ranges
> -
>
> Key: HBASE-16108
> URL: https://issues.apache.org/jira/browse/HBASE-16108
> Project: HBase
>  Issue Type: Improvement
>Reporter: Geoffrey Jacoby
>Assignee: Konstantin Ryakhovskiy
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16108.branch-1.001.patch, 
> HBASE-16108.master.001.patch, HBASE-16108.master.003.patch, 
> HBASE-16108.master.004.patch, test_HBASE-16108.log, 
> test_TestTableBasedReplicationSourceManagerImpl.log
>
>
> Currently, RowCounter only allows a single key range to be used as a filter. 
> It would be useful in some cases to be able to specify multiple key ranges 
> (or prefixes) in the same job. (For example, counting over a set of Phoenix 
> tenant ids in an unsalted table)
> This could be done by enhancing the existing key range parameter to take 
> multiple start/stop row pairs. Alternately, a new --row-prefixes option could 
> be added, similar to what HBASE-15847 did for VerifyReplication. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Ryakhovskiy updated HBASE-14422:
---
Status: Open  (was: Patch Available)

> Fix TestFastFailWithoutTestUtil
> ---
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Konstantin Ryakhovskiy
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14422.master.001.patch, 
> HBASE-14422.master.002.patch, HBASE-14422.master.003.patch, 
> HBASE-14422.master.004.patch, HBASE-14422.master.005.patch, log.txt, trace.log
>
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Ryakhovskiy updated HBASE-14422:
---
Status: Patch Available  (was: Open)

> Fix TestFastFailWithoutTestUtil
> ---
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Konstantin Ryakhovskiy
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14422.master.001.patch, 
> HBASE-14422.master.002.patch, HBASE-14422.master.003.patch, 
> HBASE-14422.master.004.patch, HBASE-14422.master.005.patch, 
> HBASE-14422.master.006.patch, log.txt, trace.log
>
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14422) Fix TestFastFailWithoutTestUtil

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Ryakhovskiy updated HBASE-14422:
---
Attachment: HBASE-14422.master.006.patch

Another try, hope never dies
It is wierd, but I can easily reproduce a bug on my rig

> Fix TestFastFailWithoutTestUtil
> ---
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: stack
>Assignee: Konstantin Ryakhovskiy
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14422.master.001.patch, 
> HBASE-14422.master.002.patch, HBASE-14422.master.003.patch, 
> HBASE-14422.master.004.patch, HBASE-14422.master.005.patch, 
> HBASE-14422.master.006.patch, log.txt, trace.log
>
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16108) RowCounter should support multiple key ranges

2016-07-01 Thread Konstantin Ryakhovskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359549#comment-15359549
 ] 

Konstantin Ryakhovskiy commented on HBASE-16108:


Findbugs warning is for HFileWriterV2 which is not touched by the patch, 
therefore it was introduced previously.
The failed test TestStochasticLoadBalancer2 is not related to the patch as 
well. I've run the test locally with the patch and it was successful.


> RowCounter should support multiple key ranges
> -
>
> Key: HBASE-16108
> URL: https://issues.apache.org/jira/browse/HBASE-16108
> Project: HBase
>  Issue Type: Improvement
>Reporter: Geoffrey Jacoby
>Assignee: Konstantin Ryakhovskiy
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16108.branch-1.001.patch, 
> HBASE-16108.master.001.patch, HBASE-16108.master.003.patch, 
> HBASE-16108.master.004.patch, test_HBASE-16108.log, 
> test_TestTableBasedReplicationSourceManagerImpl.log
>
>
> Currently, RowCounter only allows a single key range to be used as a filter. 
> It would be useful in some cases to be able to specify multiple key ranges 
> (or prefixes) in the same job. (For example, counting over a set of Phoenix 
> tenant ids in an unsalted table)
> This could be done by enhancing the existing key range parameter to take 
> multiple start/stop row pairs. Alternately, a new --row-prefixes option could 
> be added, similar to what HBASE-15847 did for VerifyReplication. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359542#comment-15359542
 ] 

Sean Busbey commented on HBASE-16052:
-

Personally, I'd rather see this limited to new minor releases. (e.g. 0.98, 1.3, 
etc.) Also could use a release note.

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359521#comment-15359521
 ] 

ChiaPing Tsai commented on HBASE-16157:
---

It seem to me the v4 patch is better. The reasons are shown below:
1) The v4 patch makes LruBlockCache#evictBlocksByHfileName return the correct 
number of evicted block.
2) The v4 patch use the threads to test the thread-safe of cache eviction.


> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16096) Replication keeps accumulating znodes

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-16096:
---
Status: Open  (was: Patch Available)

> Replication keeps accumulating znodes
> -
>
> Key: HBASE-16096
> URL: https://issues.apache.org/jira/browse/HBASE-16096
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.2.0, 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Joseph
> Attachments: HBASE-16096.patch
>
>
> If there is an error while creating the replication source on adding the 
> peer, the source if not added to the in memory list of sources but the 
> replication peer is. 
> However, in such a scenario, when you remove the peer, it is deleted from 
> zookeeper successfully but for removing the in memory list of peers, we wait 
> for the corresponding sources to get deleted (which as we said don't exist 
> because of error creating the source). 
> The problem here is the ordering of operations for adding/removing source and 
> peer. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16096) Replication keeps accumulating znodes

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-16096:
---
Status: Patch Available  (was: Open)

> Replication keeps accumulating znodes
> -
>
> Key: HBASE-16096
> URL: https://issues.apache.org/jira/browse/HBASE-16096
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.2.0, 2.0.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Joseph
> Attachments: HBASE-16096.patch
>
>
> If there is an error while creating the replication source on adding the 
> peer, the source if not added to the in memory list of sources but the 
> replication peer is. 
> However, in such a scenario, when you remove the peer, it is deleted from 
> zookeeper successfully but for removing the in memory list of peers, we wait 
> for the corresponding sources to get deleted (which as we said don't exist 
> because of error creating the source). 
> The problem here is the ordering of operations for adding/removing source and 
> peer. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Status: Patch Available  (was: Open)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Attachment: HBASE-15935.patch

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Attachment: (was: HBASE-15935.patch)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16125) RegionMover uses hardcoded, Unix-style tmp folder - breaks Windows

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359512#comment-15359512
 ] 

Hudson commented on HBASE-16125:


FAILURE: Integrated in HBase-Trunk_matrix #1150 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1150/])
HBASE-16125 RegionMover uses hardcoded, Unix-style tmp folder - breaks (tedyu: 
rev d92a99da0e37563ece782f61f63745b4d8fb9bdf)
* hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java


> RegionMover uses hardcoded, Unix-style tmp folder - breaks Windows
> --
>
> Key: HBASE-16125
> URL: https://issues.apache.org/jira/browse/HBASE-16125
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16125-branch-1.1-v1.patch, HBASE-16125-v1.patch
>
>
> The issue exists in all branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359510#comment-15359510
 ] 

Hudson commented on HBASE-16157:


FAILURE: Integrated in HBase-Trunk_matrix #1150 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1150/])
HBASE-16157 The incorrect block cache count and size are caused by (tedyu: rev 
5a7c9939cbe115c0d12d9cfe8bf9f3b3d11ac69e)
* hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java


> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16133) RSGroupBasedLoadBalancer.retainAssignment() might miss a region

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359511#comment-15359511
 ] 

Hudson commented on HBASE-16133:


FAILURE: Integrated in HBase-Trunk_matrix #1150 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/1150/])
HBASE-16133 RSGroupBasedLoadBalancer.retainAssignment() might miss a (enis: rev 
bc70dc00bb05580bdc597cedf152f8add1f48d90)
* 
hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupBasedLoadBalancer.java


> RSGroupBasedLoadBalancer.retainAssignment() might miss a region
> ---
>
> Key: HBASE-16133
> URL: https://issues.apache.org/jira/browse/HBASE-16133
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0
>
> Attachments: hbase-16133_v1.patch
>
>
> We have seen in the tests through the IntegrationTestRSGroup that we may miss 
> assigning a region. 
> It is a simple logic error here: 
> {code}
> if (server != null && !assignments.containsKey(server)) {
>   assignments.put(server, new ArrayList());
> } else if (server != null) {
>assignments.get(server).add(region);
>  } else {
> {code}
> in the first condition, we are not adding the region to the newly created 
> ArrayList. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Attachment: HBASE-16157-v4.patch

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch, HBASE-16157-v4.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Attachment: (was: HBASE-15935.patch)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Status: Open  (was: Patch Available)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Attachment: HBASE-15935.patch

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Open  (was: Patch Available)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16132) Scan does not return all the result when regionserver is busy

2016-07-01 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359493#comment-15359493
 ] 

Devaraj Das commented on HBASE-16132:
-

So if you look at the RpcRetryingCallerWithReadReplicas.call() implementation, 
it first does a poll (to wait for a certain timeout) -
{code}
  try {
// wait for the timeout to see whether the primary responds back
Future f = cs.poll(timeBeforeReplicas, TimeUnit.MICROSECONDS); 
// Yes, microseconds
if (f != null) {
  return f.get(); //great we got a response
}
  }
{code}
After that, it does a take() / get()
{code}
try {
  try {
Future f = cs.take();
return f.get();
  } catch (ExecutionException e) {
throwEnrichedException(e, retries);
  }
} catch (CancellationException e) {
{code}
In the ScannerCallableWithReplicas.call(), it does poll in both places. But 
after the second poll(), it might be better to do a get(). That should take 
care of throwing the exception (look at the implementation of get()). On a 
related note, should the second call to poll() be replaced with a call to 
take(). There is a difference between the poll() and take(). Haven't analyzed 
the side effects of doing that...
I am okay with your patch but wanted to bring the above up and see if it makes 
sense..

> Scan does not return all the result when regionserver is busy
> -
>
> Key: HBASE-16132
> URL: https://issues.apache.org/jira/browse/HBASE-16132
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: binlijin
> Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
> try {
>   Future> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>   if (f != null) {
> Pair r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
> if (r != null && r.getSecond() != null) {
>   updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
> }
> return r == null ? null : r.getFirst(); // great we got an answer
>   }
> } catch (ExecutionException e) {
>   RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
> } catch (CancellationException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (InterruptedException e) {
>   throw new InterruptedIOException(e.getMessage());
> } catch (TimeoutException e) {
>   throw new InterruptedIOException(e.getMessage());
> } finally {
>   // We get there because we were interrupted or because one or more of 
> the
>   // calls succeeded or failed. In all case, we stop all our tasks.
>   cs.cancelAll();
> }
> return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359491#comment-15359491
 ] 

Joseph commented on HBASE-15935:


I've done a bit of testing these on a cluster and am pretty confident in this 
version. Could I get some review on this patch? 

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Attachment: HBASE-15935.patch

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359490#comment-15359490
 ] 

Hadoop QA commented on HBASE-15935:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} 
| {color:red} HBASE-15935 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815806/HBASE-15935.patch |
| JIRA Issue | HBASE-15935 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2485/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359488#comment-15359488
 ] 

Hadoop QA commented on HBASE-16157:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HBASE-16157 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.2.1/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815805/HBASE-16157-v3.patch |
| JIRA Issue | HBASE-16157 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2484/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Status: Open  (was: Patch Available)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Status: Patch Available  (was: Open)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
> Attachments: HBASE-15935.patch
>
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-16158) Automate runs of check_compatibility.sh on upstream infra

2016-07-01 Thread Dima Spivak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dima Spivak reassigned HBASE-16158:
---

Assignee: Dima Spivak

> Automate runs of check_compatibility.sh on upstream infra
> -
>
> Key: HBASE-16158
> URL: https://issues.apache.org/jira/browse/HBASE-16158
> Project: HBase
>  Issue Type: Task
>  Components: test
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>
> Now that we got {{check_compatiblity.sh}} working again, perhaps we should 
> think about having it run regularly upstream? One possibility would be to tie 
> it into Yetus runs so that the tool gets run on every commit between the 
> branch in question and a designated Git reference (e.g. branch-1.2 could run 
> against the earlier release of the 1.2 line) and simply grepping the output 
> to make sure that the number of problems and warnings doesn't exceed a 
> designated number. The other would be to run every branch in its own Jenkins 
> job following the same workflow, but on a nightly basis. What do you guys and 
> gals think would be best?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15935) Have a separate Walker task running concurrently with Generator

2016-07-01 Thread Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph updated HBASE-15935:
---
Attachment: (was: HBASE-15935.patch)

> Have a separate Walker task running concurrently with Generator   
> --
>
> Key: HBASE-15935
> URL: https://issues.apache.org/jira/browse/HBASE-15935
> Project: HBase
>  Issue Type: Sub-task
>  Components: integration tests
>Reporter: Joseph
>Assignee: Joseph
>Priority: Minor
>
> Keep track of which linked lists have been flushed in an HBase table, so that 
> we can concurrently Walk these lists during the Generation phase. This will 
> allow us to test:
> 1. HBase under concurrent read/writes
> 2. Availability of data immediately after flushes (as opposed to waiting till 
> the Verification phase)
> The review diff can be found at:
> https://reviews.apache.org/r/48294/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Attachment: HBASE-16157-v3.patch

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Patch Available  (was: Open)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch, 
> HBASE-16157-v3.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-3727) MultiHFileOutputFormat

2016-07-01 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359471#comment-15359471
 ] 

Esteban Gutierrez commented on HBASE-3727:
--

Sure [~easyliangjob]

> MultiHFileOutputFormat
> --
>
> Key: HBASE-3727
> URL: https://issues.apache.org/jira/browse/HBASE-3727
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Andrew Purtell
>Assignee: yi liang
>Priority: Minor
> Attachments: HBASE-3727-V3.patch, MH2.patch, 
> MultiHFileOutputFormat.java, MultiHFileOutputFormat.java, 
> MultiHFileOutputFormat.java, TestMultiHFileOutputFormat.java
>
>
> Like MultiTableOutputFormat, but outputting HFiles. Key is tablename as an 
> IBW. Creates sub-writers (code cut and pasted from HFileOutputFormat) on 
> demand that produce HFiles in per-table subdirectories of the configured 
> output path. Does not currently support partitioning for existing tables / 
> incremental update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16114) Get regionLocation of required regions only for MR jobs

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359463#comment-15359463
 ] 

Hudson commented on HBASE-16114:


SUCCESS: Integrated in HBase-1.4 #263 (See 
[https://builds.apache.org/job/HBase-1.4/263/])
HBASE-16114 Get regionLocation of required regions only for MR jobs (tedyu: rev 
5bc06555292d28e0e53861f4b51300c38db1184e)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java


> Get regionLocation of required regions only for MR jobs
> ---
>
> Key: HBASE-16114
> URL: https://issues.apache.org/jira/browse/HBASE-16114
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 1.2.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16114.branch-1.000.patch, 
> HBASE-16114.master.001.patch, HBASE-16114.master.001.patch, 
> HBASE-16114.master.002.patch
>
>
> We should only get the location of regions required during the MR job. This 
> will help for jobs with large regions but the job itself scans only a small 
> portion of it. Similar changes can be seen in MultiInputFormatBase.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359443#comment-15359443
 ] 

Ted Yu commented on HBASE-16052:


Ping [~mantonov] [~busbey] [~ndimiduk] for green light

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16052) Improve HBaseFsck Scalability

2016-07-01 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359439#comment-15359439
 ] 

Stephen Yuan Jiang commented on HBASE-16052:


[~benlau] and [~tedyu], any reason that this patch did not go to other 1.x 
branches (eg. branch-1.1)?  Also I thought Ben did this for 0.98 and tested in 
production, so maybe we should also put it into 0.98.

> Improve HBaseFsck Scalability
> -
>
> Key: HBASE-16052
> URL: https://issues.apache.org/jira/browse/HBASE-16052
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Ben Lau
>Assignee: Ben Lau
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16052-master.patch, HBASE-16052-v3-branch-1.patch, 
> HBASE-16052-v3-master.patch
>
>
> There are some problems with HBaseFsck that make it unnecessarily slow 
> especially for large tables or clusters with many regions.  
> This patch tries to fix the biggest bottlenecks and also include a couple of 
> bug fixes for some of the race conditions caused by gathering and holding 
> state about a live cluster that is no longer true by the time you use that 
> state in Fsck processing.  These race conditions cause Fsck to crash and 
> become unusable on large clusters with lots of region splits/merges.
> Here are some scalability/performance problems in HBaseFsck and the changes 
> the patch makes:
> - Unnecessary I/O and RPCs caused by fetching an array of FileStatuses and 
> then discarding everything but the Paths, then passing the Paths to a 
> PathFilter, and then having the filter look up the (previously discarded) 
> FileStatuses of the paths again.  This is actually worse than double I/O 
> because the first lookup obtains a batch of FileStatuses while all the other 
> lookups are individual RPCs performed sequentially.
> -- Avoid this by adding a FileStatusFilter so that filtering can happen 
> directly on FileStatuses
> -- This performance bug affects more than Fsck, but also to some extent 
> things like snapshots, hfile archival, etc.  I didn't have time to look too 
> deep into other things affected and didn't want to increase the scope of this 
> ticket so I focus mostly on Fsck and make only a few improvements to other 
> codepaths.  The changes in this patch though should make it fairly easy to 
> fix other code paths in later jiras if we feel there are some other features 
> strongly impacted by this problem.  
> - OfflineReferenceFileRepair is the most expensive part of Fsck (often 50% of 
> Fsck runtime) and the running time scales with the number of store files, yet 
> the function is completely serial
> -- Make offlineReferenceFileRepair multithreaded
> - LoadHdfsRegionDirs() uses table-level concurrency, which is a big 
> bottleneck if you have 1 large cluster with 1 very large table that has 
> nearly all the regions
> -- Change loadHdfsRegionDirs() to region-level parallelism instead of 
> table-level parallelism for operations.
> The changes benefit all clusters but are especially noticeable for large 
> clusters with a few very large tables.  On our version of 0.98 with the 
> original patch we had a moderately sized production cluster with 2 (user) 
> tables and ~160k regions where HBaseFsck went from taking 18 min to 5 minutes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16169) RegionSizeCalculator should not depend on master

2016-07-01 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359411#comment-15359411
 ] 

Thiruvel Thirumoolan commented on HBASE-16169:
--

Will upload a patch in a couple of days.

> RegionSizeCalculator should not depend on master
> 
>
> Key: HBASE-16169
> URL: https://issues.apache.org/jira/browse/HBASE-16169
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, scaling
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
> Fix For: 2.0.0, 1.4.0
>
>
> RegionSizeCalculator is needed for better split generation of MR jobs. This 
> requires RegionLoad which can be obtained via ClusterStatus, i.e. accessing 
> Master. We don't want master to be in this path.
> The proposal is to add an API to the RegionServer that gets RegionLoad of all 
> regions hosted on it or those of a table if specified. RegionSizeCalculator 
> can use the latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16169) RegionSizeCalculator should not depend on master

2016-07-01 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HBASE-16169:


 Summary: RegionSizeCalculator should not depend on master
 Key: HBASE-16169
 URL: https://issues.apache.org/jira/browse/HBASE-16169
 Project: HBase
  Issue Type: Sub-task
  Components: mapreduce, scaling
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 2.0.0, 1.4.0


RegionSizeCalculator is needed for better split generation of MR jobs. This 
requires RegionLoad which can be obtained via ClusterStatus, i.e. accessing 
Master. We don't want master to be in this path.

The proposal is to add an API to the RegionServer that gets RegionLoad of all 
regions hosted on it or those of a table if specified. RegionSizeCalculator can 
use the latter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChiaPing Tsai updated HBASE-16157:
--
Status: Open  (was: Patch Available)

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16157) The incorrect block cache count and size are caused by removing duplicate block key in the LruBlockCache

2016-07-01 Thread ChiaPing Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359358#comment-15359358
 ] 

ChiaPing Tsai commented on HBASE-16157:
---

The TestLruBlockCache#testCurrentSize may fail because the evict thread is too 
fast to get the "in progress" state.

I'll update the patch asap.

> The incorrect block cache count and size are caused by removing duplicate 
> block key in the LruBlockCache
> 
>
> Key: HBASE-16157
> URL: https://issues.apache.org/jira/browse/HBASE-16157
> Project: HBase
>  Issue Type: Bug
>Reporter: ChiaPing Tsai
>Priority: Trivial
> Attachments: HBASE-16157-v1.patch, HBASE-16157-v2.patch
>
>
> {code:title=LruBlockCache.java|borderStyle=solid}
> // Check return value from the Map#remove before updating the metrics
>   protected long evictBlock(LruCachedBlock block, boolean 
> evictedByEvictionProcess) {
> map.remove(block.getCacheKey());
> updateSizeMetrics(block, true);
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16108) RowCounter should support multiple key ranges

2016-07-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359343#comment-15359343
 ] 

Hadoop QA commented on HBASE-16108:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
36s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} branch-1 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 32s 
{color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} branch-1 passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
20m 0s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 124m 7s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 163m 6s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 
|
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12815763/HBASE-16108.branch-1.001.patch
 |
| JIRA Issue | HBASE-16108 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-1 / 5bc0655 |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /home/jenkins/tools/java/jdk1.8.0:1.8.0 

[jira] [Commented] (HBASE-16161) Remove a few unnecessary (uncontended) synchronizes

2016-07-01 Thread Ann rorrer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359333#comment-15359333
 ] 

Ann rorrer commented on HBASE-16161:


How do I download 

> Remove a few unnecessary (uncontended) synchronizes
> ---
>
> Key: HBASE-16161
> URL: https://issues.apache.org/jira/browse/HBASE-16161
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Attachments: HBASE-16161.branch-1.001.patch
>
>
> This is followon from HBASE-15716. We have a few odd looking synchronizes.  A 
> few are probably elided after analysis concludes single thread consumer but 
> lets remove to be clear.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16165) Decrease RpcServer.callQueueSize before writeResponse causes OOM

2016-07-01 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359323#comment-15359323
 ] 

Enis Soztutar commented on HBASE-16165:
---

Did you see this happening, or theoretical by reading code? Just asking.

> Decrease RpcServer.callQueueSize before writeResponse causes OOM
> 
>
> Key: HBASE-16165
> URL: https://issues.apache.org/jira/browse/HBASE-16165
> Project: HBase
>  Issue Type: Bug
>Reporter: Duo Zhang
>
> In RpcServer, we use {{callQueueSizeInBytes}} to avoid queuing too many calls 
> which causes OOM. But in {{CallRunner.run}}, we decrease it before send the 
> response back. And even after calling {{sendResponseIfReady}}, the call 
> object could stay in our heap for a long time if we can not write out the 
> response(That's why we need a Responder thread...). This makes it possible 
> that the actual size of all call object in heap is larger than 
> {{maxQueueSizeInBytes}} and causes OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15119) Include git SHA in check_compatibility reports

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359316#comment-15359316
 ] 

Hudson commented on HBASE-15119:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1234 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1234/])
HBASE-15119 Include git SHA in check_compatibility reports (busbey: rev 
0f1633eab20bcfe09b0cd908013030ce341917f7)
* dev-support/check_compatibility.sh


> Include git SHA in check_compatibility reports
> --
>
> Key: HBASE-15119
> URL: https://issues.apache.org/jira/browse/HBASE-15119
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2, 1.1.6, 0.98.21
>
> Attachments: HBASE-15119.v00.patch
>
>
> Since some refs change over time (ie, branches), it would be nice to include 
> git shas in the version info included in check compatibility reports. It'll 
> also help interested parties to be sure of what they're looking at.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15729) Remove old JDiff wrapper scripts in dev-support

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359314#comment-15359314
 ] 

Hudson commented on HBASE-15729:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1234 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1234/])
HBASE-15729 Remove old JDiff wrapper scripts in dev-support (busbey: rev 
34f409e74b240517dd649476f0c0fa71c0ed0695)
* dev-support/hbase_jdiff_acrossSingularityTemplate.xml
* dev-support/jdiffHBasePublicAPI_common.sh
* dev-support/hbase_jdiff_template.xml
* dev-support/hbase_jdiff_afterSingularityTemplate.xml
* dev-support/jdiffHBasePublicAPI.sh


> Remove old JDiff wrapper scripts in dev-support
> ---
>
> Key: HBASE-15729
> URL: https://issues.apache.org/jira/browse/HBASE-15729
> Project: HBase
>  Issue Type: Task
>  Components: build, community
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2, 1.1.6, 0.98.21
>
> Attachments: HBASE-15729.patch
>
>
> Since HBASE-12808, we've been using the Java API Compliance Checker instead 
> of JDiff to look at API compatibility. Probably makes sense to remove the old 
> wrapper scripts that aren't being used anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16124) Make check_compatibility.sh less verbose when building HBase

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359317#comment-15359317
 ] 

Hudson commented on HBASE-16124:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1234 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1234/])
HBASE-16124 Make check_compatibility.sh less verbose when building HBase 
(busbey: rev a3ebafed7955decebcc7f0f89fd792a01ef89986)
* dev-support/check_compatibility.sh


> Make check_compatibility.sh less verbose when building HBase
> 
>
> Key: HBASE-16124
> URL: https://issues.apache.org/jira/browse/HBASE-16124
> Project: HBase
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Dima Spivak
>Assignee: Dima Spivak
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2, 1.1.6, 0.98.21
>
> Attachments: HBASE-16124_v1.patch
>
>
> {{[check_compatibility.sh|https://github.com/apache/hbase/blob/master/dev-support/check_compatibility.sh]}}
>  is a bit verbose when building HBase JARs, which makes it kind of a 
> nightmare when used in a Jenkins job. Let's run those steps in Maven's batch 
> mode, which means less unnecessary output and no expectation of user 
> interaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-16129) check_compatibility.sh is broken when using Java API Compliance Checker v1.7

2016-07-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359315#comment-15359315
 ] 

Hudson commented on HBASE-16129:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1234 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1234/])
HBASE-16129 check_compatibility.sh is broken when using Java API (busbey: rev 
dcd6f625203347829d973b41302d8d8fb042d431)
* dev-support/check_compatibility.sh


> check_compatibility.sh is broken when using Java API Compliance Checker v1.7
> 
>
> Key: HBASE-16129
> URL: https://issues.apache.org/jira/browse/HBASE-16129
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Dima Spivak
>Assignee: Dima Spivak
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.2, 1.1.6, 0.98.21
>
> Attachments: HBASE-16129_v1.patch, HBASE-16129_v2.patch, 
> HBASE-16129_v3.patch
>
>
> As part of HBASE-16073, we hardcoded check_compatiblity.sh to check out the 
> v1.7 tag of Java ACC. Unfortunately, just running it between two branches 
> that I know have incompatibilities, I get 0 incompatibilities (and 0 classes 
> read). Looks like this version doesn't properly traverse through JARs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >