[jira] [Updated] (HBASE-14443) Add request parameter to the TooSlow/TooLarge warn message of RpcServer

2015-09-18 Thread Jianwei Cui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-14443:

Affects Version/s: 1.2.1
Fix Version/s: (was: 1.2.1)

> Add request parameter to the TooSlow/TooLarge warn message of RpcServer
> ---
>
> Key: HBASE-14443
> URL: https://issues.apache.org/jira/browse/HBASE-14443
> Project: HBase
>  Issue Type: Improvement
>  Components: rpc
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14443-trunk-v1.patch
>
>
> The RpcServer will log a warn message for TooSlow or TooLarge request as:
> {code}
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
> {code}
> The RpcServer#logResponse will create the warn message as:
> {code}
> if (params.length == 2 && server instanceof HRegionServer &&
> params[0] instanceof byte[] &&
> params[1] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[1]).toMap());
>   ...
> } else if (params.length == 1 && server instanceof HRegionServer &&
> params[0] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[0]).toMap());
>   ...
> } else {
>   ...
> }
> {code}
> Because the parameter is always a protobuf message, not an instance of 
> Operation, the request parameter will not be added into the warn message. The 
> parameter is helpful to find out the problem, for example, knowing the 
> startRow/endRow is useful for a TooSlow scan. To improve the warn message, we 
> can transform the protobuf request message to corresponding Operation 
> subclass object by ProtobufUtil, so that it can be added the warn message. 
> Suggestion and discussion are welcomed.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13770) Programmatic JAAS configuration option for secure zookeeper may be broken

2015-09-18 Thread Maddineni Sukumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maddineni Sukumar updated HBASE-13770:
--
Attachment: HBASE-13770-v2-0.98.patch

Fixing line length issues reported by build job. 
Other errors like javadoc and checkstyle are not related to files I modified. 
So ignoring them. 
Thanks [~ashish singhi]  for your help. 



> Programmatic JAAS configuration option for secure zookeeper may be broken
> -
>
> Key: HBASE-13770
> URL: https://issues.apache.org/jira/browse/HBASE-13770
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.13, 1.2.0
>Reporter: Andrew Purtell
>Assignee: Maddineni Sukumar
> Fix For: 0.98.13
>
> Attachments: HBASE-13770-0.98.patch, HBASE-13770-v1.patch, 
> HBASE-13770-v2-0.98.patch, HBASE-13770-v2.patch
>
>
> While verifying the patch fix for HBASE-13768 we were unable to successfully 
> test the programmatic JAAS configuration option for secure ZooKeeper 
> integration. Unclear if that was due to a bug or incorrect test configuration.
> Update the security section of the online book with clear instructions for 
> setting up the programmatic JAAS configuration option for secure ZooKeeper 
> integration.
> Verify it works.
> Fix as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration

2015-09-18 Thread Jianwei Cui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-14436:

Attachment: HBASE-14436-trunk-v1.patch

A simple fix, also check the 'cfgSpec' is not empty when creating new 
Configuration.

> HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create 
> new Configuration
> ---
>
> Key: HBASE-14436
> URL: https://issues.apache.org/jira/browse/HBASE-14436
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14436-trunk-v1.patch
>
>
> HTableDescriptor#addCoprocessor will set the coprocessor value as following 
> format:
> {code}
>  public HTableDescriptor addCoprocessor(String className, Path jarFilePath,
>  int priority, final Map kvs)
>   throws IOException {
>   ...
>   String value = ((jarFilePath == null)? "" : jarFilePath.toString()) +
> "|" + className + "|" + Integer.toString(priority) + "|" +
> kvString.toString();
>   ...
> }
> {code}
> If the 'jarFilePath' is null,  the 'value' will always has the format 
> '|className|priority|'  even if 'kvs' is null, which means no extra arguments 
> for the coprocessor. Then, in the server side, 
> RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table 
> coprocessors as:
> {code}
>   static List 
> getTableCoprocessorAttrsFromSchema(Configuration conf,
>   HTableDescriptor htd) {
> ...
> try {
>   cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the 
> format '|className|priority|'
> } catch (IndexOutOfBoundsException ex) {
>   // ignore
> }
> Configuration ourConf;
> if (cfgSpec != null) {  // => cfgSpec will be '|' for the format 
> '|className|priority|'
>   ourConf = new Configuration(false);
>   HBaseConfiguration.merge(ourConf, conf);
> }
> ...
> }
> {code}
> The 'cfgSpec' will be '|' for the coprocessor formatted as 
> '|className|priority|', so that always create a new Configuration.
> In our production, there are a lot of tables having table-level coprocessors, 
> so that the region server will create new Configurations for each region of 
> the table, this will consume a certain number of memory when we have many 
> such regions.
> To fix the problem, we can make the HTableDescriptor not append the '|' if no 
> extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in 
> server side which could avoid creating new Configurations for existed such 
> regions after the regions reopened. Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13770) Programmatic JAAS configuration option for secure zookeeper may be broken

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805067#comment-14805067
 ] 

Hadoop QA commented on HBASE-13770:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12757217/HBASE-13770-v2-0.98.patch
  against 0.98 branch at commit d81fba59cfab5ed368fe888ff811a7f5064b18cc.
  ATTACHMENT ID: 12757217

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15628//console

This message is automatically generated.

> Programmatic JAAS configuration option for secure zookeeper may be broken
> -
>
> Key: HBASE-13770
> URL: https://issues.apache.org/jira/browse/HBASE-13770
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.13, 1.2.0
>Reporter: Andrew Purtell
>Assignee: Maddineni Sukumar
> Fix For: 0.98.13
>
> Attachments: HBASE-13770-0.98.patch, HBASE-13770-v1.patch, 
> HBASE-13770-v2-0.98.patch, HBASE-13770-v2.patch
>
>
> While verifying the patch fix for HBASE-13768 we were unable to successfully 
> test the programmatic JAAS configuration option for secure ZooKeeper 
> integration. Unclear if that was due to a bug or incorrect test configuration.
> Update the security section of the online book with clear instructions for 
> setting up the programmatic JAAS configuration option for secure ZooKeeper 
> integration.
> Verify it works.
> Fix as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-6721) RegionServer Group based Assignment

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876096#comment-14876096
 ] 

Andrew Purtell edited comment on HBASE-6721 at 9/18/15 6:31 PM:


bq. Andrew Purtell The new patch is based off of current master. Should we just 
replace the current branch with a new branch to go with the new patch?

I would say so if you want it. 

bq. Also was wondering since this is now cp-based. Do we still need a branch?

Up to you. We can rebase or delete, either way let me know.  

Some issues with the current security integration. Coprocessors can't call into 
the internals of other coprocessors. I understand why this was done, but we 
can't have it. Coprocessors calling into the internals of other coprocessors, 
this is a non-negotiable point for the sake of sanity in maintenance of 
separate optional extensions. It's a catch-22 imposed on this change by the 
requirement it be a coprocessor only implementation.

What I would suggest is introduce into the MasterObserver API hooks for the 
group admin APIs. Let the implementation of the group admin APIs and the 
authoritative security decisions both be separate mix-ins provided by different 
coprocessors. There needs to be common plumbing for the two. That belongs in 
MasterObserver. The plumbing could look like:
- MasterObserver support for pre/post group admin API action hooks
- In GroupAdminEndpoint, get the coprocessor host with 
getMasterCoprocessorHost()
- Invoke the public (technically, LimitedPrivate(COPROC)) APIs for pre/post 
group admin API actions.
- AccessController implements the new MasterObserver APIs to provide security 
for the group admin APIs.

This is much more in spirit with current interfaces and audience scoping. It 
decouples GroupAdminEndpoint from AccessController. (If the AC is not 
installed, no harm, no NPEs, no security checking (by intention), it's all 
good.) It also addresses concerns about zero impact in the default case. Those 
upcalls will never be made unless the GroupAdminEndpoint is installed.


was (Author: apurtell):
bq. Andrew Purtell The new patch is based off of current master. Should we just 
replace the current branch with a new branch to go with the new patch?

I would say so if you want it. 

bq. Also was wondering since this is now cp-based. Do we still need a branch?

Up to you. We can rebase or delete, either way let me know.  

Some issues with the current security integration. Coprocessors can't call into 
the internals of other coprocessors. I understand why this was done, but we 
can't have it. Coprocessors calling into the internals of other coprocessors, 
this is a non-negotiable point for the sake of sanity in maintenance of 
separate optional extensions. It's a catch-22 imposed on this change by the 
requirement it be a coprocessor only implementation.

What I would suggest is introduce into the MasterObserver API hooks for the 
group admin APIs. Let the implementation of the group admin APIs and the 
authoritative security decisions both be separate mix-ins provided by different 
coprocessors. There needs to be common plumbing for the two. That belongs in 
MasterObserver. The plumbing could look like:
- MasterObserver support for pre/post group admin API action hooks
- In GroupAdminEndpoint, get the coprocessor host with 
getMasterCoprocessorHost()
- Invoke the public (technically, LimitedPrivate(COPROC)) APIs for pre/post 
group admin API actions.

This is much more in spirit with current interfaces and audience scoping. It 
also addresses concerns about zero impact in the default case. Those upcalls 
will never be made unless the GroupAdminEndpoint is installed.

> RegionServer Group based Assignment
> ---
>
> Key: HBASE-6721
> URL: https://issues.apache.org/jira/browse/HBASE-6721
> Project: HBase
>  Issue Type: New Feature
>Reporter: Francis Liu
>Assignee: Francis Liu
>  Labels: hbase-6721
> Attachments: 6721-master-webUI.patch, HBASE-6721 
> GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
> HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
> HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
> HBASE-6721_12.patch, HBASE-6721_13.patch, HBASE-6721_14.patch, 
> HBASE-6721_8.patch, HBASE-6721_9.patch, HBASE-6721_9.patch, 
> HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721_94_2.patch, 
> HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, HBASE-6721_94_4.patch, 
> HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, HBASE-6721_94_7.patch, 
> HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
> HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
> HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
> HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 

[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-18 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876130#comment-14876130
 ] 

Vladimir Rodionov commented on HBASE-14383:
---

{quote}
Hmm, does this policy mean that we may end up not flushing data even with 
periodic flusher? The periodic flusher should be like a force flush to be 
affective.
{quote}

For small stores flush happens only if they have data which is older than 
periodic flush interval. This is how it works today. In theory, if you have 
small heap and large number of regions you won't be able to load data fast w/o 
being totally blocked periodically. ALl memstores < 16MB and they will be 
flushed once an hour.

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14452) Allow enabling tracing from configuration

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876698#comment-14876698
 ] 

stack commented on HBASE-14452:
---

Was hoping we'd have a generic dynamic config infrastructure and then we'd 
piggy-back on this to do on/off tracing but if that ain't showing up any time 
soon, lets add a switch... even if it ugly, sticking out on the side...

> Allow enabling tracing from configuration
> -
>
> Key: HBASE-14452
> URL: https://issues.apache.org/jira/browse/HBASE-14452
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Reporter: Nick Dimiduk
>
> Over on HDFS-8213 [~colinmccabe] convinced me that we should enable operators 
> to trace HDFS requests independent of applications enabling the same. At the 
> risk of adding a new, superset configuration, I think we should allow the 
> same for HBase. Any objections to following HDFS's lead on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14454) TestFailedAppendAndSync fail

2015-09-18 Thread stack (JIRA)
stack created HBASE-14454:
-

 Summary: TestFailedAppendAndSync fail
 Key: HBASE-14454
 URL: https://issues.apache.org/jira/browse/HBASE-14454
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: stack
Assignee: stack


https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15633/testReport/org.apache.hadoop.hbase.regionserver/TestFailedAppendAndSync/testLockupAroundBadAssignSync/




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876723#comment-14876723
 ] 

Heng Chen commented on HBASE-14230:
---

Sorry about that  [~ndimiduk] [~stack],  I will fix it ASAP.

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876703#comment-14876703
 ] 

stack commented on HBASE-14230:
---

Sorry about that [~ndimiduk] Thanks for the revert.

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876718#comment-14876718
 ] 

Hudson commented on HBASE-14449:


FAILURE: Integrated in HBase-1.0 #1054 (See 
[https://builds.apache.org/job/HBase-1.0/1054/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev 8fa6d4261dfd542b43a39a8ae71d031fee61966e)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876727#comment-14876727
 ] 

Hudson commented on HBASE-14230:


FAILURE: Integrated in HBase-1.2 #183 (See 
[https://builds.apache.org/job/HBase-1.2/183/])
Revert "HBASE-14230 replace reflection in FSHlog with 
HdfsDataOutputStream#getCurrentBlockReplication()" (ndimiduk: rev 
c3b936df78ae4ef7b58dddacdc84451109914798)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-14420) Zombie Stomping Session

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876733#comment-14876733
 ] 

stack edited comment on HBASE-14420 at 9/19/15 12:18 AM:
-

Looking at recent builds, not bad but then this on ubuntu-1 doing HBASE-14407 
NotServingRegion: hbase region closed forever against branch-1.2.

kalashnikov:hbase.git stack$ python ./dev-support/findHangingTests.py 
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15635/consoleFull
Fetching the console output from the URL
Printing hanging tests
Hanging test : 
org.apache.hadoop.hbase.security.access.TestWithDisabledAuthorization
Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController2
Hanging test : org.apache.hadoop.hbase.security.access.TestScanEarlyTermination
Hanging test : 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
Printing Failing tests
Failing test : org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence




was (Author: stack):
Looking at recent builds, not bad but then this on ubuntu-1 doing HBASE-14407 
NotServingRegion: hbase region closed forever

kalashnikov:hbase.git stack$ python ./dev-support/findHangingTests.py 
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15635/consoleFull
Fetching the console output from the URL
Printing hanging tests
Hanging test : 
org.apache.hadoop.hbase.security.access.TestWithDisabledAuthorization
Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController2
Hanging test : org.apache.hadoop.hbase.security.access.TestScanEarlyTermination
Hanging test : 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
Printing Failing tests
Failing test : org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence



> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876733#comment-14876733
 ] 

stack commented on HBASE-14420:
---

Looking at recent builds, not bad but then this on ubuntu-1 doing HBASE-14407 
NotServingRegion: hbase region closed forever

kalashnikov:hbase.git stack$ python ./dev-support/findHangingTests.py 
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15635/consoleFull
Fetching the console output from the URL
Printing hanging tests
Hanging test : 
org.apache.hadoop.hbase.security.access.TestWithDisabledAuthorization
Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController2
Hanging test : org.apache.hadoop.hbase.security.access.TestScanEarlyTermination
Hanging test : 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
Printing Failing tests
Failing test : org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence



> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876753#comment-14876753
 ] 

Hudson commented on HBASE-14230:


FAILURE: Integrated in HBase-TRUNK #6820 (See 
[https://builds.apache.org/job/HBase-TRUNK/6820/])
Revert "HBASE-14230 replace reflection in FSHlog with 
HdfsDataOutputStream#getCurrentBlockReplication()" (ndimiduk: rev 
8cdf4a8e03d348908883e1829b86dbe9e1b30907)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876764#comment-14876764
 ] 

Hadoop QA commented on HBASE-14453:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761211/HBASE-14453-0.98.patch
  against 0.98 branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761211

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
23 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.reef.io.network.DeprecatedNetworkConnectionServiceTest.testMultithreadedSharedConnMessagingNetworkConnServiceRate(DeprecatedNetworkConnectionServiceTest.java:343)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15637//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15637//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15637//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15637//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15637//console

This message is automatically generated.

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876770#comment-14876770
 ] 

Hudson commented on HBASE-14449:


FAILURE: Integrated in HBase-1.3 #184 (See 
[https://builds.apache.org/job/HBase-1.3/184/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev 00f467b225db2f818127f392712f7fcb2a5e30ac)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876776#comment-14876776
 ] 

Hadoop QA commented on HBASE-14453:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761206/HBASE-14453-0.98.patch
  against 0.98 branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761206

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
23 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.TestChoreService.testCorePoolDecrease(TestChoreService.java:462)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15636//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15636//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15636//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15636//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15636//console

This message is automatically generated.

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876777#comment-14876777
 ] 

Andrew Purtell commented on HBASE-14404:


Never mind, that's not going to work. Skimming again I missed writes :-/ . We 
don't have separate parameters for those. Back in a bit.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14374) Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876778#comment-14876778
 ] 

Hadoop QA commented on HBASE-14374:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761230/14317.branch-1.1.v6.txt
  against branch-1.1 branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761230

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 19 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15638//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15638//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15638//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15638//console

This message is automatically generated.

> Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0
> ---
>
> Key: HBASE-14374
> URL: https://issues.apache.org/jira/browse/HBASE-14374
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
> Fix For: 1.0.3, 1.1.3
>
> Attachments: 14317-branch-1.1.txt, 14317.branch-1.1.v2.txt, 
> 14317.branch-1.1.v2.txt, 14317.branch-1.1.v2.txt, 14317.branch-1.1.v6.txt, 
> 14374.branch-1.1.v3.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 
> 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 
> 14374.branch-1.1.v4.txt, 14374.branch-1.1.v5.txt
>
>
> Backport parent issue to branch-1.1. and branch-1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14404:
---
Attachment: HBASE-14404-0.98.patch

Updated patch that carries forward the configured HDFS setting (or its default) 
if the HBase level configuration parameter is unset. 

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876784#comment-14876784
 ] 

Hudson commented on HBASE-14230:


SUCCESS: Integrated in HBase-1.2-IT #155 (See 
[https://builds.apache.org/job/HBase-1.2-IT/155/])
Revert "HBASE-14230 replace reflection in FSHlog with 
HdfsDataOutputStream#getCurrentBlockReplication()" (ndimiduk: rev 
c3b936df78ae4ef7b58dddacdc84451109914798)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876785#comment-14876785
 ] 

Hudson commented on HBASE-14449:


SUCCESS: Integrated in HBase-1.2-IT #155 (See 
[https://builds.apache.org/job/HBase-1.2-IT/155/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev 936693b923a1e700d4db564f7012d652a4d6daad)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876790#comment-14876790
 ] 

Hudson commented on HBASE-14230:


SUCCESS: Integrated in HBase-1.3-IT #166 (See 
[https://builds.apache.org/job/HBase-1.3-IT/166/])
Revert "HBASE-14230 replace reflection in FSHlog with 
HdfsDataOutputStream#getCurrentBlockReplication()" (ndimiduk: rev 
7fb12e33315504a51578fdf747f9b8050d62bffb)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876791#comment-14876791
 ] 

Hudson commented on HBASE-14449:


SUCCESS: Integrated in HBase-1.3-IT #166 (See 
[https://builds.apache.org/job/HBase-1.3-IT/166/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev 00f467b225db2f818127f392712f7fcb2a5e30ac)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14453:
---
Status: Open  (was: Patch Available)

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876096#comment-14876096
 ] 

Andrew Purtell commented on HBASE-6721:
---

bq. Andrew Purtell The new patch is based off of current master. Should we just 
replace the current branch with a new branch to go with the new patch?

I would say so if you want it. 

bq. Also was wondering since this is now cp-based. Do we still need a branch?

Up to you. We can rebase or delete, either way let me know.  

Some issues with the current security integration. Coprocessors can't call into 
the internals of other coprocessors. I understand why this was done, but we 
can't have it. Coprocessors calling into the internals of other coprocessors, 
this is a non-negotiable point for the sake of sanity in maintenance of 
separate optional extensions. It's a catch-22 imposed on this change by the 
requirement it be a coprocessor only implementation.

What I would suggest is introduce into the MasterObserver API hooks for the 
group admin APIs. Let the implementation of the group admin APIs and the 
authoritative security decisions both be separate mix-ins provided by different 
coprocessors. There needs to be common plumbing for the two. That belongs in 
MasterObserver. The plumbing could look like:
- MasterObserver support for pre/post group admin API action hooks
- In GroupAdminEndpoint, get the coprocessor host with 
getMasterCoprocessorHost()
- Invoke the public (technically, LimitedPrivate(COPROC)) APIs for pre/post 
group admin API actions.

This is much more in spirit with current interfaces and audience scoping. It 
also addresses concerns about zero impact in the default case. Those upcalls 
will never be made unless the GroupAdminEndpoint is installed.

> RegionServer Group based Assignment
> ---
>
> Key: HBASE-6721
> URL: https://issues.apache.org/jira/browse/HBASE-6721
> Project: HBase
>  Issue Type: New Feature
>Reporter: Francis Liu
>Assignee: Francis Liu
>  Labels: hbase-6721
> Attachments: 6721-master-webUI.patch, HBASE-6721 
> GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, 
> HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, 
> HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, 
> HBASE-6721_12.patch, HBASE-6721_13.patch, HBASE-6721_14.patch, 
> HBASE-6721_8.patch, HBASE-6721_9.patch, HBASE-6721_9.patch, 
> HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721_94_2.patch, 
> HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, HBASE-6721_94_4.patch, 
> HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, HBASE-6721_94_7.patch, 
> HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, 
> HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, 
> HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, 
> HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, 
> immediateAssignments Sequence Diagram.svg, randomAssignment Sequence 
> Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment 
> Sequence Diagram.svg
>
>
> In multi-tenant deployments of HBase, it is likely that a RegionServer will 
> be serving out regions from a number of different tables owned by various 
> client applications. Being able to group a subset of running RegionServers 
> and assign specific tables to it, provides a client application a level of 
> isolation and resource allocation.
> The proposal essentially is to have an AssignmentManager which is aware of 
> RegionServer groups and assigns tables to region servers based on groupings. 
> Load balancing will occur on a per group basis as well. 
> This is essentially a simplification of the approach taken in HBASE-4120. See 
> attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14451) Move on to htrace-4.0.0 (from htrace-3.2.0)

2015-09-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14451:
--
Attachment: 14451.txt

Here is a patch to convert all tracing.

Currently stuck on fact that our hadoop depends on old htrace:

{code}
org.apache.hadoop.hbase.trace.TestHTraceHooks  Time elapsed: 4.56 sec  <<< 
ERROR!
java.lang.NoClassDefFoundError: org/apache/htrace/SamplerBuilder
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:635)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:619)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:609)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:600)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2250)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitActive(MiniDFSCluster.java:2272)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:1489)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:832)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:742)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(HBaseTestingUtility.java:632)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:1026)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:907)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:901)
at 
org.apache.hadoop.hbase.trace.TestHTraceHooks.before(TestHTraceHooks.java:59)
{code}

> Move on to htrace-4.0.0 (from htrace-3.2.0)
> ---
>
> Key: HBASE-14451
> URL: https://issues.apache.org/jira/browse/HBASE-14451
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Attachments: 14451.txt
>
>
> htrace-4.0.0 was just release with a new API. Get up on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14454) TestFailedAppendAndSync fail

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876728#comment-14876728
 ] 

stack commented on HBASE-14454:
---

It and some accompanying tests are all failing here:

Directory 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/target/test-data/1eac55dd-3ea0-4e40-94e8-51af55755e9a/TestHRegiontestLockupAroundBadAssignSync/testLockupAroundBadAssignSync
 is not empty

java.io.IOException: Directory 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/target/test-data/1eac55dd-3ea0-4e40-94e8-51af55755e9a/TestHRegiontestLockupAroundBadAssignSync/testLockupAroundBadAssignSync
 is not empty
at 
org.apache.hadoop.fs.RawLocalFileSystem.delete(RawLocalFileSystem.java:418)
at 
org.apache.hadoop.fs.ChecksumFileSystem.delete(ChecksumFileSystem.java:546)
at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.close(FSHLog.java:1006)
at 
org.apache.hadoop.hbase.regionserver.TestFailedAppendAndSync.testLockupAroundBadAssignSync(TestFailedAppendAndSync.java:256)

And in TestHRegion failures:


java.io.IOException: Directory 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase/hbase-server/target/test-data/a5674681-76e8-41bf-8af5-3edc00e4265d/TestHRegiontestMemstoreSizeWithFlushCanceling/testMemstoreSizeWithFlushCanceling
 is not empty
at 
org.apache.hadoop.fs.RawLocalFileSystem.delete(RawLocalFileSystem.java:418)
at 
org.apache.hadoop.fs.ChecksumFileSystem.delete(ChecksumFileSystem.java:546)
at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.close(FSHLog.java:1006)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.closeRegionAndWAL(HBaseTestingUtility.java:354)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion.testMemstoreSizeWithFlushCanceling(TestHRegion.java:408)

> TestFailedAppendAndSync fail
> 
>
> Key: HBASE-14454
> URL: https://issues.apache.org/jira/browse/HBASE-14454
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>
> https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15633/testReport/org.apache.hadoop.hbase.regionserver/TestFailedAppendAndSync/testLockupAroundBadAssignSync/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876740#comment-14876740
 ] 

Hudson commented on HBASE-14449:


SUCCESS: Integrated in HBase-1.1 #666 (See 
[https://builds.apache.org/job/HBase-1.1/666/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev 05618091b8f488e5b2ff81372cac2df11fd6e0d9)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876754#comment-14876754
 ] 

Hudson commented on HBASE-14449:


FAILURE: Integrated in HBase-TRUNK #6820 (See 
[https://builds.apache.org/job/HBase-TRUNK/6820/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev b0f52332651ecbb8af11557df5af3189c7283212)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876769#comment-14876769
 ] 

Hudson commented on HBASE-14230:


FAILURE: Integrated in HBase-1.3 #184 (See 
[https://builds.apache.org/job/HBase-1.3/184/])
Revert "HBASE-14230 replace reflection in FSHlog with 
HdfsDataOutputStream#getCurrentBlockReplication()" (ndimiduk: rev 
7fb12e33315504a51578fdf747f9b8050d62bffb)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876786#comment-14876786
 ] 

Andrew Purtell commented on HBASE-14404:


As a backport issue we are carrying back a change from later branches that 
alter reader and writer behavior unconditionally. I can make it conditional 
upon HBase settings changes but can't use the backport. It has to be a 
different approach that looks at configuration where we set up the readers and 
writers, rather than once in CacheConfig. Would no longer be a backport, but 
instead something unique to 0.98. I don't think that's what we want. Since I 
have a +1 to commit the earlier patch I'm going to drop the half-assed thing I 
put up a few minutes ago and commit the original backport patch shortly.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14404:
---
Attachment: (was: HBASE-14404-0.98.patch)

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876794#comment-14876794
 ] 

Andrew Purtell commented on HBASE-14453:


Those zombies aren't from HBase or 0.98, respectively. 

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14374) Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0

2015-09-18 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14374:
--
Attachment: 14317.branch-1.1.v6.txt

Rebase and include the stamping always patch that is also under the parent 
issue (HBASE-14401).

> Backport parent 'HBASE-14317 Stuck FSHLog' issue to 1.1 and 1.0
> ---
>
> Key: HBASE-14374
> URL: https://issues.apache.org/jira/browse/HBASE-14374
> Project: HBase
>  Issue Type: Sub-task
>Reporter: stack
>Assignee: stack
> Fix For: 1.0.3, 1.1.3
>
> Attachments: 14317-branch-1.1.txt, 14317.branch-1.1.v2.txt, 
> 14317.branch-1.1.v2.txt, 14317.branch-1.1.v2.txt, 14317.branch-1.1.v6.txt, 
> 14374.branch-1.1.v3.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 
> 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 14374.branch-1.1.v4.txt, 
> 14374.branch-1.1.v4.txt, 14374.branch-1.1.v5.txt
>
>
> Backport parent issue to branch-1.1. and branch-1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14449:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review, Stephen.

> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876786#comment-14876786
 ] 

Andrew Purtell edited comment on HBASE-14404 at 9/19/15 1:18 AM:
-

As a backport issue we are carrying back a change from later branches that 
alters reader and writer behavior unconditionally. I can make it conditional 
upon HBase settings changes (falling back to the HDFS level settings for 
dropping behind reads and writes, or simply not changing anything if our config 
doesn't say) but can't use the backport in that case. It has to be a different 
approach that looks at configuration where we set up the readers and writers, 
rather than once in CacheConfig. Would no longer be a backport, but instead 
something unique to 0.98. I don't think that's what we want. Since I have a +1 
to commit the earlier patch I'm going to drop the half-assed thing I put up a 
few minutes ago and commit the original backport patch shortly.


was (Author: apurtell):
As a backport issue we are carrying back a change from later branches that 
alter reader and writer behavior unconditionally. I can make it conditional 
upon HBase settings changes but can't use the backport. It has to be a 
different approach that looks at configuration where we set up the readers and 
writers, rather than once in CacheConfig. Would no longer be a backport, but 
instead something unique to 0.98. I don't think that's what we want. Since I 
have a +1 to commit the earlier patch I'm going to drop the half-assed thing I 
put up a few minutes ago and commit the original backport patch shortly.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14453:
---
Status: Patch Available  (was: Open)

Retry

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration

2015-09-18 Thread Jianwei Cui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianwei Cui updated HBASE-14436:

Attachment: HBASE-14436-trunk-v2.patch

> HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create 
> new Configuration
> ---
>
> Key: HBASE-14436
> URL: https://issues.apache.org/jira/browse/HBASE-14436
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch
>
>
> HTableDescriptor#addCoprocessor will set the coprocessor value as following 
> format:
> {code}
>  public HTableDescriptor addCoprocessor(String className, Path jarFilePath,
>  int priority, final Map kvs)
>   throws IOException {
>   ...
>   String value = ((jarFilePath == null)? "" : jarFilePath.toString()) +
> "|" + className + "|" + Integer.toString(priority) + "|" +
> kvString.toString();
>   ...
> }
> {code}
> If the 'jarFilePath' is null,  the 'value' will always has the format 
> '|className|priority|'  even if 'kvs' is null, which means no extra arguments 
> for the coprocessor. Then, in the server side, 
> RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table 
> coprocessors as:
> {code}
>   static List 
> getTableCoprocessorAttrsFromSchema(Configuration conf,
>   HTableDescriptor htd) {
> ...
> try {
>   cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the 
> format '|className|priority|'
> } catch (IndexOutOfBoundsException ex) {
>   // ignore
> }
> Configuration ourConf;
> if (cfgSpec != null) {  // => cfgSpec will be '|' for the format 
> '|className|priority|'
>   ourConf = new Configuration(false);
>   HBaseConfiguration.merge(ourConf, conf);
> }
> ...
> }
> {code}
> The 'cfgSpec' will be '|' for the coprocessor formatted as 
> '|className|priority|', so that always create a new Configuration.
> In our production, there are a lot of tables having table-level coprocessors, 
> so that the region server will create new Configurations for each region of 
> the table, this will consume a certain number of memory when we have many 
> such regions.
> To fix the problem, we can make the HTableDescriptor not append the '|' if no 
> extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in 
> server side which could avoid creating new Configurations for existed such 
> regions after the regions reopened. Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13770) Programmatic JAAS configuration option for secure zookeeper may be broken

2015-09-18 Thread Maddineni Sukumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maddineni Sukumar updated HBASE-13770:
--
Attachment: HBASE-13770-0.98.patch

Trying my luck again :(


> Programmatic JAAS configuration option for secure zookeeper may be broken
> -
>
> Key: HBASE-13770
> URL: https://issues.apache.org/jira/browse/HBASE-13770
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.13, 1.2.0
>Reporter: Andrew Purtell
>Assignee: Maddineni Sukumar
> Fix For: 0.98.13
>
> Attachments: HBASE-13770-0.98.patch, HBASE-13770-0.98.patch, 
> HBASE-13770-v1.patch, HBASE-13770-v2-0.98.patch, HBASE-13770-v2.patch
>
>
> While verifying the patch fix for HBASE-13768 we were unable to successfully 
> test the programmatic JAAS configuration option for secure ZooKeeper 
> integration. Unclear if that was due to a bug or incorrect test configuration.
> Update the security section of the online book with clear instructions for 
> setting up the programmatic JAAS configuration option for secure ZooKeeper 
> integration.
> Verify it works.
> Fix as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14275) Backport to 0.98 HBASE-10785 Metas own location should be cached

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805157#comment-14805157
 ] 

Andrew Purtell commented on HBASE-14275:


Haven't had time to look. May just revert from the branch for now. Let me see 
if I can find some time to debug tomorrow. 

> Backport to 0.98 HBASE-10785 Metas own location should be cached
> 
>
> Key: HBASE-14275
> URL: https://issues.apache.org/jira/browse/HBASE-14275
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 0.98.14
>
> Attachments: HBASE-14275-0.98.patch
>
>
> We've seen similar problem reported on 0.98.
> It is good improvement to have.
> This will cover HBASE-10785 and the a later HBASE-11332.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14451) Move on to htrace-4.0.0 (from htrace-3.2.0)

2015-09-18 Thread stack (JIRA)
stack created HBASE-14451:
-

 Summary: Move on to htrace-4.0.0 (from htrace-3.2.0)
 Key: HBASE-14451
 URL: https://issues.apache.org/jira/browse/HBASE-14451
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack


htrace-4.0.0 was just release with a new API. Get up on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14452) Allow enabling tracing from configuration

2015-09-18 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created HBASE-14452:


 Summary: Allow enabling tracing from configuration
 Key: HBASE-14452
 URL: https://issues.apache.org/jira/browse/HBASE-14452
 Project: HBase
  Issue Type: Bug
  Components: Client, Operability
Reporter: Nick Dimiduk


Over on HDFS-8213 [~colinmccabe] convinced me that we should enable operators 
to trace HDFS requests independent of applications enabling the same. At the 
risk of adding a new, superset configuration, I think we should allow the same 
for HBase. Any objections to following HDFS's lead on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876151#comment-14876151
 ] 

Jerry He commented on HBASE-14391:
--

Continue from my last comment. In theory, it looks like deleting the logDir 
should be done at the WALFactory level since it knows all Providers are closed 
and will do a final cleanup for what is shared by the WALFactory.
But it does not make logical sense for the WALFactory to go down to the FSHLog 
level and know the FS layout in this case. 
There are still holes in the the abstraction work for WALFactory and 
WALProvider.

Maybe [~busbey] has some insight suggestion.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE-14391-master-v3.patch, HBASE_14391_trunk_v1.patch, 
> HBASE_14391_trunk_v2.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }
> {code}
> The regionserver directory will be not splitted, it makes sense. But it will 
> be not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention due to concurrent connection close

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876109#comment-14876109
 ] 

Hadoop QA commented on HBASE-14449:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12761124/14449-v1.txt
  against master branch at commit d81fba59cfab5ed368fe888ff811a7f5064b18cc.
  ATTACHMENT ID: 12761124

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hdfs.server.namenode.TestAuditLogger.testWebHdfsAuditLogger(TestAuditLogger.java:126)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15631//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15631//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15631//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15631//console

This message is automatically generated.

> Rewrite deadlock prevention due to concurrent connection close
> --
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12911) Client-side metrics

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876140#comment-14876140
 ] 

Nick Dimiduk commented on HBASE-12911:
--

Thoughts on the above increase in client dependencies? Is the extra weight 
worth the weight? Better to scrap hadoop metrics entirely and go with 
dropwizard alone?

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14452) Allow enabling tracing from configuration

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876152#comment-14876152
 ] 

Nick Dimiduk commented on HBASE-14452:
--

FYI [~stack], [~clayb], [~enis], [~adriancole]

> Allow enabling tracing from configuration
> -
>
> Key: HBASE-14452
> URL: https://issues.apache.org/jira/browse/HBASE-14452
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Reporter: Nick Dimiduk
>
> Over on HDFS-8213 [~colinmccabe] convinced me that we should enable operators 
> to trace HDFS requests independent of applications enabling the same. At the 
> risk of adding a new, superset configuration, I think we should allow the 
> same for HBase. Any objections to following HDFS's lead on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention due to concurrent connection close

2015-09-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876241#comment-14876241
 ] 

Ted Yu commented on HBASE-14449:


Ran the failed tests shown above locally which passed.

TestMasterMetricsWrapper passes in the above QA run.

> Rewrite deadlock prevention due to concurrent connection close
> --
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14360) Client GC log path is not computed

2015-09-18 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-14360:
-
Release Note:   (was: HBASE-14360 Correct  replace in 
hbase-env.sh)

> Client GC log path is not computed
> --
>
> Key: HBASE-14360
> URL: https://issues.apache.org/jira/browse/HBASE-14360
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Reporter: Nick Dimiduk
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14360.1.patch
>
>
> Looking for GC logs on the client side, I noticed the nice work from 
> HBASE-7817 that gives us the settings, just uncomment and run. Giving this a 
> try with ltt, looks like {{}} is not replaced according to the 
> comments. Seems this work is done by {{bin/hbase-daemon.sh}}, not 
> {{bin/hbase}}. The result is my ltt produced a file {{.0}} in 
> {{$(pwd)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13770) Programmatic JAAS configuration option for secure zookeeper may be broken

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876059#comment-14876059
 ] 

Andrew Purtell commented on HBASE-13770:


bq. Should it be ZK_CLIENT_KERBEROS_PRINCIPAL ?
Yes, please fix s/PRINCIPLE/PRINCIPAL/g 

Otherwise lgtm


> Programmatic JAAS configuration option for secure zookeeper may be broken
> -
>
> Key: HBASE-13770
> URL: https://issues.apache.org/jira/browse/HBASE-13770
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.13, 1.2.0
>Reporter: Andrew Purtell
>Assignee: Maddineni Sukumar
> Fix For: 0.98.13
>
> Attachments: HBASE-13770-0.98.patch, HBASE-13770-0.98.patch, 
> HBASE-13770-v1.patch, HBASE-13770-v2-0.98.patch, HBASE-13770-v2.patch, 
> HBASE-13770-v3-0.98.patch
>
>
> While verifying the patch fix for HBASE-13768 we were unable to successfully 
> test the programmatic JAAS configuration option for secure ZooKeeper 
> integration. Unclear if that was due to a bug or incorrect test configuration.
> Update the security section of the online book with clear instructions for 
> setting up the programmatic JAAS configuration option for secure ZooKeeper 
> integration.
> Verify it works.
> Fix as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14391) Empty regionserver WAL will never be deleted although the coresponding regionserver has been stale

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876075#comment-14876075
 ] 

Hadoop QA commented on HBASE-14391:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761139/HBASE-14391-master-v3.patch
  against master branch at commit d81fba59cfab5ed368fe888ff811a7f5064b18cc.
  ATTACHMENT ID: 12761139

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
  org.apache.hadoop.hbase.regionserver.TestHRegion
  org.apache.hadoop.hbase.regionserver.TestFailedAppendAndSync

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15633//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15633//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15633//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15633//console

This message is automatically generated.

> Empty regionserver WAL will never be deleted although the coresponding 
> regionserver has been stale
> --
>
> Key: HBASE-14391
> URL: https://issues.apache.org/jira/browse/HBASE-14391
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.2
>Reporter: Qianxi Zhang
>Assignee: Qianxi Zhang
> Attachments: HBASE-14391-master-v3.patch, HBASE_14391_trunk_v1.patch, 
> HBASE_14391_trunk_v2.patch, WALs-leftover-dir.txt
>
>
> When I restarted the hbase cluster in which there was few data, I found there 
> are two directories for one host with different timestamp which indicates 
> that the old regionserver wal directory is not deleted.
> FHLog#989
> {code}
>  @Override
>   public void close() throws IOException {
> shutdown();
> final FileStatus[] files = getFiles();
> if (null != files && 0 != files.length) {
>   for (FileStatus file : files) {
> Path p = getWALArchivePath(this.fullPathArchiveDir, file.getPath());
> // Tell our listeners that a log is going to be archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.preLogArchive(file.getPath(), p);
>   }
> }
> if (!FSUtils.renameAndSetModifyTime(fs, file.getPath(), p)) {
>   throw new IOException("Unable to rename " + file.getPath() + " to " 
> + p);
> }
> // Tell our listeners that a log was archived.
> if (!this.listeners.isEmpty()) {
>   for (WALActionsListener i : this.listeners) {
> i.postLogArchive(file.getPath(), p);
>   }
> }
>   }
>   LOG.debug("Moved " + files.length + " WAL file(s) to " +
> FSUtils.getPath(this.fullPathArchiveDir));
> }
> LOG.info("Closed WAL: " + toString());
>   }
> {code}
> When regionserver is stopped, the hlog will be archived, so wal/regionserver 
> is empty in hdfs.
> MasterFileSystem#252
> {code}
> if (curLogFiles == null || curLogFiles.length == 0) {
> // Empty log folder. No recovery needed
> continue;
>   }

[jira] [Updated] (HBASE-14407) NotServingRegion: hbase region closed forever

2015-09-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14407:
---
Attachment: 14407-branch-1.2.patch

> NotServingRegion: hbase region closed forever
> -
>
> Key: HBASE-14407
> URL: https://issues.apache.org/jira/browse/HBASE-14407
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.98.10, 1.2.0, 1.1.2, 1.3.0
>Reporter: Shuaifeng Zhou
>Assignee: Shuaifeng Zhou
>Priority: Critical
> Attachments: 14407-branch-1.2.patch, hbase-14407-0.98.patch, 
> hbase-14407-1.1.patch, hbase-14407-1.2.patch, hs4.log, master.log
>
>
> I found a situation may cause region closed forever, and this situation 
> happend usually on my cluster, version is 0.98.10, but 1.1.2 also have the 
> problem:
> 1, master send region open to regionserver
> 2, rs open a handler do openregion
> 3, rs return resopnse to master
> 3, master not received the response, or timeout, send open region again
> 4, rs already opened the region
> 5, master processAlreadyOpenedRegion, update regionstate open in master 
> memory
> 6, master received zk message region opened(for some reason late, eg: net 
> work), and triger update regionstate open, but find that region already 
> opened, ERROR!
> 7, master send close region, and region be closed forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14360) Client GC log path is not computed

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876310#comment-14876310
 ] 

Nick Dimiduk commented on HBASE-14360:
--

So I applied the patch, uncommented a line in hbase-env.sh, and i'm seeing a 
file named {{.0}} in {{$(pwd)}}.

{noformat}
$ git diff conf/hbase-env.sh 
diff --git a/conf/hbase-env.sh b/conf/hbase-env.sh
index b7d00d1..21a45dc 100644
--- a/conf/hbase-env.sh
+++ b/conf/hbase-env.sh
@@ -70,7 +70,7 @@ export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS 
-XX:PermSize=128m -XX:M
 
 # This enables basic GC logging to its own file with automatic log rolling. 
Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
 # If FILE-PATH is not replaced, the log file(.gc) would still be generated in 
the HBASE_LOG_DIR .
-# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -Xloggc: -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
+export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc: -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 
-XX:GCLogFileSize=512M"
 
 # See the package documentation for org.apache.hadoop.hbase.io.hfile for other 
configurations
 # needed setting up off-heap block caching. 
{noformat}

The client I'm using is {{ltt}} via hbase shell, so {{./bin/hbase ltt 
-data_block_encoding FAST_DIFF -bloom ROWCOL -num_keys 10 -read 50 -write 
10:1000}}.

> Client GC log path is not computed
> --
>
> Key: HBASE-14360
> URL: https://issues.apache.org/jira/browse/HBASE-14360
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Reporter: Nick Dimiduk
>Assignee: Gabor Liptak
>Priority: Minor
>  Labels: beginner
> Attachments: HBASE-14360.1.patch
>
>
> Looking for GC logs on the client side, I noticed the nice work from 
> HBASE-7817 that gives us the settings, just uncomment and run. Giving this a 
> try with ltt, looks like {{}} is not replaced according to the 
> comments. Seems this work is done by {{bin/hbase-daemon.sh}}, not 
> {{bin/hbase}}. The result is my ltt produced a file {{.0}} in 
> {{$(pwd)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-14230:
--

Hi folks.

Running ltt vs master in standalone mode, after a couple lines of the test 
getting started my RS log is flooded with

{noformat}
2015-09-18 13:18:33,719 WARN  [sync.2] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException, continuing...
2015-09-18 13:18:33,720 WARN  [sync.4] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException, continuing...
2015-09-18 13:18:33,720 WARN  [sync.4] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException, continuing...
2015-09-18 13:18:33,720 WARN  [sync.1] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException, continuing...
2015-09-18 13:18:33,720 WARN  [sync.2] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException, continuing...
2015-09-18 13:18:33,721 WARN  [sync.3] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException, continuing...
{noformat}

I killed it before my drive filled :)

This change looks suspicious. I grabbed a jstack too, if that's helpful. I'm 
running on hadoop 2.6.0/apache via homebrew. Mind taking a look?

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14451) Move on to htrace-4.0.0 (from htrace-3.2.0)

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876127#comment-14876127
 ] 

stack commented on HBASE-14451:
---

Currently, the 'fix' would be checking in htrace-3.2.0 to solve the hadoop 
dependency requirement. We are not 'excluding' transitive hadoop include of 3.2 
but mvn will pick the latest which in this patches case is 4.0.0. I tried 
various means of pulling in two versions of an artifact and arrived at checkin 
as being only sure way; see HTRACE-114. Perhaps someone else has a better idea.

> Move on to htrace-4.0.0 (from htrace-3.2.0)
> ---
>
> Key: HBASE-14451
> URL: https://issues.apache.org/jira/browse/HBASE-14451
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Attachments: 14451.txt
>
>
> htrace-4.0.0 was just release with a new API. Get up on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention due to concurrent connection close

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876197#comment-14876197
 ] 

Hadoop QA commented on HBASE-14449:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12761150/14449-branch-1.0.txt
  against branch-1.0 branch at commit d81fba59cfab5ed368fe888ff811a7f5064b18cc.
  ATTACHMENT ID: 12761150

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestImportExport
  org.apache.hadoop.hbase.util.TestProcessBasedCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15634//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15634//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15634//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15634//console

This message is automatically generated.

> Rewrite deadlock prevention due to concurrent connection close
> --
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876348#comment-14876348
 ] 

Nick Dimiduk commented on HBASE-14230:
--

Attaching a debugger, I see {{this.hdfs_out}} is an instance of 
{{FSDataOutputStream}}.

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14449) Rewrite deadlock prevention due to concurrent connection close

2015-09-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14449:
---
 Hadoop Flags: Reviewed
Fix Version/s: 1.1.3
   1.0.3
   1.3.0
   1.2.0
   2.0.0

> Rewrite deadlock prevention due to concurrent connection close
> --
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14383) Compaction improvements

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876121#comment-14876121
 ] 

Andrew Purtell commented on HBASE-14383:


A couple of suggestions above to dynamically set maxlogs. I concur. Then, 
there's little (or no) point then to leave it be something a user can set to a 
fixed value. Remove it. One config knob down thanks to autotuning, a ton more 
to go. (smile)

> Compaction improvements
> ---
>
> Key: HBASE-14383
> URL: https://issues.apache.org/jira/browse/HBASE-14383
> Project: HBase
>  Issue Type: Improvement
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> Still major issue in many production environments. The general recommendation 
> - disabling region splitting and major compactions to reduce unpredictable 
> IO/CPU spikes, especially during peak times and running them manually during 
> off peak times. Still do not resolve the issues completely.
> h3. Flush storms
> * rolling WAL events across cluster can be highly correlated, hence flushing 
> memstores, hence triggering minor compactions, that can be promoted to major 
> ones. These events are highly correlated in time if there is a balanced 
> write-load on the regions in a table.
> *  the same is true for memstore flushing due to periodic memstore flusher 
> operation. 
> Both above may produce *flush storms* which are as bad as *compaction 
> storms*. 
> What can be done here. We can spread these events over time by randomizing 
> (with jitter) several  config options:
> # hbase.regionserver.optionalcacheflushinterval
> # hbase.regionserver.flush.per.changes
> # hbase.regionserver.maxlogs   
> h3. ExploringCompactionPolicy max compaction size
> One more optimization can be added to ExploringCompactionPolicy. To limit 
> size of a compaction there is a config parameter one could use 
> hbase.hstore.compaction.max.size. It would be nice to have two separate 
> limits: for peak and off peak hours.
> h3. ExploringCompactionPolicy selection evaluation algorithm
> Too simple? Selection with more files always wins, selection of smaller size 
> wins if number of files is the same. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14451) Move on to htrace-4.0.0 (from htrace-3.2.0)

2015-09-18 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876183#comment-14876183
 ] 

Sean Busbey commented on HBASE-14451:
-

are we just trying to land this on master, or also branch-1? do we know when 
hadoop will have a release that uses 4.0?

> Move on to htrace-4.0.0 (from htrace-3.2.0)
> ---
>
> Key: HBASE-14451
> URL: https://issues.apache.org/jira/browse/HBASE-14451
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Attachments: 14451.txt
>
>
> htrace-4.0.0 was just release with a new API. Get up on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14280) Bulk Upload from HA cluster to remote HA hbase cluster fails

2015-09-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876376#comment-14876376
 ] 

Ted Yu commented on HBASE-14280:


Checkstyle warnings are from the patch:
{code}



{code}

> Bulk Upload from HA cluster to remote HA hbase cluster fails
> 
>
> Key: HBASE-14280
> URL: https://issues.apache.org/jira/browse/HBASE-14280
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, regionserver
>Affects Versions: 0.98.4
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Minor
>  Labels: easyfix, patch
> Attachments: HBASE-14280_v1.0.patch, HBASE-14280_v2.patch, 
> HBASE-14280_v3.patch
>
>
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2113)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://ha-aggregation-nameservice1/hbase_upload/82c89692-6e78-46ef-bbea-c9e825318bfe/A/131358d641c69d6c34b803c187b0,
>  expected: hdfs://ha-hbase-nameservice1
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:414)
>   at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.commitStoreFile(HRegionFileSystem.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionFileSystem.bulkLoadStoreFile(HRegionFileSystem.java:451)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.bulkLoadHFile(HStore.java:750)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4894)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.bulkLoadHFiles(HRegion.java:4799)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.bulkLoadHFile(HRegionServer.java:3377)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29996)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2078)
>   ... 4 more
>   at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1498)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1737)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.bulkLoadHFile(ClientProtos.java:29276)
>   at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.bulkLoadHFile(ProtobufUtil.java:1548)
>   ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876816#comment-14876816
 ] 

Andrew Purtell edited comment on HBASE-14404 at 9/19/15 2:01 AM:
-

Ah, I found a compile nit with Hadoop 1.1. New patch in a bit.


was (Author: apurtell):
Ah, I found a compile nit with Hadoop 1.1. FSDataOutputStream doesn't have a 
setDropBehind method there, only FSDataInputStream does. New patch in a bit.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876864#comment-14876864
 ] 

Hadoop QA commented on HBASE-14404:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761240/HBASE-14404-0.98.patch
  against 0.98 branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761240

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 20 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
23 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at 
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat2.testMRIncrementalLoadWithSplit(TestHFileOutputFormat2.java:366)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15639//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15639//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15639//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15639//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15639//console

This message is automatically generated.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876870#comment-14876870
 ] 

Jerry He commented on HBASE-14453:
--

+1

Ran TestAssignmentManagerOnCluster 20 times in a loop. All success.
Without the patch, failed almost half of the times.

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876874#comment-14876874
 ] 

Heng Chen commented on HBASE-14230:
---

{quote}
Running ltt vs master in standalone mode
{quote}

Hi, [~ndimiduk], I don't know what {{itt}} it is,  could you tell me how to 
reproduce the problem locally. Thanks !

I has fix it like below, but i want to test it first before upload the patch.
{code}
  int getLogReplication() {
try {
  if (this.hdfs_out instanceof HdfsDataOutputStream) {
return ((HdfsDataOutputStream) 
this.hdfs_out).getCurrentBlockReplication();
  }
} catch (IOException e) {
  LOG.warn("", e);
}
return 0;
  }
{code}

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876816#comment-14876816
 ] 

Andrew Purtell commented on HBASE-14404:


Ah, I found a compile nit with Hadoop 1.1. FSDataOutputStream doesn't have a 
setDropBehind method there, only FSDataInputStream does. New patch in a bit.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14404:
---
Attachment: HBASE-14404-0.98.patch

Updated patch uses reflection to call setDropBehind on the streams. 

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876876#comment-14876876
 ] 

Hadoop QA commented on HBASE-14404:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761249/HBASE-14404-0.98.patch
  against 0.98 branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761249

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 20 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
23 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestTags
  
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.regionserver.TestHRegion
  
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithCustomVisLabService
  
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDefaultVisLabelService

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15640//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15640//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15640//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15640//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15640//console

This message is automatically generated.

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14453:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to 0.98. Thanks [~jerryhe] 

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14404) Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14404:
---
   Resolution: Invalid
 Assignee: (was: Andrew Purtell)
Fix Version/s: (was: 0.98.15)
   Status: Resolved  (was: Patch Available)

The test results are not good. Take TestTags for example:
{noformat}
Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.383 sec <<< 
FAILURE! - in org.apache.hadoop.hbase.regionserver.TestTags
testFlushAndCompactionwithCombinations(org.apache.hadoop.hbase.regionserver.TestTags)
  Time elapsed: 3.899 sec  <<< FAILURE!
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hbase.regionserver.TestTags.testFlushAndCompactionwithCombinations(TestTags.java:326)

Results :

Failed tests: 
  TestTags.testFlushAndCompactionwithCombinations:326 expected:<1> but was:<0>
{noformat}

This change triggers that failure (and the others). Checked the logs and there 
are no unexpected errors or warnings and no occurrences of the new debug log 
line "Unable to set drop behind on ..." to indicate anything amiss. Yet we 
don't read back the data we are expecting.

No interest in proceeding further, resolving as Invalid. 

> Backport HBASE-14098 (Allow dropping caches behind compactions) to 0.98
> ---
>
> Key: HBASE-14404
> URL: https://issues.apache.org/jira/browse/HBASE-14404
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
> Attachments: HBASE-14404-0.98.patch, HBASE-14404-0.98.patch
>
>
> HBASE-14098 adds a new configuration toggle - 
> "hbase.hfile.drop.behind.compaction" - which if set to "true" tells 
> compactions to drop pages from the OS blockcache after write.  It's on by 
> default where committed so far but a backport to 0.98 would default it to 
> off. (The backport would also retain compat methods to LimitedPrivate 
> interface StoreFileScanner.) What could make it a controversial change in 
> 0.98 is it changes the default setting of 
> 'hbase.regionserver.compaction.private.readers' from "false" to "true".  I 
> think it's fine, we use private readers in production. They're stable and do 
> not present perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876824#comment-14876824
 ] 

Jerry He commented on HBASE-14453:
--

Patch looks good.
As you said,  getFirstMetaServerForTable() does nothing to refresh cached meta 
location.  This private method is not used anywhere else, and probably can be 
removed altogether.
The MetaScanner.metaScan(conf, connection, visitor, tableName) uses the usual 
HTable to scan meta. I would assume this route would do all the right things.  
Correct?

There are so many ways to do the same thing (get meta and scan meta).  I 
wouldn't be surprised there is another place that is mis-used.

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876836#comment-14876836
 ] 

Andrew Purtell commented on HBASE-14453:


bq. I would assume this route would do all the right things. Correct?

That's correct. 

bq.  I wouldn't be surprised there is another place that is mis-used.

Me neither but I checked the rest of HBaseAdmin.

bq. Patch looks good.

Will assume that's a +1, please let me know if otherwise


> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876855#comment-14876855
 ] 

Hudson commented on HBASE-14449:


FAILURE: Integrated in HBase-1.2 #184 (See 
[https://builds.apache.org/job/HBase-1.2/184/])
HBASE-14449 Rewrite deadlock prevention for concurrent connection close (tedyu: 
rev 936693b923a1e700d4db564f7012d652a4d6daad)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java


> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876905#comment-14876905
 ] 

Jerry He commented on HBASE-14453:
--

Thanks for the nice detective work!

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14370) Use separate thread for calling ZKPermissionWatcher#refreshNodes()

2015-09-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876396#comment-14876396
 ] 

Ted Yu commented on HBASE-14370:


Waiting for green branch-1.x build before integrating to those branches.

> Use separate thread for calling ZKPermissionWatcher#refreshNodes()
> --
>
> Key: HBASE-14370
> URL: https://issues.apache.org/jira/browse/HBASE-14370
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: 14370-0.98-v10.txt, 14370-branch-1-v10.txt, 
> 14370-branch-1-v10.txt, 14370-v1.txt, 14370-v10.txt, 14370-v3.txt, 
> 14370-v5.txt, 14370-v7.txt, 14370-v8.txt, 14370-wait-nofity-v2.txt, 
> 14370-wait-nofity.txt, hbase-14370_v4.patch, test-acl3-branch-1.stack
>
>
> I came off a support case (0.98.0) where main zk thread was seen doing the 
> following:
> {code}
>   at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshAuthManager(ZKPermissionWatcher.java:152)
>   at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.refreshNodes(ZKPermissionWatcher.java:135)
>   at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:121)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:348)
>   at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> {code}
> There were 62000 nodes under /acl due to lack of fix from HBASE-12635, 
> leading to slowness in table creation because zk notification for region 
> offline was blocked by the above.
> The attached patch separates refreshNodes() call into its own thread.
> Thanks to Enis and Devaraj for offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876428#comment-14876428
 ] 

Nick Dimiduk commented on HBASE-14230:
--

Same thing on branch-1 and branch-1.2, except with more useful details

{noformat}
2015-09-18 14:11:23,357 WARN  [sync.1] wal.FSHLog: 
DFSOutputStream.getNumCurrentReplicas failed because of 
java.lang.ClassCastException: org.apache.hadoop.fs.FSDataOutputStream cannot be 
cast to org.apache.hadoop.hdfs.client.HdfsDataOutputStream, continuing...
{noformat}

FYI [~busbey].

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14453) Relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-14453:
--

 Summary: Relocate META when cached location is stale
 Key: HBASE-14453
 URL: https://issues.apache.org/jira/browse/HBASE-14453
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.14
Reporter: Andrew Purtell
 Fix For: 0.98.15


After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
until all regions are deleted, we won't attempt to relocate META should its 
cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876529#comment-14876529
 ] 

Andrew Purtell commented on HBASE-14453:


This is only an issue in 0.98. Fix coming soon.


> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
> Fix For: 0.98.15
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-14453:
--

Assignee: Andrew Purtell

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14212) Add IT test for procedure-v2-based namespace DDL

2015-09-18 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876420#comment-14876420
 ] 

Stephen Yuan Jiang commented on HBASE-14212:


Sorry, [~fengs] is the target Sophia.

> Add IT test for procedure-v2-based namespace DDL
> 
>
> Key: HBASE-14212
> URL: https://issues.apache.org/jira/browse/HBASE-14212
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>
> Integration test for proc-v2-based table DDLs was created in HBASE-12439 
> during HBASE 1.1 release.  With HBASE-13212, proc-v2-based namespace DDLs are 
> introduced.  We need to enhanced the IT from HBASE-12429 to include namespace 
> DDLs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876453#comment-14876453
 ] 

Sean Busbey commented on HBASE-14230:
-

does stand alone mode use the local filesystem implementation instead of 
single-node HDFS?

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876468#comment-14876468
 ] 

Nick Dimiduk commented on HBASE-14230:
--

I believe so, yes.

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876483#comment-14876483
 ] 

Nick Dimiduk commented on HBASE-14230:
--

I pushed revert commits to all 3 branches until a correct fix can be sorted. 
Had to also reintroduce the NO_ARGS instance that was removed in HBASE-14401.

> replace reflection in FSHlog with 
> HdfsDataOutputStream#getCurrentBlockReplication()
> ---
>
> Key: HBASE-14230
> URL: https://issues.apache.org/jira/browse/HBASE-14230
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Heng Chen
>Assignee: Heng Chen
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: HBASE-14230.patch
>
>
> As comment TODO said, we use 
> {{HdfsDataOutputStream#getCurrentBlockReplication}} and 
> {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14275) Backport to 0.98 HBASE-10785 Metas own location should be cached

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-14275.

Resolution: Fixed

Moving to HBASE-14453

> Backport to 0.98 HBASE-10785 Metas own location should be cached
> 
>
> Key: HBASE-14275
> URL: https://issues.apache.org/jira/browse/HBASE-14275
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 0.98.14
>
> Attachments: HBASE-14275-0.98.patch
>
>
> We've seen similar problem reported on 0.98.
> It is good improvement to have.
> This will cover HBASE-10785 and the a later HBASE-11332.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14453:
---
Summary: HBaseAdmin#deleteTable should relocate META when cached location 
is stale  (was: Relocate META when cached location is stale)

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
> Fix For: 0.98.15
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-14443) Add request parameter to the TooSlow/TooLarge warn message of RpcServer

2015-09-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-14443:
--

Assignee: Jianwei Cui

> Add request parameter to the TooSlow/TooLarge warn message of RpcServer
> ---
>
> Key: HBASE-14443
> URL: https://issues.apache.org/jira/browse/HBASE-14443
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, rpc
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Assignee: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14443-trunk-v1.patch
>
>
> The RpcServer will log a warn message for TooSlow or TooLarge request as:
> {code}
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
> {code}
> The RpcServer#logResponse will create the warn message as:
> {code}
> if (params.length == 2 && server instanceof HRegionServer &&
> params[0] instanceof byte[] &&
> params[1] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[1]).toMap());
>   ...
> } else if (params.length == 1 && server instanceof HRegionServer &&
> params[0] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[0]).toMap());
>   ...
> } else {
>   ...
> }
> {code}
> Because the parameter is always a protobuf message, not an instance of 
> Operation, the request parameter will not be added into the warn message. The 
> parameter is helpful to find out the problem, for example, knowing the 
> startRow/endRow is useful for a TooSlow scan. To improve the warn message, we 
> can transform the protobuf request message to corresponding Operation 
> subclass object by ProtobufUtil, so that it can be added the warn message. 
> Suggestion and discussion are welcomed.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14407) NotServingRegion: hbase region closed forever

2015-09-18 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876626#comment-14876626
 ] 

Ted Yu commented on HBASE-14407:


lgtm

> NotServingRegion: hbase region closed forever
> -
>
> Key: HBASE-14407
> URL: https://issues.apache.org/jira/browse/HBASE-14407
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.98.10, 1.2.0, 1.1.2, 1.3.0
>Reporter: Shuaifeng Zhou
>Assignee: Shuaifeng Zhou
>Priority: Critical
> Attachments: 14407-branch-1.2.patch, hbase-14407-0.98.patch, 
> hbase-14407-1.1.patch, hbase-14407-1.2.patch, hs4.log, master.log
>
>
> I found a situation may cause region closed forever, and this situation 
> happend usually on my cluster, version is 0.98.10, but 1.1.2 also have the 
> problem:
> 1, master send region open to regionserver
> 2, rs open a handler do openregion
> 3, rs return resopnse to master
> 3, master not received the response, or timeout, send open region again
> 4, rs already opened the region
> 5, master processAlreadyOpenedRegion, update regionstate open in master 
> memory
> 6, master received zk message region opened(for some reason late, eg: net 
> work), and triger update regionstate open, but find that region already 
> opened, ERROR!
> 7, master send close region, and region be closed forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14452) Allow enabling tracing from configuration

2015-09-18 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876650#comment-14876650
 ] 

Enis Soztutar commented on HBASE-14452:
---

Sounds good. There are cases where you cannot instrument each and every client 
with tracing. 

> Allow enabling tracing from configuration
> -
>
> Key: HBASE-14452
> URL: https://issues.apache.org/jira/browse/HBASE-14452
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Operability
>Reporter: Nick Dimiduk
>
> Over on HDFS-8213 [~colinmccabe] convinced me that we should enable operators 
> to trace HDFS requests independent of applications enabling the same. At the 
> risk of adding a new, superset configuration, I think we should allow the 
> same for HBase. Any objections to following HDFS's lead on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14443) Add request parameter to the TooSlow/TooLarge warn message of RpcServer

2015-09-18 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-14443:
-
Component/s: Operability

> Add request parameter to the TooSlow/TooLarge warn message of RpcServer
> ---
>
> Key: HBASE-14443
> URL: https://issues.apache.org/jira/browse/HBASE-14443
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, rpc
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14443-trunk-v1.patch
>
>
> The RpcServer will log a warn message for TooSlow or TooLarge request as:
> {code}
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
> {code}
> The RpcServer#logResponse will create the warn message as:
> {code}
> if (params.length == 2 && server instanceof HRegionServer &&
> params[0] instanceof byte[] &&
> params[1] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[1]).toMap());
>   ...
> } else if (params.length == 1 && server instanceof HRegionServer &&
> params[0] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[0]).toMap());
>   ...
> } else {
>   ...
> }
> {code}
> Because the parameter is always a protobuf message, not an instance of 
> Operation, the request parameter will not be added into the warn message. The 
> parameter is helpful to find out the problem, for example, knowing the 
> startRow/endRow is useful for a TooSlow scan. To improve the warn message, we 
> can transform the protobuf request message to corresponding Operation 
> subclass object by ProtobufUtil, so that it can be added the warn message. 
> Suggestion and discussion are welcomed.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14443) Add request parameter to the TooSlow/TooLarge warn message of RpcServer

2015-09-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876505#comment-14876505
 ] 

Nick Dimiduk commented on HBASE-14443:
--

Those are sample log lines from your testing runs?

Patch looks alright on the face of it. This looks like something that will 
easily suffer bit-rot because there's all this special code per operation type. 
Any way we can reflect- or automate-away some of those if/switch statements to 
keep this utility working long into the future?

Nice work [~cuijianwei].

> Add request parameter to the TooSlow/TooLarge warn message of RpcServer
> ---
>
> Key: HBASE-14443
> URL: https://issues.apache.org/jira/browse/HBASE-14443
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, rpc
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Priority: Minor
> Attachments: HBASE-14443-trunk-v1.patch
>
>
> The RpcServer will log a warn message for TooSlow or TooLarge request as:
> {code}
> logResponse(new Object[]{param},
> md.getName(), md.getName() + "(" + param.getClass().getName() + 
> ")",
> (tooLarge ? "TooLarge" : "TooSlow"),
> status.getClient(), startTime, processingTime, qTime,
> responseSize);
> {code}
> The RpcServer#logResponse will create the warn message as:
> {code}
> if (params.length == 2 && server instanceof HRegionServer &&
> params[0] instanceof byte[] &&
> params[1] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[1]).toMap());
>   ...
> } else if (params.length == 1 && server instanceof HRegionServer &&
> params[0] instanceof Operation) {
>   ...
>   responseInfo.putAll(((Operation) params[0]).toMap());
>   ...
> } else {
>   ...
> }
> {code}
> Because the parameter is always a protobuf message, not an instance of 
> Operation, the request parameter will not be added into the warn message. The 
> parameter is helpful to find out the problem, for example, knowing the 
> startRow/endRow is useful for a TooSlow scan. To improve the warn message, we 
> can transform the protobuf request message to corresponding Operation 
> subclass object by ProtobufUtil, so that it can be added the warn message. 
> Suggestion and discussion are welcomed.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14453) HBaseAdmin#deleteTable should relocate META when cached location is stale

2015-09-18 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-14453:
---
Attachment: HBASE-14453-0.98.patch

Rebased on latest 0.98. 

> HBaseAdmin#deleteTable should relocate META when cached location is stale
> -
>
> Key: HBASE-14453
> URL: https://issues.apache.org/jira/browse/HBASE-14453
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.14
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.98.15
>
> Attachments: HBASE-14453-0.98.patch, HBASE-14453-0.98.patch
>
>
> After HBASE-14275, in HBaseAdmin#deleteTable, when using MetaReader to wait 
> until all regions are deleted, we won't attempt to relocate META should its 
> cached location be stale.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14451) Move on to htrace-4.0.0 (from htrace-3.2.0)

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876610#comment-14876610
 ] 

stack commented on HBASE-14451:
---

On 1.2 hopefully (If the RM will have it -- smile).

Fix for above is a new htrace version, 4.0.1, where 4.0.0 has a different 
artifactId. Being rolled now. No clashing going forward.

First hadoop will be 2.8.

> Move on to htrace-4.0.0 (from htrace-3.2.0)
> ---
>
> Key: HBASE-14451
> URL: https://issues.apache.org/jira/browse/HBASE-14451
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Attachments: 14451.txt
>
>
> htrace-4.0.0 was just release with a new API. Get up on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14212) Add IT test for procedure-v2-based namespace DDL

2015-09-18 Thread Stephen Yuan Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876417#comment-14876417
 ] 

Stephen Yuan Jiang commented on HBASE-14212:


[~Sophia] and [~enis], the patch can be reviewed in 
https://reviews.apache.org/r/38509/

This patch extends the IntegrationTestDDLMasterFailover IT test to test proc-V2 
based namespace DDL

Note: the goal is to test stability of namespace DDL under Proc-V2 
implementation.  I did not create tables under namespace to test 
table/namespace concurrent DDL (today, it is broken, because we don’t have 
namespace/table lock hierarchy), as this is NOT the focus on the proc-v2 
implementation.

> Add IT test for procedure-v2-based namespace DDL
> 
>
> Key: HBASE-14212
> URL: https://issues.apache.org/jira/browse/HBASE-14212
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>
> Integration test for proc-v2-based table DDLs was created in HBASE-12439 
> during HBASE 1.1 release.  With HBASE-13212, proc-v2-based namespace DDLs are 
> introduced.  We need to enhanced the IT from HBASE-12429 to include namespace 
> DDLs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14275) Backport to 0.98 HBASE-10785 Metas own location should be cached

2015-09-18 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876415#comment-14876415
 ] 

Andrew Purtell commented on HBASE-14275:


Go to this commit in 0.98, run TestAssignmentManagerOnCluster a few times, we 
will eventually get stuck cleaning up at the end of 
testSSHWhenDisablingTableRegionsInOpeningOrPendingOpenState.

The deleteTable at line 1581 of HTU called from line 678 of the test gets 
stuck. This only happens when above in the test, at line 639, META is moved 
when colocated with the test table on the victim regionserver. If I add this:
{noformat}
@@ -645,6 +645,8 @@ public class TestAssignmentManagerOnCluster {
 .move(HRegionInfo.FIRST_META_REGIONINFO.getEncodedNameAsBytes(),
   Bytes.toBytes(rs.getServerName().getServerName()));
 am.waitForAssignment(HRegionInfo.FIRST_META_REGIONINFO);
+// Drop the cached location for META
+TEST_UTIL.getHBaseAdmin().getConnection().clearCaches(metaServerName);
   }
   
   am.regionOffline(hri);
{noformat}
then the test passes reliably.

Looks like when using MetaReader to wait until all regions are deleted, 
deleteTable won't attempt to relocate META should its cached location be stale. 

> Backport to 0.98 HBASE-10785 Metas own location should be cached
> 
>
> Key: HBASE-14275
> URL: https://issues.apache.org/jira/browse/HBASE-14275
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jerry He
>Assignee: Jerry He
> Fix For: 0.98.14
>
> Attachments: HBASE-14275-0.98.patch
>
>
> We've seen similar problem reported on 0.98.
> It is good improvement to have.
> This will cover HBASE-10785 and the a later HBASE-11332.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14407) NotServingRegion: hbase region closed forever

2015-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876472#comment-14876472
 ] 

Hadoop QA commented on HBASE-14407:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12761165/14407-branch-1.2.patch
  against branch-1.2 branch at commit d81fba59cfab5ed368fe888ff811a7f5064b18cc.
  ATTACHMENT ID: 12761165

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 6 zombie test(s):   
at 
org.apache.hadoop.hbase.security.access.TestAccessController2.testACLTableAccess(TestAccessController2.java:255)
at 
org.apache.hadoop.hbase.security.access.TestScanEarlyTermination.testEarlyScanTermination(TestScanEarlyTermination.java:243)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15635//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15635//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15635//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15635//console

This message is automatically generated.

> NotServingRegion: hbase region closed forever
> -
>
> Key: HBASE-14407
> URL: https://issues.apache.org/jira/browse/HBASE-14407
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.98.10, 1.2.0, 1.1.2, 1.3.0
>Reporter: Shuaifeng Zhou
>Assignee: Shuaifeng Zhou
>Priority: Critical
> Attachments: 14407-branch-1.2.patch, hbase-14407-0.98.patch, 
> hbase-14407-1.1.patch, hbase-14407-1.2.patch, hs4.log, master.log
>
>
> I found a situation may cause region closed forever, and this situation 
> happend usually on my cluster, version is 0.98.10, but 1.1.2 also have the 
> problem:
> 1, master send region open to regionserver
> 2, rs open a handler do openregion
> 3, rs return resopnse to master
> 3, master not received the response, or timeout, send open region again
> 4, rs already opened the region
> 5, master processAlreadyOpenedRegion, update regionstate open in master 
> memory
> 6, master received zk message region opened(for some reason late, eg: net 
> work), and triger update regionstate open, but find that region already 
> opened, ERROR!
> 7, master send close region, and region be closed forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14333) responseTooLarge/Slow logs are not actionable when the rpc method is ExecService

2015-09-18 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-14333.
--
Resolution: Duplicate

Superseded by efforts on HBASE-14443.

> responseTooLarge/Slow logs are not actionable when the rpc method is 
> ExecService
> 
>
> Key: HBASE-14333
> URL: https://issues.apache.org/jira/browse/HBASE-14333
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Nick Dimiduk
>  Labels: beginner
>
> This seems to come up mostly for users running Phoenix. We get logs saying a 
> response is too large or slow, but the coprocessoring being invoked is 
> masked. This makes it hard to diagnose which aspect of the query needs to be 
> optimized. Adding the coprocessor class and method would make these log lines 
> much more informative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876617#comment-14876617
 ] 

stack commented on HBASE-14420:
---

bq. Otherwise stuff is generally passing

So, now focus moves back to apache build.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-09-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876614#comment-14876614
 ] 

stack commented on HBASE-14420:
---

On my test rig, got these failures in recent runs:

Failed tests:
  
TestStochasticLoadBalancer2.testRegionReplicasOnMidClusterHighReplication:73->BalancerTestBase.testWithCluster:422->BalancerTestBase.testWithCluster:450->BalancerTestBase.assertRegionReplicaPlacement:225
 Two or more region replicas are hosted on the same host after balance

Got this again too...

TestHttpServerLifecycle.testStoppedServerIsNotAlive:97->HttpServerFunctionalTest.stop:195
 » TestTimedOut

Otherwise stuff is generally passing.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14449) Rewrite deadlock prevention for concurrent connection close

2015-09-18 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14449:
---
Summary: Rewrite deadlock prevention for concurrent connection close  (was: 
Rewrite deadlock prevention due to concurrent connection close)

> Rewrite deadlock prevention for concurrent connection close
> ---
>
> Key: HBASE-14449
> URL: https://issues.apache.org/jira/browse/HBASE-14449
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3
>
> Attachments: 14449-branch-1.0.txt, 14449-v1.txt
>
>
> The deadlock prevention approach used in HBASE-14241 introduces unnecessary 
> logic which is not intuitive.
> Depending on the value for config hbase.ipc.client.specificThreadForWriting , 
> there may or may not be CallSender threads running.
> The attached patch simplifies deadlock prevention by using a Set which 
> represents the Connections to be closed. Outside the synchronized 
> (connections) block, this Set is iterated where the Connections are closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >