[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Status: Patch Available  (was: Open)

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-18656.patch, HBASE-18656.v2.patch, 
> HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Fix Version/s: 2.0.0-alpha-3

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.patch, HBASE-18656.v2.patch, 
> HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140336#comment-16140336
 ] 

Mike Drob commented on HBASE-18656:
---

Pushed to branch-2 and master, will put up a branch-1 patch shortly.

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.patch, HBASE-18656.v2.patch, 
> HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Attachment: HBASE-18656.branch-1.patch

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.branch-1.patch, HBASE-18656.patch, 
> HBASE-18656.v2.patch, HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Status: Open  (was: Patch Available)

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.branch-1.patch, HBASE-18656.patch, 
> HBASE-18656.v2.patch, HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18676) Update branch 1.3 pom version

2017-08-24 Thread Mike Drob (JIRA)
Mike Drob created HBASE-18676:
-

 Summary: Update branch 1.3 pom version
 Key: HBASE-18676
 URL: https://issues.apache.org/jira/browse/HBASE-18676
 Project: HBase
  Issue Type: Task
  Components: build
Affects Versions: 1.3.1
Reporter: Mike Drob
Priority: Minor
 Fix For: 1.3.2


Branch 1.3 currently has version=1.3.1 set in the poms. I think this should be 
1.3.2-SNAPSHOT, in case anybody tries to deploy nightlies for any reason.

WDYT [~mantonov]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Status: Patch Available  (was: Open)

Submitting patch for branch-1 QA.

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.branch-1.patch, HBASE-18656.patch, 
> HBASE-18656.v2.patch, HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140530#comment-16140530
 ] 

Mike Drob commented on HBASE-18656:
---

[~apurtell] - take a look for branch-1? if you agree this looks ok, I'll push

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.branch-1.patch, HBASE-18656.patch, 
> HBASE-18656.v2.patch, HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18523) FilterList should break if the filter succeeds (row passes) in case of MUST_PASS_ONE and if the filter fails (row is skipped) for MUST_PASS_ALL

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18523:
--
Labels: beginner patch  (was: newbie patch)

> FilterList should break if the filter succeeds (row passes) in case of 
> MUST_PASS_ONE and if the filter fails (row is skipped) for MUST_PASS_ALL
> ---
>
> Key: HBASE-18523
> URL: https://issues.apache.org/jira/browse/HBASE-18523
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 0.98.24, 1.2.6, 1.1.12
>Reporter: Akshita Malhotra
>  Labels: beginner, patch
>
> FilterList should break if the filter succeeds (row passes) in case of 
> MUST_PASS_ONE and if the filter fails (row is skipped) for MUST_PASS_ALL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
   Resolution: Fixed
Fix Version/s: 1.2.7
   1.3.2
   1.4.0
   Status: Resolved  (was: Patch Available)

Thanks for reviews, Stack and Andrew!

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0, 1.4.0, 1.3.2, 1.2.7, 2.0.0-alpha-3
>
> Attachments: HBASE-18656.branch-1.patch, HBASE-18656.patch, 
> HBASE-18656.v2.patch, HBASE-18656.v3.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18677) type in namesapce docs

2017-08-24 Thread Mike Drob (JIRA)
Mike Drob created HBASE-18677:
-

 Summary: type in namesapce docs
 Key: HBASE-18677
 URL: https://issues.apache.org/jira/browse/HBASE-18677
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Mike Drob
Priority: Trivial


In the docs at http://hbase.apache.org/book.html#_namespace - "Region server 
groups (HBASE-6721) - A namespace/table can be pinned onto a subset of 
RegionServers thus guaranteeing a course level of isolation."

Should be "coarse"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18677) typo in namespace docs

2017-08-24 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18677:
--
Summary: typo in namespace docs  (was: type in namesapce docs)

> typo in namespace docs
> --
>
> Key: HBASE-18677
> URL: https://issues.apache.org/jira/browse/HBASE-18677
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mike Drob
>Priority: Trivial
>
> In the docs at http://hbase.apache.org/book.html#_namespace - "Region server 
> groups (HBASE-6721) - A namespace/table can be pinned onto a subset of 
> RegionServers thus guaranteeing a course level of isolation."
> Should be "coarse"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18675) Making {max,min}SessionTimeout configurable for MiniZooKeeperCluster

2017-08-24 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16140731#comment-16140731
 ] 

Mike Drob commented on HBASE-18675:
---

So if we set the defaults based on tickTime, then that's the only knob somebody 
has to tune it? I don't know if there is a use case where you would need to set 
that range or not. What do we get for session timeout settings without the 
patch?

bq. I added a comment, let me know if I should remove that.
Never remove comments! ;)

> Making {max,min}SessionTimeout configurable for MiniZooKeeperCluster
> 
>
> Key: HBASE-18675
> URL: https://issues.apache.org/jira/browse/HBASE-18675
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Reporter: Cesar Delgado
>Assignee: Cesar Delgado
>Priority: Minor
> Attachments: MiniZooKeeperCluster_HBASE_8675.patch, 
> MiniZooKeeperCluster_HBASE_8675.patch, MiniZooKeeperCluster_HBASE_8675.patch
>
>
> Right now the mini cluster on application developers laptops keep crashing 
> when the laptop goes to sleep because Zookeeper times out. We've seen this 
> for a while and [~ekoontz] had worked on it before. Now that we tried to 
> upgrade it's bitten us so we'd like to push this up as I'm sure we're not the 
> only ones getting bitten by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135899#comment-16135899
 ] 

Mike Drob commented on HBASE-18628:
---

No, I was not able to figure out the root cause. Executor _shouldn't_ be shut 
down because we can still see it's thread in the jstack, I thought.

The childrenChangedFuture does not need to be volatile because it is always 
read from the same thread - the ZK EventThread.

There _is_ a guarantee that there will be no more than one refreshNodes action, 
because they are processed by a single threaded executor (and all the ZK 
notifications also come in through a single thread). So there will be at most 
one action happening, and one new action coming in. If we always cancel the 
preceding action, then we can maintain this invariant.

The original reason for preemption was performance - either you or Ted claimed 
that updating permissions on a large cluster can take several minutes, and if 
multiple updates come in close proximity then there is no reason to process the 
stale data. I'm happy to see this code removed because I agree that it is 
complicated and brittle, but I did manage to run this against the one cluster 
I've seen the problem on and the issue went away (without other issues 
appearing). No promises that my new code is foolproof, but I feel good about it.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.branch-1.v5.patch, HBASE-18628.patch, 
> HBASE-18628.v2.patch, HBASE-18628.v3.patch, HBASE-18628.v4.patch, 
> HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Attachment: HBASE-18628.branch-1.v5.patch

Attaching branch-1 patch.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.branch-1.v5.patch, HBASE-18628.patch, 
> HBASE-18628.v2.patch, HBASE-18628.v3.patch, HBASE-18628.v4.patch, 
> HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I 

[jira] [Commented] (HBASE-14351) Procedure V2 Phase 3: Notification Bus

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136123#comment-16136123
 ] 

Mike Drob commented on HBASE-14351:
---

[~apurtell], [~stack] - There's some subtasks here that are still not done. 
Does this need to make 2.0? Or can it wait for 2.1?

> Procedure V2 Phase 3: Notification Bus
> --
>
> Key: HBASE-14351
> URL: https://issues.apache.org/jira/browse/HBASE-14351
> Project: HBase
>  Issue Type: Task
>Reporter: Stephen Yuan Jiang
>Assignee: Matteo Bertozzi
>
> This is the third phase of Procedure V2 (HBASE-12439) feature. Built on top 
> of state­ machine from Phase 1 (HBASE-14336), the notification bus is just
> an exchange of messages between the multiple machines (e.g. master and 
> region­servers).  The notification ­bus allows master to send 
> notifications/procedures to the Region Servers. Two examples are snapshot for 
> OnePhaseProcedure and ACL update for TwoPhaseProcedure (check HBASE-12439 for 
> high-level design).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12349) Add Maven build support module for a custom version of error-prone

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136144#comment-16136144
 ] 

Mike Drob commented on HBASE-12349:
---

org.apache.hadoop.hbase.client.TestAsyncClusterAdminApi2 fails locally for me 
without the patch as well.

> Add Maven build support module for a custom version of error-prone
> --
>
> Key: HBASE-12349
> URL: https://issues.apache.org/jira/browse/HBASE-12349
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Mike Drob
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-12349.patch, HBASE-12349.v2.patch, 
> HBASE-12349.v3.patch, HBASE-12349.v4.patch, HBASE-12349.v5.patch, 
> HBASE-12349.v6.patch
>
>
> Add a new Maven build support module that builds and publishes a custom 
> error-prone artifact for use by the rest of the build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136131#comment-16136131
 ] 

Mike Drob commented on HBASE-18628:
---

bq. @apurtell - Yes I would like to see it. Do you need assistance?
I have a branch-1 patch up, I think it looks ok.

bq. We no longer need to use ZK for a notification bus for ACL permission 
changes now that we have ProcedureV2.
Agree, but that code isn't there yet, hence we have this issue. Happy to see it 
all gone when those are completed.

bq. [~huaxiang] - Curious about the root cause, I think that the task submit 
did not succeed. Does the following diff achieve the goal?
I can try it and will let you know.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.branch-1.v5.patch, HBASE-18628.patch, 
> HBASE-18628.v2.patch, HBASE-18628.v3.patch, HBASE-18628.v4.patch, 
> HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 

[jira] [Comment Edited] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136131#comment-16136131
 ] 

Mike Drob edited comment on HBASE-18628 at 8/22/17 1:34 AM:


bq. [~apurtell] - Yes I would like to see it. Do you need assistance?
I have a branch-1 patch up, I think it looks ok.

bq. We no longer need to use ZK for a notification bus for ACL permission 
changes now that we have ProcedureV2.
Agree, but that code isn't there yet, hence we have this issue. Happy to see it 
all gone when those are completed.

bq. [~huaxiang] - Curious about the root cause, I think that the task submit 
did not succeed. Does the following diff achieve the goal?
I can try it and will let you know.


was (Author: mdrob):
bq. @apurtell - Yes I would like to see it. Do you need assistance?
I have a branch-1 patch up, I think it looks ok.

bq. We no longer need to use ZK for a notification bus for ACL permission 
changes now that we have ProcedureV2.
Agree, but that code isn't there yet, hence we have this issue. Happy to see it 
all gone when those are completed.

bq. [~huaxiang] - Curious about the root cause, I think that the task submit 
did not succeed. Does the following diff achieve the goal?
I can try it and will let you know.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.branch-1.v5.patch, HBASE-18628.patch, 
> HBASE-18628.v2.patch, HBASE-18628.v3.patch, HBASE-18628.v4.patch, 
> HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at 

[jira] [Commented] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135258#comment-16135258
 ] 

Mike Drob commented on HBASE-18628:
---

Ran v2 on a cluster over the weekend and didn't see the issue come up.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Attachment: HBASE-18628.v4.patch

v4:
Actually apply the correct patch, since I put up the wrong file for v3

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Open  (was: Patch Available)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to 

[jira] [Commented] (HBASE-18629) Enhance ChaosMonkeyRunner with interruptibility

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135309#comment-16135309
 ] 

Mike Drob commented on HBASE-18629:
---

Are we updating anything to use the new interruptibility feature? Follow-on 
JIRA?

Otherwise LGTM.

> Enhance ChaosMonkeyRunner with interruptibility
> ---
>
> Key: HBASE-18629
> URL: https://issues.apache.org/jira/browse/HBASE-18629
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
> Attachments: 18629.v1.txt, 18629.v2.txt
>
>
> Currently ChaosMonkeyRunner performs looping unconditionally:
> {code}
> while (true) {// loop here until got killed
>   Thread.sleep(1);
> }
> {code}
> When ChaosMonkeyRunner is invoked programmatically, it is desirable to add 
> interruptibility to the runner so that the caller can manage its lifetime.
> Another enhancement is to allow passing the path to hbase-site.xml where 
> chaos monkey parameters are specified.
> This is useful when the underlying hbase-site.xml is not on classpath.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18629) Enhance ChaosMonkeyRunner with interruptibility

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135322#comment-16135322
 ] 

Mike Drob commented on HBASE-18629:
---

Cool.

+1

> Enhance ChaosMonkeyRunner with interruptibility
> ---
>
> Key: HBASE-18629
> URL: https://issues.apache.org/jira/browse/HBASE-18629
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: 18629.v1.txt, 18629.v2.txt
>
>
> Currently ChaosMonkeyRunner performs looping unconditionally:
> {code}
> while (true) {// loop here until got killed
>   Thread.sleep(1);
> }
> {code}
> When ChaosMonkeyRunner is invoked programmatically, it is desirable to add 
> interruptibility to the runner so that the caller can manage its lifetime.
> Another enhancement is to allow passing the path to hbase-site.xml where 
> chaos monkey parameters are specified.
> This is useful when the underlying hbase-site.xml is not on classpath.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Open  (was: Patch Available)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to ZookeeperWatcher and launch threads from there. Maybe 
> 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Patch Available  (was: Open)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to 

[jira] [Commented] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135300#comment-16135300
 ] 

Mike Drob commented on HBASE-18628:
---

[~apurtell], [~mantonov] - The branch-1 backport is not clean. As the release 
managers are you interested in seeing this come back to 1.3/1.4 or do you think 
since nobody else has reported it on those branches that it may not be a 
relevant fix?

(Shouldn't affect 1.2 since HBASE-14370 never got there)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Fix Version/s: 2.0.0-alpha-3
   3.0.0

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Patch Available  (was: Open)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to ZookeeperWatcher and launch 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Attachment: HBASE-18628.v3.patch

v3:
* fix empty if block
* verify whitespace changes

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # 

[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-23 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Status: Patch Available  (was: Open)

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-18656.patch, HBASE-18656.v2.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-23 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Attachment: HBASE-18656.v2.patch

v2: Update tests for ConcatenatedList

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-18656.patch, HBASE-18656.v2.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18656) Address issues found by error-prone in hbase-common

2017-08-23 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18656:
--
Status: Open  (was: Patch Available)

> Address issues found by error-prone in hbase-common
> ---
>
> Key: HBASE-18656
> URL: https://issues.apache.org/jira/browse/HBASE-18656
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
>  Labels: beginner
> Fix For: 3.0.0
>
> Attachments: HBASE-18656.patch
>
>
> We should address the new compilation errors found by running with 
> {{-PerrorProne}}.
> Can convert this to a top-level task and add subtasks for modules if desired 
> (in which case, link it back to parent issue HBASE-12187, please)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18503) Change ***Util and Master to use TableDescriptor and ColumnFamilyDescriptor

2017-08-23 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138423#comment-16138423
 ] 

Mike Drob commented on HBASE-18503:
---

+1

> Change ***Util and Master to use TableDescriptor and ColumnFamilyDescriptor
> ---
>
> Key: HBASE-18503
> URL: https://issues.apache.org/jira/browse/HBASE-18503
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0
>
> Attachments: HBASE-18503.v0.patch, HBASE-18503.v1.patch, 
> HBASE-18503.v1.patch, HBASE-18503.v2.patch, HBASE-18503.v2.patch, 
> HBASE-18503.v2.patch, HBASE-18503.v3.patch
>
>
> These helper classes accept the HTD and HCD as argument. We need to make some 
> changes for them, otherwise we will be forced to use HTD and HCD.
> # SecureTestUtil
> # MobSnapshotTestingUtils
> # HMaster



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18665) ReversedScannerCallable invokes getRegionLocations incorrectly

2017-08-23 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138420#comment-16138420
 ] 

Mike Drob commented on HBASE-18665:
---

Good find, [~psomogyi]! Do you know what is the impact here?

Performance? Correctness?

> ReversedScannerCallable invokes getRegionLocations incorrectly
> --
>
> Key: HBASE-18665
> URL: https://issues.apache.org/jira/browse/HBASE-18665
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 1.1.13
>Reporter: Peter Somogyi
>Assignee: Peter Somogyi
>
> The behavior of ReversedScannerCallable#prepare [1] and 
> ScannerCallable#prepare [2] methods differ how they call 
> RpcRetryingCallerWithReadReplicas.getRegionLocations method.
> The reversed scanner uses the 'reload' parameter directly as the first 
> argument - RpcRetryingCallerWithReadReplicas.getRegionLocations(reload, id, 
> getConnection(), getTableName(), getRow()) - however, the forward scanner 
> passes '!reload'. The getRegionLocations first parameter is 'useCache', the 
> way we use it in ScannerCallable is the correct one.
> The same call can be found in ReversedScannerCallable#locateRegionsInRange 
> [3] also without negating its value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18677) typo in namespace docs

2017-08-25 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18677:
--
Labels: beginner  (was: )

> typo in namespace docs
> --
>
> Key: HBASE-18677
> URL: https://issues.apache.org/jira/browse/HBASE-18677
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Mike Drob
>Priority: Trivial
>  Labels: beginner
>
> In the docs at http://hbase.apache.org/book.html#_namespace - "Region server 
> groups (HBASE-6721) - A namespace/table can be pinned onto a subset of 
> RegionServers thus guaranteeing a course level of isolation."
> Should be "coarse"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18676) Update branch 1.3 pom version

2017-08-25 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18676:
--
Status: Patch Available  (was: Open)

> Update branch 1.3 pom version
> -
>
> Key: HBASE-18676
> URL: https://issues.apache.org/jira/browse/HBASE-18676
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.3.1
>Reporter: Mike Drob
>Priority: Minor
> Fix For: 1.3.2
>
> Attachments: HBASE-18676.branch-1.3.patch
>
>
> Branch 1.3 currently has version=1.3.1 set in the poms. I think this should 
> be 1.3.2-SNAPSHOT, in case anybody tries to deploy nightlies for any reason.
> WDYT [~mantonov]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18676) Update branch 1.3 pom version

2017-08-25 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18676:
--
Attachment: HBASE-18676.branch-1.3.patch

Patch with version update. I'm going to have spotty connectivity for a bit, so 
whoever reviews this can go ahead and push it as well if there are no issues.

> Update branch 1.3 pom version
> -
>
> Key: HBASE-18676
> URL: https://issues.apache.org/jira/browse/HBASE-18676
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.3.1
>Reporter: Mike Drob
>Priority: Minor
> Fix For: 1.3.2
>
> Attachments: HBASE-18676.branch-1.3.patch
>
>
> Branch 1.3 currently has version=1.3.1 set in the poms. I think this should 
> be 1.3.2-SNAPSHOT, in case anybody tries to deploy nightlies for any reason.
> WDYT [~mantonov]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-18676) Update branch 1.3 pom version

2017-08-25 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-18676:
-

Assignee: Mike Drob

> Update branch 1.3 pom version
> -
>
> Key: HBASE-18676
> URL: https://issues.apache.org/jira/browse/HBASE-18676
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.3.1
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Minor
> Fix For: 1.3.2
>
> Attachments: HBASE-18676.branch-1.3.patch
>
>
> Branch 1.3 currently has version=1.3.1 set in the poms. I think this should 
> be 1.3.2-SNAPSHOT, in case anybody tries to deploy nightlies for any reason.
> WDYT [~mantonov]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18629) Enhance ChaosMonkeyRunner with interruptibility

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135439#comment-16135439
 ] 

Mike Drob commented on HBASE-18629:
---

Do we have to start the money using ChaosMonkeyRunner? Can we start a monkey 
directly?

> Enhance ChaosMonkeyRunner with interruptibility
> ---
>
> Key: HBASE-18629
> URL: https://issues.apache.org/jira/browse/HBASE-18629
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: 18629.addendum, 18629.v1.txt, 18629.v2.txt
>
>
> Currently ChaosMonkeyRunner performs looping unconditionally:
> {code}
> while (true) {// loop here until got killed
>   Thread.sleep(1);
> }
> {code}
> When ChaosMonkeyRunner is invoked programmatically, it is desirable to add 
> interruptibility to the runner so that the caller can manage its lifetime.
> Another enhancement is to allow passing the path to hbase-site.xml where 
> chaos monkey parameters are specified.
> This is useful when the underlying hbase-site.xml is not on classpath.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18629) Enhance ChaosMonkeyRunner with interruptibility

2017-08-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16135432#comment-16135432
 ] 

Mike Drob commented on HBASE-18629:
---

Hmmm, I'm not super happy with that approach. Is it conceivable that somebody 
could want to run multiple chaos monkeys? I think so.

Would it make more sense to expose the runner instead?

> Enhance ChaosMonkeyRunner with interruptibility
> ---
>
> Key: HBASE-18629
> URL: https://issues.apache.org/jira/browse/HBASE-18629
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: 18629.addendum, 18629.v1.txt, 18629.v2.txt
>
>
> Currently ChaosMonkeyRunner performs looping unconditionally:
> {code}
> while (true) {// loop here until got killed
>   Thread.sleep(1);
> }
> {code}
> When ChaosMonkeyRunner is invoked programmatically, it is desirable to add 
> interruptibility to the runner so that the caller can manage its lifetime.
> Another enhancement is to allow passing the path to hbase-site.xml where 
> chaos monkey parameters are specified.
> This is useful when the underlying hbase-site.xml is not on classpath.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-12349) Add Maven build support module for a custom version of error-prone

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-12349:
--
Attachment: HBASE-12349.v6.patch

v6.

So... the reason that the hbase-error-prone/target shows up and something like 
hbase-shaded-client/target doesn't is because error-prone module is built 
before the assembly, while shaded-client is built after. Updated the exclude 
set to {{**/target/}} to capture sub-directories.

I don't know what is up with hbase-protocol-shaded/dependency-reduced-pom.xml

> Add Maven build support module for a custom version of error-prone
> --
>
> Key: HBASE-12349
> URL: https://issues.apache.org/jira/browse/HBASE-12349
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Mike Drob
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-12349.patch, HBASE-12349.v2.patch, 
> HBASE-12349.v3.patch, HBASE-12349.v4.patch, HBASE-12349.v5.patch, 
> HBASE-12349.v6.patch
>
>
> Add a new Maven build support module that builds and publishes a custom 
> error-prone artifact for use by the rest of the build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-12349) Add Maven build support module for a custom version of error-prone

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-12349:
--
Status: Patch Available  (was: Open)

> Add Maven build support module for a custom version of error-prone
> --
>
> Key: HBASE-12349
> URL: https://issues.apache.org/jira/browse/HBASE-12349
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Mike Drob
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HBASE-12349.patch, HBASE-12349.v2.patch, 
> HBASE-12349.v3.patch, HBASE-12349.v4.patch, HBASE-12349.v5.patch, 
> HBASE-12349.v6.patch
>
>
> Add a new Maven build support module that builds and publishes a custom 
> error-prone artifact for use by the rest of the build.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-14785) Hamburger menu for mobile site

2017-08-19 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134213#comment-16134213
 ] 

Mike Drob commented on HBASE-14785:
---

sometimes probably means that it happens with some commands and not others, but 
i haven't been paying enough attention to figure out which ones. e.g. maybe 
site triggers it, but package doesn't.

> Hamburger menu for mobile site
> --
>
> Key: HBASE-14785
> URL: https://issues.apache.org/jira/browse/HBASE-14785
> Project: HBase
>  Issue Type: Sub-task
>  Components: website
>Affects Versions: 2.0.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.0.0
>
> Attachments: 
> 0001-HBASE-14774-Addendum-Exclude-src-main-site-resources.patch, 
> HBASE-14774-addendum2_tweak_css.patch, HBASE-14785-addendum.patch, 
> HBASE-14785-addendum-v1.patch, HBASE-14785.patch, HBASE-14785-v1.patch, 
> HBASE-14785-v2.patch, maven-fluido-skin-1.5-HBASE.jar
>
>
> Figure out how to do a hamburger menu on mobile.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-18 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Attachment: jstack

Attaching jstack from the RS.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Priority: Critical
> Attachments: jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to ZookeeperWatcher and launch threads from there. Maybe 
> we'd want to pull the Executor out of ZKPW, but that's not 

[jira] [Assigned] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-18 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-18628:
-

Assignee: Mike Drob

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to ZookeeperWatcher and launch threads from there. Maybe 
> we'd want to pull the Executor 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-18 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Attachment: HBASE-18628.patch

Modifying code and redeploying to this particular cluster is kind of a 
troublesome process, so I'm not able to easily add additional debug statements.

I had also reached the same conclusion that there's something stuck in the CAS 
there, so why not avoid it entirely. Here's a strawman patch (no new tests) to 
do pre-emption using Java features instead of rolling our own method.

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-18 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Patch Available  (was: Open)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Attachments: HBASE-18628.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, but not a viable short term fix.
> # Add an Executor to ZookeeperWatcher and launch threads from there. Maybe 
> we'd want to pull the 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Patch Available  (was: Open)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Status: Open  (was: Patch Available)

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this is important long term for the 
> robustness of the system, 

[jira] [Updated] (HBASE-18628) ZKPermissionWatcher blocks all ZK notifications

2017-08-21 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18628:
--
Attachment: HBASE-18628.v5.patch

v5: Retry v4. Also, actually fix the whitespace this time. :\

> ZKPermissionWatcher blocks all ZK notifications
> ---
>
> Key: HBASE-18628
> URL: https://issues.apache.org/jira/browse/HBASE-18628
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18628.patch, HBASE-18628.v2.patch, 
> HBASE-18628.v3.patch, HBASE-18628.v4.patch, HBASE-18628.v5.patch, jstack
>
>
> Buckle up folks, we're going for a ride here. I've seeing this on a branch-2 
> based build, but I think the problem will affect branch-1 as well. I'm not 
> able to easily reproduce the issue, but it will usually come up within an 
> hour on a given cluster that I have, at which point the problem persists 
> until an RS restart. I've been seeing the problem and paying attention for 
> maybe two months, but I suspect it's been happening much longer than that.
> h3. Problem
> When running in a secure cluster, sometimes the ZK EventThread will get stuck 
> on a permissions update and not be able to process new notifications. This 
> happens to also block flush and snapshot, which is how we found it.
> h3. Analysis
> The main smoking gun is seeing this in repeated jstacks:
> {noformat}
> "main-EventThread" #43 daemon prio=5 os_prio=0 tid=0x7f0b92644000 
> nid=0x6e69 waiting on condition [0x7f0b6730f000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.security.access.ZKPermissionWatcher.nodeChildrenChanged(ZKPermissionWatcher.java:191)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:503)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> {noformat}
> That sleep is a 20ms sleep in an {{AtomicReference.compareAndSet}} loop - but 
> it never gets past the condition.
> {code}
> while (!nodes.compareAndSet(null, nodeList)) {
>   try {
> Thread.sleep(20);
>   } catch (InterruptedException e) {
> LOG.warn("Interrupted while setting node list", e);
> Thread.currentThread().interrupt();
>   }
> }
> {code}
> The warning never shows up in the logs, it just keeps looping and looping. 
> The last relevant line from the watcher in logs is:
> {noformat}
> 2017-08-17 21:25:12,379 DEBUG 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
> regionserver:22101-0x15df38884c80024, quorum=zk1:2181,zk2:2181,zk3:2181, 
> baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, 
> state=SyncConnected, path=/hbase/acl
> {noformat}
> Which makes sense, because the code snippet is from permission watcher's 
> {{nodeChildrenChanged}} handler.
> The separate thread introduced in HBASE-14370 is present, but not doing 
> anything. And this event hasn't gotten to the part where it splits off into a 
> thread:
> {noformat}
> "zk-permission-watcher4-thread-1" #160 daemon prio=5 os_prio=0 
> tid=0x01750800 nid=0x6fd9 waiting on condition [0x7f0b5dce5000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x0007436ecea0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> h3. Solutions
> There's a few approaches we can take to fix this, I think they are all 
> complimentary. It might be useful to file subtasks or new issues for some of 
> the solutions if they are longer term.
> # Move flush and snapshot to ProcedureV2. This makes my proximate problem go 
> away, but it's only relevant to branch-2 and master, and doesn't fix anything 
> on branch-1. Also, Permissions updates would still get stuck, preventing 
> future permissions updates. I think this 

[jira] [Updated] (HBASE-10504) Define Replication Interface

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-10504:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Define Replication Interface
> 
>
> Key: HBASE-10504
> URL: https://issues.apache.org/jira/browse/HBASE-10504
> Project: HBase
>  Issue Type: Task
>  Components: Replication
>Reporter: stack
>Assignee: Enis Soztutar
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
> Attachments: hbase-10504_v1.patch, hbase-10504_wip1.patch
>
>
> HBase has replication.  Fellas have been hijacking the replication apis to do 
> all kinds of perverse stuff like indexing hbase content (hbase-indexer 
> https://github.com/NGDATA/hbase-indexer) and our [~toffer] just showed up w/ 
> overrides that replicate via an alternate channel (over a secure thrift 
> channel between dcs over on HBASE-9360).  This issue is about surfacing these 
> APIs as public with guarantees to downstreamers similar to those we have on 
> our public client-facing APIs (and so we don't break them for downstreamers).
> Any input [~phunt] or [~gabriel.reid] or [~toffer]?
> Thanks.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-10462) Recategorize some of the client facing Public / Private interfaces

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-10462:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Recategorize some of the client facing Public / Private interfaces
> --
>
> Key: HBASE-10462
> URL: https://issues.apache.org/jira/browse/HBASE-10462
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
> Attachments: hbase-10462_wip1.patch
>
>
> We should go over the list of InterfaceAudience.Public interfaces one more to 
> remove those that are NOT indeed public interfaces. 
> From current trunk, we should change these from public to private: 
> {code}
> ReversedScannerCallable
> ReversedClientScanner
> ClientScanner  (note that ResultScanner is public interface, while 
> ClientScanner should not be) 
> ClientSmallScanner
> TableSnapshotScanner -> We need a way of constructing this since it cannot be 
> constructed from HConnection / HTable. Maybe a basic factory. 
> {code}
> These are not marked: 
> {code}
> Registry, 
> ZooKeeperRegistry
> RpcRetryingCallerFactory
> ZooKeeperKeepAliveConnection
> AsyncProcess
> DelegatingRetryingCallable
> HConnectionKey
> MasterKeepAliveConnection
> MultiServerCallable
> {code}
> We can think about making these public interface: 
> {code}
> ScanMetrics
> {code}
> Add javadoc to: 
> {code}
> Query
> {code}
> We can add a test to find out all classes in client package to check for 
> interface mark. 
> We can extend this to brainstorm on the preferred API options. We probably 
> want the clients to use HTableInterface, instead of HTable everywhere. 
> HConnectionManager comes with bazillion methods which are not intended for 
> public use, etc. 
> Raising this as blocker to 1.0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-14996) Some more API cleanup for 2.0

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-14996:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Some more API cleanup for 2.0 
> --
>
> Key: HBASE-14996
> URL: https://issues.apache.org/jira/browse/HBASE-14996
> Project: HBase
>  Issue Type: Improvement
>Reporter: Enis Soztutar
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> Parent jira to keep track of some more API cleanup that did not happen in 1.x 
> timeframe. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-15982) Interface ReplicationEndpoint extends Guava's Service

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-15982:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Interface ReplicationEndpoint extends Guava's Service
> -
>
> Key: HBASE-15982
> URL: https://issues.apache.org/jira/browse/HBASE-15982
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
> Attachments: HBASE-15982.master.001.patch
>
>
> We have Guava's Service leaking into the LimitedPrivate interface 
> ReplicationEndpoint:
> {code}
> public interface ReplicationEndpoint extends Service, 
> ReplicationPeerConfigListener
> {code}
> This required a private patch when I updated Guava for our internal 
> deployments. This is going to be a problem for us for long term maintenance 
> and implenters of pluggable replication endpoints. LP is only less than 
> public by a degree. We shouldn't leak types from third part code into either 
> Public or LP APIs in my opinion. Let's fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-14998) Unify synchronous and asynchronous methods in Admin and cleanup

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-14998:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Unify synchronous and asynchronous methods in Admin and cleanup
> ---
>
> Key: HBASE-14998
> URL: https://issues.apache.org/jira/browse/HBASE-14998
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> Admin has a bunch of methods, some are asnyc, some are sync. Needs some 
> unification in method naming, and method signatures. 
>  - We use modify and alter interchangeably. Pick one and stick with it 
> (modifyTable(), versus getAlterStatus()). Shell uses {{alter}}. 
>  - Remove getAlterStatus(), should not be needed. 
>  - remove already deprecated methods 
>  -  isTableAvailable(TableName tableName, byte[][] splitKeys) should be 
> removed. 
>  - Consistently use Aysnc as a prefix for all async methods. 
>  - Other ideas? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18622) Mitigate compatibility concerns between branch-1 and branch-2

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18622:
--
Fix Version/s: 2.0.0-alpha-3

> Mitigate compatibility concerns between branch-1 and branch-2
> -
>
> Key: HBASE-18622
> URL: https://issues.apache.org/jira/browse/HBASE-18622
> Project: HBase
>  Issue Type: Bug
>  Components: API
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> This project is to do what [~apurtell] did in the issue "HBASE-18431 Mitigate 
> compatibility concerns between branch-1.3 and branch-1.4" only do it between 
> branch-1 and branch-2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-16060) 1.x clients cannot access table state talking to 2.0 cluster

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-16060:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> 1.x clients cannot access table state talking to 2.0 cluster
> 
>
> Key: HBASE-16060
> URL: https://issues.apache.org/jira/browse/HBASE-16060
> Project: HBase
>  Issue Type: Bug
>Reporter: Enis Soztutar
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> Since table state is migrated to meta instead of zk in 2.0, 1.x clients 
> talking to 2.0 cluster cannot access the table state. This causes some weird 
> behavior since from a client perspective, {{Admin.isTableEnabled()}} and 
> {{Admin.isTableDisabled()}} both return false. 
> One option we can do is to add code in 1.x clients so that they can access 
> the table state in meta if needed. Otherwise, we can mirror the table state 
> in zk (while keeping meta as the source of truth) during 2.x lifecycle so 
> that any 1.x client can still work correctly. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-16550) Procedure v2 - Add AM compatibility for 2.x Master and 1.x RSs; i.e. support Rolling Upgrade from hbase-1 to -2.

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-16550:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Procedure v2 - Add AM compatibility for 2.x Master and 1.x RSs; i.e. support 
> Rolling Upgrade from hbase-1 to -2.
> 
>
> Key: HBASE-16550
> URL: https://issues.apache.org/jira/browse/HBASE-16550
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.0.0
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> Core AM HBASE-14614 relies on the RS to be using zkless assignment. Add 
> support for the old a plain non zkless AM



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-17442) Move most of the replication related classes to hbase-server package

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-17442:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Move most of the replication related classes to hbase-server package
> 
>
> Key: HBASE-17442
> URL: https://issues.apache.org/jira/browse/HBASE-17442
> Project: HBase
>  Issue Type: Sub-task
>  Components: build, Replication
>Affects Versions: 2.0.0
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
> Attachments: 0001-hbase-replication-module.patch, 
> HBASE-17442.v1.patch, HBASE-17442.v2.patch, HBASE-17442.v2.patch, 
> HBASE-17442.v3.patch
>
>
> After the replication requests are routed through master, replication 
> implementation details didn't need be exposed to client. We should move most 
> of the replication related classes to hbase-server package.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18106) Redo ProcedureInfo and LockInfo

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18106:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Redo ProcedureInfo and LockInfo
> ---
>
> Key: HBASE-18106
> URL: https://issues.apache.org/jira/browse/HBASE-18106
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: stack
>Priority: Critical
> Fix For: 3.0.0, 2.0.0-alpha-3
>
> Attachments: HBASE-18106.master.001.patch, 
> HBASE-18106.master.002.patch, HBASE-18106.master.003.patch
>
>
> ProcedureInfo was introduced as a lowest-common-denominator POJO that could 
> be used as a facade on PB Procedures. It was good for showing state of 
> Procedure framework in shell and UI.
> Its a bit weird though. Its up in hbase-common rather than in Procedure and 
> it can only ever show a subset of the Procedure info.
> I was thinking we could use the pb3.1 pb->JSON utility instead and emit a 
> JSON String wherever we need to export a view on procedure internals.
> This issue is about exploring this possibility. Would depend on our having an 
> upgraded guava (so probably depends on the 'pre-build' project).
> From ProcedureInfo and LockInfo need fixing in 
> https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.kid1jzo114xw



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-9417) SecureBulkLoadEndpoint should be folded in core

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-9417:
-
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> SecureBulkLoadEndpoint should be folded in core
> ---
>
> Key: HBASE-9417
> URL: https://issues.apache.org/jira/browse/HBASE-9417
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, security
>Reporter: Enis Soztutar
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
>
> In unsecure bulk loading, the client creates the files to be bulk loaded, and 
> asks the regionservers to do the operation. Bulk loading is performed by a 
> move, which would mean that the hbase user has to have WRITE permissions for 
> the bulk loaded files. If the client who has generated the files is different 
> than the hbase user, this creates an access denied exception if complete bulk 
> load is not run as the hbase user.
> I think even for unsecure mode, we should mimic what SecureBulkLoadEndpoint 
> does, where hbase creates a staging directory and the client hands off the 
> files to that directory with global perms. 
> Update: Now that HBASE-12052 enables running SecureBulkLoadEndpoint even in 
> unsecure deployments, we should consider bringing SecureBulkLoad into core 
> HBase (meaning implement the functionality in RegionServer instead of in the 
> coprocessor). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18596) A hbase1 cluster should be able to replicate to a hbase2 cluster; verify

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18596:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> A hbase1 cluster should be able to replicate to a hbase2 cluster; verify
> 
>
> Key: HBASE-18596
> URL: https://issues.apache.org/jira/browse/HBASE-18596
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: Esteban Gutierrez
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> From the mailing list thread "[DISCUSS] hbase-2.0.0 compatibility 
> expectations", [~esteban] asks:
> bq. Should we add additional details around replication as well? for 
> instance, shall we consider a hbase-1.x cluster as a client for a hbase-2.x 
> cluster?
> The latter should be a blocker. Verify it works.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-17143) Scan improvement

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-17143:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Scan improvement
> 
>
> Key: HBASE-17143
> URL: https://issues.apache.org/jira/browse/HBASE-17143
> Project: HBase
>  Issue Type: Umbrella
>  Components: Client, scan
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> Parent issues to track some improvements of the current scan.
> Timeout per scan, unify batch and allowPartial, add inclusive and exclusive 
> of startKey and endKey, start scan from the middle of a record, use mvcc to 
> keep row atomic when allowPartial, etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-10944) Remove all kv.getBuffer() and kv.getRow() references existing in the code

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-10944:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Remove all kv.getBuffer() and kv.getRow() references existing in the code
> -
>
> Key: HBASE-10944
> URL: https://issues.apache.org/jira/browse/HBASE-10944
> Project: HBase
>  Issue Type: Sub-task
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 1.5.0, 2.0.0-alpha-3
>
>
> kv.getRow() and kv.getBuffers() are still used in places to form key byte[] 
> and row byte[].  Removing all such instances including testcases will make 
> the usage of Cell complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18169) Coprocessor fix and cleanup before 2.0.0 release

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18169:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Coprocessor fix and cleanup before 2.0.0 release
> 
>
> Key: HBASE-18169
> URL: https://issues.apache.org/jira/browse/HBASE-18169
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 2.0.0-alpha-1
>Reporter: Duo Zhang
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> As discussed in HBASE-18038. In RegionServerServices, Region and StoreFile 
> interfaces we expose too many unnecessary methods. We need to find a way to 
> not expose these methods to CP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-16769) Deprecate/remove PB references from MasterObserver and RegionServerObserver

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-16769:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Deprecate/remove PB references from MasterObserver and RegionServerObserver
> ---
>
> Key: HBASE-16769
> URL: https://issues.apache.org/jira/browse/HBASE-16769
> Project: HBase
>  Issue Type: Bug
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> This is effectively a sub-task for HBASE-15174.
> CP Methods
> MasterObserver
>   preListSnapshot
>   postListSnapshot
>   preSnapshot
>   postSnapshot
>   preCloneSnapshot
>   postCloneSnapshot
>   preRestoreSnapshot
>   postRestoreSnapshot
>   preDeleteSnapshot
>   postDeleteSnapshot
>   
>   preSetUserQuota
>   postSetUserQuota
>   preSetUserQuota
>   postSetUserQuota
>   preSetUserQuota
>   postSetUserQuota
>   preSetTableQuota
>   postSetTableQuota
>   preSetNamespaceQuota
>   postSetNamespaceQuota
>   
> RegionServerObserver
>   preReplicateLogEntries
>   postReplicateLogEntries



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18298) RegionServerServices Interface cleanup for CP expose

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18298:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> RegionServerServices Interface cleanup for CP expose
> 
>
> Key: HBASE-18298
> URL: https://issues.apache.org/jira/browse/HBASE-18298
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
> Attachments: HBASE-18298.patch, HBASE-18298_V2.patch, 
> HBASE-18298_V3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-15284) Make TimeRange constructors IA.Private and remove unused TimeRange constructors

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-15284:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Make TimeRange constructors IA.Private and remove unused TimeRange 
> constructors
> ---
>
> Key: HBASE-15284
> URL: https://issues.apache.org/jira/browse/HBASE-15284
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 2.0.0-alpha-3
>
> Attachments: hbase-15284.patch, hbase-15284.v2.patch, 
> hbase-15284.v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-14997) Move compareOp and Comparators out of filter to client package

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-14997:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Move compareOp and Comparators out of filter to client package
> --
>
> Key: HBASE-14997
> URL: https://issues.apache.org/jira/browse/HBASE-14997
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
>
> {{Table.checkAndPut()}} and its cousins depend on CompareOp from the filter 
> package. Originally, ComparaOp and ByteArrayComparable, and various 
> "comparators" have been used in filters, so these are in the filter package. 
> However, for checkAndPut(), etc we depend on the filter subpackage although 
> these are not filter related operations. We can use some clean up.
> {code}
>   boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier,
> CompareFilter.CompareOp compareOp, byte[] value, Put put) throws 
> IOException;
> {code}
> Some ideas
>  - Cleanup ByteArrayComparable interface (see the TODO at the class) 
>  - Maybe introduce a {{Condition}} or a similar concept and do 
> {{checkAndPut(Condition condition, Put put)}} and change filters to use that 
> as well. 
>  - Introducing Condition like thing will allow us to have an interface like: 
> {{checkAndMutate(List conditions, List mutations)}}. 
>  - BinaryComparator, etc are not "Comparators", they are comparables. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-13346) Clean up Filter package for post 1.0

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-13346:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Clean up Filter package for post 1.0
> 
>
> Key: HBASE-13346
> URL: https://issues.apache.org/jira/browse/HBASE-13346
> Project: HBase
>  Issue Type: Bug
>  Components: API, Filters
>Affects Versions: 2.0.0
>Reporter: Lars George
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
>
> Since we have a bit of a messy Filter API with KeyValue vs Cell reference 
> mixed up all over the place, I recommend cleaning this up once and for all. 
> There should be no {{KeyValue}} (or {{kv}}, {{kvs}} etc.) in any method or 
> parameter name.
> This includes deprecating and renaming filters too, for example 
> {{FirstKeyOnlyFilter}}, which really should be named {{FirstKeyValueFilter}} 
> as it does _not_ just return the key, but the entire cell. It should be 
> deprecated and renamed to {{FirstCellFilter}} (or {{FirstColumnFilter}} if 
> you prefer).
> In general we should clarify and settle on {{KeyValue}} vs {{Cell}} vs 
> {{Column}} in our naming. The latter two are the only ones going forward with 
> the public API, and are used synonymous. We should carefully check which is 
> better suited (is it really a specific cell, or the newest cell, aka the 
> newest column value) and settle on a naming schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-14255) Simplify Cell creation post 1.0

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-14255:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Simplify Cell creation post 1.0
> ---
>
> Key: HBASE-14255
> URL: https://issues.apache.org/jira/browse/HBASE-14255
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Lars George
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
>
> After the switch to the new Cell based client API, and making KeyValue 
> private (but especially as soon as DBB backed Cells land) it is rather 
> difficult to create a {{Cell}} instance. I am using this now:
> {code}
>  @Override
>   public void postGetOp(ObserverContext e,
> Get get, List results) throws IOException {
> Put put = new Put(get.getRow());
> put.addColumn(get.getRow(), FIXED_COLUMN, Bytes.toBytes(counter.get()));
> CellScanner scanner = put.cellScanner();
> scanner.advance();
> Cell cell = scanner.current();
> LOG.debug("Adding fake cell: " + cell);
> results.add(cell);
>   }
> {code}
> That is, I have to create a {{Put}} instance to add a cell and then retrieve 
> its instance. The {{KeyValue}} methods are private now and should not be 
> used. Create a CellBuilder helper?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-13271) Table#puts(List) operation is indeterminate; needs fixing

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-13271:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Table#puts(List) operation is indeterminate; needs fixing
> --
>
> Key: HBASE-13271
> URL: https://issues.apache.org/jira/browse/HBASE-13271
> Project: HBase
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 1.0.0
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.0-alpha-3
>
>
> Another API issue found by [~larsgeorge]:
> "Table.put(List {code}
> [Mar-17 9:21 AM] Lars George: Table.put(List) is weird since you cannot 
> flush partial lists
> [Mar-17 9:21 AM] Lars George: Say out of 5 the third is broken, then the 
> put() call returns with a local exception (say empty Put) and then you have 2 
> that are in the buffer
> [Mar-17 9:21 AM] Lars George: but how to you force commit them?
> [Mar-17 9:22 AM] Lars George: In the past you would call flushCache(), but 
> that is "gone" now
> [Mar-17 9:22 AM] Lars George: and flush() is not available on a Table
> [Mar-17 9:22 AM] Lars George: And you cannot access the underlying 
> BufferedMutation neither
> [Mar-17 9:23 AM] Lars George: You can *only* add more Puts if you can, or 
> call close()
> [Mar-17 9:23 AM] Lars George: that is just weird to explain
> {code}
> So, Table needs to get flush back or we deprecate this method or it flushes 
> immediately and does not return until complete in the implementation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-13740) Stop using Hadoop private interfaces

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-13740:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Stop using Hadoop private interfaces
> 
>
> Key: HBASE-13740
> URL: https://issues.apache.org/jira/browse/HBASE-13740
> Project: HBase
>  Issue Type: Umbrella
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0-alpha-3
>
>
> Now that we are push downstream folks to stay off of our private interfaces, 
> we should provide a good example by doing the same with Hadoop.
> Things to do in this umbrella
> * We need a good way to check; manual inspection is untenable
> * For anything where Hadoop isn't maintaining an isolated API (i.e. they 
> include a non-org.apache.hadoop or jvm class), we should just rip things out
> * For the rest we'll need to determine if we ask for upgrading things to 
> LimitedPrivate(HBase) or Public 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18601) Remove Htrace 3.2

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18601:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-3

> Remove Htrace 3.2
> -
>
> Key: HBASE-18601
> URL: https://issues.apache.org/jira/browse/HBASE-18601
> Project: HBase
>  Issue Type: Task
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Tamas Penzes
>Assignee: Andrew Purtell
> Fix For: 2.0.0-alpha-3
>
> Attachments: HBASE-18601.master.001.patch
>
>
> HTrace is not perfectly integrated into HBase, the version 3.2.0 is buggy, 
> the upgrade to 4.x is not trivial and would take time. It might not worth to 
> keep it in this state, so would be better to remove it.
> Of course it doesn't mean tracing would be useless, just that in this form 
> the use of HTrace 3.2 might not add any value to the project and fixing it 
> would be far too much effort.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18577) shaded client includes several non-relocated third party dependencies

2017-08-17 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18577:
--
Fix Version/s: (was: 2.0.0-alpha-2)
   2.0.0-alpha-3

> shaded client includes several non-relocated third party dependencies
> -
>
> Key: HBASE-18577
> URL: https://issues.apache.org/jira/browse/HBASE-18577
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.2.0, 1.1.2, 1.3.0, 2.0.0-alpha-1
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 2.0.0-alpha-3, 1.1.13
>
> Attachments: HBASE-18577.WIP.0.patch, HBASE-18577.WIP.-1.patch, 
> HBASE-18577.WIP.1.patch
>
>
> we have some unexpected unrelocated third party dependencies in our shaded 
> artifacts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18577) shaded client includes several non-relocated third party dependencies

2017-08-17 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131204#comment-16131204
 ] 

Mike Drob commented on HBASE-18577:
---

retargetting to alpha-3 in anticipation that the current alpha2 vote passes. if 
the vote fails and this makes it into the next rc, please update jira 
appropriately.

> shaded client includes several non-relocated third party dependencies
> -
>
> Key: HBASE-18577
> URL: https://issues.apache.org/jira/browse/HBASE-18577
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.2.0, 1.1.2, 1.3.0, 2.0.0-alpha-1
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
> Fix For: 3.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 2.0.0-alpha-3, 1.1.13
>
> Attachments: HBASE-18577.WIP.0.patch, HBASE-18577.WIP.-1.patch, 
> HBASE-18577.WIP.1.patch
>
>
> we have some unexpected unrelocated third party dependencies in our shaded 
> artifacts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18503) Change ***Util and Master to use TableDescriptor and ColumnFamilyDescriptor

2017-08-17 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131209#comment-16131209
 ] 

Mike Drob commented on HBASE-18503:
---

don't take this as blocking commit, btw. we can continue to discuss with or 
without the patch, since that code is already in there and not new.

> Change ***Util and Master to use TableDescriptor and ColumnFamilyDescriptor
> ---
>
> Key: HBASE-18503
> URL: https://issues.apache.org/jira/browse/HBASE-18503
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0
>
> Attachments: HBASE-18503.v0.patch, HBASE-18503.v1.patch, 
> HBASE-18503.v1.patch, HBASE-18503.v2.patch, HBASE-18503.v2.patch, 
> HBASE-18503.v2.patch
>
>
> These helper classes accept the HTD and HCD as argument. We need to make some 
> changes for them, otherwise we will be forced to use HTD and HCD.
> # SecureTestUtil
> # MobSnapshotTestingUtils
> # HMaster



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18503) Change ***Util and Master to use TableDescriptor and ColumnFamilyDescriptor

2017-08-17 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131207#comment-16131207
 ] 

Mike Drob commented on HBASE-18503:
---

assertEquals() as in from junit? I was pretty sure that always called the 
object form, no?

> Change ***Util and Master to use TableDescriptor and ColumnFamilyDescriptor
> ---
>
> Key: HBASE-18503
> URL: https://issues.apache.org/jira/browse/HBASE-18503
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0
>
> Attachments: HBASE-18503.v0.patch, HBASE-18503.v1.patch, 
> HBASE-18503.v1.patch, HBASE-18503.v2.patch, HBASE-18503.v2.patch, 
> HBASE-18503.v2.patch
>
>
> These helper classes accept the HTD and HCD as argument. We need to make some 
> changes for them, otherwise we will be forced to use HTD and HCD.
> # SecureTestUtil
> # MobSnapshotTestingUtils
> # HMaster



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16196) Update jruby to a newer version.

2017-10-08 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196429#comment-16196429
 ] 

Mike Drob commented on HBASE-16196:
---

I don't think it's worth a revert for that regression - can you file a new 
issue for it [~chia7712]? I'll take a look tomorrow and also ping JRuby folks 
on IRC about it.

Interesting that the issue is around since at least JRuby 1.7.9 (we were on 1.6 
before).

> Update jruby to a newer version.
> 
>
> Key: HBASE-16196
> URL: https://issues.apache.org/jira/browse/HBASE-16196
> Project: HBase
>  Issue Type: Improvement
>  Components: dependencies, shell
>Reporter: Elliott Clark
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: 0001-Update-to-JRuby-9.1.2.0-and-JLine-2.12.patch, 
> HBASE-16196-branch-1.v9.patch, HBASE-16196.v5.patch, HBASE-16196.v6.patch, 
> HBASE-16196.v7.patch, HBASE-16196.v8.patch, HBASE-16196.v9.patch, 
> hbase-16196.branch-1.patch, hbase-16196.v2.branch-1.patch, 
> hbase-16196.v3.branch-1.patch, hbase-16196.v4.branch-1.patch
>
>
> Ruby 1.8.7 is no longer maintained.
> The TTY library in the old jruby is bad. The newer one is less bad.
> Since this is only a dependency on the hbase-shell module and not on 
> hbase-client or hbase-server this should be a pretty simple thing that 
> doesn't have any backwards compat issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-09 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18867:
--
Attachment: HBASE-18867.branch-1.3.patch

Attaching a branch-1.3 patch as well, since that one _also_ doesn't cherry-pick 
cleanly from branch-1/branch-2. 1.3 patch is good all the way back to 1.1 
though.

Will wait for a review on both flavors of branch-1 patches before pushing 
anything else.

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.5.0
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.3.patch, 
> HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> 

[jira] [Updated] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-09 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18867:
--
Fix Version/s: 1.1.13
   1.2.7
   1.3.2

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 1.1.13
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.3.patch, 
> HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugin-testing/maven-plugin-testing-harness/1.3/maven-plugin-testing-harness-1.3.jar
> [ERROR] urls[25] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
> [ERROR] urls[26] = 
> 

[jira] [Updated] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-09 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18867:
--
Attachment: HBASE-18867.branch-1.patch

Pushed to branch-2 and master, attaching a branch-1 patch since that didn't 
apply cleanly.

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.5.0
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugin-testing/maven-plugin-testing-harness/1.3/maven-plugin-testing-harness-1.3.jar
> [ERROR] urls[25] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
> [ERROR] urls[26] = 
> 

[jira] [Comment Edited] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-09 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198009#comment-16198009
 ] 

Mike Drob edited comment on HBASE-18867 at 10/10/17 1:41 AM:
-

Pushed to branch-2 and master, attaching a branch-1 patch since that didn't 
apply cleanly.

[~apurtell] - do you even want this on branch-1.4? not sure if compiling with 
java9 is a non-goal for the release.


was (Author: mdrob):
Pushed to branch-2 and master, attaching a branch-1 patch since that didn't 
apply cleanly.

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.5.0
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> 

[jira] [Commented] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-09 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198032#comment-16198032
 ] 

Mike Drob commented on HBASE-18867:
---

cc: [~mantonov] [~busbey] [~ndimiduk]

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 2.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 1.1.13
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.3.patch, 
> HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugin-testing/maven-plugin-testing-harness/1.3/maven-plugin-testing-harness-1.3.jar
> [ERROR] urls[25] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
> [ERROR] urls[26] = 
> 

[jira] [Commented] (HBASE-16338) update jackson to 2.y

2017-10-05 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193643#comment-16193643
 ] 

Mike Drob commented on HBASE-16338:
---

bq. Could you describe the problem caused by it remaining '$'?
The new version of jackson doesn't know what to do with it when deserializing, 
and it leads to a bunch of unit test failures. One example:

{noformat}
testMultiCellGetWithColsJSON[0](org.apache.hadoop.hbase.rest.TestMultiRowResource)
  Time elapsed: 0.083 sec  <<< ERROR!
org.codehaus.jackson.map.exc.UnrecognizedPropertyException:
Unrecognized field "value" (Class 
org.apache.hadoop.hbase.rest.model.CellModel), not marked as ignorable
 at [Source: [B@7e5efcab; line: 1, column: 91] (through reference chain: 
org.apache.hadoop.hbase.rest.model.CellSetModel["Row"]->org.apache.hadoop.hbase.rest.model.RowModel["Cell"]->org.apache.hadoop.hbase.rest.model.CellModel["value"])
at 
org.apache.hadoop.hbase.rest.TestMultiRowResource.testMultiCellGetWithColsJSON(TestMultiRowResource.java:205)
{noformat}

The relevant test code being:

{code}
ObjectMapper mapper = new 
JacksonJaxbJsonProvider().locateMapper(CellSetModel.class, 
MediaType.APPLICATION_JSON_TYPE);
CellSetModel cellSet = mapper.readValue(response.getBody(), 
CellSetModel.class);
 {code}

> update jackson to 2.y
> -
>
> Key: HBASE-16338
> URL: https://issues.apache.org/jira/browse/HBASE-16338
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Mike Drob
> Fix For: 2.0.0-beta-2
>
> Attachments: 16338.txt, HBASE-16338.v2.patch, HBASE-16338.v3.patch
>
>
> Our jackson dependency is from ~3 years ago. Update to the jackson 2.y line, 
> using 2.7.0+.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16338) update jackson to 2.y

2017-10-05 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193736#comment-16193736
 ] 

Mike Drob commented on HBASE-16338:
---

Dropping the JSON annotation seems to have worked, I'll go with that in my next 
patch.


Currently trying to debug why CellSetModelStream.Row isn't getting populated in 
TableScanResource.

> update jackson to 2.y
> -
>
> Key: HBASE-16338
> URL: https://issues.apache.org/jira/browse/HBASE-16338
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Mike Drob
> Fix For: 2.0.0-beta-2
>
> Attachments: 16338.txt, HBASE-16338.v2.patch, HBASE-16338.v3.patch
>
>
> Our jackson dependency is from ~3 years ago. Update to the jackson 2.y line, 
> using 2.7.0+.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18957) add test that establishes branch-1 behavior for filterlist w/OR

2017-10-07 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195811#comment-16195811
 ] 

Mike Drob commented on HBASE-18957:
---

Should we link this to the revertible issues?

> add test that establishes branch-1 behavior for filterlist w/OR
> ---
>
> Key: HBASE-18957
> URL: https://issues.apache.org/jira/browse/HBASE-18957
> Project: HBase
>  Issue Type: Sub-task
>  Components: Filters
>Reporter: Sean Busbey
>Assignee: Peter Somogyi
>Priority: Critical
> Fix For: 2.0.0-alpha-4
>
> Attachments: HBASE-18957-branch-1.2.v0.patch, 
> HBASE-18957-branch-1.2.v1.patch, HBASE-18957-branch-1.4.v1.patch, 
> HBASE-18957-branch-1.v1.patch, HBASE-18957-master.v1.patch
>
>
> we need a test that shows the expected behavior for filter lists that rely on 
> OR prior to our filterlist improvements so we have a baseline to show 
> compatibility (and/or document incompatibilities that end up being 
> introduced).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH

2017-10-11 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200932#comment-16200932
 ] 

Mike Drob commented on HBASE-18987:
---

{code}
+String nameStr = Bytes.toString(name);
{code}
This one is unused too, I should have caught it the first time.

+1 assuming tests pass otherwise

> Raise value of HConstants#MAX_ROW_LENGTH
> 
>
> Key: HBASE-18987
> URL: https://issues.apache.org/jira/browse/HBASE-18987
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Minor
> Attachments: HBASE-18987.master.001.patch, 
> HBASE-18987.master.002.patch
>
>
> Short.MAX_VALUE hasn't been a problem for a long time but one of our 
> customers ran into an  edgy case when the midKey used for the split point was 
> very close to Short.MAX_VALUE. When the split is submitted, we attempt to 
> create the new two daughter regions and we name those regions via 
> {{HRegionInfo.createRegionName()}} in order to be added to META. 
> Unfortunately, since {{HRegionInfo.createRegionName()}} uses midKey as the 
> startKey {{Put}} will fail since the row key length will now fail checkRow 
> and thus causing the split to fail.
> I tried a couple of alternatives to address this problem, e.g. truncating the 
> startKey. But the number of changes in the code doesn't justify for this edge 
> condition. Since we already use {{Integer.MAX_VALUE - 1}} for 
> {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for 
> the maximum row key. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18867:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 3.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 1.1.13, 2.0.0-alpha-4
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.3.patch, 
> HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugin-testing/maven-plugin-testing-harness/1.3/maven-plugin-testing-harness-1.3.jar
> [ERROR] urls[25] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
> [ERROR] urls[26] = 
> 

[jira] [Updated] (HBASE-18867) maven enforcer plugin needs update to work with jdk9

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18867:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-4
   3.0.0

> maven enforcer plugin needs update to work with jdk9
> 
>
> Key: HBASE-18867
> URL: https://issues.apache.org/jira/browse/HBASE-18867
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 2.0.0-alpha-3
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 3.0.0, 1.4.0, 1.3.2, 1.5.0, 1.2.7, 1.1.13, 2.0.0-alpha-4
>
> Attachments: HBASE-18867.0.patch, HBASE-18867.branch-1.3.patch, 
> HBASE-18867.branch-1.patch
>
>
> build fails under jdk9, even when targeting java 8
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce 
> (min-maven-min-java-banned-xerces) on project hbase: Execution 
> min-maven-min-java-banned-xerces of goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce failed: An API 
> incompatibility was encountered while executing 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4:enforce: 
> java.lang.ExceptionInInitializerError: null
> [ERROR] -
> [ERROR] realm =plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.4
> [ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
> [ERROR] urls[0] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugins/maven-enforcer-plugin/1.4/maven-enforcer-plugin-1.4.jar
> [ERROR] urls[1] = 
> file:/Users/busbey/.m2/repository/org/codehaus/mojo/extra-enforcer-rules/1.0-beta-6/extra-enforcer-rules-1.0-beta-6.jar
> [ERROR] urls[2] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-dependency-tree/2.2/maven-dependency-tree-2.2.jar
> [ERROR] urls[3] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-component-annotations/1.5.5/plexus-component-annotations-1.5.5.jar
> [ERROR] urls[4] = 
> file:/Users/busbey/.m2/repository/org/eclipse/aether/aether-util/0.9.0.M2/aether-util-0.9.0.M2.jar
> [ERROR] urls[5] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/1.4/maven-common-artifact-filters-1.4.jar
> [ERROR] urls[6] = 
> file:/Users/busbey/.m2/repository/com/ibm/icu/icu4j/56.1/icu4j-56.1.jar
> [ERROR] urls[7] = 
> file:/Users/busbey/.m2/repository/backport-util-concurrent/backport-util-concurrent/3.1/backport-util-concurrent-3.1.jar
> [ERROR] urls[8] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interpolation/1.11/plexus-interpolation-1.11.jar
> [ERROR] urls[9] = 
> file:/Users/busbey/.m2/repository/org/slf4j/slf4j-jdk14/1.5.6/slf4j-jdk14-1.5.6.jar
> [ERROR] urls[10] = 
> file:/Users/busbey/.m2/repository/org/slf4j/jcl-over-slf4j/1.5.6/jcl-over-slf4j-1.5.6.jar
> [ERROR] urls[11] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.jar
> [ERROR] urls[12] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.jar
> [ERROR] urls[13] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.jar
> [ERROR] urls[14] = 
> file:/Users/busbey/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
> [ERROR] urls[15] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-interactivity-api/1.0-alpha-4/plexus-interactivity-api-1.0-alpha-4.jar
> [ERROR] urls[16] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-sec-dispatcher/1.3/plexus-sec-dispatcher-1.3.jar
> [ERROR] urls[17] = 
> file:/Users/busbey/.m2/repository/org/sonatype/plexus/plexus-cipher/1.4/plexus-cipher-1.4.jar
> [ERROR] urls[18] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-utils/3.0.20/plexus-utils-3.0.20.jar
> [ERROR] urls[19] = 
> file:/Users/busbey/.m2/repository/commons-lang/commons-lang/2.3/commons-lang-2.3.jar
> [ERROR] urls[20] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-api/1.4/enforcer-api-1.4.jar
> [ERROR] urls[21] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/enforcer/enforcer-rules/1.4/enforcer-rules-1.4.jar
> [ERROR] urls[22] = 
> file:/Users/busbey/.m2/repository/org/beanshell/bsh/2.0b4/bsh-2.0b4.jar
> [ERROR] urls[23] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-i18n/1.0-beta-6/plexus-i18n-1.0-beta-6.jar
> [ERROR] urls[24] = 
> file:/Users/busbey/.m2/repository/org/apache/maven/plugin-testing/maven-plugin-testing-harness/1.3/maven-plugin-testing-harness-1.3.jar
> [ERROR] urls[25] = 
> file:/Users/busbey/.m2/repository/org/codehaus/plexus/plexus-archiver/2.2/plexus-archiver-2.2.jar
> [ERROR] 

[jira] [Updated] (HBASE-16338) update jackson to 2.y

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-16338:
--
Status: Patch Available  (was: Open)

> update jackson to 2.y
> -
>
> Key: HBASE-16338
> URL: https://issues.apache.org/jira/browse/HBASE-16338
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Mike Drob
> Fix For: 2.0.0-beta-2
>
> Attachments: 16338.txt, HBASE-16338.v2.patch, HBASE-16338.v3.patch, 
> HBASE-16338.v5.patch, HBASE-16338.v6.patch, HBASE-16338.v7.patch, 
> HBASE-16338.v8.patch
>
>
> Our jackson dependency is from ~3 years ago. Update to the jackson 2.y line, 
> using 2.7.0+.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-16338) update jackson to 2.y

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-16338:
--
Attachment: HBASE-16338.v8.patch

v8: Try to massage the maven dependency graph a little more to not fail with 
ClassNotFound

> update jackson to 2.y
> -
>
> Key: HBASE-16338
> URL: https://issues.apache.org/jira/browse/HBASE-16338
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Mike Drob
> Fix For: 2.0.0-beta-2
>
> Attachments: 16338.txt, HBASE-16338.v2.patch, HBASE-16338.v3.patch, 
> HBASE-16338.v5.patch, HBASE-16338.v6.patch, HBASE-16338.v7.patch, 
> HBASE-16338.v8.patch
>
>
> Our jackson dependency is from ~3 years ago. Update to the jackson 2.y line, 
> using 2.7.0+.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-16338) update jackson to 2.y

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-16338:
--
Status: Open  (was: Patch Available)

> update jackson to 2.y
> -
>
> Key: HBASE-16338
> URL: https://issues.apache.org/jira/browse/HBASE-16338
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Sean Busbey
>Assignee: Mike Drob
> Fix For: 2.0.0-beta-2
>
> Attachments: 16338.txt, HBASE-16338.v2.patch, HBASE-16338.v3.patch, 
> HBASE-16338.v5.patch, HBASE-16338.v6.patch, HBASE-16338.v7.patch, 
> HBASE-16338.v8.patch
>
>
> Our jackson dependency is from ~3 years ago. Update to the jackson 2.y line, 
> using 2.7.0+.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18667) Disable error-prone for hbase-protocol-shaded

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18667:
--
 Assignee: Mike Drob
Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

v1: Speeds up compilation for me with error-prone from 5 minutes to ~30 seconds 
for protobuf modules. This may also improve other static analysis tools that we 
have.

> Disable error-prone for hbase-protocol-shaded
> -
>
> Key: HBASE-18667
> URL: https://issues.apache.org/jira/browse/HBASE-18667
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
> Fix For: 3.0.0
>
>
> This is all generated code that we shouldn't be running extra analysis on 
> because it adds a lot of noise to the build, and also takes a very long time 
> (15 minutes on my machine). Let's make it fast and simple.
> Even when we run with error-prone enabled for the rest of the build, it 
> should not apply here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18667) Disable error-prone for hbase-protocol-shaded

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18667:
--
Attachment: HBASE-18667.patch

> Disable error-prone for hbase-protocol-shaded
> -
>
> Key: HBASE-18667
> URL: https://issues.apache.org/jira/browse/HBASE-18667
> Project: HBase
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mike Drob
>Assignee: Mike Drob
> Fix For: 3.0.0
>
> Attachments: HBASE-18667.patch
>
>
> This is all generated code that we shouldn't be running extra analysis on 
> because it adds a lot of noise to the build, and also takes a very long time 
> (15 minutes on my machine). Let's make it fast and simple.
> Even when we run with error-prone enabled for the rest of the build, it 
> should not apply here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-18974) Document "Becoming a Committer"

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-18974:
-

Assignee: Mike Drob

> Document "Becoming a Committer"
> ---
>
> Key: HBASE-18974
> URL: https://issues.apache.org/jira/browse/HBASE-18974
> Project: HBase
>  Issue Type: Bug
>  Components: community, documentation
>Reporter: Mike Drob
>Assignee: Mike Drob
> Attachments: HBASE-18974.patch
>
>
> Based on the mailing list discussion at 
> https://lists.apache.org/thread.html/81c633cbe1f6f78421cbdad5b9549643c67803a723a9d86a513264c0@%3Cdev.hbase.apache.org%3E
>  it sounds like we should record some of the thoughts for future contributors 
> to refer to.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18974) Document "Becoming a Committer"

2017-10-11 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18974:
--
Attachment: HBASE-18974.patch

v1: Joined [~misty] and [~elserj]'s lists from the mailing list discussion, 
borrowed some opening verbiage from Hadoop's documentation, and tried to slot 
it into the book naturally.

[~apurtell], [~misty] - please review!

> Document "Becoming a Committer"
> ---
>
> Key: HBASE-18974
> URL: https://issues.apache.org/jira/browse/HBASE-18974
> Project: HBase
>  Issue Type: Bug
>  Components: community, documentation
>Reporter: Mike Drob
> Attachments: HBASE-18974.patch
>
>
> Based on the mailing list discussion at 
> https://lists.apache.org/thread.html/81c633cbe1f6f78421cbdad5b9549643c67803a723a9d86a513264c0@%3Cdev.hbase.apache.org%3E
>  it sounds like we should record some of the thoughts for future contributors 
> to refer to.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH

2017-10-11 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200877#comment-16200877
 ] 

Mike Drob commented on HBASE-18987:
---

{code}
+String md5HashInHex = MD5Hash.getMD5AsHex(name, 0, name.length);
{code}
This is unused?

{code}
+try {
+  Put hri_a = new Put(sk);
+} catch (Exception e) {
+  fail("Put shouldn't have failed with oversized row key");
+}
{code}
Let the exception propagate, or do {{assertNotNull(hri_a)}}. The catch/fail is 
icky, I think.

> Raise value of HConstants#MAX_ROW_LENGTH
> 
>
> Key: HBASE-18987
> URL: https://issues.apache.org/jira/browse/HBASE-18987
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Minor
> Attachments: HBASE-18987.master.001.patch
>
>
> Short.MAX_VALUE hasn't been a problem for a long time but one of our 
> customers ran into an  edgy case when the midKey used for the split point was 
> very close to Short.MAX_VALUE. When the split is submitted, we attempt to 
> create the new two daughter regions and we name those regions via 
> {{HRegionInfo.createRegionName()}} in order to be added to META. 
> Unfortunately, since {{HRegionInfo.createRegionName()}} uses midKey as the 
> startKey {{Put}} will fail since the row key length will now fail checkRow 
> and thus causing the split to fail.
> I tried a couple of alternatives to address this problem, e.g. truncating the 
> startKey. But the number of changes in the code doesn't justify for this edge 
> condition. Since we already use {{Integer.MAX_VALUE - 1}} for 
> {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for 
> the maximum row key. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19042) Oracle Java 8u144 downloader broken in precommit check

2017-10-18 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210491#comment-16210491
 ] 

Mike Drob commented on HBASE-19042:
---

Hmmm...

{noformat}
02:23:20 Successfully built 8549bfd1b815
02:23:20 Successfully tagged yetus/hbase:tp-30526
02:23:20 
02:23:20 Total Elapsed time:   0m  2s
02:23:20 
02:23:20 /usr/bin/docker: invalid reference format: repository name must be 
lowercase.
02:23:20 See '/usr/bin/docker run --help'.
02:23:21 Build step 'Execute shell' marked build as failure
{noformat}

> Oracle Java 8u144 downloader broken in precommit check
> --
>
> Key: HBASE-19042
> URL: https://issues.apache.org/jira/browse/HBASE-19042
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Reporter: Peter Somogyi
>Assignee: Misty Stanley-Jones
>Priority: Blocker
> Attachments: Dockerfile, HBASE-19042.patch
>
>
> Precommit job fails to install Oracle Java 8 to docker image which is due to 
> Oracle's new Java version, 8u151.
> As this thread point out we probably need to upgrade to latest java 8 
> version: https://ubuntuforums.org/showthread.php?t=2374686
> {code}
> 06:45:14 Setting up java-common (0.51) ...
> 06:45:14 Setting up oracle-java8-installer (8u144-1~webupd8~0) ...
> 06:45:14 No /var/cache/oracle-jdk8-installer/wgetrc file found.
> 06:45:14 Creating /var/cache/oracle-jdk8-installer/wgetrc and
> 06:45:14 using default oracle-java8-installer wgetrc settings for it.
> 06:45:14 Downloading Oracle Java 8...
> 06:45:14 --2017-10-18 13:45:14--  
> http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
> 06:45:14 Resolving download.oracle.com (download.oracle.com)... 
> 23.59.189.81, 23.59.189.91
> 06:45:14 Connecting to download.oracle.com 
> (download.oracle.com)|23.59.189.81|:80... connected.
> 06:45:14 HTTP request sent, awaiting response... 302 Moved 
> Temporarily
> 06:45:14 Location: 
> https://edelivery.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
>  [following]
> 06:45:14 --2017-10-18 13:45:14--  
> https://edelivery.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz
> 06:45:14 Resolving edelivery.oracle.com (edelivery.oracle.com)... 
> 23.39.16.136, 2600:1409:a:39e::2d3e, 2600:1409:a:39c::2d3e
> 06:45:14 Connecting to edelivery.oracle.com 
> (edelivery.oracle.com)|23.39.16.136|:443... connected.
> 06:45:14 HTTP request sent, awaiting response... 302 Moved 
> Temporarily
> 06:45:14 Location: 
> http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz?AuthParam=1508334434_7da3c9610b0368a45f954cd47d91121c
>  [following]
> 06:45:14 --2017-10-18 13:45:14--  
> http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/jdk-8u144-linux-x64.tar.gz?AuthParam=1508334434_7da3c9610b0368a45f954cd47d91121c
> 06:45:14 Connecting to download.oracle.com 
> (download.oracle.com)|23.59.189.81|:80... connected.
> 06:45:14 HTTP request sent, awaiting response... 404 Not Found
> 06:45:14 2017-10-18 13:45:14 ERROR 404: Not Found.
> 06:45:14 
> 06:45:14 download failed
> 06:45:14 Oracle JDK 8 is NOT installed.
> 06:45:14 dpkg: error processing package oracle-java8-installer 
> (--configure):
> 06:45:14  subprocess installed post-installation script returned error exit 
> status 1
> 06:45:14 Errors were encountered while processing:
> 06:45:14  oracle-java8-installer
> 06:45:29 E: Sub-process /usr/bin/dpkg returned an error code (1)
> 06:45:29 The command '/bin/sh -c apt-get -q update && apt-get -q install 
> --no-install-recommends -y oracle-java8-installer' returned a non-zero code: 
> 100
> 06:45:29 
> 06:45:29 Total Elapsed time:   3m 19s
> 06:45:29 
> 06:45:29 ERROR: Docker failed to build image.
> {code}
> Workaround mentioned in the forum post:
> {code}
> sudo sed -i 's|JAVA_VERSION=8u144|JAVA_VERSION=8u152|' 
> oracle-java8-installer.*
> sudo sed -i 
> 's|PARTNER_URL=http://download.oracle.com/otn-pub/java/jdk/8u144-b01/090f390dda5b47b9b721c7dfaa008135/|PARTNER_URL=http://download.oracle.com/otn-pub/java/jdk/8u152-b16/aa0333dd3019491ca4f6ddbe78cdb6d0/|'
>  oracle-java8-installer.*
> sudo sed -i 
> 's|SHA256SUM_TGZ="e8a341ce566f32c3d06f6d0f0eeea9a0f434f538d22af949ae58bc86f2eeaae4"|SHA256SUM_TGZ="218b3b340c3f6d05d940b817d0270dfe0cfd657a636bad074dcabe0c111961bf"|'
>  oracle-java8-installer.*
> sudo sed -i 's|J_DIR=jdk1.8.0_144|J_DIR=jdk1.8.0_152|' 
> oracle-java8-installer.*
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


<    3   4   5   6   7   8   9   10   11   12   >