[jira] [Resolved] (HBASE-10305) Batch update performance drops as the number of regions grows

2014-01-10 Thread Chao Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Shi resolved HBASE-10305.
--

Resolution: Not A Problem

We're running 0.94.10. ASYNC_WAL does work for me. Thanks Lars. Close this 
issue.

 Batch update performance drops as the number of regions grows
 -

 Key: HBASE-10305
 URL: https://issues.apache.org/jira/browse/HBASE-10305
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Chao Shi

 In our use case, we use a small number (~5) of proxy programs that read from 
 a queue and batch update to HBase. Our program is multi-threaded and HBase 
 client will batch mutations to each RS.
 We found we're getting lower TPS when there are more regions. I think the 
 reason is RS syncs HLog for each region. Suppose there is a single region, 
 the batch update will only touch one region and therefore syncs HLog once. 
 And suppose there are 10 regions per server, in RS#multi() it have to process 
 update for each individual region and sync HLog 10 times.
 Please note that in our scenario, batched mutations usually are independent 
 with each other and need to touch a various number of regions.
 We are using the 0.94 series, but I think the trunk should have the same 
 problem after a quick look into the code.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10305) Batch update performance drops as the number of regions grows

2014-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867616#comment-13867616
 ] 

Lars Hofhansl commented on HBASE-10305:
---

Great.
Maybe this should documented more prominently in the HBase book.

 Batch update performance drops as the number of regions grows
 -

 Key: HBASE-10305
 URL: https://issues.apache.org/jira/browse/HBASE-10305
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Chao Shi

 In our use case, we use a small number (~5) of proxy programs that read from 
 a queue and batch update to HBase. Our program is multi-threaded and HBase 
 client will batch mutations to each RS.
 We found we're getting lower TPS when there are more regions. I think the 
 reason is RS syncs HLog for each region. Suppose there is a single region, 
 the batch update will only touch one region and therefore syncs HLog once. 
 And suppose there are 10 regions per server, in RS#multi() it have to process 
 update for each individual region and sync HLog 10 times.
 Please note that in our scenario, batched mutations usually are independent 
 with each other and need to touch a various number of regions.
 We are using the 0.94 series, but I think the trunk should have the same 
 problem after a quick look into the code.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-6581) Build with hadoop.profile=3.0

2014-01-10 Thread Eric Charles (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Charles updated HBASE-6581:


Attachment: HBASE-6581-6.patch

HBASE-6581-6.patch rebased to latest trunk.

 Build with hadoop.profile=3.0
 -

 Key: HBASE-6581
 URL: https://issues.apache.org/jira/browse/HBASE-6581
 Project: HBase
  Issue Type: Bug
Reporter: Eric Charles
Assignee: Eric Charles
 Fix For: 0.98.1

 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, 
 HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, 
 HBASE-6581-5.patch, HBASE-6581-6.patch, HBASE-6581.diff, HBASE-6581.diff


 Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to 
 change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT 
 instead of 3.0.0-SNAPSHOT in hbase-common).
 I can provide a patch that would move most of hadoop dependencies in their 
 respective profiles and will define the correct hadoop deps in the 3.0 
 profile.
 Please tell me if that's ok to go this way.
 Thx, Eric
 [1]
 $ mvn clean install -Dhadoop.profile=3.0
 [INFO] Scanning for projects...
 [ERROR] The build could not read 3 projects - [Help 1]
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-server:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-server/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-common:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-common/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-it:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-it/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21
 [ERROR] 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0

2014-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867675#comment-13867675
 ] 

Hadoop QA commented on HBASE-6581:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12622372/HBASE-6581-6.patch
  against trunk revision .
  ATTACHMENT ID: 12622372

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8384//console

This message is automatically generated.

 Build with hadoop.profile=3.0
 -

 Key: HBASE-6581
 URL: https://issues.apache.org/jira/browse/HBASE-6581
 Project: HBase
  Issue Type: Bug
Reporter: Eric Charles
Assignee: Eric Charles
 Fix For: 0.98.1

 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, 
 HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, 
 HBASE-6581-5.patch, HBASE-6581-6.patch, HBASE-6581.diff, HBASE-6581.diff


 Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to 
 change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT 
 instead of 3.0.0-SNAPSHOT in hbase-common).
 I can provide a patch that would move most of hadoop dependencies in their 
 respective profiles and will define the correct hadoop deps in the 3.0 
 profile.
 Please tell me if that's ok to go this way.
 Thx, Eric
 [1]
 $ mvn clean install -Dhadoop.profile=3.0
 [INFO] Scanning for projects...
 [ERROR] The build could not read 3 projects - [Help 1]
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-server:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-server/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-common:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-common/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-it:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-it/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21
 [ERROR] 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10309) Add support to delete empty regions in 0.94.x series

2014-01-10 Thread AcCud (JIRA)
AcCud created HBASE-10309:
-

 Summary: Add support to delete empty regions in 0.94.x series
 Key: HBASE-10309
 URL: https://issues.apache.org/jira/browse/HBASE-10309
 Project: HBase
  Issue Type: New Feature
Reporter: AcCud
 Fix For: 0.94.16


My use case: I have several tables where keys start with a timestamp. Because 
of this and combined with the fact that I have set a 15 days retention period, 
after a period of time results empty regions.
I am sure that no write will occur in these region.
It would be nice to have a tool to delete regions without being necessary to 
stop the cluster.
The easiest way for me is to have a tool that is able to delete all empty 
regions, but there wouldn't be any problem to specify which region to delete.
Something like:
deleteRegion tableName region




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9977) Define C interface of HBase Client Asynchronous APIs

2014-01-10 Thread Cosmin Lehene (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867782#comment-13867782
 ] 

Cosmin Lehene commented on HBASE-9977:
--

[~eclark] Is there a JIRA umbrella for the C++ (core)?
It looks like HBASE-10168 is for JNI and HBASE-1015 suggests wrapping Thrift. 

 Define C interface of HBase Client Asynchronous APIs
 

 Key: HBASE-9977
 URL: https://issues.apache.org/jira/browse/HBASE-9977
 Project: HBase
  Issue Type: Sub-task
  Components: Client
Reporter: Elliott Clark
Assignee: Elliott Clark
 Fix For: 0.99.0

 Attachments: HBASE-9977-0.patch, HBASE-9977-1.patch, 
 HBASE-9977-2.patch, HBASE-9977-3.patch, HBASE-9977-4.patch, HBASE-9977-5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-1015) pure C and C++ client libraries

2014-01-10 Thread Cosmin Lehene (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867787#comment-13867787
 ] 

Cosmin Lehene commented on HBASE-1015:
--

HBASE-9977 suggests a C++ async client and C sync/async wrappers. 
Given that HBase talks protobuf natively. Is a native wrapper around Thrift 
still a goal?


 pure C and C++ client libraries
 ---

 Key: HBASE-1015
 URL: https://issues.apache.org/jira/browse/HBASE-1015
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 0.20.6
Reporter: Andrew Purtell
Priority: Minor

 If via HBASE-794 first class support for talking via Thrift directly to 
 HMaster and HRS is available, then pure C and C++ client libraries are 
 possible. 
 The C client library would wrap a Thrift core. 
 The C++ client library can provide a class hierarchy quite close to 
 o.a.h.h.client and, ideally, identical semantics. It  should be just a 
 wrapper around the C API, for economy.
 Internally to my employer there is a lot of resistance to HBase because many 
 dev teams have a strong C/C++ bias. The real issue however is really client 
 side integration, not a fundamental objection. (What runs server side and how 
 it is managed is a secondary consideration.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10310) ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master

2014-01-10 Thread Samir Ahmic (JIRA)
Samir Ahmic created HBASE-10310:
---

 Summary: ZNodeCleaner.java 
KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for 
/hbase/master
 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic


I was testing hbase master clear command while working on [HBASE-7386] here 
is command and exception:
{code}
$ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear

14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
quorum=zk1:2181, baseZNode=/hbase
14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process identifier=clean 
znode for master connecting to ZooKeeper ensemble=zk1:2181
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
(Unable to locate a login configuration)
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
zk11/172.17.33.5:2181, initiating session
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete on 
server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated timeout 
= 4
14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
ZooKeeper, quorum=zk1:2181, 
exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /hbase/master
14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
ZooKeeper, quorum=zk1:2181, 
exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /hbase/master
14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
failed after 1 attempts
14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get data 
of znode /hbase/master
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received unexpected 
KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
14/01/10 14:05:45 WARN zookeeper.ZooKeeperNodeTracker: Can't get or delete the 
master znode
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master
at 

[jira] [Updated] (HBASE-10310) ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master

2014-01-10 Thread Samir Ahmic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Ahmic updated HBASE-10310:


Attachment: HBASE-10310.patch

Here is patch. 

 ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = 
 Session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 

[jira] [Assigned] (HBASE-10310) ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master

2014-01-10 Thread Samir Ahmic (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Ahmic reassigned HBASE-10310:
---

Assignee: Samir Ahmic

 ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = 
 Session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 

[jira] [Commented] (HBASE-10123) Change default ports; move them out of linux ephemeral port range

2014-01-10 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867936#comment-13867936
 ] 

Jonathan Hsieh commented on HBASE-10123:


This is something that would need to wait for 1.0? or is there any chance of 
this in 0.98 [~apurtell]?

 Change default ports; move them out of linux ephemeral port range
 -

 Key: HBASE-10123
 URL: https://issues.apache.org/jira/browse/HBASE-10123
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Our defaults clash w/ the range linux assigns itself for creating come-and-go 
 ephemeral ports; likely in our history we've clashed w/ a random, short-lived 
 process.  While easy to change the defaults, we should just ship w/ defaults 
 that make sense.  We could host ourselves up into the 7 or 8k range.
 See http://www.ncftp.com/ncftpd/doc/misc/ephemeral_ports.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10311) Add Scan object to preScannerNext and postScannerNext methods on RegionObserver

2014-01-10 Thread Neil Ferguson (JIRA)
Neil Ferguson created HBASE-10311:
-

 Summary: Add Scan object to preScannerNext and postScannerNext 
methods on RegionObserver
 Key: HBASE-10311
 URL: https://issues.apache.org/jira/browse/HBASE-10311
 Project: HBase
  Issue Type: New Feature
  Components: Coprocessors
Affects Versions: 0.96.1.1
Reporter: Neil Ferguson


I'd like to be able to access the org.apache.hadoop.hbase.client.Scan that was 
used to create a scanner in the RegionObserver.preScannerNext and 
RegionObserver.postScannerNext methods.

The Scan object is available in the preScannerOpen method, but not in the 
preScannerNext or postScannerNext methods.

The reason is that I'd like to access the attributes of the Scan object. I want 
to do some resource management in the coprocessor based on some attributes of 
the Scan object (like, who created it).

Alternatively, does anybody know of another way to get hold of the Scan object 
in these methods without modifying things?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10311) Add Scan object to preScannerNext and postScannerNext methods on RegionObserver

2014-01-10 Thread Neil Ferguson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Ferguson updated HBASE-10311:
--

Attachment: HBASE-10311.patch

Patch attached

 Add Scan object to preScannerNext and postScannerNext methods on 
 RegionObserver
 ---

 Key: HBASE-10311
 URL: https://issues.apache.org/jira/browse/HBASE-10311
 Project: HBase
  Issue Type: New Feature
  Components: Coprocessors
Affects Versions: 0.96.1.1
Reporter: Neil Ferguson
 Attachments: HBASE-10311.patch


 I'd like to be able to access the org.apache.hadoop.hbase.client.Scan that 
 was used to create a scanner in the RegionObserver.preScannerNext and 
 RegionObserver.postScannerNext methods.
 The Scan object is available in the preScannerOpen method, but not in the 
 preScannerNext or postScannerNext methods.
 The reason is that I'd like to access the attributes of the Scan object. I 
 want to do some resource management in the coprocessor based on some 
 attributes of the Scan object (like, who created it).
 Alternatively, does anybody know of another way to get hold of the Scan 
 object in these methods without modifying things?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10311) Add Scan object to preScannerNext and postScannerNext methods on RegionObserver

2014-01-10 Thread Neil Ferguson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867960#comment-13867960
 ] 

Neil Ferguson commented on HBASE-10311:
---

Just made this patch, then realised that I can accomplish what I want by 
mapping RegionScanner - Scan in postScannerOpen, then looking up this map in 
preScannerNext and postScannerNext. 

VisibilityController seems to do something similar already using a weak 
hashmap. 

This approach seems a little brittle, since there's theoretically no guarantee 
that the scanner that is passed to postScannerOpen is the same on that is 
passed to preScannerNext and postScannerNext. Perhaps we should change the docs 
to explicitly specify that this will always be the case.

Anyway, since it doesn't involve changing the coprocessor interface, I'll take 
this approach. The patch to modify the coprocessor interface is attached if 
anyone wants it. Feel free to close this ticket otherwise.

 Add Scan object to preScannerNext and postScannerNext methods on 
 RegionObserver
 ---

 Key: HBASE-10311
 URL: https://issues.apache.org/jira/browse/HBASE-10311
 Project: HBase
  Issue Type: New Feature
  Components: Coprocessors
Affects Versions: 0.96.1.1
Reporter: Neil Ferguson
 Attachments: HBASE-10311.patch


 I'd like to be able to access the org.apache.hadoop.hbase.client.Scan that 
 was used to create a scanner in the RegionObserver.preScannerNext and 
 RegionObserver.postScannerNext methods.
 The Scan object is available in the preScannerOpen method, but not in the 
 preScannerNext or postScannerNext methods.
 The reason is that I'd like to access the attributes of the Scan object. I 
 want to do some resource management in the coprocessor based on some 
 attributes of the Scan object (like, who created it).
 Alternatively, does anybody know of another way to get hold of the Scan 
 object in these methods without modifying things?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867968#comment-13867968
 ] 

stack commented on HBASE-10304:
---

[~enis] That pointer helps.

 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:124)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:64)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:51)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 So, ZCLBS is a hack.  This class is in the hbase-protocol module.  It is in 
 the com.google.protobuf package.  All is well and good usually.
 But when we make a job jar and bundle up hbase inside it, our 'trick' breaks. 
  RunJar makes a new class loader to run the job jar.  This URLCLassLoader 
 'attaches' all the jars and classes that are in jobjar so they can be found 
 when it does to do a lookup only Classloaders work by always delegating to 
 their parent first (unless you are a WAR file in a container where delegation 
 is 'off' for the most part) and in this case, the parent classloader will 
 have access to a pb jar since pb is in the hadoop CLASSPATH.  So, the parent 
 loads the pb classes.
 We then load ZCLBS only this is done in the claslsloader made by RunJar; 
 ZKCLBS has a different classloader from its superclass and we get the above 
 IllegalAccessError.
 Now (Jimmy's work comes in here), this can't be fixed by reflection -- you 
 can't setAccess on a 'Class' -- and though it probably could be fixed by 
 hacking RunJar so it was somehow made 

[jira] [Commented] (HBASE-10311) Add Scan object to preScannerNext and postScannerNext methods on RegionObserver

2014-01-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867979#comment-13867979
 ] 

Anoop Sam John commented on HBASE-10311:


Actually it will be the same scanner object being passed in next() and next CP 
hooks.   Yes we already use this VisibilityController.  I was about to come 
here and suggest this and then saw you realized it alreay on your own.. Good :)
We can not change the CP signature in released major versions. But can change 
in Trunk only ( if needed) .. Here it looks not needed at all..  So I will 
close this issue.

 Add Scan object to preScannerNext and postScannerNext methods on 
 RegionObserver
 ---

 Key: HBASE-10311
 URL: https://issues.apache.org/jira/browse/HBASE-10311
 Project: HBase
  Issue Type: New Feature
  Components: Coprocessors
Affects Versions: 0.96.1.1
Reporter: Neil Ferguson
 Attachments: HBASE-10311.patch


 I'd like to be able to access the org.apache.hadoop.hbase.client.Scan that 
 was used to create a scanner in the RegionObserver.preScannerNext and 
 RegionObserver.postScannerNext methods.
 The Scan object is available in the preScannerOpen method, but not in the 
 preScannerNext or postScannerNext methods.
 The reason is that I'd like to access the attributes of the Scan object. I 
 want to do some resource management in the coprocessor based on some 
 attributes of the Scan object (like, who created it).
 Alternatively, does anybody know of another way to get hold of the Scan 
 object in these methods without modifying things?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10311) Add Scan object to preScannerNext and postScannerNext methods on RegionObserver

2014-01-10 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John resolved HBASE-10311.


Resolution: Invalid

 Add Scan object to preScannerNext and postScannerNext methods on 
 RegionObserver
 ---

 Key: HBASE-10311
 URL: https://issues.apache.org/jira/browse/HBASE-10311
 Project: HBase
  Issue Type: New Feature
  Components: Coprocessors
Affects Versions: 0.96.1.1
Reporter: Neil Ferguson
 Attachments: HBASE-10311.patch


 I'd like to be able to access the org.apache.hadoop.hbase.client.Scan that 
 was used to create a scanner in the RegionObserver.preScannerNext and 
 RegionObserver.postScannerNext methods.
 The Scan object is available in the preScannerOpen method, but not in the 
 preScannerNext or postScannerNext methods.
 The reason is that I'd like to access the attributes of the Scan object. I 
 want to do some resource management in the coprocessor based on some 
 attributes of the Scan object (like, who created it).
 Alternatively, does anybody know of another way to get hold of the Scan 
 object in these methods without modifying things?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-10 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868027#comment-13868027
 ] 

Sergey Shelukhin commented on HBASE-10294:
--

Yes, above is unnecessary. Patch? :)

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Priority: Minor

 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-10294:
--

Assignee: Ted Yu

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor

 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10294:
---

Attachment: 10294-v1.txt

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10294-v1.txt


 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10294:
---

Status: Patch Available  (was: Open)

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10294-v1.txt


 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868041#comment-13868041
 ] 

Andrew Purtell commented on HBASE-10307:


I need to commit this to move forward with 0.98, so will do so using CTR in a 
few hours.

 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868053#comment-13868053
 ] 

Ted Yu commented on HBASE-10307:


+1

 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10123) Change default ports; move them out of linux ephemeral port range

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868055#comment-13868055
 ] 

Andrew Purtell commented on HBASE-10123:


No problem at all [~jmhsieh], go for it.

 Change default ports; move them out of linux ephemeral port range
 -

 Key: HBASE-10123
 URL: https://issues.apache.org/jira/browse/HBASE-10123
 Project: HBase
  Issue Type: Bug
Reporter: stack

 Our defaults clash w/ the range linux assigns itself for creating come-and-go 
 ephemeral ports; likely in our history we've clashed w/ a random, short-lived 
 process.  While easy to change the defaults, we should just ship w/ defaults 
 that make sense.  We could host ourselves up into the 7 or 8k range.
 See http://www.ncftp.com/ncftpd/doc/misc/ephemeral_ports.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868061#comment-13868061
 ] 

Andrew Purtell commented on HBASE-10304:


{quote}
I think something like below should be the standard way to launch an HBase 
job. Java developers are used to thinking about the classpath, so I don't think 
it's a burden on anyone.

{noformat}
$ HADOOP_CLASSPATH=$(hbase mapredcp) hadoop jar foo.jar MainClass
{noformat}

or perhaps, if you're fancy

{noformat}
$ HADOOP_CLASSPATH=/path/to/hbase_config:$(hbase mapredcp) hadoop jar foo.jar
{noformat}
{quote}

So can we get away with a doc change / manual update as the fix for this issue?

 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:124)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:64)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:51)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 So, ZCLBS is a hack.  This class is in the hbase-protocol module.  It is in 
 the com.google.protobuf package.  All is well and good usually.
 But when we make a job jar and bundle up hbase inside it, our 'trick' breaks. 
  RunJar makes a new class loader to run the job jar.  This URLCLassLoader 
 'attaches' all the jars and classes that are in jobjar so they can be found 
 when it does to do a lookup only Classloaders work by always delegating to 
 their parent first (unless you are a WAR file in a container where delegation 
 is 'off' for the most part) and in this case, the parent 

[jira] [Resolved] (HBASE-10299) TestZKProcedure fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-10299.


   Resolution: Duplicate
Fix Version/s: (was: 0.99.0)
   (was: 0.98.0)
 Assignee: (was: Andrew Purtell)

Dup of HBASE-10308

 TestZKProcedure fails occasionally
 --

 Key: HBASE-10299
 URL: https://issues.apache.org/jira/browse/HBASE-10299
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell

 I can reproduce this using JDK 6 on Ubuntu 12. 
 {noformat}
 Running org.apache.hadoop.hbase.procedure.TestZKProcedure
 Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.941 sec  
 FAILURE!
 [...]
 Failed tests:   
 testMultiCohortWithMemberTimeoutDuringPrepare(org.apache.hadoop.hbase.procedure.TestZKProcedure):
  (..)
 {noformat}
 Not seen running the test standalone. Quite rare, seen after 46 previous 
 successful test suite runs. No failure trace available yet.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868063#comment-13868063
 ] 

Andrew Purtell commented on HBASE-10308:


I filed HBASE-10299 for this but will close that as a dup since this issue has 
more detail.

 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.94.16, 0.96.2, 0.98.1, 0.99.0


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10308:
---

Fix Version/s: 0.99.0
   0.98.1
   0.96.2

 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.94.16, 0.96.2, 0.98.1, 0.99.0


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868071#comment-13868071
 ] 

Andrew Purtell commented on HBASE-10307:


Thanks [~te...@apache.org], appreciate you taking a look.

 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868091#comment-13868091
 ] 

Anoop Sam John commented on HBASE-10307:


Patch LGTM.. +1

 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10310) ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868092#comment-13868092
 ] 

Andrew Purtell commented on HBASE-10310:


+1, will commit this in a bit to trunk, 0.98, and 0.96 since it's an obvious 
fix.(Ping  [~stack]). 

 ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = 
 Session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   

[jira] [Commented] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868068#comment-13868068
 ] 

Andrew Purtell commented on HBASE-10308:


Updated fix versions since this happens AFAIK on all branches.

 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.94.16, 0.96.2, 0.98.1, 0.99.0


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10288) make mvcc an (optional) part of KV serialization

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868094#comment-13868094
 ] 

Andrew Purtell commented on HBASE-10288:


bq.  It can be done using tags, but we may not want the overhead given that it 
will be in many KVs, so it might require HFileFormat vN+1

Only a minor version increment should be necessary. 

 make mvcc an (optional) part of KV serialization
 

 Key: HBASE-10288
 URL: https://issues.apache.org/jira/browse/HBASE-10288
 Project: HBase
  Issue Type: Improvement
  Components: HFile
Reporter: Sergey Shelukhin
Priority: Minor

 This has been suggested in HBASE-10241. Mvcc can currently be serialized in 
 HFile, but the mechanism is... magical. We might want to make it a part of 
 proper serialization of the KV. It can be done using tags, but we may not 
 want the overhead given that it will be in many KVs, so it might require 
 HFileFormat vN+1. Regardless, the external  mechanism would need to be 
 removed while also preserving backward compat.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10309) Add support to delete empty regions in 0.94.x series

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868085#comment-13868085
 ] 

Andrew Purtell commented on HBASE-10309:


Or you could create your keys based on a modulus of the timestamp? Could work 
if you can put an upper bound on the number of records stored within the 
retention period, call it N, then key = (timestamp mod N) ... 

 Add support to delete empty regions in 0.94.x series
 

 Key: HBASE-10309
 URL: https://issues.apache.org/jira/browse/HBASE-10309
 Project: HBase
  Issue Type: New Feature
Reporter: AcCud
 Fix For: 0.94.16


 My use case: I have several tables where keys start with a timestamp. Because 
 of this and combined with the fact that I have set a 15 days retention 
 period, after a period of time results empty regions.
 I am sure that no write will occur in these region.
 It would be nice to have a tool to delete regions without being necessary to 
 stop the cluster.
 The easiest way for me is to have a tool that is able to delete all empty 
 regions, but there wouldn't be any problem to specify which region to delete.
 Something like:
 deleteRegion tableName region



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868095#comment-13868095
 ] 

Jimmy Xiang commented on HBASE-10304:
-

I have verified the two workarounds to work fine.  With the fat hbase jobjar, I 
can run row counter and get correct results.

1. run the job like

HADOOP_CLASSPATH=/path/to/hbase_config:/path/to/hbase-protocol.jar hadoop jar 
fat-hbase-job.jar

2. put the hbase-protocol jar under hadoop/lib so that MR can pick it up, and 
run the job as before

+1 on doc change as the fix for this issue.





 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:124)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:64)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:51)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 So, ZCLBS is a hack.  This class is in the hbase-protocol module.  It is in 
 the com.google.protobuf package.  All is well and good usually.
 But when we make a job jar and bundle up hbase inside it, our 'trick' breaks. 
  RunJar makes a new class loader to run the job jar.  This URLCLassLoader 
 'attaches' all the jars and classes that are in jobjar so they can be found 
 when it does to do a lookup only Classloaders work by always delegating to 
 their parent first (unless you are a WAR file in a container where delegation 
 is 'off' for the most part) and in this case, the parent classloader will 
 have access to a pb jar since pb is in the hadoop CLASSPATH.  So, the parent 
 loads the pb 

[jira] [Updated] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10307:
---

Attachment: 10307.patch

What I committed to trunk and 0.98. It's the same patch as reviewed but with 
the addition of a main method. Confirmed to work on a small test cluster.

 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10200) Better error message when HttpServer fails to start due to java.net.BindException

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10200:
---

Labels: noob  (was: )

 Better error message when HttpServer fails to start due to 
 java.net.BindException
 -

 Key: HBASE-10200
 URL: https://issues.apache.org/jira/browse/HBASE-10200
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Priority: Minor
  Labels: noob

 Starting HBase using Hoya, I saw the following in log:
 {code}
 2013-12-17 21:49:06,758 INFO  [master:hor12n19:42587] http.HttpServer: 
 HttpServer.start() threw a non Bind IOException
 java.net.BindException: Port in use: hor12n14.gq1.ygridcore.net:12432
 at org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:742)
 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:686)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:586)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.net.BindException: Cannot assign requested address
 at sun.nio.ch.Net.bind0(Native Method)
 at sun.nio.ch.Net.bind(Net.java:344)
 at sun.nio.ch.Net.bind(Net.java:336)
 at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199)
 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
 at 
 org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
 at org.apache.hadoop.http.HttpServer.openListener(HttpServer.java:738)
 {code}
 This was due to hbase.master.info.bindAddress giving static address but Hoya 
 allocates master dynamically.
 Better error message should be provided: when bindAddress points another host 
 than local host, message should remind user to remove / adjust 
 hbase.master.info.bindAddress config param from hbase-site.xml



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-10307.


  Resolution: Fixed
Hadoop Flags: Reviewed

 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-10 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868111#comment-13868111
 ] 

Sergey Shelukhin commented on HBASE-10294:
--

+1

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10294-v1.txt


 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868114#comment-13868114
 ] 

stack commented on HBASE-10304:
---

Agree with [~jxiang]

I could have a go at it, np, but [~ndimiduk], you have an opinion on where we 
should be going that I like (Deprecate fat job jar) and you are a better 
writer... do you want to do up a doc patch?

 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:124)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:64)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:51)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 So, ZCLBS is a hack.  This class is in the hbase-protocol module.  It is in 
 the com.google.protobuf package.  All is well and good usually.
 But when we make a job jar and bundle up hbase inside it, our 'trick' breaks. 
  RunJar makes a new class loader to run the job jar.  This URLCLassLoader 
 'attaches' all the jars and classes that are in jobjar so they can be found 
 when it does to do a lookup only Classloaders work by always delegating to 
 their parent first (unless you are a WAR file in a container where delegation 
 is 'off' for the most part) and in this case, the parent classloader will 
 have access to a pb jar since pb is in the hadoop CLASSPATH.  So, the parent 
 loads the pb classes.
 We then load ZCLBS only this is done in the claslsloader made by RunJar; 
 ZKCLBS has a different classloader from its superclass and we get the above 
 IllegalAccessError.
 

[jira] [Closed] (HBASE-10311) Add Scan object to preScannerNext and postScannerNext methods on RegionObserver

2014-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack closed HBASE-10311.
-


Closing at Mr [~neilf]'s suggestion.  This seems like something we should doc 
in CP section of refguide given two of you fellas bumped into the prob.  Let me 
try do that and point to this issue.

 Add Scan object to preScannerNext and postScannerNext methods on 
 RegionObserver
 ---

 Key: HBASE-10311
 URL: https://issues.apache.org/jira/browse/HBASE-10311
 Project: HBase
  Issue Type: New Feature
  Components: Coprocessors
Affects Versions: 0.96.1.1
Reporter: Neil Ferguson
 Attachments: HBASE-10311.patch


 I'd like to be able to access the org.apache.hadoop.hbase.client.Scan that 
 was used to create a scanner in the RegionObserver.preScannerNext and 
 RegionObserver.postScannerNext methods.
 The Scan object is available in the preScannerOpen method, but not in the 
 preScannerNext or postScannerNext methods.
 The reason is that I'd like to access the attributes of the Scan object. I 
 want to do some resource management in the coprocessor based on some 
 attributes of the Scan object (like, who created it).
 Alternatively, does anybody know of another way to get hold of the Scan 
 object in these methods without modifying things?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10310) ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868126#comment-13868126
 ] 

stack commented on HBASE-10310:
---

+1 for 0.96.  Bug fix.  Thanks [~asamir]

 ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = 
 Session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 

[jira] [Updated] (HBASE-9914) Port fix for HBASE-9836 'Intermittent TestRegionObserverScannerOpenHook#testRegionObserverCompactionTimeStacking failure' to 0.94

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-9914:
--

Labels: noob  (was: )

 Port fix for HBASE-9836 'Intermittent 
 TestRegionObserverScannerOpenHook#testRegionObserverCompactionTimeStacking 
 failure' to 0.94
 -

 Key: HBASE-9914
 URL: https://issues.apache.org/jira/browse/HBASE-9914
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
  Labels: noob

 According to this thread: http://search-hadoop.com/m/3CzC31BQsDd , 
 TestRegionObserverScannerOpenHook#testRegionObserverCompactionTimeStacking 
 sometimes failed.
 This issue is to port the fix from HBASE-9836 to 0.94



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Reopened] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reopened HBASE-10292:



Still seeing this occasionally

 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10309) Add support to delete empty regions in 0.94.x series

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868136#comment-13868136
 ] 

stack commented on HBASE-10309:
---

Removing a single empty region is not possible without region merge facility.  
If multiple adjacent empty regions, you could replace them all with a single 
empty region that spans the deleted regions easy enough.

 Add support to delete empty regions in 0.94.x series
 

 Key: HBASE-10309
 URL: https://issues.apache.org/jira/browse/HBASE-10309
 Project: HBase
  Issue Type: New Feature
Reporter: AcCud
 Fix For: 0.94.16


 My use case: I have several tables where keys start with a timestamp. Because 
 of this and combined with the fact that I have set a 15 days retention 
 period, after a period of time results empty regions.
 I am sure that no write will occur in these region.
 It would be nice to have a tool to delete regions without being necessary to 
 stop the cluster.
 The easiest way for me is to have a tool that is able to delete all empty 
 regions, but there wouldn't be any problem to specify which region to delete.
 Something like:
 deleteRegion tableName region



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10305) Batch update performance drops as the number of regions grows

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868140#comment-13868140
 ] 

stack commented on HBASE-10305:
---

Does HBASE-8755 help?  It batches up the sync invocations making it so each 
filesystem sync satisfies more than just the one Handler API sync invocation.  
You'll get to a sync cadence that should be  roughly independent of the number 
of times the Handler calls sync.

 Batch update performance drops as the number of regions grows
 -

 Key: HBASE-10305
 URL: https://issues.apache.org/jira/browse/HBASE-10305
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Chao Shi

 In our use case, we use a small number (~5) of proxy programs that read from 
 a queue and batch update to HBase. Our program is multi-threaded and HBase 
 client will batch mutations to each RS.
 We found we're getting lower TPS when there are more regions. I think the 
 reason is RS syncs HLog for each region. Suppose there is a single region, 
 the batch update will only touch one region and therefore syncs HLog once. 
 And suppose there are 10 regions per server, in RS#multi() it have to process 
 update for each individual region and sync HLog 10 times.
 Please note that in our scenario, batched mutations usually are independent 
 with each other and need to touch a various number of regions.
 We are using the 0.94 series, but I think the trunk should have the same 
 problem after a quick look into the code.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10294) Some synchronization on ServerManager#onlineServers can be removed

2014-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868143#comment-13868143
 ] 

Hadoop QA commented on HBASE-10294:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12622406/10294-v1.txt
  against trunk revision .
  ATTACHMENT ID: 12622406

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8385//console

This message is automatically generated.

 Some synchronization on ServerManager#onlineServers can be removed
 --

 Key: HBASE-10294
 URL: https://issues.apache.org/jira/browse/HBASE-10294
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10294-v1.txt


 ServerManager#onlineServers is a ConcurrentHashMap
 Yet I found that some accesses to it are synchronized and unnecessary.
 Here is one example:
 {code}
   public MapServerName, ServerLoad getOnlineServers() {
 // Presumption is that iterating the returned Map is OK.
 synchronized (this.onlineServers) {
   return Collections.unmodifiableMap(this.onlineServers);
 {code}
 Note: not all accesses to ServerManager#onlineServers are synchronized.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10312) Flooding the cluster with administrative actions leads to collapse

2014-01-10 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-10312:
--

 Summary: Flooding the cluster with administrative actions leads to 
collapse
 Key: HBASE-10312
 URL: https://issues.apache.org/jira/browse/HBASE-10312
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell


Steps to reproduce:
1. Start a cluster.
2. Start an ingest process.
3. In the HBase shell, do this:
{noformat}
while true ; do
   flush 'table'
end
{noformat}

We should reject abuse via administrative requests like this.

What happens on the cluster is the requests back up, leading to lots of these:
{noformat}
2014-01-10 18:55:55,293 WARN  [Priority.RpcServer.handler=2,port=8120] 
monitoring.TaskMonitor: Too many actions in action monitor! Purging some.
{noformat}

At this point we could lower a gate on further requests for actions until the 
backlog clears.

Continuing, all of the regionservers will eventually die with a 
StackOverflowError of unknown origin because, stack overflow:

{noformat}
2014-01-10 19:02:02,783 ERROR [Priority.RpcServer.handler=3,port=8120] 
ipc.RpcServer: Unexpected throwable object java.lang.StackOverflowError
at java.util.ArrayList$SubList.add(ArrayList.java:965)
[...]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10313) Duplicate servlet-api jars in hbase 0.96.0

2014-01-10 Thread stack (JIRA)
stack created HBASE-10313:
-

 Summary: Duplicate servlet-api jars in hbase 0.96.0
 Key: HBASE-10313
 URL: https://issues.apache.org/jira/browse/HBASE-10313
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Critical
 Fix For: 0.96.2


On mailing list, http://search-hadoop.com/m/wtCkHs5Ujq, [~jerryhe] reports we 
have doubled jars:

{code}
[biadmin@hdtest009 lib]$ ls -l jsp-api*
-rw-rw-r-- 1 biadmin biadmin 134910 Sep 17 01:13 jsp-api-2.1-6.1.14.jar
-rw-rw-r-- 1 biadmin biadmin 100636 Sep 17 01:27 jsp-api-2.1.jar

[biadmin@hdtest009 lib]$ ls -l servlet-api*
-rw-rw-r-- 1 biadmin biadmin 132368 Sep 17 01:13 servlet-api-2.5-6.1.14.jar
-rw-rw-r-- 1 biadmin biadmin 105112 Sep 17 01:12 servlet-api-2.5.jar
{code}

Fix in 0.96.2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10313) Duplicate servlet-api jars in hbase 0.96.0

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868190#comment-13868190
 ] 

stack commented on HBASE-10313:
---

stax-api also came recently in offline discussion.

 Duplicate servlet-api jars in hbase 0.96.0
 --

 Key: HBASE-10313
 URL: https://issues.apache.org/jira/browse/HBASE-10313
 Project: HBase
  Issue Type: Bug
Reporter: stack
Priority: Critical
 Fix For: 0.96.2


 On mailing list, http://search-hadoop.com/m/wtCkHs5Ujq, [~jerryhe] reports we 
 have doubled jars:
 {code}
 [biadmin@hdtest009 lib]$ ls -l jsp-api*
 -rw-rw-r-- 1 biadmin biadmin 134910 Sep 17 01:13 jsp-api-2.1-6.1.14.jar
 -rw-rw-r-- 1 biadmin biadmin 100636 Sep 17 01:27 jsp-api-2.1.jar
 [biadmin@hdtest009 lib]$ ls -l servlet-api*
 -rw-rw-r-- 1 biadmin biadmin 132368 Sep 17 01:13 servlet-api-2.5-6.1.14.jar
 -rw-rw-r-- 1 biadmin biadmin 105112 Sep 17 01:12 servlet-api-2.5.jar
 {code}
 Fix in 0.96.2.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10312) Flooding the cluster with administrative actions leads to collapse

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868196#comment-13868196
 ] 

Andrew Purtell commented on HBASE-10312:


With the AccessController active only users granted ADMIN privilege can do 
this, so it's not a critical issue unless enabling security is not an option 
for the deployment.

 Flooding the cluster with administrative actions leads to collapse
 --

 Key: HBASE-10312
 URL: https://issues.apache.org/jira/browse/HBASE-10312
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell

 Steps to reproduce:
 1. Start a cluster.
 2. Start an ingest process.
 3. In the HBase shell, do this:
 {noformat}
 while true ; do
flush 'table'
 end
 {noformat}
 We should reject abuse via administrative requests like this.
 What happens on the cluster is the requests back up, leading to lots of these:
 {noformat}
 2014-01-10 18:55:55,293 WARN  [Priority.RpcServer.handler=2,port=8120] 
 monitoring.TaskMonitor: Too many actions in action monitor! Purging some.
 {noformat}
 At this point we could lower a gate on further requests for actions until the 
 backlog clears.
 Continuing, all of the regionservers will eventually die with a 
 StackOverflowError of unknown origin because, stack overflow:
 {noformat}
 2014-01-10 19:02:02,783 ERROR [Priority.RpcServer.handler=3,port=8120] 
 ipc.RpcServer: Unexpected throwable object java.lang.StackOverflowError
 at java.util.ArrayList$SubList.add(ArrayList.java:965)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10303) Have snappy support properly documented would be helpful to hadoop and hbase users

2014-01-10 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10303:
--

 Priority: Blocker  (was: Major)
Fix Version/s: 0.96.2

Making a blocker against 0.96.2 If anyone has anything to add to Rural 
notes, it'd be appreciated ([~jmspaggi] Do we need to integrate your page into 
refguide?)

 Have snappy support properly documented would be helpful to hadoop and hbase 
 users
 --

 Key: HBASE-10303
 URL: https://issues.apache.org/jira/browse/HBASE-10303
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Rural Hunter
Priority: Blocker
 Fix For: 0.96.2


 The currentl document for configuring snappy 
 support(http://hbase.apache.org/book/snappy.compression.html) is not complete 
 and it's a bit obscure. IMO, there are several improvments can be made:
 1. Describe the relationship among hadoop,hbase,snappy. Is the snappy 
 actually needed by hadoop hdfs or hbase itself? That's to make clear if you 
 need to configure snappy support in hbase or hadoop.
 2. It didn't mention the default hadoop binary package is compiled without 
 snappy support and you need to compile it with snappy option manually. 
 Actually it didn't work with any native libs on 64 bits OS as the 
 libhadoop.so in the binary package is only for 32 bits OS(this of course is a 
 hadoop issue not hbase. but it's good to mention it.).
 3. In my experience, I actually need to install both snappy and 
 hadoop-snappy. So the doc lack of the steps to install hadoop-snappy. 
 4. During my set up, I found difference where hadoop and hbase to pick up the 
 native lib files. hadoop picks those files in ./lib while hbase picks in 
 ./lib/[PLATFORM]. If it's correct, it can also be mentioned.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10314) Add Chaos Monkey that doesn't touch the master

2014-01-10 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-10314:
-

 Summary: Add Chaos Monkey that doesn't touch the master
 Key: HBASE-10314
 URL: https://issues.apache.org/jira/browse/HBASE-10314
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.1.1, 0.98.0, 0.99.0
Reporter: Elliott Clark
Assignee: Elliott Clark






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868220#comment-13868220
 ] 

stack commented on HBASE-7386:
--

Related, from our boys at Xiaomi  https://github.com/XiaoMi/minos

 Investigate providing some supervisor support for znode deletion
 

 Key: HBASE-7386
 URL: https://issues.apache.org/jira/browse/HBASE-7386
 Project: HBase
  Issue Type: Task
  Components: master, regionserver, scripts
Reporter: Gregory Chanan
Assignee: stack
Priority: Blocker
 Attachments: HBASE-7386-bin-v2.patch, HBASE-7386-bin.patch, 
 HBASE-7386-conf-v2.patch, HBASE-7386-conf.patch, HBASE-7386-src.patch, 
 HBASE-7386-v0.patch, supervisordconfigs-v0.patch


 There a couple of JIRAs for deleting the znode on a process failure:
 HBASE-5844 (RS)
 HBASE-5926 (Master)
 which are pretty neat; on process failure, they delete the znode of the 
 underlying process so HBase can recover faster.
 These JIRAs were implemented via the startup scripts; i.e. the script hangs 
 around and waits for the process to exit, then deletes the znode.
 There are a few problems associated with this approach, as listed in the 
 below JIRAs:
 1) Hides startup output in script
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 2) two hbase processes listed per launched daemon
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 3) Not run by a real supervisor
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
 4) Weird output after kill -9 actual process in standalone mode
 https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
 5) Can kill existing RS if called again
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
 6) Hides stdout/stderr[6]
 https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
 I suspect running in via something like supervisor.d can solve these issues 
 if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868218#comment-13868218
 ] 

Lars Hofhansl commented on HBASE-10308:
---

Sorry I missed the earlier issue.

Do you have any hunches about what the problem might be?


 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.94.16, 0.96.2, 0.98.1, 0.99.0


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10312) Flooding the cluster with administrative actions leads to collapse

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10312:
---

Description: 
Steps to reproduce:
1. Start a cluster.
2. Start an ingest process.
3. In the HBase shell, do this:
{noformat}
while true do
   flush 'table'
end
{noformat}

We should reject abuse via administrative requests like this.

What happens on the cluster is the requests back up, leading to lots of these:
{noformat}
2014-01-10 18:55:55,293 WARN  [Priority.RpcServer.handler=2,port=8120] 
monitoring.TaskMonitor: Too many actions in action monitor! Purging some.
{noformat}

At this point we could lower a gate on further requests for actions until the 
backlog clears.

Continuing, all of the regionservers will eventually die with a 
StackOverflowError of unknown origin because, stack overflow:

{noformat}
2014-01-10 19:02:02,783 ERROR [Priority.RpcServer.handler=3,port=8120] 
ipc.RpcServer: Unexpected throwable object java.lang.StackOverflowError
at java.util.ArrayList$SubList.add(ArrayList.java:965)
[...]
{noformat}

  was:
Steps to reproduce:
1. Start a cluster.
2. Start an ingest process.
3. In the HBase shell, do this:
{noformat}
while true ; do
   flush 'table'
end
{noformat}

We should reject abuse via administrative requests like this.

What happens on the cluster is the requests back up, leading to lots of these:
{noformat}
2014-01-10 18:55:55,293 WARN  [Priority.RpcServer.handler=2,port=8120] 
monitoring.TaskMonitor: Too many actions in action monitor! Purging some.
{noformat}

At this point we could lower a gate on further requests for actions until the 
backlog clears.

Continuing, all of the regionservers will eventually die with a 
StackOverflowError of unknown origin because, stack overflow:

{noformat}
2014-01-10 19:02:02,783 ERROR [Priority.RpcServer.handler=3,port=8120] 
ipc.RpcServer: Unexpected throwable object java.lang.StackOverflowError
at java.util.ArrayList$SubList.add(ArrayList.java:965)
[...]
{noformat}


 Flooding the cluster with administrative actions leads to collapse
 --

 Key: HBASE-10312
 URL: https://issues.apache.org/jira/browse/HBASE-10312
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell

 Steps to reproduce:
 1. Start a cluster.
 2. Start an ingest process.
 3. In the HBase shell, do this:
 {noformat}
 while true do
flush 'table'
 end
 {noformat}
 We should reject abuse via administrative requests like this.
 What happens on the cluster is the requests back up, leading to lots of these:
 {noformat}
 2014-01-10 18:55:55,293 WARN  [Priority.RpcServer.handler=2,port=8120] 
 monitoring.TaskMonitor: Too many actions in action monitor! Purging some.
 {noformat}
 At this point we could lower a gate on further requests for actions until the 
 backlog clears.
 Continuing, all of the regionservers will eventually die with a 
 StackOverflowError of unknown origin because, stack overflow:
 {noformat}
 2014-01-10 19:02:02,783 ERROR [Priority.RpcServer.handler=3,port=8120] 
 ipc.RpcServer: Unexpected throwable object java.lang.StackOverflowError
 at java.util.ArrayList$SubList.add(ArrayList.java:965)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10309) Add support to delete empty regions in 0.94.x series

2014-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868224#comment-13868224
 ] 

Lars Hofhansl commented on HBASE-10309:
---

Is not simpler to remove an empty region? The equivalent of removing the 
directory from HDFS and fix META the way HBCK does?

This part of the code is not my area of expertise so I might way off, but it 
seems it should easier than actually merging regions with data.


 Add support to delete empty regions in 0.94.x series
 

 Key: HBASE-10309
 URL: https://issues.apache.org/jira/browse/HBASE-10309
 Project: HBase
  Issue Type: New Feature
Reporter: AcCud
 Fix For: 0.94.17


 My use case: I have several tables where keys start with a timestamp. Because 
 of this and combined with the fact that I have set a 15 days retention 
 period, after a period of time results empty regions.
 I am sure that no write will occur in these region.
 It would be nice to have a tool to delete regions without being necessary to 
 stop the cluster.
 The easiest way for me is to have a tool that is able to delete all empty 
 regions, but there wouldn't be any problem to specify which region to delete.
 Something like:
 deleteRegion tableName region



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868221#comment-13868221
 ] 

Andrew Purtell commented on HBASE-10308:


No I haven't looked into it.

 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.94.16, 0.96.2, 0.98.1, 0.99.0


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10309) Add support to delete empty regions in 0.94.x series

2014-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10309:
--

Fix Version/s: (was: 0.94.16)
   0.94.17

 Add support to delete empty regions in 0.94.x series
 

 Key: HBASE-10309
 URL: https://issues.apache.org/jira/browse/HBASE-10309
 Project: HBase
  Issue Type: New Feature
Reporter: AcCud
 Fix For: 0.94.17


 My use case: I have several tables where keys start with a timestamp. Because 
 of this and combined with the fact that I have set a 15 days retention 
 period, after a period of time results empty regions.
 I am sure that no write will occur in these region.
 It would be nice to have a tool to delete regions without being necessary to 
 stop the cluster.
 The easiest way for me is to have a tool that is able to delete all empty 
 regions, but there wouldn't be any problem to specify which region to delete.
 Something like:
 deleteRegion tableName region



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10308:
--

Fix Version/s: (was: 0.94.16)
   0.94.17

 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9005) Improve documentation around KEEP_DELETED_CELLS, time range scans, and delete markers

2014-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868229#comment-13868229
 ] 

Lars Hofhansl commented on HBASE-9005:
--

Only targeting 0.99 since this will go into the general documentation area on 
the site.

 Improve documentation around KEEP_DELETED_CELLS, time range scans, and delete 
 markers
 -

 Key: HBASE-9005
 URL: https://issues.apache.org/jira/browse/HBASE-9005
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Lars Hofhansl
 Fix For: 0.99.0

 Attachments: 9005.txt


 Without KEEP_DELETED_CELLS all timerange queries are broken if their range 
 covers a delete marker.
 As some internal discussions with colleagues showed, this feature is not well 
 understand and documented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9005) Improve documentation around KEEP_DELETED_CELLS, time range scans, and delete markers

2014-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9005:
-

Priority: Minor  (was: Major)

 Improve documentation around KEEP_DELETED_CELLS, time range scans, and delete 
 markers
 -

 Key: HBASE-9005
 URL: https://issues.apache.org/jira/browse/HBASE-9005
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.99.0

 Attachments: 9005.txt


 Without KEEP_DELETED_CELLS all timerange queries are broken if their range 
 covers a delete marker.
 As some internal discussions with colleagues showed, this feature is not well 
 understand and documented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-9005) Improve documentation around KEEP_DELETED_CELLS, time range scans, and delete markers

2014-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-9005:
-

Fix Version/s: (was: 0.98.1)
   (was: 0.96.2)
   (was: 0.94.16)
   0.99.0

 Improve documentation around KEEP_DELETED_CELLS, time range scans, and delete 
 markers
 -

 Key: HBASE-9005
 URL: https://issues.apache.org/jira/browse/HBASE-9005
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Lars Hofhansl
 Fix For: 0.99.0

 Attachments: 9005.txt


 Without KEEP_DELETED_CELLS all timerange queries are broken if their range 
 covers a delete marker.
 As some internal discussions with colleagues showed, this feature is not well 
 understand and documented.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10315) Canary shouldn't exit with 3 if there is no master running.

2014-01-10 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-10315:
-

 Summary: Canary shouldn't exit with 3 if there is no master 
running.
 Key: HBASE-10315
 URL: https://issues.apache.org/jira/browse/HBASE-10315
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.96.1.1, 0.98.0
Reporter: Elliott Clark
Assignee: Elliott Clark


It's possible to timeout(when timeout is below the number of retires to master) 
before even initializing if there is no master up.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10295) Refactor the replication implementation to eliminate permanent zk node

2014-01-10 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868266#comment-13868266
 ] 

stack commented on HBASE-10295:
---

Make Master arbiter for these new system tables -- only the master can mod them 
-- and then add a response on the heartbeat to update regionservers on last 
edit?  Currently we return a void.  See RegionServerReportResponse in 
http://svn.apache.org/viewvc/hbase/trunk/hbase-protocol/src/main/protobuf/RegionServerStatus.proto?view=markup
  Could be as simple as master just replying w/ timestamp of last edit.  If RS 
has not seen the new edit, it goes and reads the table

 Refactor the replication  implementation to eliminate permanent zk node
 ---

 Key: HBASE-10295
 URL: https://issues.apache.org/jira/browse/HBASE-10295
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Feng Honghua
Assignee: Feng Honghua
 Fix For: 0.99.0


 Though this is a broader and bigger change, it original motivation derives 
 from [HBASE-8751|https://issues.apache.org/jira/browse/HBASE-8751]: the newly 
 introduced per-peer tableCFs attribute should be treated the same way as the 
 peer-state, which is a permanent sub-node under peer node but using permanent 
 zk node is deemed as an incorrect practice. So let's refactor to eliminate 
 the permanent zk node. And the HBASE-8751 can then align its newly introduced 
 per-peer tableCFs attribute with this *correct* implementation theme.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868311#comment-13868311
 ] 

Hudson commented on HBASE-10307:


FAILURE: Integrated in HBase-TRUNK #4805 (See 
[https://builds.apache.org/job/HBase-TRUNK/4805/])
HBASE-10307. IntegrationTestIngestWithEncryption assumes localhost cluster 
(apurtell: rev 1557219)
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngestWithEncryption.java


 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868322#comment-13868322
 ] 

Hudson commented on HBASE-10307:


FAILURE: Integrated in HBase-0.98 #68 (See 
[https://builds.apache.org/job/HBase-0.98/68/])
HBASE-10307. IntegrationTestIngestWithEncryption assumes localhost cluster 
(apurtell: rev 1557220)
* 
/hbase/branches/0.98/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngestWithEncryption.java


 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868348#comment-13868348
 ] 

Nick Dimiduk commented on HBASE-10304:
--

Sure, I can write something up. I suppose there's no need to deprecate the fat 
jar approach so long as the docs are clear.

 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:124)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:64)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:51)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}
 So, ZCLBS is a hack.  This class is in the hbase-protocol module.  It is in 
 the com.google.protobuf package.  All is well and good usually.
 But when we make a job jar and bundle up hbase inside it, our 'trick' breaks. 
  RunJar makes a new class loader to run the job jar.  This URLCLassLoader 
 'attaches' all the jars and classes that are in jobjar so they can be found 
 when it does to do a lookup only Classloaders work by always delegating to 
 their parent first (unless you are a WAR file in a container where delegation 
 is 'off' for the most part) and in this case, the parent classloader will 
 have access to a pb jar since pb is in the hadoop CLASSPATH.  So, the parent 
 loads the pb classes.
 We then load ZCLBS only this is done in the claslsloader made by RunJar; 
 ZKCLBS has a different classloader from its superclass and we get the above 
 IllegalAccessError.
 Now (Jimmy's work comes in here), this can't be fixed by reflection -- you 
 can't 

[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868362#comment-13868362
 ] 

Nick Dimiduk commented on HBASE-10304:
--

Here's some copy we can use. Where in the book would you want something like 
this to live? I also suggest the package-info be updated as well.

h3. Problem
Mapreduce jobs submitted to the cluster via a fat jar, that is, a jar 
containing a 'lib' directory with their runtime dependencies, fail to launch. 
The symptom is an exception similar to the following:

{noformat}
Exception in thread main java.lang.IllegalAccessError: class 
com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
com.google.protobuf.LiteralByteString
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
at 
org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
...
{noformat}

This is because of an optimization introduced in 
[HBASE-9867|https://issues.apache.org/jira/browse/HBASE-9867] that 
inadvertently introduced a classloader dependency.

Jobs submitted using a regular jar and specifying their runtime dependencies 
using the -libjars parameter are not affected by this regression. More details 
about using the -libjars parameter are available in this [blog 
post|http://grepalex.com/2013/02/25/hadoop-libjars/].

h3. Solution
In order to satisfy the new classloader requirements, hbase-protocol.jar must 
be included in Hadoop's classpath. This can be resolved system-wide by 
including a reference to the hbase-protocol.jar in hadoop's lib directory, via 
a symlink or by copying the jar into the new location.

This can also be achieved on a per-job launch basis by specifying a value for 
{{HADOOP_CLASSPATH}} at job submission time. All three of the following job 
launching commands satisfy this requirement:

{noformat}
$ HADOOP_CLASSPATH=/path/to/hbase-protocol.jar hadoop jar MyJob.jar 
MyJobMainClass
$ HADOOP_CLASSPATH=$(hbase mapredcp) hadoop jar MyJob.jar MyJobMainClass
$ HADOOP_CLASSPATH=$(hbase classpath) hadoop jar MyJob.jar MyJobMainClass
{noformat}

h3. Apache Reference JIRA
See also [HBASE-10304|https://issues.apache.org/jira/browse/HBASE-10304].

 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 

[jira] [Updated] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10292:
---

Attachment: 10292-addendum-1.patch

Testing this addendum. On another issue Sergey mentioned that AsyncProcess may 
return the error for a given put on the next one. I'm not familiar with this 
area of the code, but let's try it.

 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10314) Add Chaos Monkey that doesn't touch the master

2014-01-10 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868376#comment-13868376
 ] 

Nick Dimiduk commented on HBASE-10314:
--

Instead of defining all these monkeys in code, is it possible to define them 
via configuration? I haven't looked closely at the implementation, but I'd 
think the actions should be composable.

 Add Chaos Monkey that doesn't touch the master
 --

 Key: HBASE-10314
 URL: https://issues.apache.org/jira/browse/HBASE-10314
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
Reporter: Elliott Clark
Assignee: Elliott Clark





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-1015) pure C and C++ client libraries

2014-01-10 Thread Ted Dunning (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868382#comment-13868382
 ] 

Ted Dunning commented on HBASE-1015:


Another way to put this is that if nobody cares enough to even put up a patch 
after 5 years is this issue simply moot?

Shouldn't reality be recognized?  Shouldn't this be closed as WONT_FIX?

 pure C and C++ client libraries
 ---

 Key: HBASE-1015
 URL: https://issues.apache.org/jira/browse/HBASE-1015
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 0.20.6
Reporter: Andrew Purtell
Priority: Minor

 If via HBASE-794 first class support for talking via Thrift directly to 
 HMaster and HRS is available, then pure C and C++ client libraries are 
 possible. 
 The C client library would wrap a Thrift core. 
 The C++ client library can provide a class hierarchy quite close to 
 o.a.h.h.client and, ideally, identical semantics. It  should be just a 
 wrapper around the C API, for economy.
 Internally to my employer there is a lot of resistance to HBase because many 
 dev teams have a strong C/C++ bias. The real issue however is really client 
 side integration, not a fundamental objection. (What runs server side and how 
 it is managed is a secondary consideration.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868387#comment-13868387
 ] 

Hudson commented on HBASE-10307:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #63 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/63/])
HBASE-10307. IntegrationTestIngestWithEncryption assumes localhost cluster 
(apurtell: rev 1557220)
* 
/hbase/branches/0.98/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngestWithEncryption.java


 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10316) Canary#RegionServerMonitor#monitorRegionServers() should close the scanner returned by table.getScanner()

2014-01-10 Thread Ted Yu (JIRA)
Ted Yu created HBASE-10316:
--

 Summary: Canary#RegionServerMonitor#monitorRegionServers() should 
close the scanner returned by table.getScanner()
 Key: HBASE-10316
 URL: https://issues.apache.org/jira/browse/HBASE-10316
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


At line 624, in the else block, ResultScanner returned by table.getScanner() is 
not closed.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HBASE-10316) Canary#RegionServerMonitor#monitorRegionServers() should close the scanner returned by table.getScanner()

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-10316:
--

Assignee: Ted Yu

 Canary#RegionServerMonitor#monitorRegionServers() should close the scanner 
 returned by table.getScanner()
 -

 Key: HBASE-10316
 URL: https://issues.apache.org/jira/browse/HBASE-10316
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10316.txt


 At line 624, in the else block, ResultScanner returned by table.getScanner() 
 is not closed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10316) Canary#RegionServerMonitor#monitorRegionServers() should close the scanner returned by table.getScanner()

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10316:
---

Attachment: 10316.txt

 Canary#RegionServerMonitor#monitorRegionServers() should close the scanner 
 returned by table.getScanner()
 -

 Key: HBASE-10316
 URL: https://issues.apache.org/jira/browse/HBASE-10316
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor
 Attachments: 10316.txt


 At line 624, in the else block, ResultScanner returned by table.getScanner() 
 is not closed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10316) Canary#RegionServerMonitor#monitorRegionServers() should close the scanner returned by table.getScanner()

2014-01-10 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10316:
---

Status: Patch Available  (was: Open)

 Canary#RegionServerMonitor#monitorRegionServers() should close the scanner 
 returned by table.getScanner()
 -

 Key: HBASE-10316
 URL: https://issues.apache.org/jira/browse/HBASE-10316
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10316.txt


 At line 624, in the else block, ResultScanner returned by table.getScanner() 
 is not closed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10263) make LruBlockCache single/multi/in-memory ratio user-configurable and provide preemptive mode for in-memory type block

2014-01-10 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868394#comment-13868394
 ] 

Nick Dimiduk commented on HBASE-10263:
--

Actually, [~xieliang007] do you mind committing also to 0.98? I don't want to 
steal your thunder on commit ;)

 make LruBlockCache single/multi/in-memory ratio user-configurable and provide 
 preemptive mode for in-memory type block
 --

 Key: HBASE-10263
 URL: https://issues.apache.org/jira/browse/HBASE-10263
 Project: HBase
  Issue Type: Improvement
  Components: io
Reporter: Feng Honghua
Assignee: Feng Honghua
 Fix For: 0.99.0

 Attachments: HBASE-10263-trunk_v0.patch, HBASE-10263-trunk_v1.patch, 
 HBASE-10263-trunk_v2.patch


 currently the single/multi/in-memory ratio in LruBlockCache is hardcoded 
 1:2:1, which can lead to somewhat counter-intuition behavior for some user 
 scenario where in-memory table's read performance is much worse than ordinary 
 table when two tables' data size is almost equal and larger than 
 regionserver's cache size (we ever did some such experiment and verified that 
 in-memory table random read performance is two times worse than ordinary 
 table).
 this patch fixes above issue and provides:
 1. make single/multi/in-memory ratio user-configurable
 2. provide a configurable switch which can make in-memory block preemptive, 
 by preemptive means when this switch is on in-memory block can kick out any 
 ordinary block to make room until no ordinary block, when this switch is off 
 (by default) the behavior is the same as previous, using 
 single/multi/in-memory ratio to determine evicting.
 by default, above two changes are both off and the behavior keeps the same as 
 before applying this patch. it's client/user's choice to determine whether or 
 which behavior to use by enabling one of these two enhancements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-1015) pure C and C++ client libraries

2014-01-10 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868410#comment-13868410
 ] 

Andrew Purtell commented on HBASE-1015:
---

bq. Another way to put this is that if nobody cares enough to even put up a 
patch after 5 years is this issue simply moot?

This issue has been superseded by the use of protobuf in RPCs instead of Thrift 
and the commit of the start of a C/C++ client library, see HBASE-9977. Closing 
this issue in lieu of something else is fine, but WONTFIX is the incorrect 
resolution.

 pure C and C++ client libraries
 ---

 Key: HBASE-1015
 URL: https://issues.apache.org/jira/browse/HBASE-1015
 Project: HBase
  Issue Type: New Feature
  Components: Client
Affects Versions: 0.20.6
Reporter: Andrew Purtell
Priority: Minor

 If via HBASE-794 first class support for talking via Thrift directly to 
 HMaster and HRS is available, then pure C and C++ client libraries are 
 possible. 
 The C client library would wrap a Thrift core. 
 The C++ client library can provide a class hierarchy quite close to 
 o.a.h.h.client and, ideally, identical semantics. It  should be just a 
 wrapper around the C API, for economy.
 Internally to my employer there is a lot of resistance to HBase because many 
 dev teams have a strong C/C++ bias. The real issue however is really client 
 side integration, not a fundamental objection. (What runs server side and how 
 it is managed is a secondary consideration.)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10310) ZNodeCleaner session expired for /hbase/master

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10310:
---

Summary: ZNodeCleaner session expired for /hbase/master  (was: 
ZNodeCleaner.java KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master)

 ZNodeCleaner session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 

[jira] [Resolved] (HBASE-10310) ZNodeCleaner session expired for /hbase/master

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-10310.


   Resolution: Fixed
Fix Version/s: 0.99.0
   0.96.2
   0.98.0
 Hadoop Flags: Reviewed

Committed to trunk, 0.98, and 0.96. Thanks for the patch Samir!

 ZNodeCleaner session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Fix For: 0.98.0, 0.96.2, 0.99.0

 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at 

[jira] [Commented] (HBASE-9426) Make custom distributed barrier procedure pluggable

2014-01-10 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868447#comment-13868447
 ] 

Richard Ding commented on HBASE-9426:
-

[~jmhsieh],  can you please take a look at the new patch and let me know what 
your think? Thanks.

 Make custom distributed barrier procedure pluggable 
 

 Key: HBASE-9426
 URL: https://issues.apache.org/jira/browse/HBASE-9426
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2, 0.94.11
Reporter: Richard Ding
Assignee: Richard Ding
 Attachments: HBASE-9426-4.patch, HBASE-9426-4.patch, 
 HBASE-9426-6.patch, HBASE-9426.patch.1, HBASE-9426.patch.2, HBASE-9426.patch.3


 Currently if one wants to implement a custom distributed barrier procedure 
 (e.g., distributed log roll or distributed table flush), the HBase core code 
 needs to be modified in order for the procedure to work.
 Looking into the snapshot code (especially on region server side), most of 
 the code to enable the procedure are generic life-cycle management (i.e., 
 init, start, stop). We can make this part pluggable.
 Here is the proposal. Following the coprocessor example, we define two 
 properties:
 {code}
 hbase.procedure.regionserver.classes
 hbase.procedure.master.classes
 {code}
 The values for both are comma delimited list of classes. On region server 
 side, the classes implements the following interface:
 {code}
 public interface RegionServerProcedureManager {
   public void initialize(RegionServerServices rss) throws KeeperException;
   public void start();
   public void stop(boolean force) throws IOException;
   public String getProcedureName();
 }
 {code}
 While on Master side, the classes implement the interface:
 {code}
 public interface MasterProcedureManager {
   public void initialize(MasterServices master) throws KeeperException, 
 IOException, UnsupportedOperationException;
   public void stop(String why);
   public String getProcedureName();
   public void execProcedure(ProcedureDescription desc) throws IOException;
   IOException;
 }
 {code}
 Where the ProcedureDescription is defined as
 {code}
 message ProcedureDescription {
   required string name = 1;
   required string instance = 2;
   optional int64 creationTime = 3 [default = 0];
   message Property {
 required string tag = 1;
 optional string value = 2;
   }
   repeated Property props = 4;
 }
 {code}
 A generic API can be defined on HMaster to trigger a procedure:
 {code}
 public boolean execProcedure(ProcedureDescription desc) throws IOException;
 {code}
 _SnapshotManager_ and _RegionServerSnapshotManager_ are special examples of 
 _MasterProcedureManager_ and _RegionServerProcedureManager_. They will be 
 automatically included (users don't need to specify them in the conf file).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10317) getClientPort method of MiniZooKeeperCluster does not always return the correct value

2014-01-10 Thread Vasu Mariyala (JIRA)
Vasu Mariyala created HBASE-10317:
-

 Summary: getClientPort method of MiniZooKeeperCluster does not 
always return the correct value
 Key: HBASE-10317
 URL: https://issues.apache.org/jira/browse/HBASE-10317
 Project: HBase
  Issue Type: Bug
Reporter: Vasu Mariyala
Priority: Minor


{code}
//Starting 5 zk servers
MiniZooKeeperCluster cluster = hbt.startMiniZKCluster(5);
int defaultClientPort = 21818;
cluster.setDefaultClientPort(defaultClientPort);
cluster.killCurrentActiveZooKeeperServer();
cluster.getClientPort(); //Still returns the port of the zk server that was 
killed in the previous step
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10317) getClientPort method of MiniZooKeeperCluster does not always return the correct value

2014-01-10 Thread Vasu Mariyala (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasu Mariyala updated HBASE-10317:
--

Attachment: HBASE-10317.patch

 getClientPort method of MiniZooKeeperCluster does not always return the 
 correct value
 -

 Key: HBASE-10317
 URL: https://issues.apache.org/jira/browse/HBASE-10317
 Project: HBase
  Issue Type: Bug
Reporter: Vasu Mariyala
Priority: Minor
 Attachments: HBASE-10317.patch


 {code}
 //Starting 5 zk servers
 MiniZooKeeperCluster cluster = hbt.startMiniZKCluster(5);
 int defaultClientPort = 21818;
 cluster.setDefaultClientPort(defaultClientPort);
 cluster.killCurrentActiveZooKeeperServer();
 cluster.getClientPort(); //Still returns the port of the zk server that was 
 killed in the previous step
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10317) getClientPort method of MiniZooKeeperCluster does not always return the correct value

2014-01-10 Thread Vasu Mariyala (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vasu Mariyala updated HBASE-10317:
--

Status: Patch Available  (was: Open)

 getClientPort method of MiniZooKeeperCluster does not always return the 
 correct value
 -

 Key: HBASE-10317
 URL: https://issues.apache.org/jira/browse/HBASE-10317
 Project: HBase
  Issue Type: Bug
Reporter: Vasu Mariyala
Priority: Minor
 Attachments: HBASE-10317.patch


 {code}
 //Starting 5 zk servers
 MiniZooKeeperCluster cluster = hbt.startMiniZKCluster(5);
 int defaultClientPort = 21818;
 cluster.setDefaultClientPort(defaultClientPort);
 cluster.killCurrentActiveZooKeeperServer();
 cluster.getClientPort(); //Still returns the port of the zk server that was 
 killed in the previous step
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10304) Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString

2014-01-10 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868472#comment-13868472
 ] 

Jimmy Xiang commented on HBASE-10304:
-

I tried with -libjars, and it gave me the same problem. So it is not working 
for me.

I also tried the three suggestions. The first two of them need some tweaking, 
while the third one work as-is.

bq. $ HADOOP_CLASSPATH=/path/to/hbase-protocol.jar hadoop jar MyJob.jar 
MyJobMainClass

I got this:
{noformat}
14/01/10 15:31:05 WARN zookeeper.ClientCnxn: Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
{noformat}

Basically, I can't connect to the ZK.  I have to add the hbase conf dir as 
below:

{noformat}
$ HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase-conf hadoop jar 
MyJob.jar MyJobMainClass
{noformat}

bq. $ HADOOP_CLASSPATH=$(hbase mapredcp) hadoop jar MyJob.jar MyJobMainClass

Same as above. I need to add hbase conf dir to the path:
{noformat}
$ HADOOP_CLASSPATH=$(hbase mapredcp):/path/to/hbase-conf hadoop jar MyJob.jar 
MyJobMainClass
{noformat}


bq. $ HADOOP_CLASSPATH=$(hbase classpath) hadoop jar MyJob.jar MyJobMainClass

Works for me.


 Running an hbase job jar: IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
 

 Key: HBASE-10304
 URL: https://issues.apache.org/jira/browse/HBASE-10304
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.98.0, 0.96.1.1
Reporter: stack
Priority: Blocker
 Fix For: 0.98.0

 Attachments: hbase-10304_not_tested.patch, jobjar.xml


 (Jimmy has been working on this one internally.  I'm just the messenger 
 raising this critical issue upstream).
 So, if you make job jar and bundle up hbase inside in it because you want to 
 access hbase from your mapreduce task, the deploy of the job jar to the 
 cluster fails with:
 {code}
 14/01/05 08:59:19 INFO Configuration.deprecation: 
 topology.node.switch.mapping.impl is deprecated. Instead, use 
 net.topology.node.switch.mapping.impl
 14/01/05 08:59:19 INFO Configuration.deprecation: io.bytes.per.checksum is 
 deprecated. Instead, use dfs.bytes-per-checksum
 Exception in thread main java.lang.IllegalAccessError: class 
 com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass 
 com.google.protobuf.LiteralByteString
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:818)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.convertScanToString(TableMapReduceUtil.java:433)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:186)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:147)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:270)
   at 
 org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:100)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:124)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.run(HBaseMapReduceIndexerTool.java:64)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 com.ngdata.hbaseindexer.mr.HBaseMapReduceIndexerTool.main(HBaseMapReduceIndexerTool.java:51)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 

[jira] [Commented] (HBASE-10316) Canary#RegionServerMonitor#monitorRegionServers() should close the scanner returned by table.getScanner()

2014-01-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868480#comment-13868480
 ] 

Hadoop QA commented on HBASE-10316:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12622448/10316.txt
  against trunk revision .
  ATTACHMENT ID: 12622448

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8386//console

This message is automatically generated.

 Canary#RegionServerMonitor#monitorRegionServers() should close the scanner 
 returned by table.getScanner()
 -

 Key: HBASE-10316
 URL: https://issues.apache.org/jira/browse/HBASE-10316
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Minor
 Attachments: 10316.txt


 At line 624, in the else block, ResultScanner returned by table.getScanner() 
 is not closed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10312) Flooding the cluster with administrative actions leads to collapse

2014-01-10 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10312:
---

Fix Version/s: 0.99.0

 Flooding the cluster with administrative actions leads to collapse
 --

 Key: HBASE-10312
 URL: https://issues.apache.org/jira/browse/HBASE-10312
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
 Fix For: 0.99.0


 Steps to reproduce:
 1. Start a cluster.
 2. Start an ingest process.
 3. In the HBase shell, do this:
 {noformat}
 while true do
flush 'table'
 end
 {noformat}
 We should reject abuse via administrative requests like this.
 What happens on the cluster is the requests back up, leading to lots of these:
 {noformat}
 2014-01-10 18:55:55,293 WARN  [Priority.RpcServer.handler=2,port=8120] 
 monitoring.TaskMonitor: Too many actions in action monitor! Purging some.
 {noformat}
 At this point we could lower a gate on further requests for actions until the 
 backlog clears.
 Continuing, all of the regionservers will eventually die with a 
 StackOverflowError of unknown origin because, stack overflow:
 {noformat}
 2014-01-10 19:02:02,783 ERROR [Priority.RpcServer.handler=3,port=8120] 
 ipc.RpcServer: Unexpected throwable object java.lang.StackOverflowError
 at java.util.ArrayList$SubList.add(ArrayList.java:965)
 [...]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10314) Add Chaos Monkey that doesn't touch the master

2014-01-10 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868519#comment-13868519
 ] 

Elliott Clark commented on HBASE-10314:
---

The actions are easily composable.  But doing that on the command line is 
pretty awful so imo it's much better to have several of these already made and 
there for easy use.

 Add Chaos Monkey that doesn't touch the master
 --

 Key: HBASE-10314
 URL: https://issues.apache.org/jira/browse/HBASE-10314
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.98.0, 0.99.0, 0.96.1.1
Reporter: Elliott Clark
Assignee: Elliott Clark





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10308) TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails occasionally

2014-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868521#comment-13868521
 ] 

Lars Hofhansl commented on HBASE-10308:
---

I ran the test locally in a loop while I was away for meetings. After 1658 
times it has not failed once. :(


 TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare fails 
 occasionally
 

 Key: HBASE-10308
 URL: https://issues.apache.org/jira/browse/HBASE-10308
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.2, 0.98.1, 0.99.0, 0.94.17


 Seen in 0.94 (both JDK6 and JDK7 builds)
 {code}
 Error Message
  Wanted but not invoked: procedure.sendGlobalBarrierComplete(); - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   However, there were other interactions with this mock: - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
  - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) 
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204) - 
 at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
  - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
  - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
  
 Stacktrace
 Wanted but not invoked:
 procedure.sendGlobalBarrierComplete();
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
 However, there were other interactions with this mock:
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:306)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:311)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:204)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.memberAcquiredBarrier(ProcedureCoordinator.java:262)
 - at 
 org.apache.hadoop.hbase.procedure.ProcedureCoordinator.abortProcedure(ProcedureCoordinator.java:217)
 - at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:337)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.waitAndVerifyProc(TestZKProcedure.java:344)
   at 
 org.apache.hadoop.hbase.procedure.TestZKProcedure.testMultiCohortWithMemberTimeoutDuringPrepare(TestZKProcedure.java:319)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10317) getClientPort method of MiniZooKeeperCluster does not always return the correct value

2014-01-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868535#comment-13868535
 ] 

Lars Hofhansl commented on HBASE-10317:
---

Looks good to me.

 getClientPort method of MiniZooKeeperCluster does not always return the 
 correct value
 -

 Key: HBASE-10317
 URL: https://issues.apache.org/jira/browse/HBASE-10317
 Project: HBase
  Issue Type: Bug
Reporter: Vasu Mariyala
Priority: Minor
 Attachments: HBASE-10317.patch


 {code}
 //Starting 5 zk servers
 MiniZooKeeperCluster cluster = hbt.startMiniZKCluster(5);
 int defaultClientPort = 21818;
 cluster.setDefaultClientPort(defaultClientPort);
 cluster.killCurrentActiveZooKeeperServer();
 cluster.getClientPort(); //Still returns the port of the zk server that was 
 killed in the previous step
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10318) generate-hadoopX-poms.sh expects the version to have one extra '-'

2014-01-10 Thread Raja Aluri (JIRA)
Raja Aluri created HBASE-10318:
--

 Summary: generate-hadoopX-poms.sh expects the version to have one 
extra '-'
 Key: HBASE-10318
 URL: https://issues.apache.org/jira/browse/HBASE-10318
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.98.0
Reporter: Raja Aluri


This change is in 0.96 branch, but missing in 0.98.
Including the commit that made this 
[change|https://github.com/apache/hbase/commit/09442ca]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10318) generate-hadoopX-poms.sh expects the version to have one extra '-'

2014-01-10 Thread Raja Aluri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raja Aluri updated HBASE-10318:
---

Status: Patch Available  (was: Open)

 generate-hadoopX-poms.sh expects the version to have one extra '-'
 --

 Key: HBASE-10318
 URL: https://issues.apache.org/jira/browse/HBASE-10318
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.98.0
Reporter: Raja Aluri
 Attachments: 
 0001-HBASE-10318-generate-hadoopX-poms.sh-expects-the-ver.patch


 This change is in 0.96 branch, but missing in 0.98.
 Including the commit that made this 
 [change|https://github.com/apache/hbase/commit/09442ca]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10318) generate-hadoopX-poms.sh expects the version to have one extra '-'

2014-01-10 Thread Raja Aluri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raja Aluri updated HBASE-10318:
---

Attachment: 0001-HBASE-10318-generate-hadoopX-poms.sh-expects-the-ver.patch

 generate-hadoopX-poms.sh expects the version to have one extra '-'
 --

 Key: HBASE-10318
 URL: https://issues.apache.org/jira/browse/HBASE-10318
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.98.0
Reporter: Raja Aluri
 Attachments: 
 0001-HBASE-10318-generate-hadoopX-poms.sh-expects-the-ver.patch


 This change is in 0.96 branch, but missing in 0.98.
 Including the commit that made this 
 [change|https://github.com/apache/hbase/commit/09442ca]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10310) ZNodeCleaner session expired for /hbase/master

2014-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868545#comment-13868545
 ] 

Hudson commented on HBASE-10310:


FAILURE: Integrated in HBase-TRUNK #4806 (See 
[https://builds.apache.org/job/HBase-TRUNK/4806/])
HBASE-10310. ZNodeCleaner session expired for /hbase/master (Samir Ahmic) 
(apurtell: rev 1557273)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ZNodeClearer.java


 ZNodeCleaner session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Fix For: 0.98.0, 0.96.2, 0.99.0

 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 

[jira] [Commented] (HBASE-10310) ZNodeCleaner session expired for /hbase/master

2014-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868548#comment-13868548
 ] 

Hudson commented on HBASE-10310:


FAILURE: Integrated in hbase-0.96-hadoop2 #172 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/172/])
HBASE-10310. ZNodeCleaner session expired for /hbase/master (Samir Ahmic) 
(apurtell: rev 1557275)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/ZNodeClearer.java


 ZNodeCleaner session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Fix For: 0.98.0, 0.96.2, 0.99.0

 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)

[jira] [Commented] (HBASE-10307) IntegrationTestIngestWithEncryption assumes localhost cluster

2014-01-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868550#comment-13868550
 ] 

Hudson commented on HBASE-10307:


FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #49 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/49/])
HBASE-10307. IntegrationTestIngestWithEncryption assumes localhost cluster 
(apurtell: rev 1557219)
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestIngestWithEncryption.java


 IntegrationTestIngestWithEncryption assumes localhost cluster
 -

 Key: HBASE-10307
 URL: https://issues.apache.org/jira/browse/HBASE-10307
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.99.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10307.patch, 10307.patch, 10307.patch


 We forgot to update IntegrationTestIngestWithEncryption to handle the 
 distributed cluster case.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10278) Provide better write predictability

2014-01-10 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868551#comment-13868551
 ] 

Himanshu Vashishtha commented on HBASE-10278:
-

Thanks for reviewing the doc Liang (and sorry about this delay in replying).

True, to handle longer outages (rack down, for e.g.), we could tune the 
switching policy to avoid tiny log files (for e.g., take number of append ops 
since last switched, etc).

Yes, 300ms is the avg time (total time for 1k ops was about 30sec). I didn't 
really dig into it to know the why it is better than as compared to 1 file 
scenario, but for me the interesting bit was about 568/1000 ops took more than 
a sec.

Yes, The replication needs to handle two opened files. To get minimal impact on 
Replication, I am thinking of adding a separate ReplicationSource thread for 
the second WAL. But, I still need to look into it more if there is a better way 
to achieve this.

 Provide better write predictability
 ---

 Key: HBASE-10278
 URL: https://issues.apache.org/jira/browse/HBASE-10278
 Project: HBase
  Issue Type: New Feature
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Attachments: Multiwaldesigndoc.pdf


 Currently, HBase has one WAL per region server. 
 Whenever there is any latency in the write pipeline (due to whatever reasons 
 such as n/w blip, a node in the pipeline having a bad disk, etc), the overall 
 write latency suffers. 
 Jonathan Hsieh and I analyzed various approaches to tackle this issue. We 
 also looked at HBASE-5699, which talks about adding concurrent multi WALs. 
 Along with performance numbers, we also focussed on design simplicity, 
 minimum impact on MTTR  Replication, and compatibility with 0.96 and 0.98. 
 Considering all these parameters, we propose a new HLog implementation with 
 WAL Switching functionality.
 Please find attached the design doc for the same. It introduces the WAL 
 Switching feature, and experiments/results of a prototype implementation, 
 showing the benefits of this feature.
 The second goal of this work is to serve as a building block for concurrent 
 multiple WALs feature.
 Please review the doc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10278) Provide better write predictability

2014-01-10 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868556#comment-13868556
 ] 

Himanshu Vashishtha commented on HBASE-10278:
-

As mentioned in the doc, I will work on this feature on a different branch and 
merge it in the trunk when it is ready. 
I have created branch at my github 
(https://github.com/HimanshuVashishtha/hbase/tree/HBASE-10278).

 Provide better write predictability
 ---

 Key: HBASE-10278
 URL: https://issues.apache.org/jira/browse/HBASE-10278
 Project: HBase
  Issue Type: New Feature
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
 Attachments: Multiwaldesigndoc.pdf


 Currently, HBase has one WAL per region server. 
 Whenever there is any latency in the write pipeline (due to whatever reasons 
 such as n/w blip, a node in the pipeline having a bad disk, etc), the overall 
 write latency suffers. 
 Jonathan Hsieh and I analyzed various approaches to tackle this issue. We 
 also looked at HBASE-5699, which talks about adding concurrent multi WALs. 
 Along with performance numbers, we also focussed on design simplicity, 
 minimum impact on MTTR  Replication, and compatibility with 0.96 and 0.98. 
 Considering all these parameters, we propose a new HLog implementation with 
 WAL Switching functionality.
 Please find attached the design doc for the same. It introduces the WAL 
 Switching feature, and experiments/results of a prototype implementation, 
 showing the benefits of this feature.
 The second goal of this work is to serve as a building block for concurrent 
 multiple WALs feature.
 Please review the doc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10319) HLog should roll periodically to allow DN decommission to eventually complete.

2014-01-10 Thread Jonathan Hsieh (JIRA)
Jonathan Hsieh created HBASE-10319:
--

 Summary: HLog should roll periodically to allow DN decommission to 
eventually complete.
 Key: HBASE-10319
 URL: https://issues.apache.org/jira/browse/HBASE-10319
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh


We encountered a situation where we had an esseitially read only table and 
attempted to do a clean HDFS DN decommission.  DN's cannot decomission if there 
are open blocks being written to currently on it.  Because the hbase Hlog file 
was open, had some data (hlog header), the DN could not decommission itself.  
Since no new data is ever written, the existing periodic check is not activated.

After discussing with [~atm], it seems that although an hdfs semantics change 
would be ideal (e.g. hbase doesn't have to be aware of hdfs decommission and 
the client would roll over) this would take much more effort than having hbase 
periodically force a log roll.  This would enable the hdfs dn con complete.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >