[jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258662#comment-13258662
 ] 

stack commented on HBASE-5833:
--

Looking at this more, it seems like the actual TMF tests run pretty fast but 
then the test gets stuck a long time finishing up on the end... like 10x more 
time finishing than was spent running the test.  This long cleanup seems to be 
our 'timeout'.  In TMF case, digging, there are orphan logsyncer threads 
outstanding trying to sync a fileystem that has been since closed.  This seems 
to be part of the issue (There are other orphan threads -- executors).  TMF 
does HRegion.createRegions but it doesn't pass in a WAL to use in creation.   
In this case, this method makes an individual WAL for the create use.  In TMF, 
and in a bunch of other tests, this hlog is never closed leaving logsyncer 
threads running (and some executors).  Will attach a patch that goes through 
all tests and does close and cleanup of wals that use this special 
createRegions method (there are a bunch).

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: closehregions.txt

This patch is for 0.92 only; thats where I'm doing investigation. Doesn't apply 
to trunk yet.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-5833:
--


Reopen

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258796#comment-13258796
 ] 

stack commented on HBASE-5844:
--

If we go to a count  100 we just continue the startup?  Is that what you want?

{code}
+while (!tracker.checkIfBaseNodeAvailable()  ++count100) {
+  Thread.sleep(100);
+}
{code}

Be like the rest of the code regards spaces; i.e. spaces around operators...


+
+if (fileName==null){


Maybe you don't need deleteMyEphemeralNodeOnDisk if you instead use 
http://docs.oracle.com/javase/6/docs/api/java/io/File.html#deleteOnExit() 
inside in writeMyEphemeralNodeOnDisk?

Patch looks good N.

We upped the timeout because noobs would install hbase then run big mapreduce 
jobs w/o turning jvm and so big GCs.  We figured they'd rather have their 
regionserver ride over the big pauses than have them be 'sensitive' out of the 
box.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5844.v1.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5621:
-

Attachment: hbase_5621_v4.patch

Retrying

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5621:
-

Status: Patch Available  (was: Open)

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5621:
-

Status: Open  (was: Patch Available)

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258988#comment-13258988
 ] 

stack commented on HBASE-5844:
--

Ok on your reasoning for not using deleteOnExit.  Try and have the two methods 
share more code like getting the name of the file w/ the znode name in it.  
Otherwise, sounds good.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5844.v1.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258994#comment-13258994
 ] 

stack commented on HBASE-5621:
--

We seem to have run a good few less tests, ~880 vs 906 or so Does it pass 
all tests for you Jimmy?  Thanks.

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258998#comment-13258998
 ] 

stack commented on HBASE-5833:
--

More digging.  The newest test added here, 
testShouldCheckMasterFailOverWhenMETAIsInOpenedState, is a little interesting.  
It was added by this commit:

{code}

r1172063 | tedyu | 2011-09-17 13:27:00 -0700 (Sat, 17 Sep 2011) | 3 lines

HBASE-4400  .META. getting stuck if RS hosting it is dead and znode state is in
   RS_ZK_REGION_OPENED (Ramkrishna)

{code}

The test is a bunch of copy/paste confirming stuff its not using.  It then does 
a cluster shutdown but does it explicitly on a cluster object and not via 
HBaseTestingUtility though it then starts a cluster subsequently with 
HBaseTestingUtility.  Not using HTU to do both the shutodwn and the startup can 
make he HTU state confused on whether there a master available so we just wait 
for ever.  This seems to be responsible for case where test would timeout after 
15 minutes and say no tests run and none failed.

I added a timeout for this test of 3 minutes.

Other interesting stuff is that this TestMasterFailover starts clusters per 
method but shutdown leaves around some threads.  I dug in some and was able to 
clean up an LruBlockCache eviction thread but others persist and would take a 
little more work to undo.  They seem harmless but I'll list them anyways:

{code}
TestMasterFailover [JUnit]  
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner at 
localhost:54811   
Thread [main] (Running) 
Thread [ReaderThread] (Running) 
Thread [Thread-2] (Suspended (breakpoint at line 587 in 
HBaseTestingUtility))   
HBaseTestingUtility.shutdownMiniCluster() line: 587 
TestMasterFailover.testSimpleMasterFailover() line: 178 
NativeMethodAccessorImpl.invoke0(Method, Object, 
Object[]) line: not available [native method]  
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 
39  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) 
line: 25  
Method.invoke(Object, Object...) line: 597  
FrameworkMethod$1.runReflectiveCall() line: 45  
FrameworkMethod$1(ReflectiveCallable).run() line: 15
FrameworkMethod.invokeExplosively(Object, Object...) 
line: 42   
InvokeMethod.evaluate() line: 20
FailOnTimeout$StatementThread.run() line: 62
Daemon Thread [Poller SunPKCS11-Darwin] (Running)   
Thread [pool-1-thread-1] (Running)  
Thread [pool-2-thread-1] (Running)  
Thread [pool-3-thread-1] (Running)  
Thread [pool-4-thread-1] (Running)  
Daemon Thread [LeaseChecker] (Running)  
Daemon Thread 
[RegionServer:2;192.168.1.74,54842,1335066804457.decayingSampleTick.1] 
(Running)  
Daemon Thread 
[Master:2;192.168.1.74,54838,1335066803952-SendThread(fe80:0:0:0:0:0:0:1%1:21818)]
 (Running)  
Daemon Thread 
[Master:2;192.168.1.74,54838,1335066803952-EventThread] (Running) 
Daemon Thread 
[Master:1;192.168.1.74,54836,1335066798880-EventThread] (Running) 
Daemon Thread 
[Master:1;192.168.1.74,54836,1335066798880-SendThread(localhost:21818)] 
(Running) 

/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/java (Apr 
21, 2012 8:53:07 PM) 
{code}

The thread names are enhanced -- v2 of this patch -- but things like 
decayingSampleTick are set in a static so hard to get rid of in test setup.  
The SendThread/EventThread are zk client hangouts.  Not sure what 
pool-4-thread-1 are (I've enhanced the HTable executor to include htable in 
name so these are identifiable going forward but above executor does not seem 
to be HTable).

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833-v2.092.txt

{code}
M src/main/java/org/apache/hadoop/hbase/client/HTable.java
  Give the htable executor an htable prefix.
M src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  Add a shutdown so I can close out the eviction thread.
M 
src/main/java/org/apache/hadoop/hbase/metrics/histogram/ExponentiallyDecayingSample.java
  Add hosting thread prefix to the executor made thread names.
M src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
  Add javadoc warning regions made must be closed when done and their
  logs closed too to avoid leaving threads hanging.
  Added utility closeHRegion to answer createHRegion.
M  src/main/java/org/apache/hadoop/hbase/util/JVMClusterUtil.java
  Closing out  masters in cluster, close out the backup masters first.
M src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
  Javadoc that you have to close out regions made with createNewRegion
  and close their log.
M src/test/java/org/apache/hadoop/hbase/filter/TestColumnPrefixFilter.java
M src/test/java/org/apache/hadoop/hbase/filter/TestDependentColumnFilter.java
M src/test/java/org/apache/hadoop/hbase/filter/TestFilter.java
M 
src/test/java/org/apache/hadoop/hbase/filter/TestMultipleColumnPrefixFilter.java
M src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java
M src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java
M src/test/java/org/apache/hadoop/hbase/regionserver/TestColumnSeeking.java
M src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java
M src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  Do close of region and close and delete of wal log when done w/ tests.
M src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
  Do close of region and close and delete of wal log when done w/ tests.
  Also, some cleanup in testShouldCheckMasterFailOverWhenMETAIsInOpenedState
  Removed useless code, made it use HBaseTestingUtility to do cluster shutdown
  so HTU was clear on what state of test was and added a timeout.
{code}

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-v2.092.txt, 5833.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833-trunk.txt

Trunk patch

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Status: Patch Available  (was: Reopened)

Trying trunk patch against hadoopqa.  Trunk patch is not as fat as the 
0.92/0.90 patch because a bunch of the fixing has been done in trunk already.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833v3092.txt

Running tests, I found that I'd failed convert a method in TestHRegion.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833v4092.txt

Address last of Ted's comments.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4092.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259355#comment-13259355
 ] 

stack commented on HBASE-5833:
--

I ran through complete test suite twice and got the following failures (I fixed 
the failed TestHRegion w/ last patch).  This is on 092.

{code}
Results :

Failed tests:   
testGetScanner_WithOkFamilies(org.apache.hadoop.hbase.regionserver.TestHRegion):
 Families could not be found in Region
  
testCyclicReplication(org.apache.hadoop.hbase.replication.TestMasterReplication):
 Waited too much time for put replication
  testLeaderSelection(org.apache.hadoop.hbase.zookeeper.TestZKLeaderManager): 
New leader should exist after stop

Tests in error: 
  testTableOperations(org.apache.hadoop.hbase.coprocessor.TestMasterObserver): 
org.apache.hadoop.hbase.InvalidFamilyOperationException: Column family 'fam2' 
does not exist
  
testRegionTransitionOperations(org.apache.hadoop.hbase.coprocessor.TestMasterObserver):
 org.apache.hadoop.hbase.TableExistsException: observed_table
  
testShouldCheckMasterFailOverWhenMETAIsInOpenedState(org.apache.hadoop.hbase.master.TestMasterFailover):
 test timed out after 18 milliseconds
  
testSimplePutDelete(org.apache.hadoop.hbase.replication.TestMasterReplication): 
Cluster already running at 
/Users/stack/checkouts/hbase/target/test-data/49fb7ad7-156f-4461-964b-7bb54d70db63/dfscluster_c7c5efc0-4085-4914-9be5-c2b3e1530af9

Tests run: 1074, Failures: 3, Errors: 4, Skipped: 8



Results :

Failed tests:   
testGetScanner_WithOkFamilies(org.apache.hadoop.hbase.regionserver.TestHRegion):
 Families could not be found in Region
  
testCyclicReplication(org.apache.hadoop.hbase.replication.TestMasterReplication):
 Waited too much time for put replication

Tests in error: 
  org.apache.hadoop.hbase.client.TestAdmin: No server address listed in -ROOT- 
for region .META.,,1.1028785192
  
testSimplePutDelete(org.apache.hadoop.hbase.replication.TestMasterReplication): 
Cluster already running at 
/Users/stack/checkouts/hbase/target/test-data/fcfffe27-e2ce-4f91-84fc-bd65521b6426/dfscluster_bd55ee62-0532-40d4-86e5-487bd45d9565

Tests run: 1075, Failures: 2, Errors: 2, Skipped: 8
{code}

Let me commit this on 0.92.  Will then work on getting it into 0.90 -- since 
same breakage is there but will take some massaging and testing to get this 
patch in -- and ditto on 0.94 and trunk (something up w/ TestHRegion on 
trunk at mo).

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4092.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259380#comment-13259380
 ] 

stack commented on HBASE-5833:
--

Committed 5833v4092.txt to 0.92 branch.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4092.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833v4090.txt

Here is patch for 0.90.  Ran all tests and all pass.  Will commit.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4090.txt, 5833v4092.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259649#comment-13259649
 ] 

stack commented on HBASE-5857:
--

@Chinna Good find.  Nice use of Mockito Whitebox too.  Does your test confirm 
the fix though?  Maybe I'm not reading it right.  Should it check the 
regionserver instance does not have the test region in its 
this.regionsInTransitionInRS Map?  Thanks.

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.patch, HBASE-5857_94.patch, 
 HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259773#comment-13259773
 ] 

stack commented on HBASE-5861:
--

@Jon Which code breaks the build?  The hbase-5064 changes have been in there 
with a good while now.

 Hadoop 23 compile broken due to tests introduced in HBASE-5064 
 ---

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5830) Cleanup SequenceFileLogWriter to use syncFs api from SequenceFile#Writer directly in trunk.

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259784#comment-13259784
 ] 

stack commented on HBASE-5830:
--

This will work for hadoop 1.0.x and for hadoop 2.x Uma?

 Cleanup SequenceFileLogWriter to use syncFs api from SequenceFile#Writer 
 directly in trunk.
 ---

 Key: HBASE-5830
 URL: https://issues.apache.org/jira/browse/HBASE-5830
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.96.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HBASE-5830.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259788#comment-13259788
 ] 

stack commented on HBASE-5621:
--

TestHRegion has a problem the way its written.  Will fix in 5833 commit that is 
coming up.

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch, hbase_5621_v5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5621) Convert admin protocol of HRegionInterface to PB

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5621:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Applied to trunk.  Thanks for the patch Jimmy.

 Convert admin protocol of HRegionInterface to PB
 

 Key: HBASE-5621
 URL: https://issues.apache.org/jira/browse/HBASE-5621
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, 
 hbase_5621_v4.patch, hbase_5621_v5.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5830) Cleanup SequenceFileLogWriter to use syncFs api from SequenceFile#Writer directly in trunk.

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259802#comment-13259802
 ] 

stack commented on HBASE-5830:
--

+1 on patch then.

 Cleanup SequenceFileLogWriter to use syncFs api from SequenceFile#Writer 
 directly in trunk.
 ---

 Key: HBASE-5830
 URL: https://issues.apache.org/jira/browse/HBASE-5830
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.96.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HBASE-5830.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259805#comment-13259805
 ] 

stack commented on HBASE-5851:
--

This is secure hbase only?  We should change the subject.  Thanks for digging 
in Jimmy.  Make it a blocker too?

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang

 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259811#comment-13259811
 ] 

stack commented on HBASE-5844:
--

Should master write out its znode name too?  If it crashes this code could 
bring on the second master faster?

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5844.v1.patch, 5844.v2.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5856) byte - String is not consistent between HBaseAdmin and HRegionInfo

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259817#comment-13259817
 ] 

stack commented on HBASE-5856:
--

I think regionNameStr in HRI is wrong.  We should only do toBytesBinary for the 
cases where we are outputting in shell or in ui; toBytesBinary is for human 
consumption.  HBase should be about undoctored bytes.

 byte - String is not consistent between HBaseAdmin and HRegionInfo
 

 Key: HBASE-5856
 URL: https://issues.apache.org/jira/browse/HBASE-5856
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.6, 0.92.1
Reporter: binlijin

 In HBaseAdmin 
   public void split(final String tableNameOrRegionName)
   throws IOException, InterruptedException {
 split(Bytes.toBytes(tableNameOrRegionName));  // string - byte 
   }
 In HRegionInfo
   this.regionNameStr = Bytes.toStringBinary(this.regionName);  // byte - 
 string
 Should we use Bytes.toBytesBinary in HBaseAdmin ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259866#comment-13259866
 ] 

stack commented on HBASE-5833:
--

Two 0.92 builds in a row just passed (Previous it failed four in a row...).  Of 
the last three 0.90 builds, two passed.  Previous four failed in a row (the 
fail in the middle was a timed out TestShell).



 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4090.txt, 5833v4092.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833v4trunk.txt

Patch for trunk.  Smaller than 0.92 patch because some of its work already done 
in trunk.

One extra in this patch is fix for TestHRegion.  The test 
testDataCorrectnessReplayingRecoveredEdits had a flakey manner of calculating 
which regionserver to kill so we'd end up stuck in an endless loop because we'd 
want to move a region back to where it was already running.  I fixed the 
calculation and moved this test out to a new class, TestHRegionOnCluster 
because this was only test in TestHRegion that spun up a cluster.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4090.txt, 5833v4092.txt, 5833v4trunk.txt, 
 closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833v5trunk.txt

What I committed to trunk.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.92.2

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4090.txt, 5833v4092.txt, 5833v4trunk.txt, 
 5833v5trunk.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

Attachment: 5833v5094.txt

What I applied to 0.94 branch.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.90.7, 0.92.2, 0.94.1

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4090.txt, 5833v4092.txt, 5833v4trunk.txt, 5833v5094.txt, 
 5833v5trunk.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5833) 0.92 build has been failing pretty consistently on TestMasterFailover....

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5833:
-

   Resolution: Fixed
Fix Version/s: 0.94.1
   0.90.7
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.90, 0.92, 0.94, and trunk.

 0.92 build has been failing pretty consistently on TestMasterFailover
 -

 Key: HBASE-5833
 URL: https://issues.apache.org/jira/browse/HBASE-5833
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.90.7, 0.92.2, 0.94.1

 Attachments: 5833-trunk.txt, 5833-v2.092.txt, 5833.txt, 
 5833v3092.txt, 5833v4090.txt, 5833v4092.txt, 5833v4trunk.txt, 5833v5094.txt, 
 5833v5trunk.txt, closehregions.txt


 Trunk seems fine but 0.92 fails on this test pretty regularly.  Running it 
 local it seems to hang for me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259921#comment-13259921
 ] 

stack commented on HBASE-5844:
--

@N Ok. I'll commit this then in a new JIRA move out the common code to util or 
some place.  I'll handle Ted's comment on commit.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5844.v1.patch, 5844.v2.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5844.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

Committed to trunk.  Thanks for the patch N.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5844:
-

Attachment: 5844.v3.patch

What I committed.  Its v2 + addressing Ted comment.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259928#comment-13259928
 ] 

stack commented on HBASE-5849:
--

Sounds good Enis.  What should RS do then?

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5857:
-

Attachment: HBASE-5857_trunk.1.patch

Retry against hadoopqa

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5857:
-

Status: Open  (was: Patch Available)

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259932#comment-13259932
 ] 

stack commented on HBASE-5857:
--

+1 on patch.   Formatting is off but we can fix on commit.  Will wait on next 
hadoopqa run to see if previous failure real.

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5857:
-

Status: Patch Available  (was: Open)

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259935#comment-13259935
 ] 

stack commented on HBASE-5848:
--

@Ram This is a nit comment for future patches, not for this one.

I would suggest you avoid changes like below going forward:

{code}
-  hRegionInfos = new HRegionInfo[]{
-  new HRegionInfo(hTableDescriptor.getName(), null, null)};
+  hRegionInfos = new HRegionInfo[] { new HRegionInfo(hTableDescriptor
+  .getName(), null, null) };
{code}

The replacement is harder to read w/ its break in the middle of a phrase.

Patch lgtm.

Lars, does this fix the issue you saw?


 Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to 
 abort
 

 Key: HBASE-5848
 URL: https://issues.apache.org/jira/browse/HBASE-5848
 Project: HBase
  Issue Type: Bug
  Components: client
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Attachments: HBASE-5848.patch


 A coworker of mine just had this scenario. It does not make sense the 
 EMPTY_START_ROW as splitKey (since the region with the empty start key is 
 implicit), but it should not cause the HMaster to abort.
 The abort happens because it tries to bulk assign the same region twice and 
 then runs into race conditions with ZK.
 The same would (presumably) happen when two identical split keys are passed, 
 but the client blocks that. The simplest solution here is to also block 
 passed null or EMPTY_START_ROW as split key by the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5849:
-

Status: Patch Available  (was: Open)

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5849_v1.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259937#comment-13259937
 ] 

stack commented on HBASE-5849:
--

Patch lgtm.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-5849_v1.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259950#comment-13259950
 ] 

stack commented on HBASE-5861:
--

@Jon Good stuff.  Looks like our Andrew saw this before we did over in 
HBASE-5807: JobContext and TaskAttemptContext are only interfaces in 0.23+. 
TestHLogRecordReader should be reimplemented with a different approach.  I'll 
mark that issue dup of this.

 Hadoop 23 compile broken due to tests introduced in HBASE-5064 
 ---

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5807) TestHLogRecordReader does not compile against Hadoop 2

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5807.
--

Resolution: Duplicate

Marking as dup of HBASE-5861 (You saw it first Andrew but Jon is going to work 
on this over in 5861..)

 TestHLogRecordReader does not compile against Hadoop 2
 --

 Key: HBASE-5807
 URL: https://issues.apache.org/jira/browse/HBASE-5807
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Andrew Purtell

 See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-23/144/console
 I see this trying to compile against branch 2 also.
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35]
  cannot find symbol
 [ERROR] symbol  : method getJobTracker()
 [ERROR] location: class 
 org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner
 [ERROR] 
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-23/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5830) Cleanup SequenceFileLogWriter to use syncFs api from SequenceFile#Writer directly in trunk.

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5830:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch Uma.

 Cleanup SequenceFileLogWriter to use syncFs api from SequenceFile#Writer 
 directly in trunk.
 ---

 Key: HBASE-5830
 URL: https://issues.apache.org/jira/browse/HBASE-5830
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.96.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Fix For: 0.96.0

 Attachments: HBASE-5830.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5652) [findbugs] Fix lock release on all paths

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259957#comment-13259957
 ] 

stack commented on HBASE-5652:
--

I ran TestSplitLogManager ten times on trunk and it passed each time. FYI.

 [findbugs] Fix lock release on all paths 
 -

 Key: HBASE-5652
 URL: https://issues.apache.org/jira/browse/HBASE-5652
 Project: HBase
  Issue Type: Sub-task
  Components: scripts
Reporter: Jonathan Hsieh
Assignee: Gregory Chanan
 Attachments: HBASE-5652-v0.patch, HBASE-5652-v1.patch


 See 
 https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html#Warnings_MT_CORRECTNESS
 Category UL

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5857:
-

Attachment: HBASE-5857_trunk.2.patch

This is the version I applied.  Same as trunk.1.patch but w/ the tabs removed 
replaced by two spaces.

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.2.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5857:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Applied to 0.92, 0.94, and to trunk.  Thanks for the patch Chinna.

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.2.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5857) RIT map in RS not getting cleared while region opening

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5857:
-

Fix Version/s: (was: 0.92.2)

NOT applied to 0.92, just to 0.94 and trunk.  Patch doesn't cleanly apply on 
0.92.

 RIT map in RS not getting cleared while region opening
 --

 Key: HBASE-5857
 URL: https://issues.apache.org/jira/browse/HBASE-5857
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5857_0.92.1.patch, HBASE-5857_0.92.patch, 
 HBASE-5857_0.94.1.patch, HBASE-5857_94.patch, HBASE-5857_trunk.1.patch, 
 HBASE-5857_trunk.1.patch, HBASE-5857_trunk.2.patch, HBASE-5857_trunk.patch


 While opening the region in RS after adding the region to 
 regionsInTransitionInRS if tableDescriptors.get() throws exception the region 
 wont be cleared from regionsInTransitionInRS. So next time if it tries to 
 open the region in the same RS it will throw the 
 RegionAlreadyInTransitionException.
 if swap the below statement this issue wont come.
 {code}
 this.regionsInTransitionInRS.putIfAbsent(region.getEncodedNameAsBytes(),true);
 HTableDescriptor htd = this.tableDescriptors.get(region.getTableName());
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260005#comment-13260005
 ] 

stack commented on HBASE-5699:
--

@Ted Why delete a comment, especially someone elses?

 Run with  1 WAL in HRegionServer
 -

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
Assignee: Li Pi



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260020#comment-13260020
 ] 

stack commented on HBASE-5699:
--

@Ted Would suggest you just leave it. When you delete, we all get a message in 
our mailbox about the delete transaction.  Then we start to wonder...

 Run with  1 WAL in HRegionServer
 -

 Key: HBASE-5699
 URL: https://issues.apache.org/jira/browse/HBASE-5699
 Project: HBase
  Issue Type: Improvement
Reporter: binlijin
Assignee: Li Pi



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260024#comment-13260024
 ] 

stack commented on HBASE-5851:
--

It fails for me too.  Will attach the two failure types I saw.

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: hbase-5851.patch


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5851) TestProcessBasedCluster sometimes fails

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5851:
-

Attachment: metahang.txt

Here we just kept scanning meta w/ no movement beyond that.

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5851) TestProcessBasedCluster sometimes fails

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5851:
-

Attachment: zkfail.txt

Here is case of zk not being able to make a connection.  Goes on and on.

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5851) TestProcessBasedCluster sometimes fails

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5851:
-

Attachment: disable.txt

Patch to disable the flakey test for now.  This is not a critical functional 
piece and would like to have passing tests for a while so will commit this 
until this issue is fixed.

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260036#comment-13260036
 ] 

stack commented on HBASE-5851:
--

Committed to trunk the attached disable.txt to disable the failing test.

 TestProcessBasedCluster sometimes fails
 ---

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5851) TestProcessBasedCluster sometimes fails; currently disabled -- needs to be fixed and reenabled

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5851:
-

Summary: TestProcessBasedCluster sometimes fails; currently disabled -- 
needs to be fixed and reenabled  (was: TestProcessBasedCluster sometimes fails)

Updated the subject.

 TestProcessBasedCluster sometimes fails; currently disabled -- needs to be 
 fixed and reenabled
 --

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails; currently disabled -- needs to be fixed and reenabled

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260054#comment-13260054
 ] 

stack commented on HBASE-5851:
--

@Jimmy Thanks.  Add the patch and we'll commit.

 TestProcessBasedCluster sometimes fails; currently disabled -- needs to be 
 fixed and reenabled
 --

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5805) TestServerCustomProtocol failing intermittently.

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260055#comment-13260055
 ] 

stack commented on HBASE-5805:
--

I ran this test ten times on trunk and it passes for me -- but I've seen it 
fail up on jenkins.

 TestServerCustomProtocol failing intermittently.
 

 Key: HBASE-5805
 URL: https://issues.apache.org/jira/browse/HBASE-5805
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.96.0
Reporter: Uma Maheswara Rao G
 Attachments: TestServerCustomProtocol.log


 Trace:
 java.lang.AssertionError: Results should contain region 
 test,ccc,1334638013935.b9d77206f6eb226928b898e66fd1d508. for row 'ccc'
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.assertTrue(Assert.java:43)
   at 
 org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.verifyRegionResults(TestServerCustomProtocol.java:363)
   at 
 org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol.testNullReturn(TestServerCustomProtocol.java:330)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails; currently disabled -- needs to be fixed and reenabled

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260066#comment-13260066
 ] 

stack commented on HBASE-5851:
--

Sorry Jimmy.  Missed your patch upload.

Why we need to this now?  We've not had to up to this:

{code}
+conf.reloadConfiguration();
{code}

And here, could we not just update the conf by putting the current conf into 
place?

{code}
+// copy some important settings from configuration from this.conf
+if (conf.get(hbase.rpc.engine) != null) {
+  confMap.put(hbase.rpc.engine, conf.get(hbase.rpc.engine));
+}
{code}

Would the test pass if we did not do the reload and if we just updated the 
content of confMap w/ what current Config is?

BTW, this feels like right soln.  Good stuff Jimmy.

 TestProcessBasedCluster sometimes fails; currently disabled -- needs to be 
 fixed and reenabled
 --

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260137#comment-13260137
 ] 

stack commented on HBASE-5861:
--

You uploaded wrong patch boss?

 Hadoop 23 compile broken due to tests introduced in HBASE-5064 
 ---

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5782-v3.txt, 5861.txt, hbase-5861-jon.patch, 
 hbase-5861-v2.patch


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5850) Backport HBASE-5454 to 90 and 92 Refuse operations from Admin before master is initialized

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260138#comment-13260138
 ] 

stack commented on HBASE-5850:
--

+1

 Backport HBASE-5454 to 90 and 92  Refuse operations from Admin before master 
 is initialized
 ---

 Key: HBASE-5850
 URL: https://issues.apache.org/jira/browse/HBASE-5850
 Project: HBase
  Issue Type: Bug
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: 5850-trunk.txt, No_patch_90_surefire-report.html, 
 backport-5454(createTable)-to-94.patch, 
 backport-5454(createTable)-to-94_surefire-report.html, 
 backport-5454(createTable)-to-trunk.patch, 
 backport-5454(createTable)-to-trunk_surefire-report.html, 
 backport-5454-to-90-surefire-report.html, backport-5454-to-90.patch, 
 backport-5454-to-92.patch, backport-5454-to-92_surefire-report.html


 This issue is needed in 0.90 0.92 also.
 And update the hbase-5454 patch that add the checkInitialized() into 
 HMaster#createTable().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5849:
-

Attachment: 5849v3.txt

Enis's v2 patch with this added to end of test:

{code}
+  @org.junit.Rule
+  public org.apache.hadoop.hbase.ResourceCheckerJUnitRule cu =
+new org.apache.hadoop.hbase.ResourceCheckerJUnitRule();
{code}

Nice test.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5849:
-

   Resolution: Fixed
Fix Version/s: 0.94.0
   0.92.2
 Release Note: Rather than exit, the regionserver will now wait even though 
the root directory in zookeeper has yet to be created.
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.92, 0.94, and to trunk.  Thanks for the patch Enis.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260148#comment-13260148
 ] 

stack commented on HBASE-5862:
--

@Elliott What happens?  The region looks like its on a regionserver its no 
longer on?  The counters just don't change?  Whats that mean?  That our 
per-region metrics are going to be messy if regions move?  Good on you.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Attachments: HBASE-5862-0.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails; currently disabled -- needs to be fixed and reenabled

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260193#comment-13260193
 ] 

stack commented on HBASE-5851:
--

It failed on the 9th iteration  (It used fail every other time for me so its an 
improvement).

{code}
---
 T E S T S
---
Running org.apache.hadoop.hbase.util.TestProcessBasedCluster
2012-04-23 19:38:20.210 java[97418:a003] Unable to load realm info from 
SCDynamicStore
Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 300.829 sec  
FAILURE!

Results :

Tests in error: 
  
testProcessBasedCluster(org.apache.hadoop.hbase.util.TestProcessBasedCluster): 
test timed out after 30 milliseconds

Tests run: 2, Failures: 0, Errors: 1, Skipped: 0
{code}

There is nothing in the .out...

 TestProcessBasedCluster sometimes fails; currently disabled -- needs to be 
 fixed and reenabled
 --

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, hbase-5851_v2.patch, 
 metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260209#comment-13260209
 ] 

stack commented on HBASE-5861:
--

Does HBASE-5853 need to get fixed too?  It looks like a 0.23 issue also?

 Hadoop 23 compile broken due to tests introduced in HBASE-5064 
 ---

 Key: HBASE-5861
 URL: https://issues.apache.org/jira/browse/HBASE-5861
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: 5861.txt, hbase-5861-jon.patch, hbase-5861-v2.patch


 When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of 
 compilation error messages:
 {code}
 jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests
 ...
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 18.926s
 [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012
 [INFO] Final Memory: 55M/555M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile 
 (default-testCompile) on project hbase: Compilation failure: Compilation 
 failure:
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29]
  org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated
 [ERROR] 
 [ERROR] 
 /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29]
  org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be 
 instantiated
 [ERROR] - [Help 1]
 {code}
 Upon further investigation this issue is due to code introduced in HBASE-5064 
 and is also present in trunk.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260208#comment-13260208
 ] 

stack commented on HBASE-5849:
--

I tried it before committing and it passed then.  I just tried it on trunk now:

{code}
---
 T E S T S
---
Running org.apache.hadoop.hbase.TestClusterBootOrder
2012-04-23 21:27:45.213 java[97823:d007] Unable to load realm info from 
SCDynamicStore
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.727 sec

Results :

Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ hbase 
---
[INFO] Tests are skipped.
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 34.313s
[INFO] Finished at: Mon Apr 23 21:28:02 PDT 2012
[INFO] Final Memory: 21M/81M
[INFO] 
{code}

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5852) Standalone HBase-0.92.1 fails to start master when coexisting with Hadoop-1.0.1, unnecessarily trying connecting to namenode.

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260211#comment-13260211
 ] 

stack commented on HBASE-5852:
--

You should ask about this issue on mailing list first.  For sure the hadoop 
classpath is not being picked up by hbase when its started?

 Standalone HBase-0.92.1 fails to start master when coexisting with 
 Hadoop-1.0.1, unnecessarily trying connecting to namenode. 
 --

 Key: HBASE-5852
 URL: https://issues.apache.org/jira/browse/HBASE-5852
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1
 Environment: Scientific Linux 6.2 x86_64 with Oracle JDK 1.6
 Mac OS X Lion 10.7.1 with Oracle JDK 1.6
Reporter: Jianwen WEI
  Labels: hadoop, hbase, newbie
   Original Estimate: 168h
  Remaining Estimate: 168h

 I want to run a standalone HBase instance for development test purpose, which 
 requires no HDFS. It works well on my server. However, when I add a hadoop 
 directory into my server, HBase seems to notice that change and try to 
 connect to Hadoop's namenode. The failures of HBase's connecting to Hadoop 
 NameNode cause MHBase fail to start.
 I extracted HBase-0.92.1 to my home directory:
   ~/hbase - ~/hbase-0.92.1
 In configuration file ~/hbase/config/hbase-site.xml, I set HBase to 
 standalone mode and specify the data directory it uses.
   configuration
 property
   namehbase.rootdir/name
   valuefile:///home/jianwen/.hbase.data/value
 /property
   /configuration
 Then I start HBase service with start-hbase.sh, enter HBase shell. Tests go 
 well.
 But things change when I install Hadoop into the same server. Hadoop-1.0.1 
 lies in my home directory too.
   ~/hadoop - ~/hadoop-1.0.1
 In configuration file ~/hadoop/config/core-site.xml, I set Hadoop to run in a 
 pseudo distributed environment and specify the data directory for HDFS.
   configuration
   property
 namehadoop.tmp.dir/name
 value/home/jianwen/.hdfs.data/value
 descriptionA base for other temporary directories./description
   /property
   property
   namefs.default.name/name
   valuehdfs://localhost:9000/value
   /property
   /configuration
 Then I format namenode, start hadoop service, run some MapReduce test 
 programs, such as Pi, grep, et al. Hadoop works on my pesudo distributed 
 environment. Then I stop hadoop service.
 Since I add hadoop in my home directory, HBase fails to start. HBase log 
 shows that HBase tries to connect to Hadoop NameNode when starting up, then 
 fails. That's ridiculous because HBase in standalone mode should have NOTHING 
 about NameNode and HDFS.
 In summary, there may be two problems:
 - Standalone HBase attempts to connect to Hadoop NameNode at starting up when 
 hadoop directory is co-located in home.
 - Not stating in HBase's configuration files, HBase seems to implicitly 
 search hadoop directory around it and read the configuration information, 
 such as NameNode in file core-site.xml. This unclear behavior confuses me a 
 lot.
 Log for standalone HBase starting up:
   ...
   2012-04-22 11:50:41,078 DEBUG 
 org.apache.hadoop.hbase.master.LogCleaner: Add log cleaner in chain: 
 org.apache.hadoop.hbase.master.TimeToLiveLogCleaner
   2012-04-22 11:50:41,115 INFO org.mortbay.log: Logging to 
 org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via 
 org.mortbay.log.Slf4jLog
   2012-04-22 11:50:41,160 INFO org.apache.hadoop.http.HttpServer: Added 
 global filtersafety 
 (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
   2012-04-22 11:50:41,165 INFO org.apache.hadoop.http.HttpServer: Port 
 returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. 
 Opening the listener on 60010
   2012-04-22 11:50:41,165 INFO org.apache.hadoop.http.HttpServer: 
 listener.getLocalPort() returned 60010 
 webServer.getConnectors()[0].getLocalPort() returned 60010
   2012-04-22 11:50:41,165 INFO org.apache.hadoop.http.HttpServer: Jetty 
 bound to port 60010
   2012-04-22 11:50:41,165 INFO org.mortbay.log: jetty-6.1.26
   2012-04-22 11:50:41,548 INFO org.mortbay.log: Started 
 SelectChannelConnector@0.0.0.0:60010
   2012-04-22 11:50:41,548 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Started service threads
   2012-04-22 11:50:41,790 INFO org.apache.hadoop.ipc.Client: Retrying 
 connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s).
   2012-04-22 11:50:42,792 INFO org.apache.hadoop.ipc.Client: Retrying 
 connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
   2012-04-22 11:50:43,050 INFO 
 

[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260238#comment-13260238
 ] 

stack commented on HBASE-5829:
--

Do you have a patch for us Maryann?  The first at least seems legit (For the 
second, there is no associated server, right?)

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue

 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5816) Balancer and ServerShutdownHandler concurrently reassigning the same region

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260240#comment-13260240
 ] 

stack commented on HBASE-5816:
--

@Ram I think the aim would be a simplification; one queue to assign from rather 
than from multiple.  Also as is, I think state a little distributed across 
multiple variables and maps.  We should coalesce if possible.  I think the 
Maryann suggestion of trying a double or triple concurrent assign in a unit 
test a good start.

 Balancer and ServerShutdownHandler concurrently reassigning the same region
 ---

 Key: HBASE-5816
 URL: https://issues.apache.org/jira/browse/HBASE-5816
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Maryann Xue
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: HBASE-5816.patch


 The first assign thread exits with success after updating the RegionState to 
 PENDING_OPEN, while the second assign follows immediately into assign and 
 fails the RegionState check in setOfflineInZooKeeper(). This causes the 
 master to abort.
 In the below case, the two concurrent assigns occurred when AM tried to 
 assign a region to a dying/dead RS, and meanwhile the ShutdownServerHandler 
 tried to assign this region (from the region plan) spontaneously.
 2012-04-17 05:44:57,648 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b., 
 src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 (offlining)
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=hadoop05.sh.intel.com,60020,1334544902186, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0) for region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.
 2012-04-17 05:44:57,666 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/fe38fe31caf40b6e607a3e6bbed6404b 
 (region=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  server=hadoop05.sh.intel.com,60020,1334544902186, state=RS_ZK_REGION_CLOSING)
 2012-04-17 05:52:58,984 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=CLOSED, ts=1334612697672, 
 server=hadoop05.sh.intel.com,60020,1334544902186
 2012-04-17 05:52:58,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x236b912e9b3000e Creating (or updating) unassigned node for 
 fe38fe31caf40b6e607a3e6bbed6404b with OFFLINE state
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.; 
 plan=hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:19,159 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=PENDING_OPEN, ts=1334613179096, 
 server=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:59,033 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 serverName=xmlqa-clv16.sh.intel.com,60020,1334612497253, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0
 java.net.SocketTimeoutException: Call to /10.239.47.87:60020 failed on socket 
 timeout exception: java.net.SocketTimeoutException: 12 millis timeout 
 while waiting for channel to be ready for read. ch : 
 java.nio.channels.SocketChannel[connected local=/10.239.47.89:41302 
 remote=/10.239.47.87:60020]
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:805)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:778)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:283)
 at $Proxy7.openRegion(Unknown Source)
 at 
 

[jira] [Updated] (HBASE-5831) hadoopqa builds not completing

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5831:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

hadoopqa has completed a few times since work fixing up tests and changing 
client retry so it no longer 100 times but default 10. Closing as won't fix.

 hadoopqa builds not completing
 --

 Key: HBASE-5831
 URL: https://issues.apache.org/jira/browse/HBASE-5831
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: stack
Assignee: stack
Priority: Blocker
 Attachments: 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.TestLoadIncrementalHFilesSplitRecovery.txt, 
 5831.remove.all.mapreduce.txt, 5831.remove.all.mapreduce.txt, 
 5831.remove.all.mapreduce.txt, 5831.remove.all.mapreduce.txt, 
 5831.remove.all.mapreduce.txt


 No test failures but build complains it has failed.  trunk build seems to 
 have the same affliction:
 {code}
 Results :
 Tests run: 909, Failures: 0, Errors: 0, Skipped: 9
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 41:19.273s
 [INFO] Finished at: Wed Apr 18 21:54:31 UTC 2012
 [INFO] Final Memory: 59M/451M
 [INFO] 
 
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test 
 (secondPartTestsExecution) on project hbase: Failure or timeout - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 -1 overall.  Here are the results of testing the latest attachment 
   http://issues.apache.org/jira/secure/attachment/12523250/5811+%281%29.txt
   against trunk revision .
 +1 @author.  The patch does not contain any @author tags.
 +1 tests included.  The patch appears to include 3 new or modified tests.
 +1 javadoc.  The javadoc tool did not generate any warning messages.
 +1 javac.  The applied patch does not increase the total number of javac 
 compiler warnings.
 -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 
 1.3.9) warnings.
 +1 release audit.  The applied patch does not increase the total number 
 of release audit warnings.
  -1 core tests.  The patch failed these unit tests:
 {code}
 Its not apparent that any particular test is not finishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260246#comment-13260246
 ] 

stack commented on HBASE-5849:
--

There is something wrong now.  This test won't complete for me (though it has 
previous).  I thought it the subsequent commit:

{code}
r1329555 | larsh | 2012-04-23 22:12:45 -0700 (Mon, 23 Apr 2012) | 1 line

Refuse operations from Admin before master is initialized - fix for all branches
{code}

..that was bringing on the problem but removing that, its still not completing.

I poked around in debugger and was getting an NPE in reportForDuty after master 
came up because this.hbaseMaster was null; we were failing allocating the 
Interface (hard to trace because toString would throw its on exception).

For now backing this out.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260247#comment-13260247
 ] 

stack commented on HBASE-5849:
--

I mean, it even passed hadoopqa above apart from my testing.  Backing it out 
though... its ugly hang when it happens.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260248#comment-13260248
 ] 

stack commented on HBASE-5849:
--

So, yes, I'm seeing what Ted reports above.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-23 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reopened HBASE-5849:
--


Reopening.  Backing out patch.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260255#comment-13260255
 ] 

stack commented on HBASE-5849:
--

I killed all running builds in case they'd run into this hang.

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260256#comment-13260256
 ] 

stack commented on HBASE-5849:
--

Enis, might taking a look at this?

 On first cluster startup, RS aborts if root znode is not available
 --

 Key: HBASE-5849
 URL: https://issues.apache.org/jira/browse/HBASE-5849
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, zookeeper
Affects Versions: 0.92.2, 0.96.0, 0.94.1
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.92.2, 0.94.0

 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch


 When launching a fresh new cluster, the master has to be started first, which 
 might create race conditions for starting master and rs at the same time. 
 Master startup code is smt like this: 
  - establish zk connection
  - create root znodes in zk (/hbase)
  - create ephemeral node for master /hbase/master, 
  Region server start up code is smt like this: 
  - establish zk connection
  - check whether the root znode (/hbase) is there. If not, shutdown. 
  - wait for the master to create znodes /hbase/master
 So, the problem is on the very first launch of the cluster, RS aborts to 
 start since /hbase znode might not have been created yet (only the master 
 creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on 
 subsequent cluster starts, it does not matter which order the servers are 
 started. So this affects only first launchs. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5842) Passing shell commands as an argument

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260267#comment-13260267
 ] 

stack commented on HBASE-5842:
--

We used to have it so you could pass a formatter class that the shell would 
use.  Default is console formatting.  We used to have an html output one which 
was useful when you could type in shell commands on ui and get results as an 
html page.  Another formatter would emit results per line so greppable.

 Passing shell commands as an argument
 -

 Key: HBASE-5842
 URL: https://issues.apache.org/jira/browse/HBASE-5842
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.94.0
Reporter: Harsh J
Priority: Minor

 Many times we've required scans of .META. to analyze issues with the cluster 
 we work on, and to have the result in a file we can pass around we usually 
 end up doing something like:
 {{echo scan '.META.'| hbase shell  meta-scan.txt}}
 This can rather be simplified as something like the following instead, with 
 support for a commands reading argument:
 {{hbase shell -c scan '.META.'}}
 [Note though: File reading is possible already, i.e. {{hbase shell file.hs}}, 
 but then thats two steps and we usually don't keep a file around for just a 
 meta table scan.]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5847) Support createTable(splitKeys) in Thrift

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260270#comment-13260270
 ] 

stack commented on HBASE-5847:
--

Does the thrift2 package support this Nicolas?

 Support createTable(splitKeys) in Thrift
 

 Key: HBASE-5847
 URL: https://issues.apache.org/jira/browse/HBASE-5847
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Trivial

 The Thrift API does not allow a user to create a table with multiple split 
 keys.  This is needed for a handful of new internal projects that are written 
 in PHP/C++.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5677:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Will be fixed over in HBASE-5850.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2

 Attachments: 5677-proposal.txt, 5677-proposal.txt, 
 Backport-HBASE-5454-to-90.patch, Backport-HBASE-5454-to-92.patch, 
 HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
 surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260282#comment-13260282
 ] 

stack commented on HBASE-5564:
--

@Laxman Any luck?

 Bulkload is discarding duplicate records
 

 Key: HBASE-5564
 URL: https://issues.apache.org/jira/browse/HBASE-5564
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.96.0
 Environment: HBase 0.92
Reporter: Laxman
Assignee: Laxman
  Labels: bulkloader
 Fix For: 0.96.0

 Attachments: 5564.lint, 5564v5.txt, HBASE-5564_trunk.1.patch, 
 HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, 
 HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch


 Duplicate records are getting discarded when duplicate records exists in same 
 input file and more specifically if they exists in same split.
 Duplicate records are considered if the records are from diffrent different 
 splits.
 Version under test: HBase 0.92

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260280#comment-13260280
 ] 

stack commented on HBASE-5732:
--

@Devaraj If this patch only works against 1.0.x hadoop, what do we do when we 
want to run hbase 0.96 on hadoop 2.0.x?

Here is some more feedback on posted patch:

Its big!  Its mostly deletes and generated code though thankfully.

So, you are going to move to UGI over User?

Are you going to get rid of the security dir that is at top level in hbase?  Is 
there anything left in it after this patch?

What kind of hadoop will be required?  One that supports security: i.e. apache 
hadoop 1.0.x, 2.0.x?  And then for the others?  The will need to have an answer 
for the security methods?

Just remove rather than do this commenting out:

{code}
-builder.setError(error != null);
+//builder.setStatus(
{code}

This is going to make a copy of the response?

{code}
+  token = connection.saslServer.wrap(buf.array(),
+  buf.arrayOffset(), buf.remaining());
{code}

Do we have to?  Can't we feed it out on the output stream, first the wrapping, 
then the response?

Any tests?

Good stuff.

 Remove the SecureRPCEngine and merge the security-related logic in the core 
 engine
 --

 Key: HBASE-5732
 URL: https://issues.apache.org/jira/browse/HBASE-5732
 Project: HBase
  Issue Type: Improvement
Reporter: Devaraj Das
Assignee: Devaraj Das
 Attachments: rpcengine-merge.patch


 Remove the SecureRPCEngine and merge the security-related logic in the core 
 engine. Follow up to HBASE-5727.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5672) TestLruBlockCache#testBackgroundEvictionThread fails occasionally

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5672:
-

Status: Patch Available  (was: Open)

+1 on patch.  Submitting to hadoopqa.

 TestLruBlockCache#testBackgroundEvictionThread fails occasionally
 -

 Key: HBASE-5672
 URL: https://issues.apache.org/jira/browse/HBASE-5672
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5672.patch, HBASE-5672v2.patch


 We find TestLruBlockCache#testBackgroundEvictionThread fails occasionally.
 I think it's a problem of the test case.
 Because runEviction() only do evictionThread.evict():
 {code}
 public void evict() {
   synchronized(this) {
 this.notify(); // FindBugs NN_NAKED_NOTIFY
   }
 }
 {code}
 However when we call evictionThread.evict(), the evictionThread may haven't 
 been in run() in the TestLruBlockCache#testBackgroundEvictionThread.
 If we run the test many times, we could find failture easily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5672) TestLruBlockCache#testBackgroundEvictionThread fails occasionally

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260307#comment-13260307
 ] 

stack commented on HBASE-5672:
--

Sorry Chunhui, we let the patch rot.  Mind updating it?  Thanks.

 TestLruBlockCache#testBackgroundEvictionThread fails occasionally
 -

 Key: HBASE-5672
 URL: https://issues.apache.org/jira/browse/HBASE-5672
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-5672.patch, HBASE-5672v2.patch


 We find TestLruBlockCache#testBackgroundEvictionThread fails occasionally.
 I think it's a problem of the test case.
 Because runEviction() only do evictionThread.evict():
 {code}
 public void evict() {
   synchronized(this) {
 this.notify(); // FindBugs NN_NAKED_NOTIFY
   }
 }
 {code}
 However when we call evictionThread.evict(), the evictionThread may haven't 
 been in run() in the TestLruBlockCache#testBackgroundEvictionThread.
 If we run the test many times, we could find failture easily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4393) Implement a canary monitoring program

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4393:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Release Note: Tool to check cluster.  See $ ./bin/hbase 
org.apache.hadoop.hbase.tool.Canary -help for how to use.
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch Matteo. I tried it out.  Does the 
basics.  Nice. Thanks.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Fix For: 0.96.0

 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4393) Implement a canary monitoring program

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4393:
-

Fix Version/s: 0.94.0

Committed to 0.94 (thought you might like this Lars).

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Fix For: 0.94.0, 0.96.0

 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260645#comment-13260645
 ] 

stack commented on HBASE-5864:
--

Good find lads.  I'm not sure I follow.  Is it fixable?

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260651#comment-13260651
 ] 

stack commented on HBASE-4393:
--

No.  This patch has a license.  The failure was because of the OOME.  RAT 
complaint is this:

{code}

Unapproved licenses:

  hs_err_pid23951.log

...
{code}

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Fix For: 0.94.0, 0.96.0

 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5856) byte - String is not consistent between HBaseAdmin and HRegionInfo

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260661#comment-13260661
 ] 

stack commented on HBASE-5856:
--

Help me understand Binlijin.  This does not seem right to me but you are 
obviously on to something.  The String that is being passed in here is coming 
from the shell?  Is jruby shell doing the escaping before it passes the String 
to HBaseAdmin?  Thanks.

 byte - String is not consistent between HBaseAdmin and HRegionInfo
 

 Key: HBASE-5856
 URL: https://issues.apache.org/jira/browse/HBASE-5856
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.6, 0.92.1
Reporter: binlijin
 Attachments: HBASE-5856-0.92.patch


 In HBaseAdmin 
   public void split(final String tableNameOrRegionName)
   throws IOException, InterruptedException {
 split(Bytes.toBytes(tableNameOrRegionName));  // string - byte 
   }
 In HRegionInfo
   this.regionNameStr = Bytes.toStringBinary(this.regionName);  // byte - 
 string
 Should we use Bytes.toBytesBinary in HBaseAdmin ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5863) Improve the graceful_stop.sh CLI help (especially about reloads)

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5863:
-

   Resolution: Fixed
Fix Version/s: 0.94.1
   0.92.2
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to 0.92, 0.94 and trunk.  Thanks for the patch Harsh.

 Improve the graceful_stop.sh CLI help (especially about reloads)
 

 Key: HBASE-5863
 URL: https://issues.apache.org/jira/browse/HBASE-5863
 Project: HBase
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.94.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor
 Fix For: 0.92.2, 0.94.1

 Attachments: HBASE-5863.patch


 Right now, graceful_stop.sh prints:
 {code}
 Usage: graceful_stop.sh [--config conf-dir] [--restart] [--reload] 
 [--thrift] [--rest] hostname
  thrift  If we should stop/start thrift before/after the hbase stop/start
  restIf we should stop/start rest before/after the hbase stop/start
  restart If we should restart after graceful stop
  reload  Move offloaded regions back on to the stopped server
  debug   Move offloaded regions back on to the stopped server
  hostnameHostname of server we are to stop
 {code}
 This does not help us specify that reload is actually a sub/additive-option 
 to restart.
 Also, the debug line seems to still have an old copy/paste mistake.
 I've updated these two in the patch here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260672#comment-13260672
 ] 

stack commented on HBASE-5864:
--

I'm not sure I follow what your patch is doing Ram.  And maybe we need a test 
around split of hfile?

What is this doing:

{code}
-final int ENTRY_COUNT = 1;
+final int ENTRY_COUNT = 5;
{code}

This is asking for too many entries?

Good stuff.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260726#comment-13260726
 ] 

stack commented on HBASE-4393:
--

@Ted FYI, if you go under artifacts produced by the build into the target dir, 
you can see the rat.txt now.  Thats where I got the above from.  Thinking on 
it, I also went to change the build order so site goes first so we'll fail fast 
if a rat problem but it seems build is already this way -- must run unit tests 
up front anyways.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Fix For: 0.94.0, 0.96.0

 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5866) Canary in tool package but says its in tools.

2012-04-24 Thread stack (JIRA)
stack created HBASE-5866:


 Summary: Canary in tool package but says its in tools.
 Key: HBASE-5866
 URL: https://issues.apache.org/jira/browse/HBASE-5866
 Project: HBase
  Issue Type: Bug
Reporter: stack




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5866) Canary in tool package but says its in tools.

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5866:
-

Attachment: 5866.txt

Spotted by Lars Franke.  Moved Canary into tool package.

 Canary in tool package but says its in tools.
 -

 Key: HBASE-5866
 URL: https://issues.apache.org/jira/browse/HBASE-5866
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 0.94.0, 0.96.0

 Attachments: 5866.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5866) Canary in tool package but says its in tools.

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5866.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
   0.94.0
 Assignee: stack

Committed trunk and 0.94.

 Canary in tool package but says its in tools.
 -

 Key: HBASE-5866
 URL: https://issues.apache.org/jira/browse/HBASE-5866
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0, 0.96.0

 Attachments: 5866.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4393) Implement a canary monitoring program

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260766#comment-13260766
 ] 

stack commented on HBASE-4393:
--

Let me take care of it Lars.  I should have seen that.  Thanks for pointing it 
out.  Fixed over in HBASE-5866.

 Implement a canary monitoring program
 -

 Key: HBASE-4393
 URL: https://issues.apache.org/jira/browse/HBASE-4393
 Project: HBase
  Issue Type: New Feature
  Components: monitoring
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Matteo Bertozzi
 Fix For: 0.94.0, 0.96.0

 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java


 This JIRA is to implement a standalone program that can be used to do canary 
 monitoring of a running HBase cluster. This program would gather a list of 
 the regions in the cluster, then iterate over them doing lightweight 
 operations (eg short scans) to provide metrics about latency as well as alert 
 on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5865) test-util.sh broken with unittest updates

2012-04-24 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-5865.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Tried it. Seems to work.  Committed trunk and 0.94.  Thanks for the patch Jesse.

 test-util.sh broken with unittest updates
 -

 Key: HBASE-5865
 URL: https://issues.apache.org/jira/browse/HBASE-5865
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.96.0, 0.94.1
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 0.96.0, 0.94.1

 Attachments: sh_HBASE-5865-v0.patch


 Since the default maven test is meant to be run on the server, this test 
 script always fails. Needs to take into account the location of where the 
 script is being run as well as some debugging options for future fixes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5840) Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing the old status

2012-04-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260774#comment-13260774
 ] 

stack commented on HBASE-5840:
--

Patch looks good.  Is it just moving all of this (too long) method into a try 
block and then adding a finally that sets status.abort on the end (FYI, needs 
spacing around the operators in the status.abort line so its same code style as 
rest of file).  Do you have to convert the Exception to an IOE?  WHy is that?  
What does this method let out?  IOEs only?  If so, why we catch Exception?  In 
case its a non-checked exception?   On the test, it looks good too but in the 
finally you might want to use the new HRegion.closeHRegion(region) to clean up 
the wal log that gets made by the constructor.

 Open Region FAILED_OPEN doesn't clear the TaskMonitor Status, keeps showing 
 the old status
 --

 Key: HBASE-5840
 URL: https://issues.apache.org/jira/browse/HBASE-5840
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.94.1

 Attachments: HBASE-5840.patch


 TaskMonitor Status will not be cleared in case Regions FAILED_OPEN. This will 
 keeps showing old status.
 This will miss leads the user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   3   4   5   6   7   8   9   10   >