[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144928#comment-13144928
 ] 

Phabricator commented on HBASE-4742:


tedyu has commented on the revision [jira] [HBASE-4742] Split dead server's 
log in parallel.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:311 I think thread 
index starting from 1 would be more readable to user.
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:314 Please the 
following call before returning:
  (logSplitThreadPool).allowCoreThreadTimeOut(true);

REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144929#comment-13144929
 ] 

Lars Hofhansl commented on HBASE-4752:
--

Hmm... The blocks are still sorted in the PriorityQueue. Could use a TreeSet 
with a comparator, but the properties will be different.

The least optimization one could do is using an ArrayList and fill it from the 
back (the size of the queue is known, and since PriorityQueue is not threadsafe 
and there is no thread synchronization in CachedBlockQueue, I assume there is 
not multithreaded access here).

Strange that those tests pass, though. You do agree that with the patch the 
entries would be evicted in the opposite order than before, no?


 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144930#comment-13144930
 ] 

Phabricator commented on HBASE-4746:


tedyu has commented on the revision [jira] [HBASE-4746] Use a random ZK client 
port in unit tests so we can run them in parallel.

INLINE COMMENTS
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java:117
 Thanks for taking care of HBASE-4750.

REVISION DETAIL
  https://reviews.facebook.net/D279


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144932#comment-13144932
 ] 

Phabricator commented on HBASE-4742:


tedyu has commented on the revision [jira] [HBASE-4742] Split dead server's 
log in parallel.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:314 Should read: 
'Please add the following call before returning'

REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144934#comment-13144934
 ] 

Ted Yu commented on HBASE-4752:
---

@Lars:
If you were talking about my patch, I assume you have noticed the following:
{code}
-  for(CachedBlock cb: blocks) {
+  while ((cb = queue.pollLast()) != null) {
{code}


 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144937#comment-13144937
 ] 

Lars Hofhansl commented on HBASE-4752:
--

Oh? ... Nope, was still referring to Beniot's patch. I missed your new patch.

I am not familiar with MinMaxPriorityQueue, but if it does what its name 
suggests the patch looks good to me.
Wanna set the initial size (with a builder)? Or remove the initial size 
calculation?


 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144940#comment-13144940
 ] 

Lars Hofhansl commented on HBASE-4752:
--

Looking at MinMaxPriorityQueue... can we just get rid of CacheBlockQueue? It 
only exists to provide the entries in reverse it seems.




 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144942#comment-13144942
 ] 

Hadoop QA commented on HBASE-4740:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502640/4740.v3
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 48 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.master.TestMasterFailover
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/192//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/192//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/192//console

This message is automatically generated.

 [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
 recoverable or unrecoverable error.
 

 Key: HBASE-4740
 URL: https://issues.apache.org/jira/browse/HBASE-4740
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 4740.v3, 
 hbase-4740.v2.patch


 Running TestHFileOutputFormat more frequently seems to show that it has 
 become flaky.   It is difficult to tell if this is because of a unrecoverable 
 failure or a recoverable failure.   To make this visiable from test and for 
 users, we need to make a change to bulkload call's interface on 
 HRegionServer.  The change should make successful rpcs return true, 
 recoverable failures return false, and unrecoverable failure throw an 
 IOException.  This is an RPC change, so it would be really good to get this 
 api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144944#comment-13144944
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502644/4746-trunk-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 39 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 48 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.replication.TestMasterReplication
  org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.regionserver.TestColumnSeeking

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/193//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/193//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/193//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144957#comment-13144957
 ] 

Hadoop QA commented on HBASE-4752:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502646/4752-trunk.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 49 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.io.hfile.TestCachedBlockQueue

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/194//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/194//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/194//console

This message is automatically generated.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread nkeywal (Created) (JIRA)
org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE 
on trunk


 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal


testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
elapsed: 0.011 sec   ERROR!
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
at 
org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)

Because the 'getTableInfoPath' can return null and it's not tested.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4753:
---

Status: Patch Available  (was: Open)

 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4753:
---

Attachment: 4753_FSTableDescriptors.patch

It seems than without the patch it works only if the environment has been 
'initialized' by another test or by a previous successful run.

 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144965#comment-13144965
 ] 

nkeywal commented on HBASE-4737:


@Mikael Thank you for the zookeeper patch. TestHRegionInfo.testGetSetOfHTD is 
not related to the parallel runs, it can fail on an isolated run. I created 
HBASE-4753 for it.

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
 1   18.183
 org.apache.hadoop.hbase.coprocessor.TestWALObserver   3   19.373
 org.apache.hadoop.hbase.filter.TestColumnRangeFilter  1   19.045
 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5   24.294
 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache   7   19.818
 org.apache.hadoop.hbase.io.hfile.TestHFileBlock   7   25.226
 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7   0.343
 org.apache.hadoop.hbase.mapreduce.TestImportTsv   8   40.391
 org.apache.hadoop.hbase.master.TestActiveMasterManager2   0.724
 org.apache.hadoop.hbase.master.TestHMasterRPCException1   1.17
 org.apache.hadoop.hbase.master.TestLogsCleaner1   2.953
 org.apache.hadoop.hbase.master.TestMaster 1   18.918
 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2   20.57
 org.apache.hadoop.hbase.master.TestSplitLogManager10  13.979
 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3   21.675
 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3   
 0.887
 org.apache.hadoop.hbase.regionserver.TestBlocksRead   4   1.42
 

[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144966#comment-13144966
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502651/D279.2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 39 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 48 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/195//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/195//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/195//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4511) There is data loss when master failovers

2011-11-06 Thread gaojinchao (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144982#comment-13144982
 ] 

gaojinchao commented on HBASE-4511:
---

+1

 There is data loss when master failovers
 

 Key: HBASE-4511
 URL: https://issues.apache.org/jira/browse/HBASE-4511
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: stack
Priority: Minor
 Fix For: 0.92.0

 Attachments: 4511-v2.txt, 4511.txt, 
 org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt


 It goes like this:
 Master crashed ,  at the same time RS with meta is crashing, but RS doesn't 
 eixt.
 Master startups again and finds all living RS. 
 Master verifies the meta failed,  because this RS is crashing.
 Master reassigns the meta, but it doesn't split the Hlog. 
 So some meta data is loss.
 About the logs of a failover test case fail. 
 //It said that we want to kill a RS
 2011-09-28 19:54:45,694 INFO  [Thread-988] regionserver.HRegionServer(1443): 
 STOPPED: Killing for unit test
 2011-09-28 19:54:45,694 INFO  [Thread-988] master.TestMasterFailover(1007): 
 RS 192.168.2.102,54385,1317264874629 killed 
 //Rs didn't crash. 
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.HMaster(458): Registering server found up in zk: 
 192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.ServerManager(232): Registering 
 server=192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of 
 znode /hbase/unassigned/1028785192 because node does not exist (not an error)
 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 //Meta verification failed and ressigned the meta. So all the regions in the 
 meta is loss.
 2011-09-28 19:54:51,773 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,277 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,782 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or 
 updating) unassigned node for 1028785192 with OFFLINE state
 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] 
 zookeeper.ZooKeeperWatcher(233): master:54557-0x132b31adbb30005 Received 
 

[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144983#comment-13144983
 ] 

Hadoop QA commented on HBASE-4753:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12502654/4753_FSTableDescriptors.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 48 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestMasterFailover
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/196//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/196//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/196//console

This message is automatically generated.

 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4752:
--

Status: Open  (was: Patch Available)

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Benoit Sigoure
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4752:
-

Assignee: Ted Yu  (was: Benoit Sigoure)

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145004#comment-13145004
 ] 

Ted Yu commented on HBASE-4752:
---

The gist of CachedBlockQueue now lies in its add() method which does heap size 
accounting.
I think we should keep this class.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4752:
--

Status: Patch Available  (was: Open)

Patch testing trunk v2.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4752:
--

Attachment: 4752-trunk-v2.txt

Added initial size for MinMaxPriorityQueue.

Modified TestCachedBlockQueue to reflect its original intention. Also 
simplified the assertions with a loop.
TestCachedBlockQueue passes.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145008#comment-13145008
 ] 

Ted Yu commented on HBASE-4740:
---

Some test failures were due to 'Too many open files'

I couldn't reproduce TestThriftHBaseServiceHandler-related failures:
{code}
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.595 sec

Results :

Tests run: 11, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 21.351s
[INFO] Finished at: Sun Nov 06 05:56:09 PST 2011
{code}

 [bulk load]  the HBASE-4552 API can't tell if errors on region server is 
 recoverable or unrecoverable error.
 

 Key: HBASE-4740
 URL: https://issues.apache.org/jira/browse/HBASE-4740
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 4740.v3, 
 hbase-4740.v2.patch


 Running TestHFileOutputFormat more frequently seems to show that it has 
 become flaky.   It is difficult to tell if this is because of a unrecoverable 
 failure or a recoverable failure.   To make this visiable from test and for 
 users, we need to make a change to bulkload call's interface on 
 HRegionServer.  The change should make successful rpcs return true, 
 recoverable failures return false, and unrecoverable failure throw an 
 IOException.  This is an RPC change, so it would be really good to get this 
 api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4746:
-

Assignee: Mikhail Bautin

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145009#comment-13145009
 ] 

Ted Yu commented on HBASE-4746:
---

+1 on D279 patch v2.

Nice work, Mikhail.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145010#comment-13145010
 ] 

Ted Yu commented on HBASE-4753:
---

+1 on patch.

 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145025#comment-13145025
 ] 

Hadoop QA commented on HBASE-4752:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502664/4752-trunk-v2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 48 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
  org.apache.hadoop.hbase.coprocessor.TestMasterObserver
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.master.TestMasterFailover

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/197//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/197//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/197//console

This message is automatically generated.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4752:
--

Fix Version/s: 0.94.0

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145034#comment-13145034
 ] 

Ted Yu commented on HBASE-4752:
---

I ran TestMasterObserver#testRegionTransitionOperations and it passed.
Test failures didn't seem to be related to the patch in this JIRA.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145039#comment-13145039
 ] 

Lars Hofhansl commented on HBASE-4752:
--

One part that is still different in the old implementation LruBlockCache.free() 
always drained the entire queue. Now we only drain until we have evicted enough.
Not sure whether that was intentional or not, but to be functionally equivalent 
we should do this:
{code}
while(...) {
...
  if (freedBytes = toFree) {
queue.clear();
return freedBytes;
  }
}
queue.clear();
...
{code}

Or similar (or clear() in a finally block).


 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145046#comment-13145046
 ] 

Ted Yu commented on HBASE-4752:
---

bq. Now we only drain until we have evicted enough.
I would say this should be the correct behavior.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145050#comment-13145050
 ] 

Phabricator commented on HBASE-4742:


Liyin has commented on the revision [jira] [HBASE-4742] Split dead server's 
log in parallel.

  Thanks for reviewing it in the weekend.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:311 OK I can start 
with 1 instead of 0
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:314 I don't think 
it is necessary here.
  Because the coreThreadNumber is 0.
  Every thread launched by this thread pool is not from core pool.

REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145054#comment-13145054
 ] 

Lars Hofhansl commented on HBASE-4752:
--

Hard to say, but i would agree. Maybe jgray can chime in?




 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145059#comment-13145059
 ] 

Phabricator commented on HBASE-4742:


khemani has commented on the revision [jira] [HBASE-4742] Split dead server's 
log in parallel.

  If you splitLog(list-of-servernames) then the function won't return until all 
the servers' logs are split.

  If you call multiple splitLog(servername) then each one returns as soon as it 
is done. This is what we are doing now and we should stay with it.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:305 managing 
threads through a thread pool is a good thing. But I doubt that there is any 
scenario where the number of threads will increase without limit. The number of 
threads will always be limited by the number of region servers. And a new 
incarnation of a region server is not instantiated by the master unless the old 
one's log splitting is done.

  but i think we should still use the thread pool. it is a better abstraction.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:304 
Can you please do a new Runnable inline over here. That will make it easier for 
someone to read ProcessServerShutdown() code. Having two methods run() and 
process() kind of makes it confusing to understand ProcessServerShutdown.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:323 
This should be LOG.debug. You might consider rate limiting this debug output.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:333 
This should be LOG.warn
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:339 
Runtime.halt()? Even if you throw an exception the retry mechanism will kick in.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:304 
If master shutdown is called (abort or regular shutdown) then I am not sure 
that the pool is shutting down properly. Someone has to interrupt each of the 
threads that are have called splitLog(). Or those splitLog() threads have to be 
daemons.
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:316 
Now that you have a thread pool you can look at the status of the submitted 
tasks instead of maintaining  special states :)


REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4742:
---

Attachment: D237.6.patch

Liyin updated the revision [jira] [HBASE-4742] Split dead server's log in 
parallel.
Reviewers: Kannan, khemani, Karthik, mbautin, JIRA

  Address Prakash and Ted's comments.
  Thanks for reviewing in the weekend.

REVISION DETAIL
  https://reviews.facebook.net/D237

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java
  src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java
  
src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch, D237.6.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145089#comment-13145089
 ] 

Lars Hofhansl commented on HBASE-4752:
--

+1 still on v2




 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145098#comment-13145098
 ] 

Mikhail Bautin commented on HBASE-4746:
---

@Ted: how did you generate 4746-trunk-v2.txt? When diffing D279.2.patch against 
4746-trunk-v2.txt, I get a lot of differences.
Can we use the D279.2.patch? The default ThriftHBaseServiceHandler constructor 
is not needed anymore in ThriftServer, because I instantiated the configuration 
in ThriftServer on constructor invocation. But if you think I should add the 
default constructor back (e.g. if it part of client API), please let me know 
and I will upload a D279.3.patch.

Thank you!
--Mikhail

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4737:
---

Attachment: 4737_all.patch

impacts on java files. Just JUnit categories.

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, 
 hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
 1   18.183
 org.apache.hadoop.hbase.coprocessor.TestWALObserver   3   19.373
 org.apache.hadoop.hbase.filter.TestColumnRangeFilter  1   19.045
 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5   24.294
 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache   7   19.818
 org.apache.hadoop.hbase.io.hfile.TestHFileBlock   7   25.226
 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7   0.343
 org.apache.hadoop.hbase.mapreduce.TestImportTsv   8   40.391
 org.apache.hadoop.hbase.master.TestActiveMasterManager2   0.724
 org.apache.hadoop.hbase.master.TestHMasterRPCException1   1.17
 org.apache.hadoop.hbase.master.TestLogsCleaner1   2.953
 org.apache.hadoop.hbase.master.TestMaster 1   18.918
 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2   20.57
 org.apache.hadoop.hbase.master.TestSplitLogManager10  13.979
 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3   21.675
 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3   
 0.887
 org.apache.hadoop.hbase.regionserver.TestBlocksRead   4   1.42
 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter  3   22.694
 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed  3   29.764
 

[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4737:
---

Status: Patch Available  (was: Open)

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, 
 hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
 1   18.183
 org.apache.hadoop.hbase.coprocessor.TestWALObserver   3   19.373
 org.apache.hadoop.hbase.filter.TestColumnRangeFilter  1   19.045
 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5   24.294
 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache   7   19.818
 org.apache.hadoop.hbase.io.hfile.TestHFileBlock   7   25.226
 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7   0.343
 org.apache.hadoop.hbase.mapreduce.TestImportTsv   8   40.391
 org.apache.hadoop.hbase.master.TestActiveMasterManager2   0.724
 org.apache.hadoop.hbase.master.TestHMasterRPCException1   1.17
 org.apache.hadoop.hbase.master.TestLogsCleaner1   2.953
 org.apache.hadoop.hbase.master.TestMaster 1   18.918
 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2   20.57
 org.apache.hadoop.hbase.master.TestSplitLogManager10  13.979
 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3   21.675
 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3   
 0.887
 org.apache.hadoop.hbase.regionserver.TestBlocksRead   4   1.42
 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter  3   22.694
 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed  3   29.764
 org.apache.hadoop.hbase.regionserver.TestHRegion  57  

[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145107#comment-13145107
 ] 

Ted Yu commented on HBASE-4746:
---

@Mikhail:
I planned to use D279.2.patch since its patch testing results were clean.

Omitting default ThriftHBaseServiceHandler ctor is also good since the caller 
has to pass Configuration parameter.

I generated 4746-trunk-v2.txt by applying D279.1.patch, adding the default ctor 
and using 'svn diff ' command.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145109#comment-13145109
 ] 

Phabricator commented on HBASE-4742:


lhofhansl has commented on the revision [jira] [HBASE-4742] Split dead 
server's log in parallel.

  Very nice improvment!

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:285 
This method is not used anywhere, correct?

REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch, D237.6.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145113#comment-13145113
 ] 

Lars Hofhansl commented on HBASE-4746:
--

+1 on D279.2. Thanks for doing this Mikhail.

{code}
public int getRandomPort() {
  return 0xc000 + new Random().nextInt(0x3f00);
}
{code}
Could use a comment on the choice of values.

Why did the statics have to be removed from TestMasterReplication but not from 
TestReplication and TestMultislaveReplication (and I do apologize for the 
cut-and-pasting here)?


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145116#comment-13145116
 ] 

Phabricator commented on HBASE-4746:


stack has accepted the revision [jira] [HBASE-4746] [89-fb] Use a random ZK 
client port in unit tests so we can run them in parallel.

  Very nice Mikhail.  Minor nits.  Please attach patch to JIRA so we can 
commit.  If you have a moment, perhaps address the above else we'll do it on 
commit.  Good stuff.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java:113 I'd think 
this log message rare since the port randomly chosen.  When it does happen, I 
think it should be noticeable so I'm fine w/ INFO.  DEBUG would work too.
  src/main/java/org/apache/hadoop/hbase/avro/AvroServer.java:142 Good
  src/main/java/org/apache/hadoop/hbase/client/HTable.java:90 +1
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java:200 Deprecate?
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:46 Why avro 
package stuff in here?
  src/test/resources/hbase-site.xml:128 Good catch.

REVISION DETAIL
  https://reviews.facebook.net/D255


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3939) Some crossports of Hadoop IPC fixes

2011-11-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3939:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed branch and trunk.  Thanks to all you who helped get this in.

 Some crossports of Hadoop IPC fixes
 ---

 Key: HBASE-3939
 URL: https://issues.apache.org/jira/browse/HBASE-3939
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0

 Attachments: 3939-v2.txt, 3939-v3.txt, 3939-v4.txt, 3939-v5.txt, 
 3939-v6.txt, 3939-v7.txt, 3939-v8.txt, 3939-v9.txt, 3939.txt


 A few fixes from Hadoop IPC that we should probably cross-port into our copy:
 - HADOOP-7227: remove the protocol version check at call time
 - HADOOP-7146: fix a socket leak in server
 - HADOOP-7121: fix behavior when response serialization throws an exception
 - HADOOP-7346: send back nicer error response when client is using an out of 
 date IPC version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Benoit Sigoure (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145127#comment-13145127
 ] 

Benoit Sigoure commented on HBASE-4752:
---

+1 too.  Sorry my initial patch was getting entries in the reverse order.  As I 
said, I couldn't test it because branch 0.90 didn't even compile for me.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Create config option to only cache index blocks

2011-11-06 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4683:
-

Attachment: 4683-v2.txt

v2 has a config option to cache index block with in-memory priority.

 Create config option to only cache index blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4683-v2.txt, 4683.txt


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread nkeywal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-4737:
---

Attachment: hbasetests.sh

shell cript to execute the tests:
- small in // within the jvm
- medium  large with 2 jvm

The jira containts 3 patchs:
- 1 for the pom
- 1 for the java modifs
- 1 for the shell script

I will create another Jira when I commit the impact for surefire.

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, 
 hbasetests.sh, hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
 1   18.183
 org.apache.hadoop.hbase.coprocessor.TestWALObserver   3   19.373
 org.apache.hadoop.hbase.filter.TestColumnRangeFilter  1   19.045
 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5   24.294
 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache   7   19.818
 org.apache.hadoop.hbase.io.hfile.TestHFileBlock   7   25.226
 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7   0.343
 org.apache.hadoop.hbase.mapreduce.TestImportTsv   8   40.391
 org.apache.hadoop.hbase.master.TestActiveMasterManager2   0.724
 org.apache.hadoop.hbase.master.TestHMasterRPCException1   1.17
 org.apache.hadoop.hbase.master.TestLogsCleaner1   2.953
 org.apache.hadoop.hbase.master.TestMaster 1   18.918
 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2   20.57
 org.apache.hadoop.hbase.master.TestSplitLogManager10  13.979
 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3   21.675
 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3   
 0.887
 

[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145136#comment-13145136
 ] 

Hadoop QA commented on HBASE-4737:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502696/hbasetests.sh
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/199//console

This message is automatically generated.

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, 
 hbasetests.sh, hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
 1   18.183
 org.apache.hadoop.hbase.coprocessor.TestWALObserver   3   19.373
 org.apache.hadoop.hbase.filter.TestColumnRangeFilter  1   19.045
 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5   24.294
 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache   7   19.818
 org.apache.hadoop.hbase.io.hfile.TestHFileBlock   7   25.226
 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7   0.343
 org.apache.hadoop.hbase.mapreduce.TestImportTsv   8   40.391
 org.apache.hadoop.hbase.master.TestActiveMasterManager2   0.724
 org.apache.hadoop.hbase.master.TestHMasterRPCException1   1.17
 org.apache.hadoop.hbase.master.TestLogsCleaner1   2.953
 org.apache.hadoop.hbase.master.TestMaster 1   18.918
 org.apache.hadoop.hbase.master.TestOpenedRegionHandler   

[jira] [Updated] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread stack (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4753:
-

   Resolution: Fixed
Fix Version/s: 0.92.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Applied branch and to 0.92.  Thanks for the patch N.

 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.92.0

 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145145#comment-13145145
 ] 

Hadoop QA commented on HBASE-4737:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502689/4737_all.patch
  against trunk revision .

-1 @author.  The patch appears to contain 1 @author tags which the Hadoop 
community has agreed to not allow in code contributions.

+1 tests included.  The patch appears to include 376 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 48 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
  org.apache.hadoop.hbase.master.TestMasterFailover

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/198//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/198//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/198//console

This message is automatically generated.

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, 
 hbasetests.sh, hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 

[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4746:
---

Attachment: D279.3.patch

mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port 
in unit tests so we can run them in parallel.
Reviewers: tedyu, JIRA

  Making follow-up changes after addressing Ted's and Miguel's comments from 
D255.

REVISION DETAIL
  https://reviews.facebook.net/D279

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
  src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
  src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
  src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java
  
src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
 D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145155#comment-13145155
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502699/D279.3.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 39 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/200//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4746:
---

Attachment: D255.2.patch

mbautin updated the revision [jira] [HBASE-4746] [89-fb] Use a random ZK 
client port in unit tests so we can run them in parallel.
Reviewers: Kannan, Karthik, nspiegelberg, JIRA, stack

  Making this diff more consistent with D279, and addressing Ted's and Miguel's 
comments. The only unit tests that fail are TestReplication, 
TestFullLogReconstruction, and TestZooKeeper, but these failures have existed 
for a while and Liyin is working on a fix for those. The time to run the test 
suite is consistently under 10 minutes.

REVISION DETAIL
  https://reviews.facebook.net/D255

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java
  src/main/java/org/apache/hadoop/hbase/avro/AvroServer.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  src/main/java/org/apache/hadoop/hbase/master/ThrottledRegionReopener.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  
src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeperWrapper.java
  
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/HQuorumPeer.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java
  src/main/ruby/hbase/admin.rb
  src/test/java/org/apache/hadoop/hbase/HBaseClusterTestCase.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestFullLogReconstruction.java
  src/test/java/org/apache/hadoop/hbase/TestMultiParallelPut.java
  src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
  src/test/java/org/apache/hadoop/hbase/avro/TestAvroServer.java
  src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java
  src/test/java/org/apache/hadoop/hbase/loadtest/RegionSplitter.java
  src/test/java/org/apache/hadoop/hbase/mapred/TestLegacyTableMapReduce.java
  src/test/java/org/apache/hadoop/hbase/mapred/TestTableMapReduce.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java
  src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
  src/test/java/org/apache/hadoop/hbase/master/TestRegionManager.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionClose.java
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSink.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
  src/test/java/org/apache/hadoop/hbase/rest/HBaseRESTClusterTestBase.java
  src/test/java/org/apache/hadoop/hbase/rest/client/TestRemoteTable.java
  src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java
  src/test/resources/hbase-site.xml
  src/test/ruby/shell/shell_test.rb


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145157#comment-13145157
 ] 

Phabricator commented on HBASE-4746:


mbautin has commented on the revision [jira] [HBASE-4746] [89-fb] Use a random 
ZK client port in unit tests so we can run them in parallel.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java:200 Removed 
this constructor.

REVISION DETAIL
  https://reviews.facebook.net/D255


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145160#comment-13145160
 ] 

Phabricator commented on HBASE-4746:


mbautin has commented on the revision [jira] [HBASE-4746] Use a random ZK 
client port in unit tests so we can run them in parallel.

  An update on unit test runs:

  Run 1: 239 test classes run, 1092 tests, 16 min 13 sec. 
TestRegionServerCoprocessorExceptionWithAbort failed but succeeded when rerun 
locally
  Run 2: 239 test classes run, 1096 tests, 9 min 39 sec, no failures.

  I will look into the discrepancy between the number of tests in these two 
cases -- but I think this is because the MapReduce framework can run the same 
task multiple times (speculative execution). So I guess we are good to do 
with this patch.

REVISION DETAIL
  https://reviews.facebook.net/D279


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145161#comment-13145161
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502700/D255.2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 77 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/201//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145162#comment-13145162
 ] 

Phabricator commented on HBASE-4746:


mbautin has commented on the revision [jira] [HBASE-4746] Use a random ZK 
client port in unit tests so we can run them in parallel.

  Actually, I just found a potential bug that affects command-line invocations 
of HMaster -- will upload a new diff shortly. Sorry for so many emails.

REVISION DETAIL
  https://reviews.facebook.net/D279


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3939) Some crossports of Hadoop IPC fixes

2011-11-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145167#comment-13145167
 ] 

Hudson commented on HBASE-3939:
---

Integrated in HBase-0.92 #115 (See 
[https://builds.apache.org/job/HBase-0.92/115/])
HBASE-3939 Some crossports of Hadoop IPC fixes
HBASE-3939 Some crossports of Hadoop IPC fixes

stack : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Status.java

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java


 Some crossports of Hadoop IPC fixes
 ---

 Key: HBASE-3939
 URL: https://issues.apache.org/jira/browse/HBASE-3939
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0

 Attachments: 3939-v2.txt, 3939-v3.txt, 3939-v4.txt, 3939-v5.txt, 
 3939-v6.txt, 3939-v7.txt, 3939-v8.txt, 3939-v9.txt, 3939.txt


 A few fixes from Hadoop IPC that we should probably cross-port into our copy:
 - HADOOP-7227: remove the protocol version check at call time
 - HADOOP-7146: fix a socket leak in server
 - HADOOP-7121: fix behavior when response serialization throws an exception
 - HADOOP-7346: send back nicer error response when client is using an out of 
 date IPC version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4746:
---

Attachment: D279.4.patch

mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port 
in unit tests so we can run them in parallel.
Reviewers: tedyu, JIRA

  Addressing the problems with running HMaster from the command line. Also 
fixing intermittent issues with TestRegionServerCoprocessorExceptionWithAbort 
and TestRegionServerCoprocessorExceptionWithRemove. 1096 tests in 239 classes 
succeeded.

REVISION DETAIL
  https://reviews.facebook.net/D279

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
  src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java
  src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
  src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java
  
src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145177#comment-13145177
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502707/D279.4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 45 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/202//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4746:
---

Attachment: D279.5.patch

mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port 
in unit tests so we can run them in parallel.
Reviewers: tedyu, JIRA

  Follow-up cleanup to TestRegionServerCoprocessorExceptionWithRemove.

REVISION DETAIL
  https://reviews.facebook.net/D279

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
  src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java
  src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
  src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java
  
src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145180#comment-13145180
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502708/D279.5.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 45 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/203//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4746:
--

Status: Open  (was: Patch Available)

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4746:
--

Status: Patch Available  (was: Open)

Patch testing v5.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4746:
--

Attachment: D279-trunk-v5.txt

Changes to FSTableDescriptors.java are already in TRUNK.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server are recoverable

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4740:
--

Summary: [bulk load]  the HBASE-4552 API can't tell if errors on region 
server are recoverable  (was: [bulk load]  the HBASE-4552 API can't tell if 
errors on region server is recoverable or unrecoverable error.)

 [bulk load]  the HBASE-4552 API can't tell if errors on region server are 
 recoverable
 -

 Key: HBASE-4740
 URL: https://issues.apache.org/jira/browse/HBASE-4740
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 
 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 4740.v3, 
 hbase-4740.v2.patch


 Running TestHFileOutputFormat more frequently seems to show that it has 
 become flaky.   It is difficult to tell if this is because of a unrecoverable 
 failure or a recoverable failure.   To make this visiable from test and for 
 users, we need to make a change to bulkload call's interface on 
 HRegionServer.  The change should make successful rpcs return true, 
 recoverable failures return false, and unrecoverable failure throw an 
 IOException.  This is an RPC change, so it would be really good to get this 
 api right before the final 0.92 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145188#comment-13145188
 ] 

Hudson commented on HBASE-4753:
---

Integrated in HBase-TRUNK #2416 (See 
[https://builds.apache.org/job/HBase-TRUNK/2416/])
HBASE-4753 
org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE 
on trunk

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java


 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.92.0

 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3939) Some crossports of Hadoop IPC fixes

2011-11-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145189#comment-13145189
 ] 

Hudson commented on HBASE-3939:
---

Integrated in HBase-TRUNK #2416 (See 
[https://builds.apache.org/job/HBase-TRUNK/2416/])
HBASE-3939 Some crossports of Hadoop IPC fixes

stack : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Status.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java


 Some crossports of Hadoop IPC fixes
 ---

 Key: HBASE-3939
 URL: https://issues.apache.org/jira/browse/HBASE-3939
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0

 Attachments: 3939-v2.txt, 3939-v3.txt, 3939-v4.txt, 3939-v5.txt, 
 3939-v6.txt, 3939-v7.txt, 3939-v8.txt, 3939-v9.txt, 3939.txt


 A few fixes from Hadoop IPC that we should probably cross-port into our copy:
 - HADOOP-7227: remove the protocol version check at call time
 - HADOOP-7146: fix a socket leak in server
 - HADOOP-7121: fix behavior when response serialization throws an exception
 - HADOOP-7346: send back nicer error response when client is using an out of 
 date IPC version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3792) TableInputFormat leaks ZK connections

2011-11-06 Thread Bryan Keller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Keller updated HBASE-3792:


Attachment: tableinput.patch

Here's a patch demonstrating the changes I have implemented in my system, as 
described above. The patch is for trunk, so the changes are slightly different 
than what I am using for 0.90.4.

 TableInputFormat leaks ZK connections
 -

 Key: HBASE-3792
 URL: https://issues.apache.org/jira/browse/HBASE-3792
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.1
 Environment: Java 1.6.0_24, Mac OS X 10.6.7
Reporter: Bryan Keller
 Attachments: tableinput.patch


 The TableInputFormat creates an HTable using a new Configuration object, and 
 it never cleans it up. When running a Mapper, the TableInputFormat is 
 instantiated and the ZK connection is created. While this connection is not 
 explicitly cleaned up, the Mapper process eventually exits and thus the 
 connection is closed. Ideally the TableRecordReader would close the 
 connection in its close() method rather than relying on the process to die 
 for connection cleanup. This is fairly easy to implement by overriding 
 TableRecordReader, and also overriding TableInputFormat to specify the new 
 record reader.
 The leak occurs when the JobClient is initializing and needs to retrieves the 
 splits. To get the splits, it instantiates a TableInputFormat. Doing so 
 creates a ZK connection that is never cleaned up. Unlike the mapper, however, 
 my job client process does not die. Thus the ZK connections accumulate.
 I was able to fix the problem by writing my own TableInputFormat that does 
 not initialize the HTable in the getConf() method and does not have an HTable 
 member variable. Rather, it has a variable for the table name. The HTable is 
 instantiated where needed and then cleaned up. For example, in the 
 getSplits() method, I create the HTable, then close the connection once the 
 splits are retrieved. I also create the HTable when creating the record 
 reader, and I have a record reader that closes the connection when done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3792) TableInputFormat leaks ZK connections

2011-11-06 Thread Bryan Keller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145191#comment-13145191
 ] 

Bryan Keller commented on HBASE-3792:
-

BTW, I am having trouble running the tests on trunk so I wasn't able to verify 
this patch, I'll work on getting my dev environment more functional.

 TableInputFormat leaks ZK connections
 -

 Key: HBASE-3792
 URL: https://issues.apache.org/jira/browse/HBASE-3792
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.1
 Environment: Java 1.6.0_24, Mac OS X 10.6.7
Reporter: Bryan Keller
 Attachments: tableinput.patch


 The TableInputFormat creates an HTable using a new Configuration object, and 
 it never cleans it up. When running a Mapper, the TableInputFormat is 
 instantiated and the ZK connection is created. While this connection is not 
 explicitly cleaned up, the Mapper process eventually exits and thus the 
 connection is closed. Ideally the TableRecordReader would close the 
 connection in its close() method rather than relying on the process to die 
 for connection cleanup. This is fairly easy to implement by overriding 
 TableRecordReader, and also overriding TableInputFormat to specify the new 
 record reader.
 The leak occurs when the JobClient is initializing and needs to retrieves the 
 splits. To get the splits, it instantiates a TableInputFormat. Doing so 
 creates a ZK connection that is never cleaned up. Unlike the mapper, however, 
 my job client process does not die. Thus the ZK connections accumulate.
 I was able to fix the problem by writing my own TableInputFormat that does 
 not initialize the HTable in the getConf() method and does not have an HTable 
 member variable. Rather, it has a variable for the table name. The HTable is 
 instantiated where needed and then cleaned up. For example, in the 
 getSplits() method, I create the HTable, then close the connection once the 
 splits are retrieved. I also create the HTable when creating the record 
 reader, and I have a record reader that closes the connection when done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-11-06 Thread mingjian (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145192#comment-13145192
 ] 

mingjian commented on HBASE-4377:
-

@Jonathan If a region is splitting how do we fix it without onlined parent and 
daughters?

 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
 EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
 hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, 
 hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145195#comment-13145195
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502710/D279-trunk-v5.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 45 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 51 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/204//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/204//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/204//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.

2011-11-06 Thread Jonathan Hsieh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145197#comment-13145197
 ] 

Jonathan Hsieh commented on HBASE-4377:
---

@mingjian 

If there was a split that didn't complete cleanly, a parent region with 
daughters should look like an overlap.  The tool will tell you where these 
overlaps are.

One way to fix the problem is to keep the parent region and then move or remove 
the daughter regions from hdfs.  Since it is in the middle of a split, the 
parent should have all the data.  Alternately, you could copy the store files 
from the daughters into the dir of the parent and then run the offline 
rebuilder.

I plan on writing a blog post and hopefully adding to the book on how to fix 
these problems.

 [hbck] Offline rebuild .META. from fs data only.
 

 Key: HBASE-4377
 URL: https://issues.apache.org/jira/browse/HBASE-4377
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.92.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 
 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, 
 EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, 
 hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, 
 hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch


 In a worst case situation, it may be helpful to have an offline .META. 
 rebuilder that just looks at the file system's .regioninfos and rebuilds meta 
 from scratch.  Users could move bad regions out until there is a clean 
 rebuild.  
 It would likely fill in region split holes.  Follow on work could given 
 options to merge or select regions that overlap, or do online rebuilds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4746:
---

Attachment: D279.6.patch

mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port 
in unit tests so we can run them in parallel.
Reviewers: tedyu, JIRA

  Rebasing on recent fixes in the trunk.

REVISION DETAIL
  https://reviews.facebook.net/D279

AFFECTED FILES
  CHANGES.txt
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
  src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java
  src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java
  src/main/java/org/apache/hadoop/hbase/ipc/Status.java
  src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java
  src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
  src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java
  src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java
  src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
  src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java
  
src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch, D279.6.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145200#comment-13145200
 ] 

Phabricator commented on HBASE-4742:


khemani has commented on the revision [jira] [HBASE-4742] Split dead server's 
log in parallel.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:306 Why use a 
SynchronousQueue ... won't that cause the task.submitting() thread to block if 
the thread-pool is at capacity? (Not that we will hit this issue with default 
capacity set at 1000)
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:313 I am not very 
sure about the ThreadPoolExecutor's behavior ... is it possible that when 
waiting for a task the thread pool keeps its threads  in daemon mode and before 
a thread is chosen to execute a task it's made into a non-daemon thread?

REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch, D237.6.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4746:
---

Attachment: D279.7.patch

mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port 
in unit tests so we can run them in parallel.
Reviewers: tedyu, JIRA

  Please ignore the previous update, some unrelated changes got into the patch. 
Creating a new patch that applies on top of the trunk. Sorry for spam.

REVISION DETAIL
  https://reviews.facebook.net/D279

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/HConstants.java
  src/main/java/org/apache/hadoop/hbase/client/HTable.java
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java
  src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java
  src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java
  src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
  src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java
  src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java
  src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java
  src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java
  
src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
  
src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java
  src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java
  src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java
  
src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java


 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch, D279.6.patch, D279.7.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145204#comment-13145204
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502713/D279.7.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 45 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/205//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch, D279.6.patch, D279.7.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4511) There is data loss when master failovers

2011-11-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145227#comment-13145227
 ] 

ramkrishna.s.vasudevan commented on HBASE-4511:
---

@Stack
Patch looks fine. I have one suggestion
{code}
+  if (isExpiring(expiredServer, currentMetaServer) ||
+  expireIfOnline(currentMetaServer)) {
+// We are expiring the server that is carrying meta because unreachable
+// The expiration processing will take care of reassigning meta.
+  }
{code}

As you had clearly told if we are already expiring a server while assigning 
meta then we will not be expiring once again. 
So can we rename isExpiring to isAlreadyExpiring()?  Also can we split the 
conition because currently the if block is empty.
So we can add isAlreadyExpiring() and if true we can go with expireIfOnline.  
Just a thought.  You can decide Stack. :)

 There is data loss when master failovers
 

 Key: HBASE-4511
 URL: https://issues.apache.org/jira/browse/HBASE-4511
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: stack
Priority: Minor
 Fix For: 0.92.0

 Attachments: 4511-v2.txt, 4511.txt, 
 org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt


 It goes like this:
 Master crashed ,  at the same time RS with meta is crashing, but RS doesn't 
 eixt.
 Master startups again and finds all living RS. 
 Master verifies the meta failed,  because this RS is crashing.
 Master reassigns the meta, but it doesn't split the Hlog. 
 So some meta data is loss.
 About the logs of a failover test case fail. 
 //It said that we want to kill a RS
 2011-09-28 19:54:45,694 INFO  [Thread-988] regionserver.HRegionServer(1443): 
 STOPPED: Killing for unit test
 2011-09-28 19:54:45,694 INFO  [Thread-988] master.TestMasterFailover(1007): 
 RS 192.168.2.102,54385,1317264874629 killed 
 //Rs didn't crash. 
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.HMaster(458): Registering server found up in zk: 
 192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.ServerManager(232): Registering 
 server=192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of 
 znode /hbase/unassigned/1028785192 because node does not exist (not an error)
 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 //Meta verification failed and ressigned the meta. So all the regions in the 
 meta is loss.
 2011-09-28 19:54:51,773 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,277 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,782 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 

[jira] [Created] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException

2011-11-06 Thread Ted Yu (Created) (JIRA)
FSTableDescriptors.getTableInfoPath() should be able to handle 
FileNotFoundException


 Key: HBASE-4754
 URL: https://issues.apache.org/jira/browse/HBASE-4754
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.92.0


As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test 
results', table creation would result in the following if hadoop 0.22 is the 
underlying platform:
{code}
11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create
the table b
11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to
create the table b
java.io.FileNotFoundException: File
hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist.
   at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387)
   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085)
   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110)
   at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257)
   at 
org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243)
   at 
org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566)
   at 
org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535)
   at 
org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519)
   at 
org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140)
   at 
org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126)
   at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:619)
{code}
This was due to how DistributedFileSystem.listStatus() in 0.22 handles 
non-existent directory:
{code}
  @Override
  public FileStatus[] listStatus(Path p) throws IOException {
String src = getPathName(p);

// fetch the first batch of entries in the directory
DirectoryListing thisListing = dfs.listPaths(
src, HdfsFileStatus.EMPTY_NAME);

if (thisListing == null) { // the directory does not exist
  throw new FileNotFoundException(File  + p +  does not exist.);
}
{code}
So in FSTableDescriptors.getTableInfoPath(), we should catch 
FileNotFoundException and treat it the same way as status being null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4754:
--

Attachment: 4754.txt

 FSTableDescriptors.getTableInfoPath() should be able to handle 
 FileNotFoundException
 

 Key: HBASE-4754
 URL: https://issues.apache.org/jira/browse/HBASE-4754
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 4754.txt


 As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test 
 results', table creation would result in the following if hadoop 0.22 is the 
 underlying platform:
 {code}
 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create
 the table b
 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to
 create the table b
 java.io.FileNotFoundException: File
 hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist.
at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126)
at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 {code}
 This was due to how DistributedFileSystem.listStatus() in 0.22 handles 
 non-existent directory:
 {code}
   @Override
   public FileStatus[] listStatus(Path p) throws IOException {
 String src = getPathName(p);
 // fetch the first batch of entries in the directory
 DirectoryListing thisListing = dfs.listPaths(
 src, HdfsFileStatus.EMPTY_NAME);
 if (thisListing == null) { // the directory does not exist
   throw new FileNotFoundException(File  + p +  does not exist.);
 }
 {code}
 So in FSTableDescriptors.getTableInfoPath(), we should catch 
 FileNotFoundException and treat it the same way as status being null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4754:
--

Status: Patch Available  (was: Open)

 FSTableDescriptors.getTableInfoPath() should be able to handle 
 FileNotFoundException
 

 Key: HBASE-4754
 URL: https://issues.apache.org/jira/browse/HBASE-4754
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 4754.txt


 As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test 
 results', table creation would result in the following if hadoop 0.22 is the 
 underlying platform:
 {code}
 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create
 the table b
 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to
 create the table b
 java.io.FileNotFoundException: File
 hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist.
at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126)
at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 {code}
 This was due to how DistributedFileSystem.listStatus() in 0.22 handles 
 non-existent directory:
 {code}
   @Override
   public FileStatus[] listStatus(Path p) throws IOException {
 String src = getPathName(p);
 // fetch the first batch of entries in the directory
 DirectoryListing thisListing = dfs.listPaths(
 src, HdfsFileStatus.EMPTY_NAME);
 if (thisListing == null) { // the directory does not exist
   throw new FileNotFoundException(File  + p +  does not exist.);
 }
 {code}
 So in FSTableDescriptors.getTableInfoPath(), we should catch 
 FileNotFoundException and treat it the same way as status being null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4748) Restarting the cluster after alter table(online) completely losses the table information

2011-11-06 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4748:
--

Summary: Restarting the cluster after alter table(online) completely losses 
the table information  (was: Restart the cluster after alter table(online) 
completely losses the table information)

 Restarting the cluster after alter table(online) completely losses the table 
 information
 

 Key: HBASE-4748
 URL: https://issues.apache.org/jira/browse/HBASE-4748
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 1. Start a cluster.
 2. Alter a table
 3. Restart the master using ./hbase-daemon.sh restart master
 4. Kill the RS after master restarts.
 5. Start RS again.
 6. No table operations can be performed on the table that was altered but 
 admin.listTables() is able to list the altered table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4746:
--

Status: Open  (was: Patch Available)

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, 
 D279.5.patch, D279.6.patch, D279.7.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4748) Restarting the cluster after alter table (online) completely loses the table information

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4748:
--

Summary: Restarting the cluster after alter table (online) completely loses 
the table information  (was: Restarting the cluster after alter table(online) 
completely losses the table information)

 Restarting the cluster after alter table (online) completely loses the table 
 information
 

 Key: HBASE-4748
 URL: https://issues.apache.org/jira/browse/HBASE-4748
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan

 1. Start a cluster.
 2. Alter a table
 3. Restart the master using ./hbase-daemon.sh restart master
 4. Kill the RS after master restarts.
 5. Start RS again.
 6. No table operations can be performed on the table that was altered but 
 admin.listTables() is able to list the altered table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4746:
--

Attachment: D279-trunk-v7.txt

Same as D279.v7

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279-trunk-v7.txt, D279.1.patch, D279.2.patch, 
 D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4746:
--

Status: Patch Available  (was: Open)

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279-trunk-v7.txt, D279.1.patch, D279.2.patch, 
 D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException

2011-11-06 Thread Ted Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4754:
--

 Priority: Blocker  (was: Major)
Affects Version/s: 0.92.0

 FSTableDescriptors.getTableInfoPath() should be able to handle 
 FileNotFoundException
 

 Key: HBASE-4754
 URL: https://issues.apache.org/jira/browse/HBASE-4754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4754.txt


 As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test 
 results', table creation would result in the following if hadoop 0.22 is the 
 underlying platform:
 {code}
 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create
 the table b
 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to
 create the table b
 java.io.FileNotFoundException: File
 hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist.
at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126)
at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 {code}
 This was due to how DistributedFileSystem.listStatus() in 0.22 handles 
 non-existent directory:
 {code}
   @Override
   public FileStatus[] listStatus(Path p) throws IOException {
 String src = getPathName(p);
 // fetch the first batch of entries in the directory
 DirectoryListing thisListing = dfs.listPaths(
 src, HdfsFileStatus.EMPTY_NAME);
 if (thisListing == null) { // the directory does not exist
   throw new FileNotFoundException(File  + p +  does not exist.);
 }
 {code}
 So in FSTableDescriptors.getTableInfoPath(), we should catch 
 FileNotFoundException and treat it the same way as status being null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4511) There is data loss when master failovers

2011-11-06 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145239#comment-13145239
 ] 

ramkrishna.s.vasudevan commented on HBASE-4511:
---

bq. So we can add isAlreadyExpiring() and if true

Sorry it should be 
So we can add isAlreadyExpiring() and if flase we can go with expireIfOnline

 There is data loss when master failovers
 

 Key: HBASE-4511
 URL: https://issues.apache.org/jira/browse/HBASE-4511
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: stack
Priority: Minor
 Fix For: 0.92.0

 Attachments: 4511-v2.txt, 4511.txt, 
 org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt


 It goes like this:
 Master crashed ,  at the same time RS with meta is crashing, but RS doesn't 
 eixt.
 Master startups again and finds all living RS. 
 Master verifies the meta failed,  because this RS is crashing.
 Master reassigns the meta, but it doesn't split the Hlog. 
 So some meta data is loss.
 About the logs of a failover test case fail. 
 //It said that we want to kill a RS
 2011-09-28 19:54:45,694 INFO  [Thread-988] regionserver.HRegionServer(1443): 
 STOPPED: Killing for unit test
 2011-09-28 19:54:45,694 INFO  [Thread-988] master.TestMasterFailover(1007): 
 RS 192.168.2.102,54385,1317264874629 killed 
 //Rs didn't crash. 
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.HMaster(458): Registering server found up in zk: 
 192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.ServerManager(232): Registering 
 server=192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of 
 znode /hbase/unassigned/1028785192 because node does not exist (not an error)
 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 //Meta verification failed and ressigned the meta. So all the regions in the 
 meta is loss.
 2011-09-28 19:54:51,773 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,277 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,782 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or 
 updating) unassigned node for 

[jira] [Commented] (HBASE-4511) There is data loss when master failovers

2011-11-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145242#comment-13145242
 ] 

stack commented on HBASE-4511:
--

I like your suggestion @Ram.  New patch to follow.

 There is data loss when master failovers
 

 Key: HBASE-4511
 URL: https://issues.apache.org/jira/browse/HBASE-4511
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.0
Reporter: gaojinchao
Assignee: stack
Priority: Minor
 Fix For: 0.92.0

 Attachments: 4511-v2.txt, 4511.txt, 
 org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt


 It goes like this:
 Master crashed ,  at the same time RS with meta is crashing, but RS doesn't 
 eixt.
 Master startups again and finds all living RS. 
 Master verifies the meta failed,  because this RS is crashing.
 Master reassigns the meta, but it doesn't split the Hlog. 
 So some meta data is loss.
 About the logs of a failover test case fail. 
 //It said that we want to kill a RS
 2011-09-28 19:54:45,694 INFO  [Thread-988] regionserver.HRegionServer(1443): 
 STOPPED: Killing for unit test
 2011-09-28 19:54:45,694 INFO  [Thread-988] master.TestMasterFailover(1007): 
 RS 192.168.2.102,54385,1317264874629 killed 
 //Rs didn't crash. 
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.HMaster(458): Registering server found up in zk: 
 192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,763 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 master.ServerManager(232): Registering 
 server=192.168.2.102,54385,1317264874629
 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of 
 znode /hbase/unassigned/1028785192 because node does not exist (not an error)
 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 //Meta verification failed and ressigned the meta. So all the regions in the 
 meta is loss.
 2011-09-28 19:54:51,773 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,277 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) 
 of data from znode /hbase/root-region-server and set watcher; 
 192.168.2.102,54383,131726487...
 2011-09-28 19:54:52,782 INFO  [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(476): Failed verification of .META.,,1 at 
 address=192.168.2.102,54385,1317264874629; 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: 
 org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 
 192.168.2.102,54385,1317264874629 not running, aborting
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 catalog.CatalogTracker(316): new .META. server: 
 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null
 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] 
 zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or 
 updating) unassigned node for 1028785192 with OFFLINE state
 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] 
 zookeeper.ZooKeeperWatcher(233): 

[jira] [Commented] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException

2011-11-06 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145243#comment-13145243
 ] 

stack commented on HBASE-4754:
--

We should check all of hbase codebase Ted I'd say.  We'll see other forms of 
this issue I'd say.

 FSTableDescriptors.getTableInfoPath() should be able to handle 
 FileNotFoundException
 

 Key: HBASE-4754
 URL: https://issues.apache.org/jira/browse/HBASE-4754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4754.txt


 As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test 
 results', table creation would result in the following if hadoop 0.22 is the 
 underlying platform:
 {code}
 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create
 the table b
 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to
 create the table b
 java.io.FileNotFoundException: File
 hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist.
at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126)
at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 {code}
 This was due to how DistributedFileSystem.listStatus() in 0.22 handles 
 non-existent directory:
 {code}
   @Override
   public FileStatus[] listStatus(Path p) throws IOException {
 String src = getPathName(p);
 // fetch the first batch of entries in the directory
 DirectoryListing thisListing = dfs.listPaths(
 src, HdfsFileStatus.EMPTY_NAME);
 if (thisListing == null) { // the directory does not exist
   throw new FileNotFoundException(File  + p +  does not exist.);
 }
 {code}
 So in FSTableDescriptors.getTableInfoPath(), we should catch 
 FileNotFoundException and treat it the same way as status being null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Li Pi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145244#comment-13145244
 ] 

Li Pi commented on HBASE-4752:
--

This should work. I agree we should drain until we're below minsize, this seems 
to align with the way the code has been written.

+1 from me.

 Don't create an unnecessary LinkedList when evicting from the BlockCache
 

 Key: HBASE-4752
 URL: https://issues.apache.org/jira/browse/HBASE-4752
 Project: HBase
  Issue Type: Improvement
  Components: performance, regionserver
Affects Versions: 0.90.4
Reporter: Benoit Sigoure
Assignee: Ted Yu
Priority: Minor
 Fix For: 0.94.0

 Attachments: 
 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
 4752-trunk-v2.txt, 4752-trunk.txt


 When evicting from the BlockCache, the code creates a LinkedList containing 
 every single block sorted by access time.  This list is created from a 
 PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
 used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145251#comment-13145251
 ] 

Phabricator commented on HBASE-4742:


Liyin has commented on the revision [jira] [HBASE-4742] Split dead server's 
log in parallel.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:304 I 
have made these threads as daemons threads.

  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:285 
yes, stale code. I shall remove this function.
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:306 Good point, 
Prakash.
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java:313 I think all the 
threads created by this thread executor pool will run as daemon mode, which 
means the thread will either run until  it completes or until all User Threads 
have completed.

REVISION DETAIL
  https://reviews.facebook.net/D237


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch, D237.6.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4742) Split dead server's log in parallel

2011-11-06 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4742:
---

Attachment: D237.7.patch

Liyin updated the revision [jira] [HBASE-4742] Split dead server's log in 
parallel.
Reviewers: Kannan, khemani, Karthik, mbautin, JIRA

  Address Prakash and lhofhansl 's comments.

REVISION DETAIL
  https://reviews.facebook.net/D237

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java
  src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java
  src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
  
src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java


 Split dead server's log in parallel
 ---

 Key: HBASE-4742
 URL: https://issues.apache.org/jira/browse/HBASE-4742
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, 
 D237.5.patch, D237.6.patch, D237.7.patch


 When one region server goes down, the master will shutdown the region server 
 and split its log.
 However, splitting log is a blocking call and it would take some time.
 If more than one region server go down, the master will split its log one by 
 one, which is not efficient.
 Since we have the distributed log split, we could split these logs from the 
 dead servers in parallel. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2742) Provide strong authentication with a secure RPC engine

2011-11-06 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145254#comment-13145254
 ] 

jirapos...@reviews.apache.org commented on HBASE-2742:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1991/#review3072
---

Ship it!


+1 on getting it in.


security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6843

Good.  We're pretty far from hadoop rpc now, huh..



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6844

This happen on every rpc?  It won't show in hbase logs because our logger 
is up in hadoop ipc package that should be fine.



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6845

This if/else seems a little fuzzy.  Server-side, this is how it chooses an 
auth method?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6846

Should we set saslRpcClient to null after dispose?  Is it good to just 
swallow the exception?  Should log at least?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6847

Should this be configurable?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6848

Formatting.  This copied from hadoop?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java
https://reviews.apache.org/r/1991/#comment6849

What about the 'length' thing that we added to hbase? I don't see you 
processing it here.  I suppose its ok because these are different client and 
servers and they just don't do it (and things like asynchbase are not going to 
do secure hase any time soon).



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java
https://reviews.apache.org/r/1991/#comment6850

In secure hbase we don't do that delayed response stuff that was recently 
added to insecure rpc?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java
https://reviews.apache.org/r/1991/#comment6851

Why we need this?  This in all our rpc'ing?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java
https://reviews.apache.org/r/1991/#comment6852

Where does TRACELOG come from?  Why not LOG.trace?  Is it from parent?  
Should logging be from this class?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java
https://reviews.apache.org/r/1991/#comment6853

An insecure client cannot talk to a secure server (which makes 'sense')



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java
https://reviews.apache.org/r/1991/#comment6854

If result is null, this will work?



security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java
https://reviews.apache.org/r/1991/#comment6855

This copy/pasted from hadoop?  Else its style violation (minor nit)



security/src/main/java/org/apache/hadoop/hbase/ipc/Status.java
https://reviews.apache.org/r/1991/#comment6856

This has been added already.



security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenIdentifier.java
https://reviews.apache.org/r/1991/#comment6857

These are copied from hadoop?  Don't look like Gary style.



security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenSecretManager.java
https://reviews.apache.org/r/1991/#comment6858

Should a 'secret manager' have a public constructor?



security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenSecretManager.java
https://reviews.apache.org/r/1991/#comment6859

ditoo



security/src/main/java/org/apache/hadoop/hbase/security/token/ZKLeaderManager.java
https://reviews.apache.org/r/1991/#comment6860

Is this 'generic' zk facility?  Put in zk package?


- Michael


On 2011-10-26 20:23:19, Gary Helmling wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1991/
bq.  ---
bq.  
bq.  (Updated 2011-10-26 20:23:19)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch creates a new secure RPC engine for HBase, which provides 
Kerberos based authentication of clients, and a token-based authentication 
mechanism for mapreduce jobs.  Primary components of the patch are:
bq.  
bq.  - a new maven profile for secure Hadoop/HBase: hadoop-0.20S
bq.- Secure Hadoop dependent classes are separated under a 

[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk

2011-11-06 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145259#comment-13145259
 ] 

Hudson commented on HBASE-4753:
---

Integrated in HBase-0.92 #116 (See 
[https://builds.apache.org/job/HBase-0.92/116/])
HBASE-4753 
org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE 
on trunk

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java


 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws 
 NPE on trunk
 

 Key: HBASE-4753
 URL: https://issues.apache.org/jira/browse/HBASE-4753
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.92.0

 Attachments: 4753_FSTableDescriptors.patch


 testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo)  Time 
 elapsed: 0.011 sec   ERROR!
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433)
   at 
 org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72)
 Because the 'getTableInfoPath' can return null and it's not tested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145262#comment-13145262
 ] 

Hadoop QA commented on HBASE-4754:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502723/4754.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 51 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/206//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/206//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/206//console

This message is automatically generated.

 FSTableDescriptors.getTableInfoPath() should be able to handle 
 FileNotFoundException
 

 Key: HBASE-4754
 URL: https://issues.apache.org/jira/browse/HBASE-4754
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Blocker
 Fix For: 0.92.0

 Attachments: 4754.txt


 As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test 
 results', table creation would result in the following if hadoop 0.22 is the 
 underlying platform:
 {code}
 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create
 the table b
 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to
 create the table b
 java.io.FileNotFoundException: File
 hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist.
at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535)
at 
 org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140)
at 
 org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126)
at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 {code}
 This was due to how DistributedFileSystem.listStatus() in 0.22 handles 
 non-existent directory:
 {code}
   @Override
   public FileStatus[] listStatus(Path p) throws IOException {
 String src = getPathName(p);
 // fetch the first batch of entries in the directory
 DirectoryListing thisListing = dfs.listPaths(
 src, HdfsFileStatus.EMPTY_NAME);
 if (thisListing == null) { // the directory does not exist
   throw new FileNotFoundException(File  + p +  does not exist.);
 }
 {code}
 So in FSTableDescriptors.getTableInfoPath(), we should catch 
 FileNotFoundException and treat it the same way as status being null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

  

[jira] [Commented] (HBASE-4605) Constraints

2011-11-06 Thread Jesse Yates (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145266#comment-13145266
 ] 

Jesse Yates commented on HBASE-4605:


@Ted:
{quote}
I only see synchronized keyword on Constraints.add().
Have you tried using synchronization on other methods ?
{quote}
No I haven't, but this was the beginnings of making the setting synchronized. 
However, after thinking about it, that doesn't really make all that much sense 
for a bunch of static setting methods. I was going to push this off onto the 
user.

{quote}
Also, HTableDescriptor.values is protected field. We can change its actual 
implementation to ConcurrentHashMap, etc to accommodate for the concurrency you 
described.
{quote}
Seems a little excessive especially since all the synchronization on HTD has 
been done by users when necessary in the past. My hunch is that this is rarely 
an issue.

{quote}
If we store metadata about constraints in the Configuration object as I 
described @ 29/Oct/11 04:20, we utilize the available serialization mechanism.
The current approach deals with serialization itself. This is not as flexible 
as the above approach.
{quote}
+1 Making the changes. My initial thought was to mimic how CPs are added to the 
table, which handle writing their own bytes. But, yeah your recommendation 
would be makes it much easier (and consistent).

@Gary:
{quote}
Some runtime exceptions could indicate programming errors or bugs, in which 
case I think it's best to handle them the same way we handle unexpected errors 
in coprocessors – by either unloading or aborting, depending on configuration. 
We need to be careful for what we allow from user code running in process on 
region servers.
{quote}
AND
{quote}
I disagree here. Treating all possible exceptions the same seems wrong. An 
OutOfMemoryError is not equivalent to a NumberFormatException. We need some 
structure in place for what types of errors are allowed/expected, instead of 
swallowing everything and assuming it's just a constraint violation. I don't 
think asking the Constraint implementations to think about what errors they 
want to throw is much of a burden.
{quote}
+1 having the user propagate a DoNoRetryException subclass (eg. 
ConstraintException) - you make a good point. Working on the best way to handle 
unload/total failure.

{quote}
Sometimes people do weird things (I've seen people doing exactly what I was 
testing against), so while ugly, is not unheard of. An examples module would be 
great - I'll comment on 4336 about that. In the mean time, maybe just having an 
examples package (main/java/…hbase.constraint.examples) that can be updated as 
people find more uses for constraints? Though I'm not adverse to the book 
either.


 Constraints
 ---

 Key: HBASE-4605
 URL: https://issues.apache.org/jira/browse/HBASE-4605
 Project: HBase
  Issue Type: Improvement
  Components: client, coprocessors
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Attachments: constraint_as_cp.txt, java_Constraint_v2.patch


 From Jesse's comment on dev:
 {quote}
 What I would like to propose is a simple interface that people can use to 
 implement a 'constraint' (matching the classic database definition). This 
 would help ease of adoption by helping HBase more easily check that box, help 
 minimize code duplication across organizations, and lead to easier adoption.
 Essentially, people would implement a 'Constraint' interface for checking 
 keys before they are put into a table. Puts that are valid get written to the 
 table, but if not people can will throw an exception that gets propagated 
 back to the client explaining why the put was invalid.
 Constraints would be set on a per-table basis and the user would be expected 
 to ensure the jars containing the constraint are present on the machines 
 serving that table.
 Yes, people could roll their own mechanism for doing this via coprocessors 
 each time, but this would make it easier to do so, so you only have to 
 implement a very minimal interface and not worry about the specifics.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145270#comment-13145270
 ] 

Hadoop QA commented on HBASE-4746:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502724/D279-trunk-v7.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 45 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -164 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 51 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/207//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/207//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/207//console

This message is automatically generated.

 Use a random ZK client port in unit tests so we can run them in parallel
 

 Key: HBASE-4746
 URL: https://issues.apache.org/jira/browse/HBASE-4746
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, 
 D279-trunk-v5.txt, D279-trunk-v7.txt, D279.1.patch, D279.2.patch, 
 D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch


 The hard-coded ZK client port has long been a problem for running HBase test 
 suite in parallel. The mini ZK cluster should run on a random free port, and 
 that port should be passed to all parts of the unit tests that need to talk 
 to the mini cluster. In fact, randomizing the port exposes a lot of places in 
 the code where a new configuration is instantiated, and as a result the 
 client tries to talk to the default ZK client port and times out.
 The initial fix is for 0.89-fb, where it already allows to run unit tests in 
 parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM

2011-11-06 Thread nkeywal (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145275#comment-13145275
 ] 

nkeywal commented on HBASE-4737:


the -1 on @author is a bug in the validation tool: I actually removed an author 
tag, not added one.

Other errors are the usual ones (this patch does not modify the behavior; just 
adds tags). So I think it's ok for integration. Parallel builds can be tested 
with the shell script.

 Split the tests in small/medium/large; allow small tests to be ran in 
 parallel within a single JVM
 --

 Key: HBASE-4737
 URL: https://issues.apache.org/jira/browse/HBASE-4737
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 
 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 
 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, 
 hbasetests.sh, hbasetests.sh


 1) Split the tests in 3 categories
  - small: no cluster, less than 15s, can be run in parallel with other tests 
 in a JVM
  - medium: 45s, no flaky, useful to detect bugs immediatly
  - large: remaining
  
 2) Allow to run a subset: developpers should need to run only small and 
 medium before submitting a patch
  - will need a surefire patch, see 
 http://jira.codehaus.org/browse/SUREFIRE-329
 Small is the default. All other tests will have to be marked Medium or Large 
 with a JUnit category.
 Proposed split:
 Small:122 classes, 479 methods, ~3 minutes when no //)
 Medium: 78 classes, 373 methods, ~23 minutes
 Large: 34 classes, 221 methods, ~60 minutes
 I will have to extract the methods that are today in large or medium but 
 could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it 
 will be done in a second step (and another JIRA).
 MEDIUM LIST (name; number of methods, time)
 org.apache.hadoop.hbase.avro.TestAvroServer   3   31.468
 org.apache.hadoop.hbase.catalog.TestCatalogTracker8   4.174
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1   3.888
 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor  5   30.157
 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin   1   
 0.762
 org.apache.hadoop.hbase.client.TestHCM3   21.961
 org.apache.hadoop.hbase.client.TestHTablePool 18  26.274
 org.apache.hadoop.hbase.client.TestHTableUtil 2   16.997
 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD   3   24.629
 org.apache.hadoop.hbase.client.TestMetaScanner1   16.365
 org.apache.hadoop.hbase.client.TestMultiParallel  10  34.077
 org.apache.hadoop.hbase.client.TestTimestampsFilter   3   27.547
 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44  16.834
 org.apache.hadoop.hbase.coprocessor.TestClassLoading  5   31.346
 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint   2   32.736
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort   
 1   13.874
 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove  
 1   16.923
 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3   29.97
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass  2   14.976
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface   5   
 33.353
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
  1   16.596
 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
 1   18.183
 org.apache.hadoop.hbase.coprocessor.TestWALObserver   3   19.373
 org.apache.hadoop.hbase.filter.TestColumnRangeFilter  1   19.045
 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5   24.294
 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache   7   19.818
 org.apache.hadoop.hbase.io.hfile.TestHFileBlock   7   25.226
 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7   0.343
 org.apache.hadoop.hbase.mapreduce.TestImportTsv   8   40.391
 org.apache.hadoop.hbase.master.TestActiveMasterManager2   0.724
 org.apache.hadoop.hbase.master.TestHMasterRPCException1   1.17
 org.apache.hadoop.hbase.master.TestLogsCleaner1   2.953
 org.apache.hadoop.hbase.master.TestMaster 1   18.918
 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2   20.57
 org.apache.hadoop.hbase.master.TestSplitLogManager10  13.979
 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3   21.675