[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144928#comment-13144928 ] Phabricator commented on HBASE-4742: tedyu has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/HMaster.java:311 I think thread index starting from 1 would be more readable to user. src/main/java/org/apache/hadoop/hbase/master/HMaster.java:314 Please the following call before returning: (logSplitThreadPool).allowCoreThreadTimeOut(true); REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144929#comment-13144929 ] Lars Hofhansl commented on HBASE-4752: -- Hmm... The blocks are still sorted in the PriorityQueue. Could use a TreeSet with a comparator, but the properties will be different. The least optimization one could do is using an ArrayList and fill it from the back (the size of the queue is known, and since PriorityQueue is not threadsafe and there is no thread synchronization in CachedBlockQueue, I assume there is not multithreaded access here). Strange that those tests pass, though. You do agree that with the patch the entries would be evicted in the opposite order than before, no? Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144930#comment-13144930 ] Phabricator commented on HBASE-4746: tedyu has commented on the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java:117 Thanks for taking care of HBASE-4750. REVISION DETAIL https://reviews.facebook.net/D279 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144932#comment-13144932 ] Phabricator commented on HBASE-4742: tedyu has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/HMaster.java:314 Should read: 'Please add the following call before returning' REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144934#comment-13144934 ] Ted Yu commented on HBASE-4752: --- @Lars: If you were talking about my patch, I assume you have noticed the following: {code} - for(CachedBlock cb: blocks) { + while ((cb = queue.pollLast()) != null) { {code} Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144937#comment-13144937 ] Lars Hofhansl commented on HBASE-4752: -- Oh? ... Nope, was still referring to Beniot's patch. I missed your new patch. I am not familiar with MinMaxPriorityQueue, but if it does what its name suggests the patch looks good to me. Wanna set the initial size (with a builder)? Or remove the initial size calculation? Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144940#comment-13144940 ] Lars Hofhansl commented on HBASE-4752: -- Looking at MinMaxPriorityQueue... can we just get rid of CacheBlockQueue? It only exists to provide the entries in reverse it seems. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144942#comment-13144942 ] Hadoop QA commented on HBASE-4740: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502640/4740.v3 against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/192//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/192//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/192//console This message is automatically generated. [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 4740.v3, hbase-4740.v2.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144944#comment-13144944 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502644/4746-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.replication.TestMasterReplication org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.regionserver.TestColumnSeeking Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/193//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/193//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/193//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144957#comment-13144957 ] Hadoop QA commented on HBASE-4752: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502646/4752-trunk.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 49 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.io.hfile.TestCachedBlockQueue Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/194//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/194//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/194//console This message is automatically generated. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4753: --- Status: Patch Available (was: Open) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4753: --- Attachment: 4753_FSTableDescriptors.patch It seems than without the patch it works only if the environment has been 'initialized' by another test or by a previous successful run. org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144965#comment-13144965 ] nkeywal commented on HBASE-4737: @Mikael Thank you for the zookeeper patch. TestHRegionInfo.testGetSetOfHTD is not related to the parallel runs, it can fail on an isolated run. I created HBASE-4753 for it. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144966#comment-13144966 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502651/D279.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/195//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/195//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/195//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144982#comment-13144982 ] gaojinchao commented on HBASE-4511: --- +1 There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: stack Priority: Minor Fix For: 0.92.0 Attachments: 4511-v2.txt, 4511.txt, org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for 1028785192 with OFFLINE state 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] zookeeper.ZooKeeperWatcher(233): master:54557-0x132b31adbb30005 Received
[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144983#comment-13144983 ] Hadoop QA commented on HBASE-4753: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502654/4753_FSTableDescriptors.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.master.TestDistributedLogSplitting Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/196//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/196//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/196//console This message is automatically generated. org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4752: -- Status: Open (was: Patch Available) Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-4752: - Assignee: Ted Yu (was: Benoit Sigoure) Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145004#comment-13145004 ] Ted Yu commented on HBASE-4752: --- The gist of CachedBlockQueue now lies in its add() method which does heap size accounting. I think we should keep this class. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4752: -- Status: Patch Available (was: Open) Patch testing trunk v2. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4752: -- Attachment: 4752-trunk-v2.txt Added initial size for MinMaxPriorityQueue. Modified TestCachedBlockQueue to reflect its original intention. Also simplified the assertions with a loop. TestCachedBlockQueue passes. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145008#comment-13145008 ] Ted Yu commented on HBASE-4740: --- Some test failures were due to 'Too many open files' I couldn't reproduce TestThriftHBaseServiceHandler-related failures: {code} Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.595 sec Results : Tests run: 11, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 21.351s [INFO] Finished at: Sun Nov 06 05:56:09 PST 2011 {code} [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error. Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 4740.v3, hbase-4740.v2.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-4746: - Assignee: Mikhail Bautin Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145009#comment-13145009 ] Ted Yu commented on HBASE-4746: --- +1 on D279 patch v2. Nice work, Mikhail. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145010#comment-13145010 ] Ted Yu commented on HBASE-4753: --- +1 on patch. org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145025#comment-13145025 ] Hadoop QA commented on HBASE-4752: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502664/4752-trunk-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/197//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/197//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/197//console This message is automatically generated. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4752: -- Fix Version/s: 0.94.0 Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145034#comment-13145034 ] Ted Yu commented on HBASE-4752: --- I ran TestMasterObserver#testRegionTransitionOperations and it passed. Test failures didn't seem to be related to the patch in this JIRA. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145039#comment-13145039 ] Lars Hofhansl commented on HBASE-4752: -- One part that is still different in the old implementation LruBlockCache.free() always drained the entire queue. Now we only drain until we have evicted enough. Not sure whether that was intentional or not, but to be functionally equivalent we should do this: {code} while(...) { ... if (freedBytes = toFree) { queue.clear(); return freedBytes; } } queue.clear(); ... {code} Or similar (or clear() in a finally block). Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145046#comment-13145046 ] Ted Yu commented on HBASE-4752: --- bq. Now we only drain until we have evicted enough. I would say this should be the correct behavior. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145050#comment-13145050 ] Phabricator commented on HBASE-4742: Liyin has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. Thanks for reviewing it in the weekend. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/HMaster.java:311 OK I can start with 1 instead of 0 src/main/java/org/apache/hadoop/hbase/master/HMaster.java:314 I don't think it is necessary here. Because the coreThreadNumber is 0. Every thread launched by this thread pool is not from core pool. REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145054#comment-13145054 ] Lars Hofhansl commented on HBASE-4752: -- Hard to say, but i would agree. Maybe jgray can chime in? Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145059#comment-13145059 ] Phabricator commented on HBASE-4742: khemani has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. If you splitLog(list-of-servernames) then the function won't return until all the servers' logs are split. If you call multiple splitLog(servername) then each one returns as soon as it is done. This is what we are doing now and we should stay with it. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/HMaster.java:305 managing threads through a thread pool is a good thing. But I doubt that there is any scenario where the number of threads will increase without limit. The number of threads will always be limited by the number of region servers. And a new incarnation of a region server is not instantiated by the master unless the old one's log splitting is done. but i think we should still use the thread pool. it is a better abstraction. src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:304 Can you please do a new Runnable inline over here. That will make it easier for someone to read ProcessServerShutdown() code. Having two methods run() and process() kind of makes it confusing to understand ProcessServerShutdown. src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:323 This should be LOG.debug. You might consider rate limiting this debug output. src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:333 This should be LOG.warn src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:339 Runtime.halt()? Even if you throw an exception the retry mechanism will kick in. src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:304 If master shutdown is called (abort or regular shutdown) then I am not sure that the pool is shutting down properly. Someone has to interrupt each of the threads that are have called splitLog(). Or those splitLog() threads have to be daemons. src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:316 Now that you have a thread pool you can look at the status of the submitted tasks instead of maintaining special states :) REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4742: --- Attachment: D237.6.patch Liyin updated the revision [jira] [HBASE-4742] Split dead server's log in parallel. Reviewers: Kannan, khemani, Karthik, mbautin, JIRA Address Prakash and Ted's comments. Thanks for reviewing in the weekend. REVISION DETAIL https://reviews.facebook.net/D237 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/master/HMaster.java src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch, D237.6.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145089#comment-13145089 ] Lars Hofhansl commented on HBASE-4752: -- +1 still on v2 Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145098#comment-13145098 ] Mikhail Bautin commented on HBASE-4746: --- @Ted: how did you generate 4746-trunk-v2.txt? When diffing D279.2.patch against 4746-trunk-v2.txt, I get a lot of differences. Can we use the D279.2.patch? The default ThriftHBaseServiceHandler constructor is not needed anymore in ThriftServer, because I instantiated the configuration in ThriftServer on constructor invocation. But if you think I should add the default constructor back (e.g. if it part of client API), please let me know and I will upload a D279.3.patch. Thank you! --Mikhail Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: 4737_all.patch impacts on java files. Just JUnit categories. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Status: Patch Available (was: Open) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887 org.apache.hadoop.hbase.regionserver.TestBlocksRead 4 1.42 org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter 3 22.694 org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed 3 29.764 org.apache.hadoop.hbase.regionserver.TestHRegion 57
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145107#comment-13145107 ] Ted Yu commented on HBASE-4746: --- @Mikhail: I planned to use D279.2.patch since its patch testing results were clean. Omitting default ThriftHBaseServiceHandler ctor is also good since the caller has to pass Configuration parameter. I generated 4746-trunk-v2.txt by applying D279.1.patch, adding the default ctor and using 'svn diff ' command. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145109#comment-13145109 ] Phabricator commented on HBASE-4742: lhofhansl has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. Very nice improvment! INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:285 This method is not used anywhere, correct? REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch, D237.6.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145113#comment-13145113 ] Lars Hofhansl commented on HBASE-4746: -- +1 on D279.2. Thanks for doing this Mikhail. {code} public int getRandomPort() { return 0xc000 + new Random().nextInt(0x3f00); } {code} Could use a comment on the choice of values. Why did the statics have to be removed from TestMasterReplication but not from TestReplication and TestMultislaveReplication (and I do apologize for the cut-and-pasting here)? Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145116#comment-13145116 ] Phabricator commented on HBASE-4746: stack has accepted the revision [jira] [HBASE-4746] [89-fb] Use a random ZK client port in unit tests so we can run them in parallel. Very nice Mikhail. Minor nits. Please attach patch to JIRA so we can commit. If you have a moment, perhaps address the above else we'll do it on commit. Good stuff. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java:113 I'd think this log message rare since the port randomly chosen. When it does happen, I think it should be noticeable so I'm fine w/ INFO. DEBUG would work too. src/main/java/org/apache/hadoop/hbase/avro/AvroServer.java:142 Good src/main/java/org/apache/hadoop/hbase/client/HTable.java:90 +1 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java:200 Deprecate? src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:46 Why avro package stuff in here? src/test/resources/hbase-site.xml:128 Good catch. REVISION DETAIL https://reviews.facebook.net/D255 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3939) Some crossports of Hadoop IPC fixes
[ https://issues.apache.org/jira/browse/HBASE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3939: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed branch and trunk. Thanks to all you who helped get this in. Some crossports of Hadoop IPC fixes --- Key: HBASE-3939 URL: https://issues.apache.org/jira/browse/HBASE-3939 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: 3939-v2.txt, 3939-v3.txt, 3939-v4.txt, 3939-v5.txt, 3939-v6.txt, 3939-v7.txt, 3939-v8.txt, 3939-v9.txt, 3939.txt A few fixes from Hadoop IPC that we should probably cross-port into our copy: - HADOOP-7227: remove the protocol version check at call time - HADOOP-7146: fix a socket leak in server - HADOOP-7121: fix behavior when response serialization throws an exception - HADOOP-7346: send back nicer error response when client is using an out of date IPC version -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145127#comment-13145127 ] Benoit Sigoure commented on HBASE-4752: --- +1 too. Sorry my initial patch was getting entries in the reverse order. As I said, I couldn't test it because branch 0.90 didn't even compile for me. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4683) Create config option to only cache index blocks
[ https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-4683: - Attachment: 4683-v2.txt v2 has a config option to cache index block with in-memory priority. Create config option to only cache index blocks --- Key: HBASE-4683 URL: https://issues.apache.org/jira/browse/HBASE-4683 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 4683-v2.txt, 4683.txt This would add a new boolean config option: hfile.block.cache.datablocks Default would be true. Setting this to false allows HBase in a mode where only index blocks are cached, which is useful for analytical scenarios where a useful working set of the data cannot be expected to fit into the (aggregate) cache. This is the equivalent of setting cacheBlocks to false on all scans (including scans on behalf of gets). I would like to get a general feeling about what folks think about this. The change itself would be simple. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4737: --- Attachment: hbasetests.sh shell cript to execute the tests: - small in // within the jvm - medium large with 2 jvm The jira containts 3 patchs: - 1 for the pom - 1 for the java modifs - 1 for the shell script I will create another Jira when I commit the impact for surefire. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, hbasetests.sh, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675 org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler3 0.887
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145136#comment-13145136 ] Hadoop QA commented on HBASE-4737: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502696/hbasetests.sh against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/199//console This message is automatically generated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, hbasetests.sh, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler
[jira] [Updated] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4753: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Applied branch and to 0.92. Thanks for the patch N. org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.92.0 Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145145#comment-13145145 ] Hadoop QA commented on HBASE-4737: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502689/4737_all.patch against trunk revision . -1 @author. The patch appears to contain 1 @author tags which the Hadoop community has agreed to not allow in code contributions. +1 tests included. The patch appears to include 376 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 48 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.master.TestMasterFailover Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/198//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/198//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/198//console This message is automatically generated. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, hbasetests.sh, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4746: --- Attachment: D279.3.patch mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. Reviewers: tedyu, JIRA Making follow-up changes after addressing Ted's and Miguel's comments from D255. REVISION DETAIL https://reviews.facebook.net/D279 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/client/HTable.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145155#comment-13145155 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502699/D279.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 39 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/200//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4746: --- Attachment: D255.2.patch mbautin updated the revision [jira] [HBASE-4746] [89-fb] Use a random ZK client port in unit tests so we can run them in parallel. Reviewers: Kannan, Karthik, nspiegelberg, JIRA, stack Making this diff more consistent with D279, and addressing Ted's and Miguel's comments. The only unit tests that fail are TestReplication, TestFullLogReconstruction, and TestZooKeeper, but these failures have existed for a while and Liyin is working on a fix for those. The time to run the test suite is consistently under 10 minutes. REVISION DETAIL https://reviews.facebook.net/D255 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java src/main/java/org/apache/hadoop/hbase/avro/AvroServer.java src/main/java/org/apache/hadoop/hbase/client/HTable.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java src/main/java/org/apache/hadoop/hbase/master/HMaster.java src/main/java/org/apache/hadoop/hbase/master/ThrottledRegionReopener.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeperWrapper.java src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java src/main/java/org/apache/hadoop/hbase/zookeeper/HQuorumPeer.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseClusterTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestFullLogReconstruction.java src/test/java/org/apache/hadoop/hbase/TestMultiParallelPut.java src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java src/test/java/org/apache/hadoop/hbase/avro/TestAvroServer.java src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java src/test/java/org/apache/hadoop/hbase/loadtest/RegionSplitter.java src/test/java/org/apache/hadoop/hbase/mapred/TestLegacyTableMapReduce.java src/test/java/org/apache/hadoop/hbase/mapred/TestTableMapReduce.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java src/test/java/org/apache/hadoop/hbase/master/TestMaster.java src/test/java/org/apache/hadoop/hbase/master/TestRegionManager.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionClose.java src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSink.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java src/test/java/org/apache/hadoop/hbase/rest/HBaseRESTClusterTestBase.java src/test/java/org/apache/hadoop/hbase/rest/client/TestRemoteTable.java src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServer.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java src/test/resources/hbase-site.xml src/test/ruby/shell/shell_test.rb Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145157#comment-13145157 ] Phabricator commented on HBASE-4746: mbautin has commented on the revision [jira] [HBASE-4746] [89-fb] Use a random ZK client port in unit tests so we can run them in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java:200 Removed this constructor. REVISION DETAIL https://reviews.facebook.net/D255 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145160#comment-13145160 ] Phabricator commented on HBASE-4746: mbautin has commented on the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. An update on unit test runs: Run 1: 239 test classes run, 1092 tests, 16 min 13 sec. TestRegionServerCoprocessorExceptionWithAbort failed but succeeded when rerun locally Run 2: 239 test classes run, 1096 tests, 9 min 39 sec, no failures. I will look into the discrepancy between the number of tests in these two cases -- but I think this is because the MapReduce framework can run the same task multiple times (speculative execution). So I guess we are good to do with this patch. REVISION DETAIL https://reviews.facebook.net/D279 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145161#comment-13145161 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502700/D255.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 77 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/201//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145162#comment-13145162 ] Phabricator commented on HBASE-4746: mbautin has commented on the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. Actually, I just found a potential bug that affects command-line invocations of HMaster -- will upload a new diff shortly. Sorry for so many emails. REVISION DETAIL https://reviews.facebook.net/D279 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3939) Some crossports of Hadoop IPC fixes
[ https://issues.apache.org/jira/browse/HBASE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145167#comment-13145167 ] Hudson commented on HBASE-3939: --- Integrated in HBase-0.92 #115 (See [https://builds.apache.org/job/HBase-0.92/115/]) HBASE-3939 Some crossports of Hadoop IPC fixes HBASE-3939 Some crossports of Hadoop IPC fixes stack : Files : * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Status.java stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java Some crossports of Hadoop IPC fixes --- Key: HBASE-3939 URL: https://issues.apache.org/jira/browse/HBASE-3939 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: 3939-v2.txt, 3939-v3.txt, 3939-v4.txt, 3939-v5.txt, 3939-v6.txt, 3939-v7.txt, 3939-v8.txt, 3939-v9.txt, 3939.txt A few fixes from Hadoop IPC that we should probably cross-port into our copy: - HADOOP-7227: remove the protocol version check at call time - HADOOP-7146: fix a socket leak in server - HADOOP-7121: fix behavior when response serialization throws an exception - HADOOP-7346: send back nicer error response when client is using an out of date IPC version -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4746: --- Attachment: D279.4.patch mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. Reviewers: tedyu, JIRA Addressing the problems with running HMaster from the command line. Also fixing intermittent issues with TestRegionServerCoprocessorExceptionWithAbort and TestRegionServerCoprocessorExceptionWithRemove. 1096 tests in 239 classes succeeded. REVISION DETAIL https://reviews.facebook.net/D279 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/client/HTable.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145177#comment-13145177 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502707/D279.4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 45 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/202//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4746: --- Attachment: D279.5.patch mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. Reviewers: tedyu, JIRA Follow-up cleanup to TestRegionServerCoprocessorExceptionWithRemove. REVISION DETAIL https://reviews.facebook.net/D279 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/client/HTable.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145180#comment-13145180 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502708/D279.5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 45 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/203//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4746: -- Status: Open (was: Patch Available) Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4746: -- Status: Patch Available (was: Open) Patch testing v5. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4746: -- Attachment: D279-trunk-v5.txt Changes to FSTableDescriptors.java are already in TRUNK. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4740) [bulk load] the HBASE-4552 API can't tell if errors on region server are recoverable
[ https://issues.apache.org/jira/browse/HBASE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4740: -- Summary: [bulk load] the HBASE-4552 API can't tell if errors on region server are recoverable (was: [bulk load] the HBASE-4552 API can't tell if errors on region server is recoverable or unrecoverable error.) [bulk load] the HBASE-4552 API can't tell if errors on region server are recoverable - Key: HBASE-4740 URL: https://issues.apache.org/jira/browse/HBASE-4740 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.92.0 Attachments: 0001-HBASE-4740-bulkload-HBASE-4552-API-can-t-tell-if-err.patch, 4740.v3, hbase-4740.v2.patch Running TestHFileOutputFormat more frequently seems to show that it has become flaky. It is difficult to tell if this is because of a unrecoverable failure or a recoverable failure. To make this visiable from test and for users, we need to make a change to bulkload call's interface on HRegionServer. The change should make successful rpcs return true, recoverable failures return false, and unrecoverable failure throw an IOException. This is an RPC change, so it would be really good to get this api right before the final 0.92 goes out. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145188#comment-13145188 ] Hudson commented on HBASE-4753: --- Integrated in HBase-TRUNK #2416 (See [https://builds.apache.org/job/HBase-TRUNK/2416/]) HBASE-4753 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.92.0 Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3939) Some crossports of Hadoop IPC fixes
[ https://issues.apache.org/jira/browse/HBASE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145189#comment-13145189 ] Hudson commented on HBASE-3939: --- Integrated in HBase-TRUNK #2416 (See [https://builds.apache.org/job/HBase-TRUNK/2416/]) HBASE-3939 Some crossports of Hadoop IPC fixes stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Status.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java Some crossports of Hadoop IPC fixes --- Key: HBASE-3939 URL: https://issues.apache.org/jira/browse/HBASE-3939 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: 3939-v2.txt, 3939-v3.txt, 3939-v4.txt, 3939-v5.txt, 3939-v6.txt, 3939-v7.txt, 3939-v8.txt, 3939-v9.txt, 3939.txt A few fixes from Hadoop IPC that we should probably cross-port into our copy: - HADOOP-7227: remove the protocol version check at call time - HADOOP-7146: fix a socket leak in server - HADOOP-7121: fix behavior when response serialization throws an exception - HADOOP-7346: send back nicer error response when client is using an out of date IPC version -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3792) TableInputFormat leaks ZK connections
[ https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Keller updated HBASE-3792: Attachment: tableinput.patch Here's a patch demonstrating the changes I have implemented in my system, as described above. The patch is for trunk, so the changes are slightly different than what I am using for 0.90.4. TableInputFormat leaks ZK connections - Key: HBASE-3792 URL: https://issues.apache.org/jira/browse/HBASE-3792 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.1 Environment: Java 1.6.0_24, Mac OS X 10.6.7 Reporter: Bryan Keller Attachments: tableinput.patch The TableInputFormat creates an HTable using a new Configuration object, and it never cleans it up. When running a Mapper, the TableInputFormat is instantiated and the ZK connection is created. While this connection is not explicitly cleaned up, the Mapper process eventually exits and thus the connection is closed. Ideally the TableRecordReader would close the connection in its close() method rather than relying on the process to die for connection cleanup. This is fairly easy to implement by overriding TableRecordReader, and also overriding TableInputFormat to specify the new record reader. The leak occurs when the JobClient is initializing and needs to retrieves the splits. To get the splits, it instantiates a TableInputFormat. Doing so creates a ZK connection that is never cleaned up. Unlike the mapper, however, my job client process does not die. Thus the ZK connections accumulate. I was able to fix the problem by writing my own TableInputFormat that does not initialize the HTable in the getConf() method and does not have an HTable member variable. Rather, it has a variable for the table name. The HTable is instantiated where needed and then cleaned up. For example, in the getSplits() method, I create the HTable, then close the connection once the splits are retrieved. I also create the HTable when creating the record reader, and I have a record reader that closes the connection when done. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3792) TableInputFormat leaks ZK connections
[ https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145191#comment-13145191 ] Bryan Keller commented on HBASE-3792: - BTW, I am having trouble running the tests on trunk so I wasn't able to verify this patch, I'll work on getting my dev environment more functional. TableInputFormat leaks ZK connections - Key: HBASE-3792 URL: https://issues.apache.org/jira/browse/HBASE-3792 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.1 Environment: Java 1.6.0_24, Mac OS X 10.6.7 Reporter: Bryan Keller Attachments: tableinput.patch The TableInputFormat creates an HTable using a new Configuration object, and it never cleans it up. When running a Mapper, the TableInputFormat is instantiated and the ZK connection is created. While this connection is not explicitly cleaned up, the Mapper process eventually exits and thus the connection is closed. Ideally the TableRecordReader would close the connection in its close() method rather than relying on the process to die for connection cleanup. This is fairly easy to implement by overriding TableRecordReader, and also overriding TableInputFormat to specify the new record reader. The leak occurs when the JobClient is initializing and needs to retrieves the splits. To get the splits, it instantiates a TableInputFormat. Doing so creates a ZK connection that is never cleaned up. Unlike the mapper, however, my job client process does not die. Thus the ZK connections accumulate. I was able to fix the problem by writing my own TableInputFormat that does not initialize the HTable in the getConf() method and does not have an HTable member variable. Rather, it has a variable for the table name. The HTable is instantiated where needed and then cleaned up. For example, in the getSplits() method, I create the HTable, then close the connection once the splits are retrieved. I also create the HTable when creating the record reader, and I have a record reader that closes the connection when done. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145192#comment-13145192 ] mingjian commented on HBASE-4377: - @Jonathan If a region is splitting how do we fix it without onlined parent and daughters? [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145195#comment-13145195 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502710/D279-trunk-v5.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 45 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 51 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/204//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/204//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/204//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4377) [hbck] Offline rebuild .META. from fs data only.
[ https://issues.apache.org/jira/browse/HBASE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145197#comment-13145197 ] Jonathan Hsieh commented on HBASE-4377: --- @mingjian If there was a split that didn't complete cleanly, a parent region with daughters should look like an overlap. The tool will tell you where these overlaps are. One way to fix the problem is to keep the parent region and then move or remove the daughter regions from hdfs. Since it is in the middle of a split, the parent should have all the data. Alternately, you could copy the store files from the daughters into the dir of the parent and then run the offline rebuilder. I plan on writing a blog post and hopefully adding to the book on how to fix these problems. [hbck] Offline rebuild .META. from fs data only. Key: HBASE-4377 URL: https://issues.apache.org/jira/browse/HBASE-4377 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90-v4.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.0.90.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v1.patch, 0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data.0.92.v2.patch, EXT_AC.regioninfo, EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo, hbase-4377-trunk.v2.patch, hbase-4377.0.90.v6.patch, hbase-4377.trunk.v3.txt, hbase-4377.trunk.v4.txt, hbase-4377.trunk.v5.txt, hbase-4377.trunk.v6.patch In a worst case situation, it may be helpful to have an offline .META. rebuilder that just looks at the file system's .regioninfos and rebuilds meta from scratch. Users could move bad regions out until there is a clean rebuild. It would likely fill in region split holes. Follow on work could given options to merge or select regions that overlap, or do online rebuilds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4746: --- Attachment: D279.6.patch mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. Reviewers: tedyu, JIRA Rebasing on recent fixes in the trunk. REVISION DETAIL https://reviews.facebook.net/D279 AFFECTED FILES CHANGES.txt src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/client/HTable.java src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java src/main/java/org/apache/hadoop/hbase/ipc/ProtocolSignature.java src/main/java/org/apache/hadoop/hbase/ipc/Status.java src/main/java/org/apache/hadoop/hbase/ipc/VersionedProtocol.java src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java src/main/java/org/apache/hadoop/hbase/master/HMaster.java src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145200#comment-13145200 ] Phabricator commented on HBASE-4742: khemani has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/HMaster.java:306 Why use a SynchronousQueue ... won't that cause the task.submitting() thread to block if the thread-pool is at capacity? (Not that we will hit this issue with default capacity set at 1000) src/main/java/org/apache/hadoop/hbase/master/HMaster.java:313 I am not very sure about the ThreadPoolExecutor's behavior ... is it possible that when waiting for a task the thread pool keeps its threads in daemon mode and before a thread is chosen to execute a task it's made into a non-daemon thread? REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch, D237.6.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4746: --- Attachment: D279.7.patch mbautin updated the revision [jira] [HBASE-4746] Use a random ZK client port in unit tests so we can run them in parallel. Reviewers: tedyu, JIRA Please ignore the previous update, some unrelated changes got into the patch. Creating a new patch that applies on top of the trunk. Sorry for spam. REVISION DETAIL https://reviews.facebook.net/D279 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/client/HTable.java src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java src/main/java/org/apache/hadoop/hbase/master/HMasterCommandLine.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftHBaseServiceHandler.java src/main/java/org/apache/hadoop/hbase/thrift2/ThriftServer.java src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java src/main/java/org/apache/hadoop/hbase/util/RegionSplitter.java src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKConfig.java src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestMultiSlaveReplication.java src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java src/test/java/org/apache/hadoop/hbase/util/TestRegionSplitter.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestHQuorumPeer.java src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperMainServerArg.java Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145204#comment-13145204 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502713/D279.7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 45 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/205//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145227#comment-13145227 ] ramkrishna.s.vasudevan commented on HBASE-4511: --- @Stack Patch looks fine. I have one suggestion {code} + if (isExpiring(expiredServer, currentMetaServer) || + expireIfOnline(currentMetaServer)) { +// We are expiring the server that is carrying meta because unreachable +// The expiration processing will take care of reassigning meta. + } {code} As you had clearly told if we are already expiring a server while assigning meta then we will not be expiring once again. So can we rename isExpiring to isAlreadyExpiring()? Also can we split the conition because currently the if block is empty. So we can add isAlreadyExpiring() and if true we can go with expireIfOnline. Just a thought. You can decide Stack. :) There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: stack Priority: Minor Fix For: 0.92.0 Attachments: 4511-v2.txt, 4511.txt, org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629;
[jira] [Created] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException
FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException Key: HBASE-4754 URL: https://issues.apache.org/jira/browse/HBASE-4754 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.92.0 As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test results', table creation would result in the following if hadoop 0.22 is the underlying platform: {code} 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create the table b 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to create the table b java.io.FileNotFoundException: File hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} This was due to how DistributedFileSystem.listStatus() in 0.22 handles non-existent directory: {code} @Override public FileStatus[] listStatus(Path p) throws IOException { String src = getPathName(p); // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); if (thisListing == null) { // the directory does not exist throw new FileNotFoundException(File + p + does not exist.); } {code} So in FSTableDescriptors.getTableInfoPath(), we should catch FileNotFoundException and treat it the same way as status being null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4754: -- Attachment: 4754.txt FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException Key: HBASE-4754 URL: https://issues.apache.org/jira/browse/HBASE-4754 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.92.0 Attachments: 4754.txt As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test results', table creation would result in the following if hadoop 0.22 is the underlying platform: {code} 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create the table b 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to create the table b java.io.FileNotFoundException: File hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} This was due to how DistributedFileSystem.listStatus() in 0.22 handles non-existent directory: {code} @Override public FileStatus[] listStatus(Path p) throws IOException { String src = getPathName(p); // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); if (thisListing == null) { // the directory does not exist throw new FileNotFoundException(File + p + does not exist.); } {code} So in FSTableDescriptors.getTableInfoPath(), we should catch FileNotFoundException and treat it the same way as status being null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4754: -- Status: Patch Available (was: Open) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException Key: HBASE-4754 URL: https://issues.apache.org/jira/browse/HBASE-4754 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.92.0 Attachments: 4754.txt As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test results', table creation would result in the following if hadoop 0.22 is the underlying platform: {code} 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create the table b 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to create the table b java.io.FileNotFoundException: File hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} This was due to how DistributedFileSystem.listStatus() in 0.22 handles non-existent directory: {code} @Override public FileStatus[] listStatus(Path p) throws IOException { String src = getPathName(p); // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); if (thisListing == null) { // the directory does not exist throw new FileNotFoundException(File + p + does not exist.); } {code} So in FSTableDescriptors.getTableInfoPath(), we should catch FileNotFoundException and treat it the same way as status being null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4748) Restarting the cluster after alter table(online) completely losses the table information
[ https://issues.apache.org/jira/browse/HBASE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-4748: -- Summary: Restarting the cluster after alter table(online) completely losses the table information (was: Restart the cluster after alter table(online) completely losses the table information) Restarting the cluster after alter table(online) completely losses the table information Key: HBASE-4748 URL: https://issues.apache.org/jira/browse/HBASE-4748 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan 1. Start a cluster. 2. Alter a table 3. Restart the master using ./hbase-daemon.sh restart master 4. Kill the RS after master restarts. 5. Start RS again. 6. No table operations can be performed on the table that was altered but admin.listTables() is able to list the altered table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4746: -- Status: Open (was: Patch Available) Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4748) Restarting the cluster after alter table (online) completely loses the table information
[ https://issues.apache.org/jira/browse/HBASE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4748: -- Summary: Restarting the cluster after alter table (online) completely loses the table information (was: Restarting the cluster after alter table(online) completely losses the table information) Restarting the cluster after alter table (online) completely loses the table information Key: HBASE-4748 URL: https://issues.apache.org/jira/browse/HBASE-4748 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan 1. Start a cluster. 2. Alter a table 3. Restart the master using ./hbase-daemon.sh restart master 4. Kill the RS after master restarts. 5. Start RS again. 6. No table operations can be performed on the table that was altered but admin.listTables() is able to list the altered table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4746: -- Attachment: D279-trunk-v7.txt Same as D279.v7 Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279-trunk-v7.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4746: -- Status: Patch Available (was: Open) Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279-trunk-v7.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4754: -- Priority: Blocker (was: Major) Affects Version/s: 0.92.0 FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException Key: HBASE-4754 URL: https://issues.apache.org/jira/browse/HBASE-4754 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ted Yu Assignee: Ted Yu Priority: Blocker Fix For: 0.92.0 Attachments: 4754.txt As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test results', table creation would result in the following if hadoop 0.22 is the underlying platform: {code} 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create the table b 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to create the table b java.io.FileNotFoundException: File hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} This was due to how DistributedFileSystem.listStatus() in 0.22 handles non-existent directory: {code} @Override public FileStatus[] listStatus(Path p) throws IOException { String src = getPathName(p); // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); if (thisListing == null) { // the directory does not exist throw new FileNotFoundException(File + p + does not exist.); } {code} So in FSTableDescriptors.getTableInfoPath(), we should catch FileNotFoundException and treat it the same way as status being null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145239#comment-13145239 ] ramkrishna.s.vasudevan commented on HBASE-4511: --- bq. So we can add isAlreadyExpiring() and if true Sorry it should be So we can add isAlreadyExpiring() and if flase we can go with expireIfOnline There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: stack Priority: Minor Fix For: 0.92.0 Attachments: 4511-v2.txt, 4511.txt, org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for
[jira] [Commented] (HBASE-4511) There is data loss when master failovers
[ https://issues.apache.org/jira/browse/HBASE-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145242#comment-13145242 ] stack commented on HBASE-4511: -- I like your suggestion @Ram. New patch to follow. There is data loss when master failovers Key: HBASE-4511 URL: https://issues.apache.org/jira/browse/HBASE-4511 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.0 Reporter: gaojinchao Assignee: stack Priority: Minor Fix For: 0.92.0 Attachments: 4511-v2.txt, 4511.txt, org.apache.hadoop.hbase.master.TestMasterFailover-output.rar, sketch.txt It goes like this: Master crashed , at the same time RS with meta is crashing, but RS doesn't eixt. Master startups again and finds all living RS. Master verifies the meta failed, because this RS is crashing. Master reassigns the meta, but it doesn't split the Hlog. So some meta data is loss. About the logs of a failover test case fail. //It said that we want to kill a RS 2011-09-28 19:54:45,694 INFO [Thread-988] regionserver.HRegionServer(1443): STOPPED: Killing for unit test 2011-09-28 19:54:45,694 INFO [Thread-988] master.TestMasterFailover(1007): RS 192.168.2.102,54385,1317264874629 killed //Rs didn't crash. 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.HMaster(458): Registering server found up in zk: 192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,763 INFO [Master:0;192.168.2.102,54557,1317264885720] master.ServerManager(232): Registering server=192.168.2.102,54385,1317264874629 2011-09-28 19:54:51,770 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(491): master:54557-0x132b31adbb30005 Unable to get data of znode /hbase/unassigned/1028785192 because node does not exist (not an error) 2011-09-28 19:54:51,771 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... //Meta verification failed and ressigned the meta. So all the regions in the meta is loss. 2011-09-28 19:54:51,773 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:51,773 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,274 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,277 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,277 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,778 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKUtil(1003): master:54557-0x132b31adbb30005 Retrieved 33 byte(s) of data from znode /hbase/root-region-server and set watcher; 192.168.2.102,54383,131726487... 2011-09-28 19:54:52,782 INFO [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(476): Failed verification of .META.,,1 at address=192.168.2.102,54385,1317264874629; org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: org.apache.hadoop.hbase.regionserver.RegionServerStoppedException: Server 192.168.2.102,54385,1317264874629 not running, aborting 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] catalog.CatalogTracker(316): new .META. server: 192.168.2.102,54385,1317264874629 isn't valid. Cached .META. server: null 2011-09-28 19:54:52,782 DEBUG [Master:0;192.168.2.102,54557,1317264885720] zookeeper.ZKAssign(264): master:54557-0x132b31adbb30005 Creating (or updating) unassigned node for 1028785192 with OFFLINE state 2011-09-28 19:54:52,825 DEBUG [Thread-988-EventThread] zookeeper.ZooKeeperWatcher(233):
[jira] [Commented] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145243#comment-13145243 ] stack commented on HBASE-4754: -- We should check all of hbase codebase Ted I'd say. We'll see other forms of this issue I'd say. FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException Key: HBASE-4754 URL: https://issues.apache.org/jira/browse/HBASE-4754 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ted Yu Assignee: Ted Yu Priority: Blocker Fix For: 0.92.0 Attachments: 4754.txt As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test results', table creation would result in the following if hadoop 0.22 is the underlying platform: {code} 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create the table b 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to create the table b java.io.FileNotFoundException: File hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} This was due to how DistributedFileSystem.listStatus() in 0.22 handles non-existent directory: {code} @Override public FileStatus[] listStatus(Path p) throws IOException { String src = getPathName(p); // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); if (thisListing == null) { // the directory does not exist throw new FileNotFoundException(File + p + does not exist.); } {code} So in FSTableDescriptors.getTableInfoPath(), we should catch FileNotFoundException and treat it the same way as status being null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache
[ https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145244#comment-13145244 ] Li Pi commented on HBASE-4752: -- This should work. I agree we should drain until we're below minsize, this seems to align with the way the code has been written. +1 from me. Don't create an unnecessary LinkedList when evicting from the BlockCache Key: HBASE-4752 URL: https://issues.apache.org/jira/browse/HBASE-4752 Project: HBase Issue Type: Improvement Components: performance, regionserver Affects Versions: 0.90.4 Reporter: Benoit Sigoure Assignee: Ted Yu Priority: Minor Fix For: 0.94.0 Attachments: 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 4752-trunk-v2.txt, 4752-trunk.txt When evicting from the BlockCache, the code creates a LinkedList containing every single block sorted by access time. This list is created from a PriorityQueue. I don't believe it is necessary, as the PriorityQueue can be used directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145251#comment-13145251 ] Phabricator commented on HBASE-4742: Liyin has commented on the revision [jira] [HBASE-4742] Split dead server's log in parallel. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:304 I have made these threads as daemons threads. src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java:285 yes, stale code. I shall remove this function. src/main/java/org/apache/hadoop/hbase/master/HMaster.java:306 Good point, Prakash. src/main/java/org/apache/hadoop/hbase/master/HMaster.java:313 I think all the threads created by this thread executor pool will run as daemon mode, which means the thread will either run until it completes or until all User Threads have completed. REVISION DETAIL https://reviews.facebook.net/D237 Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch, D237.6.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4742) Split dead server's log in parallel
[ https://issues.apache.org/jira/browse/HBASE-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4742: --- Attachment: D237.7.patch Liyin updated the revision [jira] [HBASE-4742] Split dead server's log in parallel. Reviewers: Kannan, khemani, Karthik, mbautin, JIRA Address Prakash and lhofhansl 's comments. REVISION DETAIL https://reviews.facebook.net/D237 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/master/HMaster.java src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/test/java/org/apache/hadoop/hbase/master/TestMultiRegionServerShutDown.java Split dead server's log in parallel --- Key: HBASE-4742 URL: https://issues.apache.org/jira/browse/HBASE-4742 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D237.1.patch, D237.2.patch, D237.3.patch, D237.4.patch, D237.5.patch, D237.6.patch, D237.7.patch When one region server goes down, the master will shutdown the region server and split its log. However, splitting log is a blocking call and it would take some time. If more than one region server go down, the master will split its log one by one, which is not efficient. Since we have the distributed log split, we could split these logs from the dead servers in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2742) Provide strong authentication with a secure RPC engine
[ https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145254#comment-13145254 ] jirapos...@reviews.apache.org commented on HBASE-2742: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1991/#review3072 --- Ship it! +1 on getting it in. security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6843 Good. We're pretty far from hadoop rpc now, huh.. security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6844 This happen on every rpc? It won't show in hbase logs because our logger is up in hadoop ipc package that should be fine. security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6845 This if/else seems a little fuzzy. Server-side, this is how it chooses an auth method? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6846 Should we set saslRpcClient to null after dispose? Is it good to just swallow the exception? Should log at least? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6847 Should this be configurable? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6848 Formatting. This copied from hadoop? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java https://reviews.apache.org/r/1991/#comment6849 What about the 'length' thing that we added to hbase? I don't see you processing it here. I suppose its ok because these are different client and servers and they just don't do it (and things like asynchbase are not going to do secure hase any time soon). security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java https://reviews.apache.org/r/1991/#comment6850 In secure hbase we don't do that delayed response stuff that was recently added to insecure rpc? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java https://reviews.apache.org/r/1991/#comment6851 Why we need this? This in all our rpc'ing? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java https://reviews.apache.org/r/1991/#comment6852 Where does TRACELOG come from? Why not LOG.trace? Is it from parent? Should logging be from this class? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java https://reviews.apache.org/r/1991/#comment6853 An insecure client cannot talk to a secure server (which makes 'sense') security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java https://reviews.apache.org/r/1991/#comment6854 If result is null, this will work? security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java https://reviews.apache.org/r/1991/#comment6855 This copy/pasted from hadoop? Else its style violation (minor nit) security/src/main/java/org/apache/hadoop/hbase/ipc/Status.java https://reviews.apache.org/r/1991/#comment6856 This has been added already. security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenIdentifier.java https://reviews.apache.org/r/1991/#comment6857 These are copied from hadoop? Don't look like Gary style. security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenSecretManager.java https://reviews.apache.org/r/1991/#comment6858 Should a 'secret manager' have a public constructor? security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenSecretManager.java https://reviews.apache.org/r/1991/#comment6859 ditoo security/src/main/java/org/apache/hadoop/hbase/security/token/ZKLeaderManager.java https://reviews.apache.org/r/1991/#comment6860 Is this 'generic' zk facility? Put in zk package? - Michael On 2011-10-26 20:23:19, Gary Helmling wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1991/ bq. --- bq. bq. (Updated 2011-10-26 20:23:19) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch creates a new secure RPC engine for HBase, which provides Kerberos based authentication of clients, and a token-based authentication mechanism for mapreduce jobs. Primary components of the patch are: bq. bq. - a new maven profile for secure Hadoop/HBase: hadoop-0.20S bq.- Secure Hadoop dependent classes are separated under a
[jira] [Commented] (HBASE-4753) org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk
[ https://issues.apache.org/jira/browse/HBASE-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145259#comment-13145259 ] Hudson commented on HBASE-4753: --- Integrated in HBase-0.92 #116 (See [https://builds.apache.org/job/HBase-0.92/116/]) HBASE-4753 org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java org.apache.hadoop.hbase.regionserver.TestHRegionInfo#testGetSetOfHTD throws NPE on trunk Key: HBASE-4753 URL: https://issues.apache.org/jira/browse/HBASE-4753 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.92.0 Attachments: 4753_FSTableDescriptors.patch testGetSetOfHTD(org.apache.hadoop.hbase.regionserver.TestHRegionInfo) Time elapsed: 0.011 sec ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.util.FSTableDescriptors.deleteTableDescriptorIfExists(FSTableDescriptors.java:433) at org.apache.hadoop.hbase.regionserver.TestHRegionInfo.testGetSetOfHTD(TestHRegionInfo.java:72) Because the 'getTableInfoPath' can return null and it's not tested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4754) FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145262#comment-13145262 ] Hadoop QA commented on HBASE-4754: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502723/4754.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 51 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/206//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/206//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/206//console This message is automatically generated. FSTableDescriptors.getTableInfoPath() should be able to handle FileNotFoundException Key: HBASE-4754 URL: https://issues.apache.org/jira/browse/HBASE-4754 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Ted Yu Assignee: Ted Yu Priority: Blocker Fix For: 0.92.0 Attachments: 4754.txt As reported by Roman in the thread entitled 'HBase 0.92/Hadoop 0.22 test results', table creation would result in the following if hadoop 0.22 is the underlying platform: {code} 11/11/05 19:08:48 INFO handler.CreateTableHandler: Attemping to create the table b 11/11/05 19:08:48 ERROR handler.CreateTableHandler: Error trying to create the table b java.io.FileNotFoundException: File hdfs://ip-10-110-254-200.ec2.internal:17020/hbase/b does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1085) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1110) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:257) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:243) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:535) at org.apache.hadoop.hbase.util.FSTableDescriptors.createTableDescriptor(FSTableDescriptors.java:519) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.handleCreateTable(CreateTableHandler.java:140) at org.apache.hadoop.hbase.master.handler.CreateTableHandler.process(CreateTableHandler.java:126) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) {code} This was due to how DistributedFileSystem.listStatus() in 0.22 handles non-existent directory: {code} @Override public FileStatus[] listStatus(Path p) throws IOException { String src = getPathName(p); // fetch the first batch of entries in the directory DirectoryListing thisListing = dfs.listPaths( src, HdfsFileStatus.EMPTY_NAME); if (thisListing == null) { // the directory does not exist throw new FileNotFoundException(File + p + does not exist.); } {code} So in FSTableDescriptors.getTableInfoPath(), we should catch FileNotFoundException and treat it the same way as status being null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4605) Constraints
[ https://issues.apache.org/jira/browse/HBASE-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145266#comment-13145266 ] Jesse Yates commented on HBASE-4605: @Ted: {quote} I only see synchronized keyword on Constraints.add(). Have you tried using synchronization on other methods ? {quote} No I haven't, but this was the beginnings of making the setting synchronized. However, after thinking about it, that doesn't really make all that much sense for a bunch of static setting methods. I was going to push this off onto the user. {quote} Also, HTableDescriptor.values is protected field. We can change its actual implementation to ConcurrentHashMap, etc to accommodate for the concurrency you described. {quote} Seems a little excessive especially since all the synchronization on HTD has been done by users when necessary in the past. My hunch is that this is rarely an issue. {quote} If we store metadata about constraints in the Configuration object as I described @ 29/Oct/11 04:20, we utilize the available serialization mechanism. The current approach deals with serialization itself. This is not as flexible as the above approach. {quote} +1 Making the changes. My initial thought was to mimic how CPs are added to the table, which handle writing their own bytes. But, yeah your recommendation would be makes it much easier (and consistent). @Gary: {quote} Some runtime exceptions could indicate programming errors or bugs, in which case I think it's best to handle them the same way we handle unexpected errors in coprocessors – by either unloading or aborting, depending on configuration. We need to be careful for what we allow from user code running in process on region servers. {quote} AND {quote} I disagree here. Treating all possible exceptions the same seems wrong. An OutOfMemoryError is not equivalent to a NumberFormatException. We need some structure in place for what types of errors are allowed/expected, instead of swallowing everything and assuming it's just a constraint violation. I don't think asking the Constraint implementations to think about what errors they want to throw is much of a burden. {quote} +1 having the user propagate a DoNoRetryException subclass (eg. ConstraintException) - you make a good point. Working on the best way to handle unload/total failure. {quote} Sometimes people do weird things (I've seen people doing exactly what I was testing against), so while ugly, is not unheard of. An examples module would be great - I'll comment on 4336 about that. In the mean time, maybe just having an examples package (main/java/…hbase.constraint.examples) that can be updated as people find more uses for constraints? Though I'm not adverse to the book either. Constraints --- Key: HBASE-4605 URL: https://issues.apache.org/jira/browse/HBASE-4605 Project: HBase Issue Type: Improvement Components: client, coprocessors Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Attachments: constraint_as_cp.txt, java_Constraint_v2.patch From Jesse's comment on dev: {quote} What I would like to propose is a simple interface that people can use to implement a 'constraint' (matching the classic database definition). This would help ease of adoption by helping HBase more easily check that box, help minimize code duplication across organizations, and lead to easier adoption. Essentially, people would implement a 'Constraint' interface for checking keys before they are put into a table. Puts that are valid get written to the table, but if not people can will throw an exception that gets propagated back to the client explaining why the put was invalid. Constraints would be set on a per-table basis and the user would be expected to ensure the jars containing the constraint are present on the machines serving that table. Yes, people could roll their own mechanism for doing this via coprocessors each time, but this would make it easier to do so, so you only have to implement a very minimal interface and not worry about the specifics. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel
[ https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145270#comment-13145270 ] Hadoop QA commented on HBASE-4746: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12502724/D279-trunk-v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 45 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -164 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 51 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/207//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/207//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/207//console This message is automatically generated. Use a random ZK client port in unit tests so we can run them in parallel Key: HBASE-4746 URL: https://issues.apache.org/jira/browse/HBASE-4746 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: 4746-trunk-v2.txt, D255.1.patch, D255.2.patch, D279-trunk-v5.txt, D279-trunk-v7.txt, D279.1.patch, D279.2.patch, D279.3.patch, D279.4.patch, D279.5.patch, D279.6.patch, D279.7.patch The hard-coded ZK client port has long been a problem for running HBase test suite in parallel. The mini ZK cluster should run on a random free port, and that port should be passed to all parts of the unit tests that need to talk to the mini cluster. In fact, randomizing the port exposes a lot of places in the code where a new configuration is instantiated, and as a result the client tries to talk to the default ZK client port and times out. The initial fix is for 0.89-fb, where it already allows to run unit tests in parallel in 10 minutes. A fix for the trunk will follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4737) Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM
[ https://issues.apache.org/jira/browse/HBASE-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145275#comment-13145275 ] nkeywal commented on HBASE-4737: the -1 on @author is a bug in the validation tool: I actually removed an author tag, not added one. Other errors are the usual ones (this patch does not modify the behavior; just adds tags). So I think it's ok for integration. Parallel builds can be tested with the shell script. Split the tests in small/medium/large; allow small tests to be ran in parallel within a single JVM -- Key: HBASE-4737 URL: https://issues.apache.org/jira/browse/HBASE-4737 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 2003_4737_pom.dummy.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.patch, 2003_4737_pom.v2.patch, 2003_4737_pom.v2.patch, 4737_all.patch, hbasetests.sh, hbasetests.sh 1) Split the tests in 3 categories - small: no cluster, less than 15s, can be run in parallel with other tests in a JVM - medium: 45s, no flaky, useful to detect bugs immediatly - large: remaining 2) Allow to run a subset: developpers should need to run only small and medium before submitting a patch - will need a surefire patch, see http://jira.codehaus.org/browse/SUREFIRE-329 Small is the default. All other tests will have to be marked Medium or Large with a JUnit category. Proposed split: Small:122 classes, 479 methods, ~3 minutes when no //) Medium: 78 classes, 373 methods, ~23 minutes Large: 34 classes, 221 methods, ~60 minutes I will have to extract the methods that are today in large or medium but could be in small (typically io.hfile.TestHFileBlock#testBlockHeapSize), it will be done in a second step (and another JIRA). MEDIUM LIST (name; number of methods, time) org.apache.hadoop.hbase.avro.TestAvroServer 3 31.468 org.apache.hadoop.hbase.catalog.TestCatalogTracker8 4.174 org.apache.hadoop.hbase.catalog.TestMetaReaderEditorNoCluster 1 3.888 org.apache.hadoop.hbase.catalog.TestMetaReaderEditor 5 30.157 org.apache.hadoop.hbase.client.replication.TestReplicationAdmin 1 0.762 org.apache.hadoop.hbase.client.TestHCM3 21.961 org.apache.hadoop.hbase.client.TestHTablePool 18 26.274 org.apache.hadoop.hbase.client.TestHTableUtil 2 16.997 org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD 3 24.629 org.apache.hadoop.hbase.client.TestMetaScanner1 16.365 org.apache.hadoop.hbase.client.TestMultiParallel 10 34.077 org.apache.hadoop.hbase.client.TestTimestampsFilter 3 27.547 org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol 44 16.834 org.apache.hadoop.hbase.coprocessor.TestClassLoading 5 31.346 org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint 2 32.736 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort 1 13.874 org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove 1 16.923 org.apache.hadoop.hbase.coprocessor.TestMasterObserver3 29.97 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass 2 14.976 org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface 5 33.353 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort 1 16.596 org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove 1 18.183 org.apache.hadoop.hbase.coprocessor.TestWALObserver 3 19.373 org.apache.hadoop.hbase.filter.TestColumnRangeFilter 1 19.045 org.apache.hadoop.hbase.io.hfile.slab.TestSingleSizeCache 5 24.294 org.apache.hadoop.hbase.io.hfile.slab.TestSlabCache 7 19.818 org.apache.hadoop.hbase.io.hfile.TestHFileBlock 7 25.226 org.apache.hadoop.hbase.io.hfile.TestLruBlockCache7 0.343 org.apache.hadoop.hbase.mapreduce.TestImportTsv 8 40.391 org.apache.hadoop.hbase.master.TestActiveMasterManager2 0.724 org.apache.hadoop.hbase.master.TestHMasterRPCException1 1.17 org.apache.hadoop.hbase.master.TestLogsCleaner1 2.953 org.apache.hadoop.hbase.master.TestMaster 1 18.918 org.apache.hadoop.hbase.master.TestOpenedRegionHandler2 20.57 org.apache.hadoop.hbase.master.TestSplitLogManager10 13.979 org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion 3 21.675