[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530777#comment-13530777 ] Hadoop QA commented on HBASE-6466: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560308/HBASE-6466-v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 105 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3522//console This message is automatically generated. Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding
[ https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530796#comment-13530796 ] Hadoop QA commented on HBASE-4676: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560147/HBASE-4676-prefix-tree-trunk-v8.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 129 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 116 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 62 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3525//console This message is automatically generated. Prefix Compression - Trie data block encoding - Key: HBASE-4676 URL: https://issues.apache.org/jira/browse/HBASE-4676 Project: HBase Issue Type: New Feature Components: io, Performance, regionserver Affects Versions: 0.96.0 Reporter: Matt Corgan Assignee: Matt Corgan Attachments: HBASE-4676-0.94-v1.patch, HBASE-4676-common-and-server-v8.patch, HBASE-4676-prefix-tree-trunk-v1.patch, HBASE-4676-prefix-tree-trunk-v2.patch, HBASE-4676-prefix-tree-trunk-v3.patch, HBASE-4676-prefix-tree-trunk-v4.patch, HBASE-4676-prefix-tree-trunk-v5.patch, HBASE-4676-prefix-tree-trunk-v6.patch, HBASE-4676-prefix-tree-trunk-v7.patch, HBASE-4676-prefix-tree-trunk-v8.patch, hbase-prefix-trie-0.1.jar, PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, SeeksPerSec by blockSize.png The HBase data block format has room for 2 significant improvements for applications that have high block cache hit ratios. First, there is no prefix compression, and the current KeyValue format is somewhat metadata heavy, so there can be tremendous memory bloat for many common data layouts, specifically those with long keys and short values. Second, there is no random access to KeyValues inside data blocks. This means that every time you double the datablock size, average seek time (or average cpu consumption) goes up by a factor of 2. The standard 64KB block size is ~10x slower for random seeks than a 4KB block size, but block sizes as small as 4KB cause problems elsewhere. Using block sizes of 256KB or 1MB or more may be more efficient from a disk access and block-cache perspective in many big-data applications, but doing so is infeasible from a random seek perspective. The PrefixTrie block encoding format attempts to solve both of these problems. Some features: * trie format for row key encoding completely eliminates duplicate row keys and encodes similar row keys into a standard trie structure which also saves a lot of space * the column family is currently stored once at the beginning of each block. this could easily be modified to allow multiple family names per block * all
[jira] [Commented] (HBASE-1212) merge tool expects regions all have different sequence ids
[ https://issues.apache.org/jira/browse/HBASE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530802#comment-13530802 ] Hadoop QA commented on HBASE-1212: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12556386/HBASE-1212-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 104 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3526//console This message is automatically generated. merge tool expects regions all have different sequence ids -- Key: HBASE-1212 URL: https://issues.apache.org/jira/browse/HBASE-1212 Project: HBase Issue Type: Bug Reporter: stack Assignee: Jean-Marc Spaggiari Attachments: failure.log, HBASE-1212.patch, HBASE-1212-v2.patch Currently merging two regions, the merge tool will compare their sequence ids. If same, it will decrement one. It needs to do this because on region open, files are keyed by their sequenceid; if two the same, one will erase the other. Well, with the move to the aggregating hfile format, the sequenceid is written when the file is created and its no longer written into an aside file but as metadata on to the end of the file. Changing the sequenceid is no longer an option. This issue is about figuring a solution for the rare case where two store files have same sequence id AND we want to merge the two regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530904#comment-13530904 ] Adrian Muraru commented on HBASE-7205: -- I'm 100% sure it works - but I wouldn't swear ever for a piece of software :D Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7205-0.94.txt, 7205-v10.txt, 7205-v1.txt, 7205-v3.txt, 7205-v4.txt, 7205-v5.txt, 7205-v6.txt, 7205-v7.txt, 7205-v8.txt, 7205-v9.txt, HBASE-7205_v2.patch HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7243) Test for creating a large number of regions
[ https://issues.apache.org/jira/browse/HBASE-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530933#comment-13530933 ] Hudson commented on HBASE-7243: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #294 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/294/]) HBASE-7243 Test for creating a large number of regions (Revision 1421039) Result = FAILURE stack : Files : * /hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestManyRegions.java Test for creating a large number of regions --- Key: HBASE-7243 URL: https://issues.apache.org/jira/browse/HBASE-7243 Project: HBase Issue Type: Bug Components: Region Assignment, regionserver, test Reporter: Enis Soztutar Assignee: Nick Dimiduk Labels: noob Fix For: 0.96.0 Attachments: 7243-integration-test-many-splits.diff, 7243-integration-test-many-splits.diff, 7243-integration-test-many-splits.diff After HBASE-7220, I think it will be good to write a unit test/IT to create a large number of regions. We can put a reasonable timeout to the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7211) Improve hbase ref guide for the testing part.
[ https://issues.apache.org/jira/browse/HBASE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531017#comment-13531017 ] nkeywal commented on HBASE-7211: Committed. Thanks, Jeffrey! I removed as well a reference to localTests in zookeeper.xml. Our last issue (SUREFIRE-800) is now tagged as fixed in the next (2.13) surefire release, so I'm going to be optimistic and wait for it before updating the remaining part... Improve hbase ref guide for the testing part. - Key: HBASE-7211 URL: https://issues.apache.org/jira/browse/HBASE-7211 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: hbase-7211-partial.patch Here is some stuff I saw. I will propose a fix in a week or so, please add the comment or issues you have in mind. ??15.6.1. Apache HBase Modules?? = We should be able to use categories in all modules. The default should be small; but any test manipulating the time needs to be in a specific jvm (hence medium), so it's not always related to minicluster. ??15.6.3.6. hbasetests.sh?? = We can remove this chapter, and the script The script is not totally useless, but I think nobody actually uses it. = Add a chapter on flakiness. Some tests are, unfortunately, flaky. While there number decreases, we still have some. Rules are: - don't write flaky tests! :-) - small tests cannot be flaky, as it blocks other test execution. Corollary: if you have an issue with a small test, it's either your environment either a severe issue. - rerun the test a few time to validate, check the ports and file descriptors used. ??mvn test -P localTests -Dtest=MyTest?? = We could actually activate the localTests profile whenever -Dtest is used. If we do that, we can remove the reference from localTests in the doc. ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests?? = I'm not sure it's actually used. We could remove them from the pom.xml (and the doc). ??The HBase build uses a patched version of the maven surefire plugin?? = Hopefully, we will be able to remove this soon :-) ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION?? = Should be documented -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531019#comment-13531019 ] Hadoop QA commented on HBASE-6651: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560716/HBASE-6651-V10.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 19 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 114 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 20 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3527//console This message is automatically generated. Improve thread safety of HTablePool --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Assignee: Hiroshi Ikeda Fix For: 0.96.0 Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, HBASE-6651-V6.patch, HBASE-6651-V7.patch, HBASE-6651-V8.patch, HBASE-6651-V9.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool accessing PoolMap in multiple places without any explicit synchronization. For example HTablePool.closeTablePool() calls PoolMap.values(), and calls PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the newly added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multiple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531051#comment-13531051 ] Ted Yu commented on HBASE-6466: --- +1 on patch v4. Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531052#comment-13531052 ] Ted Yu commented on HBASE-7340: --- For HMaster.balance(), we first unassign the regions. For postBalance() call, the regions may not have time to reach destination. I think in this case postBalance() just provides some hint as for the final destination. Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Bug Reporter: Ted Yu Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531053#comment-13531053 ] Ted Yu commented on HBASE-6651: --- bq. HTablePool is annotated as stable. +1 on keeping HTablePool in 0.96 Improve thread safety of HTablePool --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Assignee: Hiroshi Ikeda Fix For: 0.96.0 Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, HBASE-6651-V6.patch, HBASE-6651-V7.patch, HBASE-6651-V8.patch, HBASE-6651-V9.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool accessing PoolMap in multiple places without any explicit synchronization. For example HTablePool.closeTablePool() calls PoolMap.values(), and calls PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the newly added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multiple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7344) subprocedure initialization fails with invalid znode data.
Jonathan Hsieh created HBASE-7344: - Summary: subprocedure initialization fails with invalid znode data. Key: HBASE-7344 URL: https://issues.apache.org/jira/browse/HBASE-7344 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Sometimes snapshots subprocedures fail to start on RS because data read from ZK is bad. {code} 2012-12-13 07:22:55,238 ERROR org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Illegal argument exception java.lang.IllegalArgumentException: Could not read snapshot information from request. at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapsh otManager.java:284) at org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) 2012-12-13 07:22:55,239 ERROR org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Failed due to null subprocedure Local ForeignThreadException from null at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) Caused by: java.lang.IllegalArgumentException: Could not read snapshot information from request. at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapshotManager.java:284) at org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199) ... 6 more 2012-12-13 07:22:55,239 ERROR org.apache.zookeeper.ClientCnxn: Error while calling watcher java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAborted(ZKProcedureMemberRpcs.java:266) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7211) Improve hbase ref guide for the testing part.
[ https://issues.apache.org/jira/browse/HBASE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531107#comment-13531107 ] Hudson commented on HBASE-7211: --- Integrated in HBase-TRUNK #3617 (See [https://builds.apache.org/job/HBase-TRUNK/3617/]) HBASE-7211 Improve hbase ref guide for the testing part. - 1st part (Jeffrey Zhong) (Revision 1421295) Result = FAILURE nkeywal : Files : * /hbase/trunk/pom.xml * /hbase/trunk/src/docbkx/developer.xml * /hbase/trunk/src/docbkx/zookeeper.xml Improve hbase ref guide for the testing part. - Key: HBASE-7211 URL: https://issues.apache.org/jira/browse/HBASE-7211 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: hbase-7211-partial.patch Here is some stuff I saw. I will propose a fix in a week or so, please add the comment or issues you have in mind. ??15.6.1. Apache HBase Modules?? = We should be able to use categories in all modules. The default should be small; but any test manipulating the time needs to be in a specific jvm (hence medium), so it's not always related to minicluster. ??15.6.3.6. hbasetests.sh?? = We can remove this chapter, and the script The script is not totally useless, but I think nobody actually uses it. = Add a chapter on flakiness. Some tests are, unfortunately, flaky. While there number decreases, we still have some. Rules are: - don't write flaky tests! :-) - small tests cannot be flaky, as it blocks other test execution. Corollary: if you have an issue with a small test, it's either your environment either a severe issue. - rerun the test a few time to validate, check the ports and file descriptors used. ??mvn test -P localTests -Dtest=MyTest?? = We could actually activate the localTests profile whenever -Dtest is used. If we do that, we can remove the reference from localTests in the doc. ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests?? = I'm not sure it's actually used. We could remove them from the pom.xml (and the doc). ??The HBase build uses a patched version of the maven surefire plugin?? = Hopefully, we will be able to remove this soon :-) ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION?? = Should be documented -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7344) subprocedure initialization fails with invalid znode data.
[ https://issues.apache.org/jira/browse/HBASE-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531109#comment-13531109 ] Jonathan Hsieh commented on HBASE-7344: --- This makes taking a ss impossible until the timeout occurs. subprocedure initialization fails with invalid znode data. -- Key: HBASE-7344 URL: https://issues.apache.org/jira/browse/HBASE-7344 Project: HBase Issue Type: Sub-task Reporter: Jonathan Hsieh Sometimes snapshots subprocedures fail to start on RS because data read from ZK is bad. {code} 2012-12-13 07:22:55,238 ERROR org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Illegal argument exception java.lang.IllegalArgumentException: Could not read snapshot information from request. at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapsh otManager.java:284) at org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) 2012-12-13 07:22:55,239 ERROR org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Failed due to null subprocedure Local ForeignThreadException from null at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) Caused by: java.lang.IllegalArgumentException: Could not read snapshot information from request. at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapshotManager.java:284) at org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199) ... 6 more 2012-12-13 07:22:55,239 ERROR org.apache.zookeeper.ClientCnxn: Error while calling watcher java.lang.NullPointerException at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAborted(ZKProcedureMemberRpcs.java:266) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150) at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7345) subprocedure zk info should be dumpable from the shell
Jonathan Hsieh created HBASE-7345: - Summary: subprocedure zk info should be dumpable from the shell Key: HBASE-7345 URL: https://issues.apache.org/jira/browse/HBASE-7345 Project: HBase Issue Type: Sub-task Affects Versions: hbase-6055 Reporter: Jonathan Hsieh For debugging by admins, we should include the ability to dump subprocedure information either as part of the hbase shell's zk_dump or via some new command. It should include all the status of the different procedure portions and include timestamp information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7290) Online snapshots
[ https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531133#comment-13531133 ] Jonathan Hsieh commented on HBASE-7290: --- When I say 'restore' in step 4, I am specifically referring to a 'snapshot restore' operation. It doesn't necessarily have to happen after a crash. Suppose a user creates data, snapshots it, then puts bad data in. The use wants the old data back so he restores the snapshot (there is no crash anywhere here). While what you suggest is true, (they could restore again), this is unexpected, even in our weaker consistency model. If we have a table, have a crash and come back we essentially expect the table to be essentially the same as before. (without potentially tons of new data appear. Online snapshots - Key: HBASE-7290 URL: https://issues.apache.org/jira/browse/HBASE-7290 Project: HBase Issue Type: Bug Reporter: Jonathan Hsieh HBASE-6055 will be closed when the offline snapshots pieces get merged with trunk. This umbrella issue has all the online snapshot specific patches. This will get merged once one of the implementations makes it into trunk. Other flavors of online snapshots can then be done as normal patches instead of on a development branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7290) Online snapshots
[ https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531135#comment-13531135 ] Jonathan Hsieh commented on HBASE-7290: --- Let's call this problem the 'snapshot replay problem', and move discussion to a new issue. Online snapshots - Key: HBASE-7290 URL: https://issues.apache.org/jira/browse/HBASE-7290 Project: HBase Issue Type: Bug Reporter: Jonathan Hsieh HBASE-6055 will be closed when the offline snapshots pieces get merged with trunk. This umbrella issue has all the online snapshot specific patches. This will get merged once one of the implementations makes it into trunk. Other flavors of online snapshots can then be done as normal patches instead of on a development branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-7339) Splitting a hfilelink causes region servers to go down.
[ https://issues.apache.org/jira/browse/HBASE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-7339 started by Jonathan Hsieh. Splitting a hfilelink causes region servers to go down. --- Key: HBASE-7339 URL: https://issues.apache.org/jira/browse/HBASE-7339 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: hbase-6055 Steps: - Have a single region table t with 15 hfiles in it. - Snapshot it. (was done using online snapshot from HBASE-7321) - Clone a snapshot to table t'. - t' has its region do a post-open task that attempts to compact region. policy does not compact all files. (default seems to be 10) - after compaction we have hfile links and real hfiles mixed in the region - t' starts splitting - creating split references, opening daughers fails - hfile links are split, creating hfile link daughter refs. {{hfile\-region\-table.parentregion}} - these split hfile links are interpreted as hfile links with table {{table.parentregion}} - {{hfile\-region\-table.parentregion}} (groupings interpreted incorrectly) - Since this is after the splitting PONR, this aborts the server. It then spreads to the next server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7346) Restored snapshot replay problem
Jonathan Hsieh created HBASE-7346: - Summary: Restored snapshot replay problem Key: HBASE-7346 URL: https://issues.apache.org/jira/browse/HBASE-7346 Project: HBase Issue Type: Sub-task Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Priority: Critical Fix For: hbase-6055 The situation is a coarse-grained problem. The key problem is that writes that shouldn't be replayed (since they don't belong to the restored image), would not normally get replayed, but would potentially get replayed if recovery was triggered. Previously, without restore, we could depend on the timestamps – if something was replayed but there was newer data, the newer data would win. In a restore situation, the newer data is has the old time stamps from before recovery, and new data that shouldn't get replayed could be. ex: 1) write 100 rows 2) ss1 (with logs) 3) write 50 rows 4) restore ss1 5) crash 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7290) Online snapshots
[ https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531137#comment-13531137 ] Jonathan Hsieh commented on HBASE-7290: --- Restored snapshot replay problem: HBASE-7346. I've filed it as part of offline snapshots since it affects that case as well. Online snapshots - Key: HBASE-7290 URL: https://issues.apache.org/jira/browse/HBASE-7290 Project: HBase Issue Type: Bug Reporter: Jonathan Hsieh HBASE-6055 will be closed when the offline snapshots pieces get merged with trunk. This umbrella issue has all the online snapshot specific patches. This will get merged once one of the implementations makes it into trunk. Other flavors of online snapshots can then be done as normal patches instead of on a development branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7346) Restored snapshot replay problem
[ https://issues.apache.org/jira/browse/HBASE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531139#comment-13531139 ] Jonathan Hsieh commented on HBASE-7346: --- It was suggested by [~mbertozzi] that we probably can adding a restore timestamp somewhere to a restored snapshot and have log splitting not restore data before that timestamp in that case. Since we require a disable before restore, this may be able to avoid restore ragged edges due to having many rs's with different timestamps. Restored snapshot replay problem Key: HBASE-7346 URL: https://issues.apache.org/jira/browse/HBASE-7346 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Priority: Critical Fix For: hbase-6055 The situation is a coarse-grained problem. The key problem is that writes that shouldn't be replayed (since they don't belong to the restored image), would not normally get replayed, but would potentially get replayed if recovery was triggered. Previously, without restore, we could depend on the timestamps – if something was replayed but there was newer data, the newer data would win. In a restore situation, the newer data is has the old time stamps from before recovery, and new data that shouldn't get replayed could be. ex: 1) write 100 rows 2) ss1 (with logs) 3) write 50 rows 4) restore ss1 5) crash 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531162#comment-13531162 ] ramkrishna.s.vasudevan commented on HBASE-7342: --- It tries to find the mid from the blockkey indices right...As the offset starts from 0 this should be fine right? Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7119) org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test
[ https://issues.apache.org/jira/browse/HBASE-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ping Zhang updated HBASE-7119: - Status: Patch Available (was: In Progress) submit patch org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test - Key: HBASE-7119 URL: https://issues.apache.org/jira/browse/HBASE-7119 Project: HBase Issue Type: Sub-task Affects Versions: 0.94.0, 0.92.0, 0.90.5, 0.90.4 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7119-0.94.patch, HBASE-7119-0.94-v1.patch, HBASE-7119-trunk.patch Original Estimate: 24h Remaining Estimate: 24h Running org.apache.hadoop.hbase.io.TestHeapSize Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.068 sec FAILURE! testNativeSizes(org.apache.hadoop.hbase.io.TestHeapSize) Time elapsed: 0.01 sec FAILURE! junit.framework.AssertionFailedError: expected:64 but was:56 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:130) at junit.framework.Assert.assertEquals(Assert.java:136) at org.apache.hadoop.hbase.io.TestHeapSize.testNativeSizes(TestHeapSize.java:75) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test
[ https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ping Zhang updated HBASE-7116: - Attachment: HBASE-7116-0.94.patch This patch HBASE-7116-0.94.patch is for 0.94 branch. org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test - Key: HBASE-7116 URL: https://issues.apache.org/jira/browse/HBASE-7116 Project: HBase Issue Type: Bug Affects Versions: 0.90.5, 0.92.0 Environment: Linux RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch Original Estimate: 72h Remaining Estimate: 72h org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with testClientSessionExpired method as follwoing: - Tests run: 5, Failures: 1, Errors: 0, Skipped: 0 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper) java.lang.AssertionError: at org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114) impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties 2012-01-22 01:44:45,764 WARN [main] util.MBeans(59): Hadoop:service=DataNode,name=MetricsSystem,sub=Control javax.management.InstanceAlreadyExistsException: MXBean already registered with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control 2012-01-22 01:46:07,153 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(336): org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected from ZooKeeper, ignoring 2012-01-22 01:46:07,157 WARN [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException Shutting down DataNode 0 2012-01-22 01:46:08,160 WARN [main] util.MBeans(73): Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test
[ https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ping Zhang updated HBASE-7116: - Attachment: HBASE-7116-0.90.patch The HBASE-7116-0.90.patch is for 0.90 branch. org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test - Key: HBASE-7116 URL: https://issues.apache.org/jira/browse/HBASE-7116 Project: HBase Issue Type: Bug Affects Versions: 0.90.5, 0.92.0 Environment: Linux RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch Original Estimate: 72h Remaining Estimate: 72h org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with testClientSessionExpired method as follwoing: - Tests run: 5, Failures: 1, Errors: 0, Skipped: 0 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper) java.lang.AssertionError: at org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114) impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties 2012-01-22 01:44:45,764 WARN [main] util.MBeans(59): Hadoop:service=DataNode,name=MetricsSystem,sub=Control javax.management.InstanceAlreadyExistsException: MXBean already registered with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control 2012-01-22 01:46:07,153 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(336): org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected from ZooKeeper, ignoring 2012-01-22 01:46:07,157 WARN [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException Shutting down DataNode 0 2012-01-22 01:46:08,160 WARN [main] util.MBeans(73): Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test
[ https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ping Zhang updated HBASE-7116: - Status: Patch Available (was: Open) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test - Key: HBASE-7116 URL: https://issues.apache.org/jira/browse/HBASE-7116 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.90.5 Environment: Linux RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch Original Estimate: 72h Remaining Estimate: 72h org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with testClientSessionExpired method as follwoing: - Tests run: 5, Failures: 1, Errors: 0, Skipped: 0 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper) java.lang.AssertionError: at org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114) impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties 2012-01-22 01:44:45,764 WARN [main] util.MBeans(59): Hadoop:service=DataNode,name=MetricsSystem,sub=Control javax.management.InstanceAlreadyExistsException: MXBean already registered with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control 2012-01-22 01:46:07,153 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(336): org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected from ZooKeeper, ignoring 2012-01-22 01:46:07,157 WARN [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException Shutting down DataNode 0 2012-01-22 01:46:08,160 WARN [main] util.MBeans(73): Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7119) org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test
[ https://issues.apache.org/jira/browse/HBASE-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531183#comment-13531183 ] Hadoop QA commented on HBASE-7119: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555307/HBASE-7119-0.94-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3528//console This message is automatically generated. org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test - Key: HBASE-7119 URL: https://issues.apache.org/jira/browse/HBASE-7119 Project: HBase Issue Type: Sub-task Affects Versions: 0.90.4, 0.90.5, 0.92.0, 0.94.0 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7119-0.94.patch, HBASE-7119-0.94-v1.patch, HBASE-7119-trunk.patch Original Estimate: 24h Remaining Estimate: 24h Running org.apache.hadoop.hbase.io.TestHeapSize Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.068 sec FAILURE! testNativeSizes(org.apache.hadoop.hbase.io.TestHeapSize) Time elapsed: 0.01 sec FAILURE! junit.framework.AssertionFailedError: expected:64 but was:56 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:130) at junit.framework.Assert.assertEquals(Assert.java:136) at org.apache.hadoop.hbase.io.TestHeapSize.testNativeSizes(TestHeapSize.java:75) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test
[ https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ping Zhang updated HBASE-7118: - Attachment: HBASE-7118-0.94.0.patch Thanks Ramkrishna.s. I attached HBASE-7118-0.94.0.patch specially for release 0.94.0. org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test --- Key: HBASE-7118 URL: https://issues.apache.org/jira/browse/HBASE-7118 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7118-0.94.0.patch Original Estimate: 24h Remaining Estimate: 24h org.apache.hadoop.hbase.replication.TestReplicationPeer Running org.apache.hadoop.hbase.replication.TestReplicationPeer Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec FAILURE! --- stable failures, new for hbase 0.92.0, need to be fixed firstly. target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt output: Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec FAILURE! testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer) Time elapsed: 25.247 sec FAILURE! junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was not properly expired. at junit.framework.Assert.fail(Assert.java:50) at org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt content: 2012-03-25 20:52:42,979 INFO [main] zookeeper.MiniZooKeeperCluster(174): Started MiniZK Cluster and connect 1 ZK server on client port: 21818 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to cluster: clusterId opening connection to ZooKeeper with ensemble (localhost:21818) 2012-03-25 20:52:43,082 INFO [main] zookeeper.RecoverableZooKeeper(89): The identifier of this process is 4...@svltest116.svl.ibm.com 2012-03-25 20:52:43,166 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-03-25 20:52:43,175 INFO [Thread-9] replication.TestReplicationPeer(53): Expiring ReplicationPeer ZooKeeper session. 2012-03-25 20:52:43,196 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(334): connection to cluster: clusterId-0x1364d226a3d connected 2012-03-25 20:52:43,308 INFO [Thread-9] hbase.HBaseTestingUtility(1234): ZK Closed Session 0x1364d226a3d; sleeping=25000 2012-03-25 20:53:08,323 INFO [Thread-9] replication.TestReplicationPeer(57): Attempting to use expired ReplicationPeer ZooKeeper session. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test
[ https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Ping Zhang updated HBASE-7118: - Status: Patch Available (was: Open) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test --- Key: HBASE-7118 URL: https://issues.apache.org/jira/browse/HBASE-7118 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.92.0 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7118-0.94.0.patch Original Estimate: 24h Remaining Estimate: 24h org.apache.hadoop.hbase.replication.TestReplicationPeer Running org.apache.hadoop.hbase.replication.TestReplicationPeer Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec FAILURE! --- stable failures, new for hbase 0.92.0, need to be fixed firstly. target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt output: Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec FAILURE! testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer) Time elapsed: 25.247 sec FAILURE! junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was not properly expired. at junit.framework.Assert.fail(Assert.java:50) at org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt content: 2012-03-25 20:52:42,979 INFO [main] zookeeper.MiniZooKeeperCluster(174): Started MiniZK Cluster and connect 1 ZK server on client port: 21818 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to cluster: clusterId opening connection to ZooKeeper with ensemble (localhost:21818) 2012-03-25 20:52:43,082 INFO [main] zookeeper.RecoverableZooKeeper(89): The identifier of this process is 4...@svltest116.svl.ibm.com 2012-03-25 20:52:43,166 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-03-25 20:52:43,175 INFO [Thread-9] replication.TestReplicationPeer(53): Expiring ReplicationPeer ZooKeeper session. 2012-03-25 20:52:43,196 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(334): connection to cluster: clusterId-0x1364d226a3d connected 2012-03-25 20:52:43,308 INFO [Thread-9] hbase.HBaseTestingUtility(1234): ZK Closed Session 0x1364d226a3d; sleeping=25000 2012-03-25 20:53:08,323 INFO [Thread-9] replication.TestReplicationPeer(57): Attempting to use expired ReplicationPeer ZooKeeper session. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7340: -- Attachment: 7340-v1.txt Patch v1. TestMasterObserver passes. Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Bug Reporter: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7340: -- Assignee: Ted Yu Issue Type: Task (was: Bug) Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7340: -- Status: Patch Available (was: Open) Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test
[ https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531196#comment-13531196 ] Hadoop QA commented on HBASE-7118: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560807/HBASE-7118-0.94.0.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3529//console This message is automatically generated. org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test --- Key: HBASE-7118 URL: https://issues.apache.org/jira/browse/HBASE-7118 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7118-0.94.0.patch Original Estimate: 24h Remaining Estimate: 24h org.apache.hadoop.hbase.replication.TestReplicationPeer Running org.apache.hadoop.hbase.replication.TestReplicationPeer Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec FAILURE! --- stable failures, new for hbase 0.92.0, need to be fixed firstly. target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt output: Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec FAILURE! testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer) Time elapsed: 25.247 sec FAILURE! junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was not properly expired. at junit.framework.Assert.fail(Assert.java:50) at org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt content: 2012-03-25 20:52:42,979 INFO [main] zookeeper.MiniZooKeeperCluster(174): Started MiniZK Cluster and connect 1 ZK server on client port: 21818 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to cluster: clusterId opening connection to ZooKeeper with ensemble (localhost:21818) 2012-03-25 20:52:43,082 INFO [main] zookeeper.RecoverableZooKeeper(89): The identifier of this process is 4...@svltest116.svl.ibm.com 2012-03-25 20:52:43,166 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-03-25 20:52:43,175 INFO [Thread-9] replication.TestReplicationPeer(53): Expiring ReplicationPeer ZooKeeper session. 2012-03-25 20:52:43,196 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(334): connection to cluster: clusterId-0x1364d226a3d connected 2012-03-25 20:52:43,308 INFO [Thread-9] hbase.HBaseTestingUtility(1234): ZK Closed Session 0x1364d226a3d; sleeping=25000 2012-03-25 20:53:08,323 INFO [Thread-9] replication.TestReplicationPeer(57): Attempting to use expired ReplicationPeer ZooKeeper session. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test
[ https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531197#comment-13531197 ] Hadoop QA commented on HBASE-7116: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560805/HBASE-7116-0.90.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3530//console This message is automatically generated. org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test - Key: HBASE-7116 URL: https://issues.apache.org/jira/browse/HBASE-7116 Project: HBase Issue Type: Bug Affects Versions: 0.90.5, 0.92.0 Environment: Linux RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch Original Estimate: 72h Remaining Estimate: 72h org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with testClientSessionExpired method as follwoing: - Tests run: 5, Failures: 1, Errors: 0, Skipped: 0 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper) java.lang.AssertionError: at org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114) impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties 2012-01-22 01:44:45,764 WARN [main] util.MBeans(59): Hadoop:service=DataNode,name=MetricsSystem,sub=Control javax.management.InstanceAlreadyExistsException: MXBean already registered with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control 2012-01-22 01:46:07,153 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(336): org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected from ZooKeeper, ignoring 2012-01-22 01:46:07,157 WARN [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException Shutting down DataNode 0 2012-01-22 01:46:08,160 WARN [main] util.MBeans(73): Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7120) hbase-daemon.sh (start) missing necessary check when writing pid and log files
[ https://issues.apache.org/jira/browse/HBASE-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531212#comment-13531212 ] Li Ping Zhang commented on HBASE-7120: -- Stack, agree. I will modifiy it and regenerate new patches with your good suggestion. hbase-daemon.sh (start) missing necessary check when writing pid and log files -- Key: HBASE-7120 URL: https://issues.apache.org/jira/browse/HBASE-7120 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7120-0.94.patch, HBASE-7120-trunk.patch Original Estimate: 48h Remaining Estimate: 48h $HBASE_HOME/bin/hbase-daemon.sh exit code is Zero, when runing hbase-daemon.sh failed with start, which doesn’t do required command exit code check, it's better to do necessary check when writing pid and log files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7211) Improve hbase ref guide for the testing part.
[ https://issues.apache.org/jira/browse/HBASE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531228#comment-13531228 ] Jeffrey Zhong commented on HBASE-7211: -- Thanks Nicolas! Improve hbase ref guide for the testing part. - Key: HBASE-7211 URL: https://issues.apache.org/jira/browse/HBASE-7211 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: hbase-7211-partial.patch Here is some stuff I saw. I will propose a fix in a week or so, please add the comment or issues you have in mind. ??15.6.1. Apache HBase Modules?? = We should be able to use categories in all modules. The default should be small; but any test manipulating the time needs to be in a specific jvm (hence medium), so it's not always related to minicluster. ??15.6.3.6. hbasetests.sh?? = We can remove this chapter, and the script The script is not totally useless, but I think nobody actually uses it. = Add a chapter on flakiness. Some tests are, unfortunately, flaky. While there number decreases, we still have some. Rules are: - don't write flaky tests! :-) - small tests cannot be flaky, as it blocks other test execution. Corollary: if you have an issue with a small test, it's either your environment either a severe issue. - rerun the test a few time to validate, check the ports and file descriptors used. ??mvn test -P localTests -Dtest=MyTest?? = We could actually activate the localTests profile whenever -Dtest is used. If we do that, we can remove the reference from localTests in the doc. ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests?? = I'm not sure it's actually used. We could remove them from the pom.xml (and the doc). ??The HBase build uses a patched version of the maven surefire plugin?? = Hopefully, we will be able to remove this soon :-) ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION?? = Should be documented -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
[ https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7338: --- Assignee: Himanshu Vashishtha Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange - Key: HBASE-7338 URL: https://issues.apache.org/jira/browse/HBASE-7338 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3, 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Attachments: HBASE-7338.patch The balancer doesn't run in case a region is in-transition. The check to confirm whether there all regions are assigned looks for region count 22, where the total regions are 27. This may result in a failure: {code} java.lang.AssertionError: After 5 attempts, region assignments were not balanced. at org.junit.Assert.fail(Assert.java:93) at org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203) at org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123) . 2012-12-11 13:47:02,231 INFO [pool-1-thread-1] hbase.TestRegionRebalancing(120): Added fourth server=p0118.mtv.cloudera.com,44414,1355262422083 2012-12-11 13:47:02,231 INFO [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] regionserver.HRegionServer(3769): Registered RegionServer MXBean 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d. state=OPENING, ts=1355262421037, server=p0118.mtv.cloudera.com,54281,1355262419765} 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load Average: 13.0 low border: 9, up border: 16; attempt: 0 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 Avg: 13.0 actual: 11 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 Avg: 13.0 actual: 15 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 Avg: 13.0 actual: 0 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
[ https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531241#comment-13531241 ] Jimmy Xiang commented on HBASE-7338: +1. Looks good. On commit, need to add a timeout for the test in case region stuck in transition. Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange - Key: HBASE-7338 URL: https://issues.apache.org/jira/browse/HBASE-7338 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3, 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Attachments: HBASE-7338.patch The balancer doesn't run in case a region is in-transition. The check to confirm whether there all regions are assigned looks for region count 22, where the total regions are 27. This may result in a failure: {code} java.lang.AssertionError: After 5 attempts, region assignments were not balanced. at org.junit.Assert.fail(Assert.java:93) at org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203) at org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123) . 2012-12-11 13:47:02,231 INFO [pool-1-thread-1] hbase.TestRegionRebalancing(120): Added fourth server=p0118.mtv.cloudera.com,44414,1355262422083 2012-12-11 13:47:02,231 INFO [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] regionserver.HRegionServer(3769): Registered RegionServer MXBean 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d. state=OPENING, ts=1355262421037, server=p0118.mtv.cloudera.com,54281,1355262419765} 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load Average: 13.0 low border: 9, up border: 16; attempt: 0 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 Avg: 13.0 actual: 11 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 Avg: 13.0 actual: 15 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 Avg: 13.0 actual: 0 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool
[ https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531242#comment-13531242 ] Lars Hofhansl commented on HBASE-6651: -- -1 on keeping it. It seems need an official vote. Improve thread safety of HTablePool --- Key: HBASE-6651 URL: https://issues.apache.org/jira/browse/HBASE-6651 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Assignee: Hiroshi Ikeda Fix For: 0.96.0 Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, HBASE-6651-V5.patch, HBASE-6651-V6.patch, HBASE-6651-V7.patch, HBASE-6651-V8.patch, HBASE-6651-V9.patch, sample.zip, sample.zip, sharedmap_for_hbaseclient.zip There are some operations in HTablePool accessing PoolMap in multiple places without any explicit synchronization. For example HTablePool.closeTablePool() calls PoolMap.values(), and calls PoolMap.remove(). If other threads add new instances to the pool in the middle of the calls, the newly added instances might be dropped. (HTablePool.closeTablePool() also has another problem that calling it by multiple threads causes accessing HTable by multiple threads.) Moreover, PoolMap is not thread safe for the same reason. For example PoolMap.put() calles ConcurrentMap.get() and calles ConcurrentMap.put(). If other threads add a new instance to the concurent map in the middle of the calls, the new instance might be dropped. And also implementations of Pool have the same problems. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test
[ https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531244#comment-13531244 ] Ted Yu commented on HBASE-7118: --- @Liping: Can you tell us which JVM you were using ? Looking at recent Jenkins builds for 0.94, I don't see this test failing. Thanks org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test --- Key: HBASE-7118 URL: https://issues.apache.org/jira/browse/HBASE-7118 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Environment: RHEL 5.3, open JDK 1.6 Reporter: Li Ping Zhang Assignee: Li Ping Zhang Labels: patch Attachments: HBASE-7118-0.94.0.patch Original Estimate: 24h Remaining Estimate: 24h org.apache.hadoop.hbase.replication.TestReplicationPeer Running org.apache.hadoop.hbase.replication.TestReplicationPeer Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec FAILURE! --- stable failures, new for hbase 0.92.0, need to be fixed firstly. target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt output: Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec FAILURE! testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer) Time elapsed: 25.247 sec FAILURE! junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was not properly expired. at junit.framework.Assert.fail(Assert.java:50) at org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt content: 2012-03-25 20:52:42,979 INFO [main] zookeeper.MiniZooKeeperCluster(174): Started MiniZK Cluster and connect 1 ZK server on client port: 21818 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to cluster: clusterId opening connection to ZooKeeper with ensemble (localhost:21818) 2012-03-25 20:52:43,082 INFO [main] zookeeper.RecoverableZooKeeper(89): The identifier of this process is 4...@svltest116.svl.ibm.com 2012-03-25 20:52:43,166 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received ZooKeeper Event, type=None, state=SyncConnected, path=null 2012-03-25 20:52:43,175 INFO [Thread-9] replication.TestReplicationPeer(53): Expiring ReplicationPeer ZooKeeper session. 2012-03-25 20:52:43,196 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(334): connection to cluster: clusterId-0x1364d226a3d connected 2012-03-25 20:52:43,308 INFO [Thread-9] hbase.HBaseTestingUtility(1234): ZK Closed Session 0x1364d226a3d; sleeping=25000 2012-03-25 20:53:08,323 INFO [Thread-9] replication.TestReplicationPeer(57): Attempting to use expired ReplicationPeer ZooKeeper session. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531247#comment-13531247 ] Aleksandr Shulman commented on HBASE-7342: -- Hi Ramkrishna, The logic for the change is as follows: With the existing implementation (using -1), when there are two items in the array, it returns the 0th item ( (2 - 1) / 2 = 0 ) , which is equal the index of the firstKey. This is a problem during splits because a split is invalid if the midkey is equal to the firstKey. What we really want here is the index to be 1. This is because the lastKey is going to be first key in the next block. So there won't be a collision with it and the midkey will really represent the mid of first and last. Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7343) Fix flaky condition for TestDrainingServer
[ https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531246#comment-13531246 ] Jimmy Xiang commented on HBASE-7343: You need to rebase your branch and post a new patch so that hadoopqa can take it. Can you make sure all tests in this class have proper timeout? +1. Looks good to me. Will commit if hadoopqa looks good. Fix flaky condition for TestDrainingServer -- Key: HBASE-7343 URL: https://issues.apache.org/jira/browse/HBASE-7343 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Attachments: HBASE-7343.patch The assert statement in setUpBeforeClass() may fail in case the region distribution is not even (a particular rs has 0 regions). {code} junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at junit.framework.Assert.assertFalse(Assert.java:41) at org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83) {code} This is already fixed in trunk with HBASE-5992, but as that's a bigger change and uses 5877, this jira fixes that issue instead of backporting 5992. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7347) Allow multiple readers per storefile
Lars Hofhansl created HBASE-7347: Summary: Allow multiple readers per storefile Key: HBASE-7347 URL: https://issues.apache.org/jira/browse/HBASE-7347 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Currently each store file is read only through the single reader regardless of how many concurrent read requests access that file. This issue is to explore alternate designs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads
[ https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7336: - Issue Type: Sub-task (was: Bug) Parent: HBASE-7347 HFileBlock.readAtOffset does not work well with multiple threads Key: HBASE-7336 URL: https://issues.apache.org/jira/browse/HBASE-7336 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7336-0.94.txt, 7336-0.96.txt HBase grinds to a halt when many threads scan along the same set of blocks and neither read short circuit is nor block caching is enabled for the dfs client ... disabling the block cache makes sense on very large scans. It turns out that synchronizing in istream in HFileBlock.readAtOffset is the culprit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531254#comment-13531254 ] Ted Yu commented on HBASE-7342: --- Since the fix is for array size being 2, maybe add a check for this case and don't subtract 1. Otherwise keep the current logic. Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7343) Fix flaky condition for TestDrainingServer
[ https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Himanshu Vashishtha updated HBASE-7343: --- Fix Version/s: 0.94.4 Fix flaky condition for TestDrainingServer -- Key: HBASE-7343 URL: https://issues.apache.org/jira/browse/HBASE-7343 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7343.patch The assert statement in setUpBeforeClass() may fail in case the region distribution is not even (a particular rs has 0 regions). {code} junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at junit.framework.Assert.assertFalse(Assert.java:41) at org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83) {code} This is already fixed in trunk with HBASE-5992, but as that's a bigger change and uses 5877, this jira fixes that issue instead of backporting 5992. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7343) Fix flaky condition for TestDrainingServer
[ https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531264#comment-13531264 ] Himanshu Vashishtha commented on HBASE-7343: This is only for 0.94.4 as trunk is already taken care by 5992 Fix flaky condition for TestDrainingServer -- Key: HBASE-7343 URL: https://issues.apache.org/jira/browse/HBASE-7343 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7343.patch The assert statement in setUpBeforeClass() may fail in case the region distribution is not even (a particular rs has 0 regions). {code} junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at junit.framework.Assert.assertFalse(Assert.java:41) at org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83) {code} This is already fixed in trunk with HBASE-5992, but as that's a bigger change and uses 5877, this jira fixes that issue instead of backporting 5992. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads
[ https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531265#comment-13531265 ] stack commented on HBASE-7336: -- +1 on committing this for now. On Reader per long-running scanner, it will complicate the swapping in of new files on compaction but probably worth figuring out. Lets file issues for further improvement. Good stuff Lars. HFileBlock.readAtOffset does not work well with multiple threads Key: HBASE-7336 URL: https://issues.apache.org/jira/browse/HBASE-7336 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7336-0.94.txt, 7336-0.96.txt HBase grinds to a halt when many threads scan along the same set of blocks and neither read short circuit is nor block caching is enabled for the dfs client ... disabling the block cache makes sense on very large scans. It turns out that synchronizing in istream in HFileBlock.readAtOffset is the culprit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531272#comment-13531272 ] Ted Yu commented on HBASE-7342: --- I extracted testBasicSplit from patch v2 and it passed. {code} Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster 2012-12-13 10:17:06.866 java[69043:1903] Unable to load realm mapping info from SCDynamicStore Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.711 sec {code} @Aleksandr: Can you refine your test case to show us the problem ? Thanks Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531275#comment-13531275 ] stack commented on HBASE-7342: -- +1 on fix. I'd not add the test. Its overkill running a cluster to test a array math problem. Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531282#comment-13531282 ] Hadoop QA commented on HBASE-7340: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560808/7340-v1.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 104 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3531//console This message is automatically generated. Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531284#comment-13531284 ] stack commented on HBASE-7342: -- [~ted_yu] The problem is plain. Look at it. Imagine an array w/ two elements in it only. Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7317) server-side request problems are hard to debug
[ https://issues.apache.org/jira/browse/HBASE-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531286#comment-13531286 ] Sergey Shelukhin commented on HBASE-7317: - How do you define progress on tracing? Improvements on the library, or adoption within HBase codebase? Also, w.r.t. Apache, as far as I see to go into commons sandbox you don't need much of a separate project. server-side request problems are hard to debug -- Key: HBASE-7317 URL: https://issues.apache.org/jira/browse/HBASE-7317 Project: HBase Issue Type: Brainstorming Components: IPC/RPC, regionserver Reporter: Sergey Shelukhin Priority: Minor I've seen cases during integration tests where the write or read request took an unexpectedly large amount of time (that, after the client went to the region server that is reported alive and well, which I know from temporary debug logging :)), and it's impossible to understand what is going on on the server side, short of catching the moment with jstack. Some solutions (off by default) could be - a facility for tests (especially integration tests) that would trace Server/Master calls into some log or file (won't help with internals but at least one could see what was actually received); - logging the progress of requests between components inside master/server (e.g. request id=N received, request id=N is being processed in MyClass, N being drawn on client from local sequence - no guarantees of uniqueness are necessary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-6466: Component/s: regionserver Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Components: regionserver Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-6466: Affects Version/s: 0.96.0 Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-6466: Fix Version/s: 0.96.0 Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531292#comment-13531292 ] stack commented on HBASE-6466: -- Lets get some more review on this patch before it goes in. Its no fun debugging intermittent hung flushing or closing on a distributed cluster. Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush
[ https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6466: - Priority: Critical (was: Major) Marking critical so it does not slip through the cracks. It is patch available and just in need of additional review. Enable multi-thread for memstore flush -- Key: HBASE-6466 URL: https://issues.apache.org/jira/browse/HBASE-6466 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.96.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch If the KV is large or Hlog is closed with high-pressure putting, we found memstore is often above the high water mark and block the putting. So should we enable multi-thread for Memstore Flush? Some performance test data for reference, 1.test environment : random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler per client for writing 2.test results: one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per regionserver, appears many aboveGlobalMemstoreLimit blocking two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per regionserver, 200 thread handler per client two cacheFlush handlers, tps:16.1k/s per regionserver, Flush:18.6MB/s per regionserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531296#comment-13531296 ] Nick Dimiduk commented on HBASE-7340: - Is this a case for shipping some pre-made coprocessors, such as an AutoCompactionObserver? Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531298#comment-13531298 ] Ted Yu commented on HBASE-7340: --- Test failure in TestMultiParallel is unrelated to the patch. Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531300#comment-13531300 ] Aleksandr Shulman commented on HBASE-7342: -- I'll go back and refine the test case. We can add it later. The hard part is getting blockKeys to be a size-2 array. Any suggestions? Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server
[ https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531302#comment-13531302 ] Sergey Shelukhin commented on HBASE-7268: - bq. How would you use it Sergey? Would it be ok if on cluster restart, the sequence restarted at zero? The single-source, increasing timer would be useful for any coordination tasks... e.g. you'd always know which events happened earlier, across master restarts/etc. It should only reset when there's a singularity e.g. if you wipe the cluster. It's overkill to do it just for this issue though... I think I saw it discussed somewhere, maybe in a JIRA related to snapshots. bq. Yes. But we can't have client register to get callbacks when regions moves. What you thinking? First sleep then get location? :) correct local region location cache information can be overwritten w/stale information from an old server - Key: HBASE-7268 URL: https://issues.apache.org/jira/browse/HBASE-7268 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v1.patch, HBASE-7268-v2.patch Discovered via HBASE-7250; related to HBASE-5877. Test is writing from multiple threads. Server A has region R; client knows that. R gets moved from A to server B. B gets killed. R gets moved by master to server C. ~15 seconds later, client tries to write to it (on A?). Multiple client threads report from RegionMoved exception processing logic R moved from C to B, even though such transition never happened (neither in nor before the sequence described below). Not quite sure how the client learned of the transition to C, I assume it's from meta from some other thread... Then, put fails (it may fail due to accumulated errors that are not logged, which I am investigating... but the bogus cache update is there nonwithstanding). I have a patch but not sure if it works, test still fails locally for yet unknown reason. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7317) server-side request problems are hard to debug
[ https://issues.apache.org/jira/browse/HBASE-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531305#comment-13531305 ] stack commented on HBASE-7317: -- bq. How do you define progress on tracing? I'd define it as us making use of the lib. Examples: + Flip a switch in shell and it 'enables tracing'. Make a request and when the request completes, the trace info is dumped out in shell + Flip a switch on a regionserver and it will start dumping trace into the .log file or into the UI server-side request problems are hard to debug -- Key: HBASE-7317 URL: https://issues.apache.org/jira/browse/HBASE-7317 Project: HBase Issue Type: Brainstorming Components: IPC/RPC, regionserver Reporter: Sergey Shelukhin Priority: Minor I've seen cases during integration tests where the write or read request took an unexpectedly large amount of time (that, after the client went to the region server that is reported alive and well, which I know from temporary debug logging :)), and it's impossible to understand what is going on on the server side, short of catching the moment with jstack. Some solutions (off by default) could be - a facility for tests (especially integration tests) that would trace Server/Master calls into some log or file (won't help with internals but at least one could see what was actually received); - logging the progress of requests between components inside master/server (e.g. request id=N received, request id=N is being processed in MyClass, N being drawn on client from local sequence - no guarantees of uniqueness are necessary). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error
[ https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531309#comment-13531309 ] stack commented on HBASE-7342: -- bq. I'll go back and refine the test case. We can add it later. The hard part is getting blockKeys to be a size-2 array. Any suggestions? You could just make a test apart from splits. Conjure an array of 0, 1, 2, and 10 elements. Verify that when you divide by two you get a 'midpoint' that makes sense for hbase. But such a test verges on the silly I would argue. Lets just commit your fix unless someone comes up w/ a reason for why the -1 was there in the first place. Split operation without split key incorrectly finds the middle key in off-by-one error -- Key: HBASE-7342 URL: https://issues.apache.org/jira/browse/HBASE-7342 Project: HBase Issue Type: Bug Components: HFile, io Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Priority: Minor Fix For: 0.96.0, 0.94.4 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch I took a deeper look into issues I was having using region splitting when specifying a region (but not a key for splitting). The midkey calculation is off by one and when there are 2 rows, will pick the 0th one. This causes the firstkey to be the same as midkey and the split will fail. Removing the -1 causes it work correctly, as per the test I've added. Looking into the code here is what goes on: 1. Split takes the largest storefile 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key i resides as blockKeys[i] 3. Getting the middle root-level index should yield the key in the middle of the storefile 4. In step 3, we see that there is a possible erroneous (-1) to adjust for the 0-offset indexing. 5. In a result with where there are only 2 blockKeys, this yields the 0th block key. 6. Unfortunately, this is the same block key that 'firstKey' will be. 7. This yields the result in HStore.java:1873 (cannot split because midkey is the same as first or last row) 8. Removing the -1 solves the problem (in this case). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531317#comment-13531317 ] stack commented on HBASE-7340: -- I read the thread. How does it justify this feature? Nor does this feature strike me as a 'major' priority. Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531318#comment-13531318 ] Ted Yu commented on HBASE-7340: --- @Nick: I plan to put AutoCompactionObserver in another issue. This issue exposes region movement to custom coprocessor logic. However, the notification is from master side and decision for compaction should be made on region servers. There is more work so that we can come up with good design for auto compaction. If my patch passes review, I plan to modify the subject of this JIRA and open new one to continue with auto compaction. Allow user-specified actions following region movement -- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Task Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server
[ https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531323#comment-13531323 ] nkeywal commented on HBASE-7268: Something that could be done here is to give back to the client the time when the move started: then the client could have another set of heuristic like: the region move started less than 10 seconds ago, let's wait a little before asking for the location. correct local region location cache information can be overwritten w/stale information from an old server - Key: HBASE-7268 URL: https://issues.apache.org/jira/browse/HBASE-7268 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v1.patch, HBASE-7268-v2.patch Discovered via HBASE-7250; related to HBASE-5877. Test is writing from multiple threads. Server A has region R; client knows that. R gets moved from A to server B. B gets killed. R gets moved by master to server C. ~15 seconds later, client tries to write to it (on A?). Multiple client threads report from RegionMoved exception processing logic R moved from C to B, even though such transition never happened (neither in nor before the sequence described below). Not quite sure how the client learned of the transition to C, I assume it's from meta from some other thread... Then, put fails (it may fail due to accumulated errors that are not logged, which I am investigating... but the bogus cache update is there nonwithstanding). I have a patch but not sure if it works, test still fails locally for yet unknown reason. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)
[ https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4791: - Attachment: HBASE-4791-v4.patch Reapplying Matteo's patch so hadoopqa finds the trunk version Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file) -- Key: HBASE-4791 URL: https://issues.apache.org/jira/browse/HBASE-4791 Project: HBase Issue Type: Improvement Components: security, Zookeeper Reporter: Eugene Koontz Assignee: Matteo Bertozzi Labels: security, zookeeper Attachments: DemoConfig.java, HBASE-4791-v1.patch, HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, HBASE-4791-v4.patch, HBASE-4791-v4.patch In the currently proposed fix for HBASE-2418, there must be a JAAS file specified in System.setProperty(java.security.auth.login.config). However, it might be preferable to construct a JAAS configuration programmatically, as is done with secure Hadoop (see https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175). This would have the benefit of avoiding a usage of a system property setting, and allow instead an HBase-local configuration setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)
[ https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4791: - Status: Open (was: Patch Available) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file) -- Key: HBASE-4791 URL: https://issues.apache.org/jira/browse/HBASE-4791 Project: HBase Issue Type: Improvement Components: security, Zookeeper Reporter: Eugene Koontz Assignee: Matteo Bertozzi Labels: security, zookeeper Attachments: DemoConfig.java, HBASE-4791-v1.patch, HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, HBASE-4791-v4.patch, HBASE-4791-v4.patch In the currently proposed fix for HBASE-2418, there must be a JAAS file specified in System.setProperty(java.security.auth.login.config). However, it might be preferable to construct a JAAS configuration programmatically, as is done with secure Hadoop (see https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175). This would have the benefit of avoiding a usage of a system property setting, and allow instead an HBase-local configuration setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)
[ https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-4791: - Status: Patch Available (was: Open) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file) -- Key: HBASE-4791 URL: https://issues.apache.org/jira/browse/HBASE-4791 Project: HBase Issue Type: Improvement Components: security, Zookeeper Reporter: Eugene Koontz Assignee: Matteo Bertozzi Labels: security, zookeeper Attachments: DemoConfig.java, HBASE-4791-v1.patch, HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, HBASE-4791-v4.patch, HBASE-4791-v4.patch In the currently proposed fix for HBASE-2418, there must be a JAAS file specified in System.setProperty(java.security.auth.login.config). However, it might be preferable to construct a JAAS configuration programmatically, as is done with secure Hadoop (see https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175). This would have the benefit of avoiding a usage of a system property setting, and allow instead an HBase-local configuration setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7340: -- Description: I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ was: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ Issue Type: Bug (was: Task) Summary: Master coprocessor notification for assignmentManager.balance() is inconsistent (was: Allow user-specified actions following region movement) @Stack: As mentioned above, I have modified the subject of this JIRA to reflect latest progress. Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7346) Restored snapshot replay problem
[ https://issues.apache.org/jira/browse/HBASE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531349#comment-13531349 ] Jesse Yates commented on HBASE-7346: When we do restore a table, it needds be disabled first (no inflight restores!), which flushes everything to disk and rolls the WAL. This seems as simple as just removing the existing recovered.edits directory under a table before we restore the table. As an aside, offline snapshots don't actually reference any of the WALs and just copy the recovered.edits directory over, just incase something is in there. This should only apply to the online snapshot cases. Restored snapshot replay problem Key: HBASE-7346 URL: https://issues.apache.org/jira/browse/HBASE-7346 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Priority: Critical Fix For: hbase-6055 The situation is a coarse-grained problem. The key problem is that writes that shouldn't be replayed (since they don't belong to the restored image), would not normally get replayed, but would potentially get replayed if recovery was triggered. Previously, without restore, we could depend on the timestamps – if something was replayed but there was newer data, the newer data would win. In a restore situation, the newer data is has the old time stamps from before recovery, and new data that shouldn't get replayed could be. ex: 1) write 100 rows 2) ss1 (with logs) 3) write 50 rows 4) restore ss1 5) crash 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7347) Allow multiple readers per storefile
[ https://issues.apache.org/jira/browse/HBASE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531364#comment-13531364 ] Lars Hofhansl commented on HBASE-7347: -- Options might include: * Ad hoc fix in HBASE-7336 is good enough (using seek + read when possible without contention, false back to p-read when needed) * A separate reader for compactions (Stack suggested that) * A reader for each scanner that has cacheBlocks set to false (presumably because it is expected to be large scan, that would also cover compactions) * multiple readers for store files, round robin between them ** based on size (i.e. a reader per each n GB) ** based on access pattern (i.e. create readers, cache them for a bit, expire them) What probably does not work: * Use p-read everywhere (does not perform well) * A reader per scanner (probably not efficient, open is a name node operation) Allow multiple readers per storefile Key: HBASE-7347 URL: https://issues.apache.org/jira/browse/HBASE-7347 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Currently each store file is read only through the single reader regardless of how many concurrent read requests access that file. This issue is to explore alternate designs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531366#comment-13531366 ] Andrew Purtell commented on HBASE-7205: --- Thanks [~amuraru] for pitching in on this. Thanks Ted for the reviews. Coprocessor classloader is replicated for all regions in the HRegionServer -- Key: HBASE-7205 URL: https://issues.apache.org/jira/browse/HBASE-7205 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.92.2, 0.94.2 Reporter: Adrian Muraru Assignee: Ted Yu Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7205-0.94.txt, 7205-v10.txt, 7205-v1.txt, 7205-v3.txt, 7205-v4.txt, 7205-v5.txt, 7205-v6.txt, 7205-v7.txt, 7205-v8.txt, 7205-v9.txt, HBASE-7205_v2.patch HBASE-6308 introduced a new custom CoprocessorClassLoader to load the coprocessor classes and a new instance of this CL is created for each single HRegion opened. This leads to OOME-PermGen when the number of regions go above hundres / region server. Having the table coprocessor jailed in a separate classloader is good however we should create only one for all regions of a table in each HRS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7348) [89-fb] Add some statistics from DFSClient to RegionServerMetrics
Liyin Tang created HBASE-7348: - Summary: [89-fb] Add some statistics from DFSClient to RegionServerMetrics Key: HBASE-7348 URL: https://issues.apache.org/jira/browse/HBASE-7348 Project: HBase Issue Type: Improvement Reporter: Liyin Tang DFSClient actually collected a number of useful statistics such as bytesLocalRead, bytesLocalRackRead and so on. So this diff is going to merge these metrics into the RegionServerMetrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531367#comment-13531367 ] Andrew Purtell commented on HBASE-7340: --- [~saint@gmail.com] I think this is some minor missing coverage in the CP API, minor because there isn't an existing user (yet) for it, so it's reasonable on those grounds. Agree the priority should be minor if not trivial. Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 7340-v1.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-7340: -- Priority: Trivial (was: Major) Issue Type: Improvement (was: Bug) Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Attachments: 7340-v1.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads
[ https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7336: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the reviews. HFileBlock.readAtOffset does not work well with multiple threads Key: HBASE-7336 URL: https://issues.apache.org/jira/browse/HBASE-7336 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7336-0.94.txt, 7336-0.96.txt HBase grinds to a halt when many threads scan along the same set of blocks and neither read short circuit is nor block caching is enabled for the dfs client ... disabling the block cache makes sense on very large scans. It turns out that synchronizing in istream in HFileBlock.readAtOffset is the culprit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
[ https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7338: --- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange - Key: HBASE-7338 URL: https://issues.apache.org/jira/browse/HBASE-7338 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3, 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7338.patch The balancer doesn't run in case a region is in-transition. The check to confirm whether there all regions are assigned looks for region count 22, where the total regions are 27. This may result in a failure: {code} java.lang.AssertionError: After 5 attempts, region assignments were not balanced. at org.junit.Assert.fail(Assert.java:93) at org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203) at org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123) . 2012-12-11 13:47:02,231 INFO [pool-1-thread-1] hbase.TestRegionRebalancing(120): Added fourth server=p0118.mtv.cloudera.com,44414,1355262422083 2012-12-11 13:47:02,231 INFO [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] regionserver.HRegionServer(3769): Registered RegionServer MXBean 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d. state=OPENING, ts=1355262421037, server=p0118.mtv.cloudera.com,54281,1355262419765} 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load Average: 13.0 low border: 9, up border: 16; attempt: 0 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 Avg: 13.0 actual: 11 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 Avg: 13.0 actual: 15 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 Avg: 13.0 actual: 0 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7340: -- Attachment: 7340-v2.txt Patch v2 addresses Andy's comments. Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Attachments: 7340-v1.txt, 7340-v2.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7341) Deprecate RowLocks in 0.94
[ https://issues.apache.org/jira/browse/HBASE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-7341: -- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the review, Stack. Committed to 0.94. Deprecate RowLocks in 0.94 -- Key: HBASE-7341 URL: https://issues.apache.org/jira/browse/HBASE-7341 Project: HBase Issue Type: Task Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7341.patch Since we are removing support in 0.96 (see HBASE-7315), we should deprecate in 0.94. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531394#comment-13531394 ] Andrew Purtell commented on HBASE-7340: --- +1 patch v2 Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Attachments: 7340-v1.txt, 7340-v2.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata
[ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531396#comment-13531396 ] Sergey Shelukhin commented on HBASE-7236: - bq. Attributes as in TTL, MIN_VERSIONS, which are part of the hbase data model Actually, these are not part of data model, this is config. For many existing metadata settings, it makes perfect sense to have a cluster-wide value (e.g. TTL, versions, compression parameters, etc.). IMHO metadata is stuff like IS_ROOT, IS_READONLY, etc. So we have 3 types of values, excluding schema: 1) Metadata specific to column/table, e.g. IS_ROOT. 2) Config: a) Some of config is xml-file only and has names like hbase.foo.bar.baz. b) Some config is descriptor-only and has names like MIN_VERSIONS. c) Some config is both xml and column-specific, and has names like hbase.region.max.filesize (or something) *and* MAX_FILESIZE. 3) User-specific parameters, set via CONFIG in shell, and used for mysterious user-specific purposes (right now, CF user parameters can also override xml config implicitly, which I am not removing with this patch to keep backward compat). I don't see a reason not to separate all of this. This patch cleans up 2c and changes it to a general mechanism to do such things, instead of case-by-case basis which as shown above leads to more code and potential bugs. Ideally, I'd say we should: * remove 2b in a sense that all the config values should be specifiable in hbase.foo.bar.baz form, and both in XML file (e.g. for example hbase.cf.min.versions), and table/cf (where it makes sense; e.g. we can whitelist overridable keys). * Separate (1) from (3) to prevent potential of conflicts if we add new reserved keywords that user has set on his descriptors for their own purposes, but that's a separate issue. bq. Is putting arbitrary byte[]'s into the table/column descriptors a valid use case for the clients. Yes, there were JIRAs about improving it so I assume it's used :) add per-table/per-cf configuration via metadata --- Key: HBASE-7236 URL: https://issues.apache.org/jira/browse/HBASE-7236 Project: HBase Issue Type: New Feature Components: Compaction Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, HBASE-7236-v2.patch, HBASE-7236-v3.patch Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly. We might want to add support for compaction configuration via metadata on table/cf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)
[ https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531402#comment-13531402 ] Hadoop QA commented on HBASE-4791: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560824/HBASE-4791-v4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 104 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3532//console This message is automatically generated. Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file) -- Key: HBASE-4791 URL: https://issues.apache.org/jira/browse/HBASE-4791 Project: HBase Issue Type: Improvement Components: security, Zookeeper Reporter: Eugene Koontz Assignee: Matteo Bertozzi Labels: security, zookeeper Attachments: DemoConfig.java, HBASE-4791-v1.patch, HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, HBASE-4791-v4.patch, HBASE-4791-v4.patch In the currently proposed fix for HBASE-2418, there must be a JAAS file specified in System.setProperty(java.security.auth.login.config). However, it might be preferable to construct a JAAS configuration programmatically, as is done with secure Hadoop (see https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175). This would have the benefit of avoiding a usage of a system property setting, and allow instead an HBase-local configuration setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7343) Fix flaky condition for TestDrainingServer
[ https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7343: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Fix flaky condition for TestDrainingServer -- Key: HBASE-7343 URL: https://issues.apache.org/jira/browse/HBASE-7343 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7343.patch The assert statement in setUpBeforeClass() may fail in case the region distribution is not even (a particular rs has 0 regions). {code} junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at junit.framework.Assert.assertFalse(Assert.java:41) at org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83) {code} This is already fixed in trunk with HBASE-5992, but as that's a bigger change and uses 5877, this jira fixes that issue instead of backporting 5992. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7346) Restored snapshot replay problem
[ https://issues.apache.org/jira/browse/HBASE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531404#comment-13531404 ] Jonathan Hsieh commented on HBASE-7346: --- This crash I'm talking about is after the restore has successfully completed (after step 4). Does the disable table flushing + log rolling guarantee that logs are moved out of the way so that they won't be replayed? Let's double check and add a test case for this. That is the easiest way to convince that this is or isn't a problem on the offline or online cases. Restored snapshot replay problem Key: HBASE-7346 URL: https://issues.apache.org/jira/browse/HBASE-7346 Project: HBase Issue Type: Sub-task Components: Client, master, regionserver, snapshots, Zookeeper Affects Versions: hbase-6055 Reporter: Jonathan Hsieh Priority: Critical Fix For: hbase-6055 The situation is a coarse-grained problem. The key problem is that writes that shouldn't be replayed (since they don't belong to the restored image), would not normally get replayed, but would potentially get replayed if recovery was triggered. Previously, without restore, we could depend on the timestamps – if something was replayed but there was newer data, the newer data would win. In a restore situation, the newer data is has the old time stamps from before recovery, and new data that shouldn't get replayed could be. ex: 1) write 100 rows 2) ss1 (with logs) 3) write 50 rows 4) restore ss1 5) crash 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7055: Attachment: HBASE-7055-v4.patch Addressed Ted's feedback; that includes moving some code into base class port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically) - Key: HBASE-7055 URL: https://issues.apache.org/jira/browse/HBASE-7055 Project: HBase Issue Type: Task Components: Compaction Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.96.0 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, HBASE-6371-v3-refactor-only-squashed.patch, HBASE-6371-v4-refactor-only-squashed.patch, HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, HBASE-7055-v4.patch There's divergence in the code :( See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7233) Serializing KeyValues
[ https://issues.apache.org/jira/browse/HBASE-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531437#comment-13531437 ] Andrew Purtell commented on HBASE-7233: --- {quote} bq. I would really prefer not to double the number of kV types just to say foo with tags. And then double again for foo with tags and bar. That would be ugly, but at the same time it's difficult and maybe wasteful to future-proof it from every angle. Tags are already sort of a flexible future-proofing mechanism. Maybe tags can be added in a backwards compatible way to the existing encoders. I'd have to think about it for PrefixTree, probably punting them to a PREFIX_TREE2 encoder with some other additions/improvements. {quote} The use case I'm looking at is adding security policy information to KVs (HBASE-6222), could be either ACLs or visibility labels, both can be handled the same way. There's a 1:1 mapping, so it makes sense to store the policy information in the KV. This also has the nice property of reading in the ACL for free in the same op that reads in the KV. I'm not asking for specifically more than tagging KVs with this specific metadata but, given that tags could be easily made generic enough to support a number of other cases, I think it makes sense to do that. Then security is just one user of something more generally useful, we haven't done something fixed for security's sake only. Adding tag support to the encoders might be the right answer. Would we still have the trouble of teaching KeyValue about where in the bytebuffers coming out of the encoder the tag data resides? Any thoughts on how we might distinguish a KV with tags from one without? Maybe we don't, we just have the encoder add the discovered tag data to the KV by way of an API that adds out of band metadata to the KV's in memory representation? And likewise add tags to the blocks beyond the KV itself if they are present? Serializing KeyValues - Key: HBASE-7233 URL: https://issues.apache.org/jira/browse/HBASE-7233 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.96.0 Attachments: 7233sketch.txt, 7233.txt, 7233-v2.txt, 7233v3_encoders.txt, 7233v4_encoders.txt, 7233v5_encoders.txt, 7233v6_encoder.txt Undo KeyValue being a Writable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531438#comment-13531438 ] stack commented on HBASE-7340: -- As the issue is cast now, its fine w/ appropriate priority. Original justification was way wonky citing a mail exchange though therein there was no call such a 'task/feature' (And then the 'task/feature' becomes a 'bug' on questioning). All this churn and triviality distracts. There are bigger fish to fry. Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Attachments: 7340-v1.txt, 7340-v2.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.
[ https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531492#comment-13531492 ] Sergey Shelukhin commented on HBASE-5416: - Hmm, no... let me try with increased. Should this setting be set in the patch too? This test is going to run with other tests every time I assume. Improve performance of scans with some kind of filters. --- Key: HBASE-5416 URL: https://issues.apache.org/jira/browse/HBASE-5416 Project: HBase Issue Type: Improvement Components: Filters, Performance, regionserver Affects Versions: 0.90.4 Reporter: Max Lapan Assignee: Max Lapan Fix For: 0.96.0 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, Filtered_scans_v7.patch When the scan is performed, whole row is loaded into result list, after that filter (if exists) is applied to detect that row is needed. But when scan is performed on several CFs and filter checks only data from the subset of these CFs, data from CFs, not checked by a filter is not needed on a filter stage. Only when we decided to include current row. And in such case we can significantly reduce amount of IO performed by a scan, by loading only values, actually checked by a filter. For example, we have two CFs: flags and snap. Flags is quite small (bunch of megabytes) and is used to filter large entries from snap. Snap is very large (10s of GB) and it is quite costly to scan it. If we needed only rows with some flag specified, we use SingleColumnValueFilter to limit result to only small subset of region. But current implementation is loading both CFs to perform scan, when only small subset is needed. Attached patch adds one routine to Filter interface to allow filter to specify which CF is needed to it's operation. In HRegion, we separate all scanners into two groups: needed for filter and the rest (joined). When new row is considered, only needed data is loaded, filter applied, and only if filter accepts the row, rest of data is loaded. At our data, this speeds up such kind of scans 30-50 times. Also, this gives us the way to better normalize the data into separate columns by optimizing the scans performed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata
[ https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531494#comment-13531494 ] Enis Soztutar commented on HBASE-7236: -- bq. Actually, these are not part of data model, this is config. Well, TTL, and MIN_VERSIONS are actually part of the data model in that, together with the timestamp in KV, they define the semantics of the data lifetime. Compactions / compression / data block encoding are pure configurations that don't affect the semantics, but rather performance/data size. bq. I don't see a reason not to separate all of this. Separating makes sense. add per-table/per-cf configuration via metadata --- Key: HBASE-7236 URL: https://issues.apache.org/jira/browse/HBASE-7236 Project: HBase Issue Type: New Feature Components: Compaction Affects Versions: 0.96.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, HBASE-7236-v2.patch, HBASE-7236-v3.patch Regardless of the compaction policy, it makes sense to have separate configuration for compactions for different tables and column families, as their access patterns and workloads can be different. In particular, for tiered compactions that are being ported from 0.89-fb branch it is necessary to have, to use it properly. We might want to add support for compaction configuration via metadata on table/cf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531495#comment-13531495 ] Hadoop QA commented on HBASE-7340: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560837/7340-v2.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 104 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 23 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3533//console This message is automatically generated. Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Attachments: 7340-v1.txt, 7340-v2.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads
[ https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531500#comment-13531500 ] Hudson commented on HBASE-7336: --- Integrated in HBase-0.94 #625 (See [https://builds.apache.org/job/HBase-0.94/625/]) HBASE-7336 HFileBlock.readAtOffset does not work well with multiple threads (Revision 1421439) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java HFileBlock.readAtOffset does not work well with multiple threads Key: HBASE-7336 URL: https://issues.apache.org/jira/browse/HBASE-7336 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7336-0.94.txt, 7336-0.96.txt HBase grinds to a halt when many threads scan along the same set of blocks and neither read short circuit is nor block caching is enabled for the dfs client ... disabling the block cache makes sense on very large scans. It turns out that synchronizing in istream in HFileBlock.readAtOffset is the culprit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7341) Deprecate RowLocks in 0.94
[ https://issues.apache.org/jira/browse/HBASE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531501#comment-13531501 ] Hudson commented on HBASE-7341: --- Integrated in HBase-0.94 #625 (See [https://builds.apache.org/job/HBase-0.94/625/]) HBASE-7341 Deprecate RowLocks in 0.94 (Revision 1421447) Result = SUCCESS gchanan : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/UnknownRowLockException.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Delete.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Get.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Increment.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Mutation.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Put.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/RowLock.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java Deprecate RowLocks in 0.94 -- Key: HBASE-7341 URL: https://issues.apache.org/jira/browse/HBASE-7341 Project: HBase Issue Type: Task Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7341.patch Since we are removing support in 0.96 (see HBASE-7315), we should deprecate in 0.94. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7343) Fix flaky condition for TestDrainingServer
[ https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531502#comment-13531502 ] Hudson commented on HBASE-7343: --- Integrated in HBase-0.94 #625 (See [https://builds.apache.org/job/HBase-0.94/625/]) HBASE-7343 Fix flaky condition for TestDrainingServer (Himanshu) (Revision 1421455) Result = SUCCESS jxiang : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestDrainingServer.java Fix flaky condition for TestDrainingServer -- Key: HBASE-7343 URL: https://issues.apache.org/jira/browse/HBASE-7343 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.94.4 Attachments: HBASE-7343.patch The assert statement in setUpBeforeClass() may fail in case the region distribution is not even (a particular rs has 0 regions). {code} junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertFalse(Assert.java:34) at junit.framework.Assert.assertFalse(Assert.java:41) at org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83) {code} This is already fixed in trunk with HBASE-5992, but as that's a bigger change and uses 5877, this jira fixes that issue instead of backporting 5992. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads
[ https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531506#comment-13531506 ] Hudson commented on HBASE-7336: --- Integrated in HBase-TRUNK #3618 (See [https://builds.apache.org/job/HBase-TRUNK/3618/]) HBASE-7336 HFileBlock.readAtOffset does not work well with multiple threads (Revision 1421440) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java HFileBlock.readAtOffset does not work well with multiple threads Key: HBASE-7336 URL: https://issues.apache.org/jira/browse/HBASE-7336 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.96.0, 0.94.4 Attachments: 7336-0.94.txt, 7336-0.96.txt HBase grinds to a halt when many threads scan along the same set of blocks and neither read short circuit is nor block caching is enabled for the dfs client ... disabling the block cache makes sense on very large scans. It turns out that synchronizing in istream in HFileBlock.readAtOffset is the culprit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
[ https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531507#comment-13531507 ] Hudson commented on HBASE-7338: --- Integrated in HBase-TRUNK #3618 (See [https://builds.apache.org/job/HBase-TRUNK/3618/]) HBASE-7338 Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange (Himanshu) (Revision 1421444) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange - Key: HBASE-7338 URL: https://issues.apache.org/jira/browse/HBASE-7338 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.3, 0.96.0 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.96.0 Attachments: HBASE-7338.patch The balancer doesn't run in case a region is in-transition. The check to confirm whether there all regions are assigned looks for region count 22, where the total regions are 27. This may result in a failure: {code} java.lang.AssertionError: After 5 attempts, region assignments were not balanced. at org.junit.Assert.fail(Assert.java:93) at org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203) at org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123) . 2012-12-11 13:47:02,231 INFO [pool-1-thread-1] hbase.TestRegionRebalancing(120): Added fourth server=p0118.mtv.cloudera.com,44414,1355262422083 2012-12-11 13:47:02,231 INFO [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] regionserver.HRegionServer(3769): Registered RegionServer MXBean 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d. state=OPENING, ts=1355262421037, server=p0118.mtv.cloudera.com,54281,1355262419765} 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load Average: 13.0 low border: 9, up border: 16; attempt: 0 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 Avg: 13.0 actual: 11 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 Avg: 13.0 actual: 15 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 Avg: 13.0 actual: 0 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not running balancer because 1 region(s) in transition: {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7340: -- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Integrated to trunk. Thanks for the review, Andy. Master coprocessor notification for assignmentManager.balance() is inconsistent --- Key: HBASE-7340 URL: https://issues.apache.org/jira/browse/HBASE-7340 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Fix For: 0.96.0 Attachments: 7340-v1.txt, 7340-v2.txt I found this issue when reading user discussion which is quoted below. In HMaster.moveRegion(), we have: {code} this.assignmentManager.balance(rp); if (this.cpHost != null) { this.cpHost.postMove(hri, rp.getSource(), rp.getDestination()); } {code} Meaning, user can register master coprocessor which would receive region movement notification. The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out such notification. I think we should enhance the following hook (at line 1335) with list of regions moved so that notification from master is consistent: {code} this.cpHost.postBalance(); {code} Here is excerpt for user discussion: Sometimes user performs compaction after a region is moved (by balancer). We should provide 'hook' which lets user specify what follow-on actions to take after region movement. See discussion on user mailing list under the thread 'How to know it's time for a major compaction?' for background information: http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531533#comment-13531533 ] Shrijeet Paliwal commented on HBASE-5898: - After putting UseMembar in GC opts we have not seen the previously reported deadlock. Just wanted to update. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-0.94.txt, 5898-TestBlocksRead.txt, 5898-v2.txt, 5898-v3.txt, 5898-v4.txt, 5898-v4.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7299) TestMultiParallel fails intermittently in trunk builds
[ https://issues.apache.org/jira/browse/HBASE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531543#comment-13531543 ] Ted Yu commented on HBASE-7299: --- I looped TestMultiParallel 9 times in 0.94 branch and didn't see test failure. Looks like the failure is unique to trunk. TestMultiParallel fails intermittently in trunk builds -- Key: HBASE-7299 URL: https://issues.apache.org/jira/browse/HBASE-7299 Project: HBase Issue Type: Bug Reporter: Ted Yu Priority: Critical Fix For: 0.96.0 From trunk build #3598: {code} testFlushCommitsNoAbort(org.apache.hadoop.hbase.client.TestMultiParallel): Count of regions=8 {code} It failed in 3595 as well: {code} java.lang.AssertionError: Server count=2, abort=true expected:1 but was:2 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.hadoop.hbase.client.TestMultiParallel.doTestFlushCommits(TestMultiParallel.java:267) at org.apache.hadoop.hbase.client.TestMultiParallel.testFlushCommitsWithAbort(TestMultiParallel.java:226) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531555#comment-13531555 ] Lars Hofhansl commented on HBASE-5898: -- Thanks Shrijeet. How reliably have you seen this before (Once a day, once a mongth, etc)? If this really causes issues we should: # ship with -XX:+UseMembar by default in hbase-env.sh # document that this must be set Are we confident enough this to do that? Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: Performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.94.3, 0.96.0 Attachments: 5898-0.94.txt, 5898-TestBlocksRead.txt, 5898-v2.txt, 5898-v3.txt, 5898-v4.txt, 5898-v4.txt, HBASE-5898-0.patch, HBASE-5898-1.patch, HBASE-5898-1.patch, hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira