[jira] [Created] (HBASE-13693) [HBase MOB] Mob files are not encrypting.
Y. SREENIVASULU REDDY created HBASE-13693: - Summary: [HBase MOB] Mob files are not encrypting. Key: HBASE-13693 URL: https://issues.apache.org/jira/browse/HBASE-13693 Project: HBase Issue Type: Bug Components: mob Affects Versions: hbase-11339 Reporter: Y. SREENIVASULU REDDY Fix For: hbase-11339 Mob HFiles are not encrypting. steps to reproduce: === 1.create a table and for column family with mob enabled and enable AES encryption for the column family. 2. Insert mob data into the table. 3. Flush the mob table. 4. check hfiles for mob data is created or not. 5. check hfiles in hdfs is encrypted or not using hfile tool. {code} hfile tool output for mob reference hfile meta Block index size as per heapsize: 392 reader=/hbase/data/default/mobTest/1587e00c3e257969c48d9872994ce57c/mobcf/8c33ab9e8201449e9ac709eb9e4263d6, Trailer: fileinfoOffset=527, loadOnOpenDataOffset=353, dataIndexCount=1, metaIndexCount=0, totalUncomressedBytes=5941, entryCount=9, compressionCodec=GZ, uncompressedDataIndexSize=34, numDataIndexLevels=1, firstDataBlockOffset=0, lastDataBlockOffset=0, comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator, encryptionKey=PRESENT, majorVersion=3, minorVersion=0 {code} {code} hfile tool output for mob hfile meta Block index size as per heapsize: 872 reader=/hbase/mobdir/data/default/mobTest/46844d8b9f699e175a4d7bd57848c576/mobcf/d41d8cd98f00b204e9800998ecf8427e20150512bf18fa62a98c40d7bd6e810f790c6291, Trailer: fileinfoOffset=1018180, loadOnOpenDataOffset=1017959, dataIndexCount=9, metaIndexCount=0, totalUncomressedBytes=1552619, entryCount=9, compressionCodec=GZ, uncompressedDataIndexSize=266, numDataIndexLevels=1, firstDataBlockOffset=0, lastDataBlockOffset=904852, comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator, encryptionKey=NONE, majorVersion=3, minorVersion=0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545026#comment-14545026 ] Hadoop QA commented on HBASE-11927: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733061/HBASE-11927-v8.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733061 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.activemq.transport.mqtt.MQTTTest.testPacketIdGeneratorNonCleanSession(MQTTTest.java:859) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14055//console This message is automatically generated. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13693) [HBase MOB] Mob files are not encrypting.
[ https://issues.apache.org/jira/browse/HBASE-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal reassigned HBASE-13693: --- Assignee: Ashutosh Jindal [HBase MOB] Mob files are not encrypting. - Key: HBASE-13693 URL: https://issues.apache.org/jira/browse/HBASE-13693 Project: HBase Issue Type: Bug Components: mob Affects Versions: hbase-11339 Reporter: Y. SREENIVASULU REDDY Assignee: Ashutosh Jindal Fix For: hbase-11339 Mob HFiles are not encrypting. steps to reproduce: === 1.create a table and for column family with mob enabled and enable AES encryption for the column family. 2. Insert mob data into the table. 3. Flush the mob table. 4. check hfiles for mob data is created or not. 5. check hfiles in hdfs is encrypted or not using hfile tool. {code} hfile tool output for mob reference hfile meta Block index size as per heapsize: 392 reader=/hbase/data/default/mobTest/1587e00c3e257969c48d9872994ce57c/mobcf/8c33ab9e8201449e9ac709eb9e4263d6, Trailer: fileinfoOffset=527, loadOnOpenDataOffset=353, dataIndexCount=1, metaIndexCount=0, totalUncomressedBytes=5941, entryCount=9, compressionCodec=GZ, uncompressedDataIndexSize=34, numDataIndexLevels=1, firstDataBlockOffset=0, lastDataBlockOffset=0, comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator, encryptionKey=PRESENT, majorVersion=3, minorVersion=0 {code} {code} hfile tool output for mob hfile meta Block index size as per heapsize: 872 reader=/hbase/mobdir/data/default/mobTest/46844d8b9f699e175a4d7bd57848c576/mobcf/d41d8cd98f00b204e9800998ecf8427e20150512bf18fa62a98c40d7bd6e810f790c6291, Trailer: fileinfoOffset=1018180, loadOnOpenDataOffset=1017959, dataIndexCount=9, metaIndexCount=0, totalUncomressedBytes=1552619, entryCount=9, compressionCodec=GZ, uncompressedDataIndexSize=266, numDataIndexLevels=1, firstDataBlockOffset=0, lastDataBlockOffset=904852, comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator, encryptionKey=NONE, majorVersion=3, minorVersion=0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-13694: -- Summary: CallQueueSize is incorrectly decremented until the response is sent (was: CallQueueSize is incorrectly decremented after the response is sent) CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back to other client requests while we send the response back to the client that original caller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13562) Rejigger AC tests and expand testing coverage for master/RS ops to include all scope and permissoin combinations.
[ https://issues.apache.org/jira/browse/HBASE-13562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-13562: -- Assignee: Srikanth Srungarapu (was: Ashish Singhi) Rejigger AC tests and expand testing coverage for master/RS ops to include all scope and permissoin combinations. - Key: HBASE-13562 URL: https://issues.apache.org/jira/browse/HBASE-13562 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Srikanth Srungarapu Attachments: HBASE-13562-v1.patch, HBASE-13562-v2.patch, HBASE-13562.patch, HBASE-13562_v2.patch, sample.patch As of now, the tests in TestAccessController and TestAccessController2 doesn't cover all the combinations of Scope and Permissions. Ideally, we should have testing coverage for the entire [ACL matrix|https://hbase.apache.org/book/appendix_acl_matrix.html]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545050#comment-14545050 ] ramkrishna.s.vasudevan commented on HBASE-13531: [~jingcheng...@intel.com] So previously before the patch what value we were getting - value4? After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-13694: -- Description: We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. was: We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back to other client requests while we send the response back to the client that original caller. CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-13694: -- Attachment: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back to other client requests while we send the response back to the client that original caller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13658) Improve the test run time for TestAccessController class
[ https://issues.apache.org/jira/browse/HBASE-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545184#comment-14545184 ] Ashish Singhi commented on HBASE-13658: --- [~andrew.purt...@gmail.com] can you review v1 patch. Improve the test run time for TestAccessController class Key: HBASE-13658 URL: https://issues.apache.org/jira/browse/HBASE-13658 Project: HBase Issue Type: Sub-task Components: test Reporter: Ashish Singhi Assignee: Ashish Singhi Attachments: 13658.patch, HBASE-13658-v1.patch, HBASE-13658.patch Improve the test run time for TestAccessController class -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13686) Fail to limit rate in RateLimiter
[ https://issues.apache.org/jira/browse/HBASE-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544999#comment-14544999 ] Ashish Singhi commented on HBASE-13686: --- [~zghaobac] Thanks for the test case. As per the current code implementation of {{RateLimiter}} what I have understand is, if your limit is set for 10resources/sec, then you will be able to consume one resource every 0.1sec. So if we modify your test case like below it will always pass. {code} @Test public void testLimiterBySmallerRate() throws InterruptedException { RateLimiter limiter = new RateLimiter(); // set limiter is 10 resources per seconds limiter.set(10, TimeUnit.SECONDS); long lastTs = System.currentTimeMillis(); int count = 0; // control the test count while ((count++) 100) { // test will get 3 resources per 0.5 sec. so it will get 6 resources per sec. Thread.sleep(125); // for (int i = 0; i 1; i++) { long nowTs = System.currentTimeMillis(); // 6 resources/sec limit, so limiter.canExecute(nowTs, lastTs) should be true assertEquals(true, limiter.canExecute(nowTs, lastTs)); limiter.consume(); lastTs = nowTs; // } } } {code} Fail to limit rate in RateLimiter - Key: HBASE-13686 URL: https://issues.apache.org/jira/browse/HBASE-13686 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Guanghao Zhang Priority: Minor While using the patch in HBASE-11598 , I found that RateLimiter can't to limit the rate right. {code} /** * given the time interval, are there enough available resources to allow execution? * @param now the current timestamp * @param lastTs the timestamp of the last update * @param amount the number of required resources * @return true if there are enough available resources, otherwise false */ public synchronized boolean canExecute(final long now, final long lastTs, final long amount) { return avail = amount ? true : refill(now, lastTs) = amount; } {code} When avail = amount, avail can't be refill. But in the next time to call canExecute, lastTs maybe update. So avail will waste some time to refill. Even we use smaller rate than the limit, the canExecute will return false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13375: Status: Open (was: Patch Available) Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.1 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch, HBASE-13375-v3.patch, HBASE-13375-v4.patch, HBASE-13375-v5.patch, HBASE-13375-v6.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13694) CallQueueSize is incorrectly decremented after the response is sent
Esteban Gutierrez created HBASE-13694: - Summary: CallQueueSize is incorrectly decremented after the response is sent Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back to other client requests while we send the response back to the client that original caller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545205#comment-14545205 ] Jingcheng Du commented on HBASE-13531: -- bq. May be we can follow the same logic like when the cell seqId is less than smallest readpoint, we can reset it to 0. The smallest readPt is a region-related. The mob file compaction can not work with them. So it's hard to reset seqId to 0 in mob files. I think there're still issues in mob file compactions if we enable mvcc in mob files. It's hard to maintain the readPt consistency between the ref cell and mob cell. I have to think about this again. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545057#comment-14545057 ] Jingcheng Du commented on HBASE-13531: -- Yes, it's value4. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-13531: - Attachment: HBASE-13531-V2.diff Upload the patch V2 to fix the readPt issue. # If there is a matched cell (seqId=readPt), this cell is returned. # If there is not such a cell, use the latest matched cell whose seqId is larger than readPt. Because we don't reset the seqId in compactions. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez updated HBASE-13694: -- Status: Patch Available (was: Open) CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back to other client requests while we send the response back to the client that original caller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545174#comment-14545174 ] Anoop Sam John commented on HBASE-13531: The logic is fine {code} result = scanner.peek(); + if (result.getSequenceId() readPt) { +// current cell is invisible, look for the next. +Cell candidate = scanner.next(); +while (candidate != null) { + if (CellComparator.compare(search, candidate, true) != 0) { +break; + } {code} We return result right? So if there are no match and the break happens, we will return the 1st cell. Ideally this will never happen. Still for the correctness we should not return the 1st Cell.(null is ok?) bq.Because we don't reset the seqId in compactions. May be we can follow the same logic like when the cell seqId is less than smallest readpoint, we can reset it to 0. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13375: Status: Patch Available (was: Open) Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.1 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch, HBASE-13375-v3.patch, HBASE-13375-v4.patch, HBASE-13375-v5.patch, HBASE-13375-v6.patch, HBASE-13375-v7.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13375: Attachment: HBASE-13375-v7.patch v7, fixed broken test. Let's see how it goes. Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.1 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch, HBASE-13375-v3.patch, HBASE-13375-v4.patch, HBASE-13375-v5.patch, HBASE-13375-v6.patch, HBASE-13375-v7.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13686) Fail to limit rate in RateLimiter
[ https://issues.apache.org/jira/browse/HBASE-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545275#comment-14545275 ] Guanghao Zhang commented on HBASE-13686: I don't think so. As your understand, 60 resources per minute will equal to 1 resources per sec. So the RateLimiter doesn't need so many TimeUnit. If user set quota is 60 resources per minute, the RateLimiter can't be effected by the distribution of request. The user can get 60 resources in the first sec of one minute or get 60 resources in the last sec of one minute. As long as user's request rate is samller than 60 in one minute, the RateLimiter should guarantee the request canExecute. Fail to limit rate in RateLimiter - Key: HBASE-13686 URL: https://issues.apache.org/jira/browse/HBASE-13686 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Guanghao Zhang Priority: Minor While using the patch in HBASE-11598 , I found that RateLimiter can't to limit the rate right. {code} /** * given the time interval, are there enough available resources to allow execution? * @param now the current timestamp * @param lastTs the timestamp of the last update * @param amount the number of required resources * @return true if there are enough available resources, otherwise false */ public synchronized boolean canExecute(final long now, final long lastTs, final long amount) { return avail = amount ? true : refill(now, lastTs) = amount; } {code} When avail = amount, avail can't be refill. But in the next time to call canExecute, lastTs maybe update. So avail will waste some time to refill. Even we use smaller rate than the limit, the canExecute will return false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545351#comment-14545351 ] Jingcheng Du commented on HBASE-13531: -- bq. I think there're still issues in mob file compactions if we enable mvcc in mob files. It's hard to maintain the readPt consistency between the ref cell and mob cell. I have to think about this again. I thought about this, there are no issues in the compactions. A new patch had been uploaded. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13686) Fail to limit rate in RateLimiter
[ https://issues.apache.org/jira/browse/HBASE-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545327#comment-14545327 ] Guanghao Zhang commented on HBASE-13686: The RateLimiter can be considered as a leaky bucket (see http://en.wikipedia.org/wiki/Leaky_bucket). It will refill itself as a limit rate. So lastTs should be one property of RateLimiter and initialize when new RateLimiter(). When refill, RateLimiter will update lastTs to now by itself and avail will be refill to limit at most. Fail to limit rate in RateLimiter - Key: HBASE-13686 URL: https://issues.apache.org/jira/browse/HBASE-13686 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Guanghao Zhang Priority: Minor While using the patch in HBASE-11598 , I found that RateLimiter can't to limit the rate right. {code} /** * given the time interval, are there enough available resources to allow execution? * @param now the current timestamp * @param lastTs the timestamp of the last update * @param amount the number of required resources * @return true if there are enough available resources, otherwise false */ public synchronized boolean canExecute(final long now, final long lastTs, final long amount) { return avail = amount ? true : refill(now, lastTs) = amount; } {code} When avail = amount, avail can't be refill. But in the next time to call canExecute, lastTs maybe update. So avail will waste some time to refill. Even we use smaller rate than the limit, the canExecute will return false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-13531: - Attachment: HBASE-13531-V3.diff Upload the patch V3 to add a null check to avoid a NPE which is supposed to never happens in reading. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545295#comment-14545295 ] Anoop Sam John commented on HBASE-13694: {code} /** * On construction, adds the size of this call to the running count of outstanding call sizes. * Presumption is that we are put on a queue while we wait on an executor to run us. During this * time we occupy heap. */ // The constructor is shutdown so only RpcServer in this class can make one of these. CallRunner(final RpcServerInterface rpcServer, final Call call) { this.call = call; this.rpcServer = rpcServer; // Add size of the call to queue size. this.rpcServer.addCallSize(call.getSize()); this.status = getStatus(); } {code} So after the RpcServer.CurCall.set(null) call, we are cleared of the call heap? No right? Am I missing any? CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545405#comment-14545405 ] Hadoop QA commented on HBASE-13375: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733095/HBASE-13375-v7.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733095 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 32 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14057//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14057//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14057//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14057//console This message is automatically generated. Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.1 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch, HBASE-13375-v3.patch, HBASE-13375-v4.patch, HBASE-13375-v5.patch, HBASE-13375-v6.patch, HBASE-13375-v7.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13686) Fail to limit rate in RateLimiter
[ https://issues.apache.org/jira/browse/HBASE-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545289#comment-14545289 ] Ashish Singhi commented on HBASE-13686: --- bq. As long as user's request rate is samller than 60 in one minute, the RateLimiter should guarantee the request canExecute. Yes, I totally understand that and even we have internally thought about it. We are internally discussing on this to get to a conclusion. Fail to limit rate in RateLimiter - Key: HBASE-13686 URL: https://issues.apache.org/jira/browse/HBASE-13686 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Guanghao Zhang Priority: Minor While using the patch in HBASE-11598 , I found that RateLimiter can't to limit the rate right. {code} /** * given the time interval, are there enough available resources to allow execution? * @param now the current timestamp * @param lastTs the timestamp of the last update * @param amount the number of required resources * @return true if there are enough available resources, otherwise false */ public synchronized boolean canExecute(final long now, final long lastTs, final long amount) { return avail = amount ? true : refill(now, lastTs) = amount; } {code} When avail = amount, avail can't be refill. But in the next time to call canExecute, lastTs maybe update. So avail will waste some time to refill. Even we use smaller rate than the limit, the canExecute will return false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545286#comment-14545286 ] Hadoop QA commented on HBASE-13694: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733078/0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733078 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14056//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14056//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14056//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14056//console This message is automatically generated. CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingcheng Du updated HBASE-13531: - Attachment: HBASE-13531-V4.diff Refined the patch(V4) based on V1, and upload it. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531-V4.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once.
[ https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545307#comment-14545307 ] Samir Ahmic commented on HBASE-13337: - I can confirm that this is still issue on master branch and here is what i have concluded from testing (distributed cluster): 1. RS restart is enough to reproduce this issue. 2. As [~jxiang]] suggested this is not problem with AM. Issue is probably related to RSRpcService on regionserver side. 3. I was able to trace this issue in case of assign operation to (ServerManager.java): {code} 734 public RegionOpeningState sendRegionOpen(final ServerName server, HRegionInfo region, ListServerName favoredNodes) {code} master never gets response from regionserver because of java.nio.channels.ClosedChannelException. 4. From logs i have notices that when RS is in this state ServerManager#sendRegionWarmup is also failing with java.nio.channels.ClosedChannelException. 5. When issue is present nothing about this requests is logged on RS side so my assumption is that RPC server on rs side in some strange state. I will try to dig some more info from rs side and post it here. Table regions are not assigning back, after restarting all regionservers at once. - Key: HBASE-13337 URL: https://issues.apache.org/jira/browse/HBASE-13337 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Priority: Blocker Fix For: 2.0.0 Attachments: HBASE-13337.patch Regions of the table are continouly in state=FAILED_CLOSE. {noformat} RegionState RIT time (ms) 8f62e819b356736053e06240f7f7c6fd t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113929 caf59209ae65ea80fca6bdc6996a7d68 t1,,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM2,16040,1427362533691 113929 db52a74988f71e5cf257bbabf31f26f3 t1,,1427362431330.db52a74988f71e5cf257bbabf31f26f3. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM3,16040,1427362533691 113920 43f3a65b9f9ff283f598c5450feab1f8 t1,,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113920 {noformat} *Steps to reproduce:* 1. Start HBase cluster with more than one regionserver. 2. Create a table with precreated regions. (lets say 15 regions) 3. Make sure the regions are well balanced. 4. Restart all the Regionservers process at once across the cluster, except HMaster process 5. After restarting the Regionservers, successfully will connect to the HMaster. *Bug:* But no regions are assigning back to the Regionservers. *Master log shows as follows:* {noformat} 2015-03-26 15:05:36,201 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,202 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_OPENsn=VM1,16040,1427362531818 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Force region state offline {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_CLOSE 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545447#comment-14545447 ] Jingcheng Du commented on HBASE-13531: -- bq. We will read cells from MOB files only on demand right. First we read from ref cf files and there any way the readPoint check is there. Only if it passes, we will call get cell on MOB. Or the case is like with same key (rk,cf,q,ts,type) we have 2 cells and old cell in ref cf got cleared by readPoint check but we read the new cell from MOB file? Hi Anoop [~anoopsamjohn], I think we could directly use the readPt to directly seek the cell in mob file. I have refined the patch based on the V1, and will upload it later. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545449#comment-14545449 ] Jingcheng Du commented on HBASE-13531: -- Wrong comment format, will send it again. bq. We will read cells from MOB files only on demand right. First we read from ref cf files and there any way the readPoint check is there. Only if it passes, we will call get cell on MOB. Or the case is like with same key (rk,cf,q,ts,type) we have 2 cells and old cell in ref cf got cleared by readPoint check but we read the new cell from MOB file? Hi Anoop Anoop Sam John, I think we could directly use the readPt to directly seek the cell in mob file. I have refined the patch based on the V1, and will upload it later. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13695) Cannot timeout Hbase bulk operations
Dev Lakhani created HBASE-13695: --- Summary: Cannot timeout Hbase bulk operations Key: HBASE-13695 URL: https://issues.apache.org/jira/browse/HBASE-13695 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dev Lakhani Using the Hbase 1.0.0 client. In HTable there is a batch() operation which calls AyncRequest ars ... ars.waitUntilDone() This invokes waitUntilDone with Long.Max. Does this mean batch operations cannot be interrupted or invoked with a timeout? We are seeing some batch operations taking so long that our client hangs forever in Waiting for.. actions to finish. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13695) Cannot timeout or interrupt Hbase bulk/batch operations
[ https://issues.apache.org/jira/browse/HBASE-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dev Lakhani updated HBASE-13695: Summary: Cannot timeout or interrupt Hbase bulk/batch operations (was: Cannot timeout Hbase bulk operations) Cannot timeout or interrupt Hbase bulk/batch operations --- Key: HBASE-13695 URL: https://issues.apache.org/jira/browse/HBASE-13695 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dev Lakhani Using the Hbase 1.0.0 client. In HTable there is a batch() operation which calls AyncRequest ars ... ars.waitUntilDone() This invokes waitUntilDone with Long.Max. Does this mean batch operations cannot be interrupted or invoked with a timeout? We are seeing some batch operations taking so long that our client hangs forever in Waiting for.. actions to finish. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545634#comment-14545634 ] stack commented on HBASE-11927: --- That failure is not yours [~appy] ... not sure why complaining no test when you've added some. Let me rerun to be sure. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13656) Rename getDeadServers to getDeadServersSize in Admin
[ https://issues.apache.org/jira/browse/HBASE-13656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545705#comment-14545705 ] Lars Francke commented on HBASE-13656: -- This patch does not have the proper subject line - be careful when committing. Sorry! Next time... Rename getDeadServers to getDeadServersSize in Admin Key: HBASE-13656 URL: https://issues.apache.org/jira/browse/HBASE-13656 Project: HBase Issue Type: Improvement Reporter: Lars Francke Assignee: Lars Francke Priority: Minor Attachments: HBASE-13656.patch The name is inconsistent with the other methods (e.g. {{getServersSize}} {{getBackupMastersSize}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545520#comment-14545520 ] Jonathan Hsieh commented on HBASE-13531: I ran the initial version of the patch about 80 times and it passed. Good stuff. This isn't my area of expertise so I need an extra day to understand it completely. [~anoopsamjohn] I won't block committing this if you are satisfied. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531-V4.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545631#comment-14545631 ] stack commented on HBASE-13694: --- If exception, we don't decrement? How much diff you seeing [~esteban] The response is not sent when you do the decrement, that is ok? CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11927: -- Attachment: HBASE-11927-v8.patch Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545660#comment-14545660 ] Anoop Sam John commented on HBASE-13694: bq.If exception, we don't decrement? No. We were doing decrement in finally. Patch moves the decrement a bit early. That is correct? I have a doubt. CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545669#comment-14545669 ] stack commented on HBASE-13694: --- bq ...otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. Tell us more [~esteban] For sure we are bottlenecking on returning results... CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545655#comment-14545655 ] Anoop Sam John commented on HBASE-11927: +1. Good work [~appy]. Good release notes too. Well summarized. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545668#comment-14545668 ] ramkrishna.s.vasudevan commented on HBASE-13531: I will check this once by tomorrow. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531-V4.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-13695) Cannot timeout or interrupt Hbase bulk/batch operations
[ https://issues.apache.org/jira/browse/HBASE-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-13695. --- Resolution: Incomplete Cannot timeout or interrupt Hbase bulk/batch operations --- Key: HBASE-13695 URL: https://issues.apache.org/jira/browse/HBASE-13695 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dev Lakhani Using the Hbase 1.0.0 client. In HTable there is a batch() operation which calls AyncRequest ars ... ars.waitUntilDone() This invokes waitUntilDone with Long.Max. Does this mean batch operations cannot be interrupted or invoked with a timeout? We are seeing some batch operations taking so long that our client hangs forever in Waiting for.. actions to finish. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13531) After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity
[ https://issues.apache.org/jira/browse/HBASE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545642#comment-14545642 ] Anoop Sam John commented on HBASE-13531: I am +1 The old methods readCell(Cell search, boolean cacheMobBlocks) resolve(Cell reference, boolean cacheBlocks) are not used by any code path and may be we can just replace them with new ones with readPoint param? These are private classes so no issue to drop old methods. After 4/18/15 merge, flakey failures of TestAcidGuarantees#testMobScanAtomicity --- Key: HBASE-13531 URL: https://issues.apache.org/jira/browse/HBASE-13531 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: hbase-11339 Reporter: Jonathan Hsieh Fix For: hbase-11339 Attachments: HBASE-13531-V2.diff, HBASE-13531-V3.diff, HBASE-13531-V4.diff, HBASE-13531.diff After the merge of master from 4/18/15 with hbase-11339 branch, we encounter some atomicity violations. We want to fix before calling merge to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13695) Cannot timeout or interrupt Hbase bulk/batch operations
[ https://issues.apache.org/jira/browse/HBASE-13695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545676#comment-14545676 ] stack commented on HBASE-13695: --- This one is probably better up on the list than an issue [~devl.development] until we figure what is going on. Suggest you add a bit more info too.. Below waitUntilDone we are running timers and if we timeout we'll return errors which should bubble up as exceptions... Cannot timeout or interrupt Hbase bulk/batch operations --- Key: HBASE-13695 URL: https://issues.apache.org/jira/browse/HBASE-13695 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dev Lakhani Using the Hbase 1.0.0 client. In HTable there is a batch() operation which calls AyncRequest ars ... ars.waitUntilDone() This invokes waitUntilDone with Long.Max. Does this mean batch operations cannot be interrupted or invoked with a timeout? We are seeing some batch operations taking so long that our client hangs forever in Waiting for.. actions to finish. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546237#comment-14546237 ] Hudson commented on HBASE-11927: SUCCESS: Integrated in HBase-TRUNK #6484 (See [https://builds.apache.org/job/HBase-TRUNK/6484/]) HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit) (stack: rev 988593857f5150e5d337ad6b8bf3ba0479441f3e) * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContextBuilder.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java * hbase-common/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Improvement Components: Performance Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13651) Handle StoreFileScanner FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546238#comment-14546238 ] Hudson commented on HBASE-13651: SUCCESS: Integrated in HBase-TRUNK #6484 (See [https://builds.apache.org/job/HBase-TRUNK/6484/]) HBASE-13651 Handle StoreFileScanner FileNotFoundException (matteo.bertozzi: rev fec091a8073ab000e47120c244238f1b3642d560) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Handle StoreFileScanner FileNotFoundException - Key: HBASE-13651 URL: https://issues.apache.org/jira/browse/HBASE-13651 Project: HBase Issue Type: Bug Affects Versions: 0.94.27, 0.98.10.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 0.94.28, 0.98.13, 1.2.0 Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch, HBASE-13651-v0-0.94.patch, HBASE-13651-v0-0.98.patch, HBASE-13651-v0-branch-1.patch, HBASE-13651-v0.patch Example: * Machine-1 is serving Region-X and start compaction * Machine-1 goes in GC pause * Region-X gets reassigned to Machine-2 * Machine-1 exit from the GC pause * Machine-1 (re)moves the compacted files * Machine-1 get the lease expired and shutdown Machine-2 has now tons of FileNotFoundException on scan. If we reassign the region everything is ok, because we pickup the files compacted by Machine-1. This problem doesn't happen in the new code 1.0+ (i think but I haven't checked, it may be 1.1) where we write on the WAL the compaction event before (re)moving the files. A workaround is handling FileNotFoundException and refresh the store files, or shutdown the region and reassign. the first one is easy in 1.0+ the second one requires more work because at the moment we don't have the code to notify the master that the RS is closing the region, alternatively we can shutdown the entire RS (it is not a good solution but the case is rare enough) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13700) Allow Thrift2 HSHA server to have configurable threads
[ https://issues.apache.org/jira/browse/HBASE-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13700: -- Status: Patch Available (was: Open) Allow Thrift2 HSHA server to have configurable threads -- Key: HBASE-13700 URL: https://issues.apache.org/jira/browse/HBASE-13700 Project: HBase Issue Type: Bug Components: Thrift Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HBASE-13700.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13476) Procedure v2 - Add Replay Order logic for child procedures
[ https://issues.apache.org/jira/browse/HBASE-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546414#comment-14546414 ] Hadoop QA commented on HBASE-13476: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733251/HBASE-13476-v0.patch against master branch at commit 988593857f5150e5d337ad6b8bf3ba0479441f3e. ATTACHMENT ID: 12733251 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14060//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14060//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14060//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14060//console This message is automatically generated. Procedure v2 - Add Replay Order logic for child procedures -- Key: HBASE-13476 URL: https://issues.apache.org/jira/browse/HBASE-13476 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.1.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13476-v0.patch The current replay order logic is only for single-level procedures (which is what we are using today for master operations). To complete the implementation for the notification-bus we need to be able to replay in correct order child procs too. this will not impact the the current procs implementation (create/delete/modify/...) it is just a change at the framework level. https://reviews.apache.org/r/34289/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13616) Move ServerShutdownHandler to Pv2
[ https://issues.apache.org/jira/browse/HBASE-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13616: -- Attachment: 13616.wip.v3.branch-1.txt Latest version of the WIP. It is for branch-1. Missing are some tests verifying my assumptions on how the queues works bear out for server shutdown handler. Will add those next. Adding here so can get a test run to see whatelse needs fixing. Move ServerShutdownHandler to Pv2 - Key: HBASE-13616 URL: https://issues.apache.org/jira/browse/HBASE-13616 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 1.1.0 Reporter: stack Assignee: stack Attachments: 13616.wip.txt, 13616.wip.v3.branch-1.txt, 13616wip.v2.txt Move ServerShutdownHandler to run on ProcedureV2. Need this for DLR to work. See HBASE-13567. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13701) Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load
Jerry He created HBASE-13701: Summary: Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load Key: HBASE-13701 URL: https://issues.apache.org/jira/browse/HBASE-13701 Project: HBase Issue Type: Improvement Reporter: Jerry He HBASE-12052 makes SecureBulkLoadEndpoint work in a non-secure env to solve HDFS permission issues. We have encountered some of the permission issues and have to use this SecureBulkLoadEndpoint to workaround issues. We should probably consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load since it is able to handle both secure Kerberos and non-secure cases. Maintaining two versions of bulk load implementation is also a cause of confusion, and having to explicitly set it is also inconvenient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13510) Purge ByteBloomFilter
[ https://issues.apache.org/jira/browse/HBASE-13510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546317#comment-14546317 ] stack commented on HBASE-13510: --- Fix this formatting... 501 if (qualifier == null) 502 qualifier = DUMMY; and what is DUMMY? You mean empty? (Anoop said use EMPTY_BYTES from HConstants) Yeah, I don't get why we can't go to Cell since blooms are hashes... but Anoop does above so that is enough for me (for now -- smile) The javadoc on BloomFilterChunk is about BloomFilters. Is BFC a BF or utility a BF could use to make chunks? In javadoc, we don't say what a BFC is. If it is a BF, then why not call it so? We have a BF in our code base already and it has javadoc on the class that is similar to what is here. How does a BFC relate to a BF. Man, BloomFilterBase is and Interface? That'll throw folks off. Having a bit of a hard time navigating the hierarchy here with BloomFilter and BloomFilterBase and BloomFilterChunk. ByteBloomFilter seems like a better name than BFC yet we are removing it and putting in place a new class named BFC that has a good bit of BBF. You don't want to just purge the unused bits from BBF? Purge ByteBloomFilter - Key: HBASE-13510 URL: https://issues.apache.org/jira/browse/HBASE-13510 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13510_1.patch, HBASE-13510_2.patch, HBASE-13510_3.patch In order to address the comments over in HBASE-10800 related to comparing Cell with a serialized KV's key we had some need for that in Bloom filters. After discussing with Anoop, we found that it may be possible to remove/modify some of the APIs in the BloomFilter interfaces and for doing that we can purge ByteBloomFilter. I read the code and found that ByteBloomFilter was getting used in V1 version only. Now as it is obsolete we can remove this code and move some of the static APIs in ByteBloomFilter to some other util class or bloom related classes which will help us in refactoring the code too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-13699: Status: Patch Available (was: Open) Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-13699: Attachment: HBASE-13699.patch First pass. Please review, [~mbertozzi] Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13668) [0.98] TestFlushRegonEntry is flaky
[ https://issues.apache.org/jira/browse/HBASE-13668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546334#comment-14546334 ] Andrew Purtell commented on HBASE-13668: Related, and what I've sen up on Jenkins before: {noformat} test(org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry) Time elapsed: 1.452 sec FAILURE! java.lang.AssertionError: expected:369241501 but was:-369241502 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry.test(TestFlushRegionEntry.java:40) {noformat} FlushRegionEntry includes the amount of remaining delay, a _wall clock time dependent value_, into its calculation of hashCode. The test injects a ManualEnvironmentEdge with a fixed value for currentTime, with the expectation this will produce objects that are equal to each other, but TestFlushRegionEntry is a small test, so isn't the only test running in the JVM, or using EnvironmentEdgeManager. Rather than messing with EnviornmentEdgeManager this test should probably spy on the FlushRegionEntry objects and substitute a fixed value for the potentially variable return value from FlushRegionEntry#getDelay(). [0.98] TestFlushRegonEntry is flaky --- Key: HBASE-13668 URL: https://issues.apache.org/jira/browse/HBASE-13668 Project: HBase Issue Type: Bug Affects Versions: 0.98.13 Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor {noformat} Flaked tests: org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry.test (org.apache.hadoop.hbase.regionserver.TestFlushRegionEntry) Run 1: TestFlushRegionEntry.test:41 expected: org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushRegionEntry[flush region null] but was: org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushRegionEntry[flush region null] Run 2: PASS {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13700) Allow Thrift2 HSHA server to have configurable threads
[ https://issues.apache.org/jira/browse/HBASE-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13700: -- Attachment: HBASE-13700.patch Allow Thrift2 HSHA server to have configurable threads -- Key: HBASE-13700 URL: https://issues.apache.org/jira/browse/HBASE-13700 Project: HBase Issue Type: Bug Components: Thrift Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HBASE-13700.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-13699: Status: Open (was: Patch Available) Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-13699: Attachment: HBASE-13699-1.patch Second try. Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-13699: Status: Patch Available (was: Open) Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13701) Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load
[ https://issues.apache.org/jira/browse/HBASE-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546441#comment-14546441 ] Jerry He commented on HBASE-13701: -- Going through the comments in HBASE-12052, [~mbertozzi] and [~apurtell] both had intention for a consolidation. Consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load --- Key: HBASE-13701 URL: https://issues.apache.org/jira/browse/HBASE-13701 Project: HBase Issue Type: Improvement Reporter: Jerry He HBASE-12052 makes SecureBulkLoadEndpoint work in a non-secure env to solve HDFS permission issues. We have encountered some of the permission issues and have to use this SecureBulkLoadEndpoint to workaround issues. We should probably consolidate SecureBulkLoadEndpoint into HBase core as default for bulk load since it is able to handle both secure Kerberos and non-secure cases. Maintaining two versions of bulk load implementation is also a cause of confusion, and having to explicitly set it is also inconvenient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apekshit Sharma updated HBASE-13702: Description: ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over creating table, writing out KVs, and other mutations. was: ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. If there some link we can show the user in the starting which can help them with that? For the dry run, we can simply use if-else to skip over creating table, writing out KVs, etc. ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over creating table, writing out KVs, and other mutations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546349#comment-14546349 ] Matteo Bertozzi commented on HBASE-13699: - The Quotas section looks good to me, but you probably removed too much stuff from HBASE-13398. The Namespace Quota is a different beast form Quotas and that stuff is removed, grep for base.namespace.quota.maxtables to see that those settings are gone. also some of the examples for quotas from the previous patch looks good and more clear than the list we have, your call on what to keep or not. Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13476) Procedure v2 - Add Replay Order logic for child procedures
[ https://issues.apache.org/jira/browse/HBASE-13476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13476: Fix Version/s: 1.2.0 2.0.0 Procedure v2 - Add Replay Order logic for child procedures -- Key: HBASE-13476 URL: https://issues.apache.org/jira/browse/HBASE-13476 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 2.0.0, 1.1.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13476-v0.patch The current replay order logic is only for single-level procedures (which is what we are using today for master operations). To complete the implementation for the notification-bus we need to be able to replay in correct order child procs too. this will not impact the the current procs implementation (create/delete/modify/...) it is just a change at the framework level. https://reviews.apache.org/r/34289/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13398) Document HBase Quota
[ https://issues.apache.org/jira/browse/HBASE-13398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546457#comment-14546457 ] Hudson commented on HBASE-13398: SUCCESS: Integrated in HBase-TRUNK #6485 (See [https://builds.apache.org/job/HBase-TRUNK/6485/]) HBASE-13398 Document HBase Quota (mstanleyjones: rev 88f0f421c3330f4ba914ecf89d8d2afe78cacbc4) * src/main/asciidoc/_chapters/ops_mgt.adoc Document HBase Quota Key: HBASE-13398 URL: https://issues.apache.org/jira/browse/HBASE-13398 Project: HBase Issue Type: Sub-task Components: documentation Reporter: Ashish Singhi Assignee: Gururaj Shetty Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13398.01.patch, HBASE-13398.02.patch As part of this we should document HBASE-11598 and HBASE-8410 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
Apekshit Sharma created HBASE-13702: --- Summary: ImportTsv: Add dry-run functionality and log bad rows Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. If there some link we can show the user in the starting which can help them with that? For the dry run, we can simply use if-else to skip over creating table, writing out KVs, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13336) Consistent rules for security meta table protections
[ https://issues.apache.org/jira/browse/HBASE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546304#comment-14546304 ] Mikhail Antonov commented on HBASE-13336: - Looks good! A few nits: - getReservedColumnIfMeta - the naming suggested we expect to chech meta table, but we actually check ACL and labels tables? Would it be more consistent to name like getReservedColumnForSystemTable or so? - just to note that in HBASE-13375 it's proposed to eliminate multiple isSystemOrSuperUser() calls scattered over the codebase and move it to User class instead. Separate checkSystemUser() is proposed as the one throwing an exception vs. returning boolean. Just a nit though. - in preDisableTable do we need both log and ACE to be thrown? ACE doesn't get logged? Consistent rules for security meta table protections Key: HBASE-13336 URL: https://issues.apache.org/jira/browse/HBASE-13336 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Srikanth Srungarapu Fix For: 2.0.0, 0.98.13, 1.2.0 Attachments: HBASE-13336.patch The AccessController and VisibilityController do different things regarding protecting their meta tables. The AC allows schema changes and disable/enable if the user has permission. The VC unconditionally disallows all admin actions. Generally, bad things will happen if these meta tables are damaged, disabled, or dropped. The likely outcome is random frequent (or constant) server side op failures with nasty stack traces. On the other hand some things like column family and table attribute changes can have valid use cases. We should have consistent and sensible rules for protecting security meta tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13700) Allow Thrift2 HSHA server to have configurable threads
[ https://issues.apache.org/jira/browse/HBASE-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13700: -- Description: The half sync half async server by default starts 5 worker threads. For busy servers that might not be enough. That should be configurable. For the threadpool there should be a way to set the max number of threads so that creating threads doesn't run away. That should be configurable. Allow Thrift2 HSHA server to have configurable threads -- Key: HBASE-13700 URL: https://issues.apache.org/jira/browse/HBASE-13700 Project: HBase Issue Type: Bug Components: Thrift Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HBASE-13700.patch The half sync half async server by default starts 5 worker threads. For busy servers that might not be enough. That should be configurable. For the threadpool there should be a way to set the max number of threads so that creating threads doesn't run away. That should be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13700) Allow Thrift2 HSHA server to have configurable threads
[ https://issues.apache.org/jira/browse/HBASE-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13700: -- Attachment: HBASE-13700-v1.patch Zoom Zoom. Pre-make the threads to be faster when the process starts accepting requests. Allow Thrift2 HSHA server to have configurable threads -- Key: HBASE-13700 URL: https://issues.apache.org/jira/browse/HBASE-13700 Project: HBase Issue Type: Bug Components: Thrift Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HBASE-13700-v1.patch, HBASE-13700.patch The half sync half async server by default starts 5 worker threads. For busy servers that might not be enough. That should be configurable. For the threadpool there should be a way to set the max number of threads so that creating threads doesn't run away. That should be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13651) Handle StoreFileScanner FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546264#comment-14546264 ] Hudson commented on HBASE-13651: SUCCESS: Integrated in HBase-0.98 #989 (See [https://builds.apache.org/job/HBase-0.98/989/]) HBASE-13651 Handle StoreFileScanner FileNotFoundExceptin (matteo.bertozzi: rev 42e3e37ee3b3d3fc3d348e3888f07237b680e594) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java Handle StoreFileScanner FileNotFoundException - Key: HBASE-13651 URL: https://issues.apache.org/jira/browse/HBASE-13651 Project: HBase Issue Type: Bug Affects Versions: 0.94.27, 0.98.10.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 0.94.28, 0.98.13, 1.2.0 Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch, HBASE-13651-v0-0.94.patch, HBASE-13651-v0-0.98.patch, HBASE-13651-v0-branch-1.patch, HBASE-13651-v0.patch Example: * Machine-1 is serving Region-X and start compaction * Machine-1 goes in GC pause * Region-X gets reassigned to Machine-2 * Machine-1 exit from the GC pause * Machine-1 (re)moves the compacted files * Machine-1 get the lease expired and shutdown Machine-2 has now tons of FileNotFoundException on scan. If we reassign the region everything is ok, because we pickup the files compacted by Machine-1. This problem doesn't happen in the new code 1.0+ (i think but I haven't checked, it may be 1.1) where we write on the WAL the compaction event before (re)moving the files. A workaround is handling FileNotFoundException and refresh the store files, or shutdown the region and reassign. the first one is easy in 1.0+ the second one requires more work because at the moment we don't have the code to notify the master that the RS is closing the region, alternatively we can shutdown the entire RS (it is not a good solution but the case is rare enough) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13651) Handle StoreFileScanner FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546363#comment-14546363 ] Hudson commented on HBASE-13651: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #940 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/940/]) HBASE-13651 Handle StoreFileScanner FileNotFoundExceptin (matteo.bertozzi: rev 42e3e37ee3b3d3fc3d348e3888f07237b680e594) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Handle StoreFileScanner FileNotFoundException - Key: HBASE-13651 URL: https://issues.apache.org/jira/browse/HBASE-13651 Project: HBase Issue Type: Bug Affects Versions: 0.94.27, 0.98.10.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 0.94.28, 0.98.13, 1.2.0 Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch, HBASE-13651-v0-0.94.patch, HBASE-13651-v0-0.98.patch, HBASE-13651-v0-branch-1.patch, HBASE-13651-v0.patch Example: * Machine-1 is serving Region-X and start compaction * Machine-1 goes in GC pause * Region-X gets reassigned to Machine-2 * Machine-1 exit from the GC pause * Machine-1 (re)moves the compacted files * Machine-1 get the lease expired and shutdown Machine-2 has now tons of FileNotFoundException on scan. If we reassign the region everything is ok, because we pickup the files compacted by Machine-1. This problem doesn't happen in the new code 1.0+ (i think but I haven't checked, it may be 1.1) where we write on the WAL the compaction event before (re)moving the files. A workaround is handling FileNotFoundException and refresh the store files, or shutdown the region and reassign. the first one is easy in 1.0+ the second one requires more work because at the moment we don't have the code to notify the master that the RS is closing the region, alternatively we can shutdown the entire RS (it is not a good solution but the case is rare enough) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546404#comment-14546404 ] Matteo Bertozzi commented on HBASE-13699: - looks ok to me, +1 Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-13699: Resolution: Fixed Fix Version/s: 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to master. Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 2.0.0 Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546484#comment-14546484 ] Hadoop QA commented on HBASE-13699: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733270/HBASE-13699.patch against master branch at commit 88f0f421c3330f4ba914ecf89d8d2afe78cacbc4. ATTACHMENT ID: 12733270 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + Disable a replication relationship. HBase will no longer send edits to that peer cluster, but it still keeps track of all the new WALs that it will need to replicate if and when it is re-enabled. + Enable the table replication switch for all it's column families. If the table is not found in the destination cluster then it will create one with the same name and column families. +hbase set_quota TYPE = THROTTLE, USER = 'u1', LIMIT = '10req/sec' # a per-user request limit +hbase set_quota TYPE = THROTTLE, USER = 'u1', LIMIT = '10M/sec' # a per-user size limit +hbase set_quota TYPE = THROTTLE, USER = 'u1', TABLE = 't2', LIMIT = '5K/min' # a per-user size limit on a table +hbase set_quota TYPE = THROTTLE, USER = 'u1', NAMESPACE = 'ns2', LIMIT = NONE # removing a per-namespace request limit +hbase set_quota TYPE = THROTTLE, NAMESPACE = 'ns1', LIMIT = '10req/sec' # a per-namespace request limit +hbase set_quota TYPE = THROTTLE, TABLE = 't1', LIMIT = '10M/sec' # a per-table size limit +hbase set_quota TYPE = THROTTLE, USER = 'u1', LIMIT = NONE # removing a per-user limit +hbase list_quotas USER = 'u1, NAMESPACE = 'ns2' # list quotas that apply to a given user on a namespace {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestProcessBasedCluster org.apache.hadoop.hbase.mapreduce.TestImportExport Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14061//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14061//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14061//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14061//console This message is automatically generated. Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 2.0.0 Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12867) Shell does not support custom replication endpoint specification
[ https://issues.apache.org/jira/browse/HBASE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546490#comment-14546490 ] Hudson commented on HBASE-12867: FAILURE: Integrated in HBase-0.98 #990 (See [https://builds.apache.org/job/HBase-0.98/990/]) HBASE-13035 Backport HBASE-12867 Shell does not support custom replication endpoint specification (apurtell: rev 4ff7797f3a754b1cbb7e6de6c78c356321ba2396) * hbase-shell/src/main/ruby/shell/commands/add_peer.rb * hbase-shell/src/test/ruby/test_helper.rb * hbase-shell/src/main/ruby/hbase.rb * hbase-shell/src/main/ruby/hbase/replication_admin.rb * hbase-shell/src/test/ruby/hbase/replication_admin_test.rb Shell does not support custom replication endpoint specification Key: HBASE-12867 URL: https://issues.apache.org/jira/browse/HBASE-12867 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Kevin Risden Labels: beginner, beginners Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12867-v1.patch, HBASE-12867-v2.patch, HBASE-12867-v3.patch, HBASE-12867-v3.patch, HBASE-12867-v3.patch, HBASE-12867.patch On HBASE-12254 and also at https://github.com/risdenk/hbase-custom-replication-endpoint-example [~risdenk] made the following observations and suggestions regarding custom replication endpoints that I think are a reasonable blueprint for improvement: {quote} I was trying out the pluggable replication endpoint feature and found the following: - you must use the ReplicationAdmin to add the new ReplicationEndpoint - hbase shell add_peer command doesn't support specifying a custom class - hbase shell add_peer relies on the newly deprecated ReplicationAdmin addPeer methods - ReplicationAdmin addPeer tableCfs is now a MapTableName, ? extends CollectionString instead of a string {quote} We should fix the add_peer command in the shell at least. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13035) [0.98] Backport HBASE-12867 - Shell does not support custom replication endpoint specification
[ https://issues.apache.org/jira/browse/HBASE-13035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546489#comment-14546489 ] Hudson commented on HBASE-13035: FAILURE: Integrated in HBase-0.98 #990 (See [https://builds.apache.org/job/HBase-0.98/990/]) HBASE-13035 Backport HBASE-12867 Shell does not support custom replication endpoint specification (apurtell: rev 4ff7797f3a754b1cbb7e6de6c78c356321ba2396) * hbase-shell/src/main/ruby/shell/commands/add_peer.rb * hbase-shell/src/test/ruby/test_helper.rb * hbase-shell/src/main/ruby/hbase.rb * hbase-shell/src/main/ruby/hbase/replication_admin.rb * hbase-shell/src/test/ruby/hbase/replication_admin_test.rb [0.98] Backport HBASE-12867 - Shell does not support custom replication endpoint specification -- Key: HBASE-13035 URL: https://issues.apache.org/jira/browse/HBASE-13035 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Andrew Purtell Fix For: 0.98.13, 1.0.2 Attachments: HBASE-13035-0.98.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13700) Allow Thrift2 HSHA server to have configurable threads
[ https://issues.apache.org/jira/browse/HBASE-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546515#comment-14546515 ] Hadoop QA commented on HBASE-13700: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733276/HBASE-13700.patch against master branch at commit 88f0f421c3330f4ba914ecf89d8d2afe78cacbc4. ATTACHMENT ID: 12733276 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + InetSocketAddress inetSocketAddress) throws TTransportException { + server = getTHsHaServer(protocolFactory, processor, transportFactory, workerThreads, inetSocketAddress, metrics); + server = getTThreadPoolServer(protocolFactory, processor, transportFactory, workerThreads, inetSocketAddress); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14062//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14062//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14062//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14062//console This message is automatically generated. Allow Thrift2 HSHA server to have configurable threads -- Key: HBASE-13700 URL: https://issues.apache.org/jira/browse/HBASE-13700 Project: HBase Issue Type: Bug Components: Thrift Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HBASE-13700-v1.patch, HBASE-13700.patch The half sync half async server by default starts 5 worker threads. For busy servers that might not be enough. That should be configurable. For the threadpool there should be a way to set the max number of threads so that creating threads doesn't run away. That should be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13700) Allow Thrift2 HSHA server to have configurable threads
[ https://issues.apache.org/jira/browse/HBASE-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546534#comment-14546534 ] Hadoop QA commented on HBASE-13700: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733280/HBASE-13700-v1.patch against master branch at commit a93353e83ce514b48700b3f5ba16f8a41204e1fa. ATTACHMENT ID: 12733280 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + InetSocketAddress inetSocketAddress) throws TTransportException { + server = getTHsHaServer(protocolFactory, processor, transportFactory, workerThreads, inetSocketAddress, metrics); + server = getTThreadPoolServer(protocolFactory, processor, transportFactory, workerThreads, inetSocketAddress); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 4 zombie test(s): at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompactionInternals(TestCacheOnWrite.java:454) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompaction(TestCacheOnWrite.java:479) at org.apache.hadoop.hbase.wal.TestWALSplit.testLogDirectoryShouldBeDeletedAfterSuccessfulSplit(TestWALSplit.java:671) at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompaction(TestIOFencing.java:228) at org.apache.hadoop.hbase.wal.TestWALSplit.testEmptyLogFiles(TestWALSplit.java:477) at org.apache.hadoop.hbase.wal.TestWALSplit.testEmptyOpenLogFiles(TestWALSplit.java:470) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14063//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14063//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14063//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14063//console This message is automatically generated. Allow Thrift2 HSHA server to have configurable threads -- Key: HBASE-13700 URL: https://issues.apache.org/jira/browse/HBASE-13700 Project: HBase Issue Type: Bug Components: Thrift Reporter: Elliott Clark Assignee: Elliott Clark Attachments: HBASE-13700-v1.patch, HBASE-13700.patch The half sync half async server by default starts 5 worker threads. For busy servers that might not be enough. That should be configurable. For the threadpool there should be a way to set the max number of threads so that creating threads doesn't run away. That should be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apekshit Sharma updated HBASE-13702: Description: ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. was: ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over creating table, writing out KVs, and other mutations. ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13616) Move ServerShutdownHandler to Pv2
[ https://issues.apache.org/jira/browse/HBASE-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546530#comment-14546530 ] Hadoop QA commented on HBASE-13616: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733283/13616.wip.v3.branch-1.txt against branch-1 branch at commit a93353e83ce514b48700b3f5ba16f8a41204e1fa. ATTACHMENT ID: 12733283 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +regionsOnCrashedServer_ = java.util.Collections.unmodifiableList(regionsOnCrashedServer_); + new java.lang.String[] { UserInfo, UnmodifiedTableSchema, ModifiedTableSchema, DeleteColumnFamilyInModify, }); + new java.lang.String[] { ServerName, DistributedLogReplay, RegionsOnCrashedServer, RegionsToAssign, CarryingMeta, CarryingSystem, ShouldSplitWal, }); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestZKLessAMOnCluster org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.master.TestAssignmentManager org.apache.hadoop.hbase.master.TestMasterFailover org.apache.hadoop.hbase.master.TestDistributedLogSplitting {color:red}-1 core zombie tests{color}. There are 3 zombie test(s): at org.apache.hadoop.hbase.client.TestMetaWithReplicas.testShutdownHandling(TestMetaWithReplicas.java:140) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompactionInternals(TestCacheOnWrite.java:468) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testNotCachingDataBlocksDuringCompaction(TestCacheOnWrite.java:493) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14064//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14064//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14064//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14064//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14064//console This message is automatically generated. Move ServerShutdownHandler to Pv2 - Key: HBASE-13616 URL: https://issues.apache.org/jira/browse/HBASE-13616 Project: HBase Issue Type: Sub-task Components: proc-v2 Affects Versions: 1.1.0 Reporter: stack Assignee: stack Attachments: 13616.wip.txt, 13616.wip.v3.branch-1.txt, 13616wip.v2.txt Move ServerShutdownHandler to run on ProcedureV2. Need this for DLR to work. See HBASE-13567. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12867) Shell does not support custom replication endpoint specification
[ https://issues.apache.org/jira/browse/HBASE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546544#comment-14546544 ] Hudson commented on HBASE-12867: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #941 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/941/]) HBASE-13035 Backport HBASE-12867 Shell does not support custom replication endpoint specification (apurtell: rev 4ff7797f3a754b1cbb7e6de6c78c356321ba2396) * hbase-shell/src/main/ruby/hbase/replication_admin.rb * hbase-shell/src/test/ruby/hbase/replication_admin_test.rb * hbase-shell/src/test/ruby/test_helper.rb * hbase-shell/src/main/ruby/hbase.rb * hbase-shell/src/main/ruby/shell/commands/add_peer.rb Shell does not support custom replication endpoint specification Key: HBASE-12867 URL: https://issues.apache.org/jira/browse/HBASE-12867 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Kevin Risden Labels: beginner, beginners Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12867-v1.patch, HBASE-12867-v2.patch, HBASE-12867-v3.patch, HBASE-12867-v3.patch, HBASE-12867-v3.patch, HBASE-12867.patch On HBASE-12254 and also at https://github.com/risdenk/hbase-custom-replication-endpoint-example [~risdenk] made the following observations and suggestions regarding custom replication endpoints that I think are a reasonable blueprint for improvement: {quote} I was trying out the pluggable replication endpoint feature and found the following: - you must use the ReplicationAdmin to add the new ReplicationEndpoint - hbase shell add_peer command doesn't support specifying a custom class - hbase shell add_peer relies on the newly deprecated ReplicationAdmin addPeer methods - ReplicationAdmin addPeer tableCfs is now a MapTableName, ? extends CollectionString instead of a string {quote} We should fix the add_peer command in the shell at least. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13035) [0.98] Backport HBASE-12867 - Shell does not support custom replication endpoint specification
[ https://issues.apache.org/jira/browse/HBASE-13035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546543#comment-14546543 ] Hudson commented on HBASE-13035: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #941 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/941/]) HBASE-13035 Backport HBASE-12867 Shell does not support custom replication endpoint specification (apurtell: rev 4ff7797f3a754b1cbb7e6de6c78c356321ba2396) * hbase-shell/src/test/ruby/test_helper.rb * hbase-shell/src/main/ruby/hbase.rb * hbase-shell/src/main/ruby/hbase/replication_admin.rb * hbase-shell/src/test/ruby/hbase/replication_admin_test.rb * hbase-shell/src/main/ruby/shell/commands/add_peer.rb [0.98] Backport HBASE-12867 - Shell does not support custom replication endpoint specification -- Key: HBASE-13035 URL: https://issues.apache.org/jira/browse/HBASE-13035 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Andrew Purtell Fix For: 0.98.13, 1.0.2 Attachments: HBASE-13035-0.98.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13699) Expand information about HBase quotas
[ https://issues.apache.org/jira/browse/HBASE-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546532#comment-14546532 ] Hudson commented on HBASE-13699: FAILURE: Integrated in HBase-TRUNK #6486 (See [https://builds.apache.org/job/HBase-TRUNK/6486/]) HBASE-13699 Expand documentation about quotas and other load balancing mechanisms (mstanleyjones: rev a93353e83ce514b48700b3f5ba16f8a41204e1fa) * src/main/asciidoc/_chapters/ops_mgt.adoc Expand information about HBase quotas - Key: HBASE-13699 URL: https://issues.apache.org/jira/browse/HBASE-13699 Project: HBase Issue Type: Bug Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 2.0.0 Attachments: HBASE-13699-1.patch, HBASE-13699.patch See HBASE-13398 and http://blog.cloudera.com/blog/2014/12/new-in-cdh-5-2-improvements-for-running-multiple-workloads-on-a-single-hbase-cluster/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546052#comment-14546052 ] Apekshit Sharma commented on HBASE-11927: - Changed InterfaceAudience of DataChecksum to include HBase. Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Bug Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13647) Default value for hbase.client.operation.timeout is too high
[ https://issues.apache.org/jira/browse/HBASE-13647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546058#comment-14546058 ] Nick Dimiduk commented on HBASE-13647: -- Is this something we want to advertise in hbase-defaults.xml as well? Definitely for branch-1, it's not a breaking change (and I'm no RM for branch-1 in general, just the 1.1 line). An unbounded timeout seems like a bug to me, and your proposed value is consistent with our other timeouts, so +1 for a bug-fix on branch-1.1. [~enis] and [~apurtell] probably want it for branch-1.0 and 0.98, if they agree with my reasoning anyway :) Default value for hbase.client.operation.timeout is too high Key: HBASE-13647 URL: https://issues.apache.org/jira/browse/HBASE-13647 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.0.1, 1.1.1 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Priority: Critical Fix For: 2.0.0 Attachments: HBASE-13647.patch Default value for hbase.client.operation.timeout is too high, it is LONG.Max. That value will block any service calls to coprocessor endpoints indefinitely. Should we introduce better default value for that? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apekshit Sharma updated HBASE-11927: Issue Type: Improvement (was: Bug) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Improvement Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-8323) Low hanging checksum improvements
[ https://issues.apache.org/jira/browse/HBASE-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apekshit Sharma resolved HBASE-8323. Resolution: Duplicate Low hanging checksum improvements - Key: HBASE-8323 URL: https://issues.apache.org/jira/browse/HBASE-8323 Project: HBase Issue Type: Improvement Components: Performance Reporter: Enis Soztutar Over at Hadoop land, [~tlipcon] had done some improvements for checksums, a native implementation for CRC32C (HADOOP-7445) and bulk verify of checksums (HADOOP-7444). In HBase, we can do - Also develop a bulk verify API. Regardless of hbase.hstore.bytes.per.checksum we always want to verify of the whole checksum for the hfile block. - Enable NativeCrc32 to be used as a checksum algo. It is not clear how much gain we can expect over pure java CRC32. Though, longer term we should focus on convincing hdfs guys for inline checksums (HDFS-2699) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apekshit Sharma updated HBASE-11927: Component/s: Performance Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Improvement Components: Performance Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13035) [0.98] Backport HBASE-12867 - Shell does not support custom replication endpoint specification
[ https://issues.apache.org/jira/browse/HBASE-13035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546085#comment-14546085 ] Hudson commented on HBASE-13035: FAILURE: Integrated in HBase-1.0 #916 (See [https://builds.apache.org/job/HBase-1.0/916/]) HBASE-13035 Backport HBASE-12867 Shell does not support custom replication endpoint specification (apurtell: rev 31fb058394f70f1742827ce3bdc97695fb3eae81) * hbase-shell/src/main/ruby/hbase/replication_admin.rb * hbase-shell/src/test/ruby/hbase/replication_admin_test.rb * hbase-shell/src/test/ruby/test_helper.rb * hbase-shell/src/main/ruby/shell/commands/add_peer.rb * hbase-shell/src/main/ruby/hbase.rb [0.98] Backport HBASE-12867 - Shell does not support custom replication endpoint specification -- Key: HBASE-13035 URL: https://issues.apache.org/jira/browse/HBASE-13035 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Andrew Purtell Fix For: 0.98.13, 1.0.2 Attachments: HBASE-13035-0.98.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13646) HRegion#execService should not try to build incomplete messages
[ https://issues.apache.org/jira/browse/HBASE-13646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546084#comment-14546084 ] Nick Dimiduk commented on HBASE-13646: -- What [~stack] said :) This would make it so some coproc running on 1.1.x would be experiencing different environment from running on 1.1.(x+1), so I think it's not acceptable as a semver patch level change. Thus please no for branch-1.0 or branch-1.1. HRegion#execService should not try to build incomplete messages --- Key: HBASE-13646 URL: https://issues.apache.org/jira/browse/HBASE-13646 Project: HBase Issue Type: Bug Components: Coprocessors, regionserver Affects Versions: 2.0.0, 1.2.0, 1.1.1 Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 2.0.0 Attachments: HBASE-13646-branch-1.patch, HBASE-13646.patch, HBASE-13646.v2.patch, HBASE-13646.v2.patch If some RPC service, called on region throws exception, execService still tries to build Message. In case of complex messages with required fields it complicates service code because service need to pass fake protobuf objects, so they can be barely buildable. To mitigate that I propose to check that controller was failed and return null from call instead of failing with exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12867) Shell does not support custom replication endpoint specification
[ https://issues.apache.org/jira/browse/HBASE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546087#comment-14546087 ] Hudson commented on HBASE-12867: FAILURE: Integrated in HBase-1.0 #916 (See [https://builds.apache.org/job/HBase-1.0/916/]) HBASE-13035 Backport HBASE-12867 Shell does not support custom replication endpoint specification (apurtell: rev 31fb058394f70f1742827ce3bdc97695fb3eae81) * hbase-shell/src/main/ruby/hbase/replication_admin.rb * hbase-shell/src/test/ruby/hbase/replication_admin_test.rb * hbase-shell/src/test/ruby/test_helper.rb * hbase-shell/src/main/ruby/shell/commands/add_peer.rb * hbase-shell/src/main/ruby/hbase.rb Shell does not support custom replication endpoint specification Key: HBASE-12867 URL: https://issues.apache.org/jira/browse/HBASE-12867 Project: HBase Issue Type: Bug Reporter: Andrew Purtell Assignee: Kevin Risden Labels: beginner, beginners Fix For: 2.0.0, 1.1.0 Attachments: HBASE-12867-v1.patch, HBASE-12867-v2.patch, HBASE-12867-v3.patch, HBASE-12867-v3.patch, HBASE-12867-v3.patch, HBASE-12867.patch On HBASE-12254 and also at https://github.com/risdenk/hbase-custom-replication-endpoint-example [~risdenk] made the following observations and suggestions regarding custom replication endpoints that I think are a reasonable blueprint for improvement: {quote} I was trying out the pluggable replication endpoint feature and found the following: - you must use the ReplicationAdmin to add the new ReplicationEndpoint - hbase shell add_peer command doesn't support specifying a custom class - hbase shell add_peer relies on the newly deprecated ReplicationAdmin addPeer methods - ReplicationAdmin addPeer tableCfs is now a MapTableName, ? extends CollectionString instead of a string {quote} We should fix the add_peer command in the shell at least. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13084) Add labels to VisibilityLabelsCache asynchronously causes TestShell flakey
[ https://issues.apache.org/jira/browse/HBASE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546086#comment-14546086 ] Hudson commented on HBASE-13084: FAILURE: Integrated in HBase-1.0 #916 (See [https://builds.apache.org/job/HBase-1.0/916/]) HBASE-13084 Add labels to VisibilityLabelsCache asynchronously causes TestShell flakey (apurtell: rev e36876b75ef72e4149705c1819e5a945f90946ec) * hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * hbase-shell/src/test/ruby/hbase/visibility_labels_admin_test.rb Add labels to VisibilityLabelsCache asynchronously causes TestShell flakey -- Key: HBASE-13084 URL: https://issues.apache.org/jira/browse/HBASE-13084 Project: HBase Issue Type: Bug Components: test Reporter: zhangduo Assignee: zhangduo Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: HBASE-13084-addendum.patch, HBASE-13084-addendum2.patch, HBASE-13084.patch, HBASE-13084_1.patch, HBASE-13084_2.patch, HBASE-13084_2.patch, HBASE-13084_2.patch, HBASE-13084_2.patch, HBASE-13084_2_disable_test.patch As discussed in HBASE-12953, we found this error in PreCommit log https://builds.apache.org/job/PreCommit-HBASE-Build/12918/artifact/hbase-shell/target/surefire-reports/org.apache.hadoop.hbase.client.TestShell-output.txt {noformat} 1) Error: test_The_get/put_methods_should_work_for_data_written_with_Visibility(Hbase::VisibilityLabelsAdminMethodsTest): ArgumentError: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.security.visibility.InvalidLabelException: Label 'TEST_VISIBILITY' doesn't exists at org.apache.hadoop.hbase.security.visibility.VisibilityController.setAuths(VisibilityController.java:808) at org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService$1.setAuths(VisibilityLabelsProtos.java:6036) at org.apache.hadoop.hbase.protobuf.generated.VisibilityLabelsProtos$VisibilityLabelsService.callMethod(VisibilityLabelsProtos.java:6219) at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6867) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1707) at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1689) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31309) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2038) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:744) /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-shell/src/main/ruby/hbase/visibility_labels.rb:84:in `set_auths' ./src/test/ruby/hbase/visibility_labels_admin_test.rb:77:in `test_The_get/put_methods_should_work_for_data_written_with_Visibility' org/jruby/RubyProc.java:270:in `call' org/jruby/RubyKernel.java:2105:in `send' org/jruby/RubyArray.java:1620:in `each' org/jruby/RubyArray.java:1620:in `each' 2) Error: test_The_set/clear_methods_should_work_with_authorizations(Hbase::VisibilityLabelsAdminMethodsTest): ArgumentError: No authentication set for the given user jenkins /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-shell/src/main/ruby/hbase/visibility_labels.rb:97:in `get_auths' ./src/test/ruby/hbase/visibility_labels_admin_test.rb:57:in `test_The_set/clear_methods_should_work_with_authorizations' org/jruby/RubyProc.java:270:in `call' org/jruby/RubyKernel.java:2105:in `send' org/jruby/RubyArray.java:1620:in `each' org/jruby/RubyArray.java:1620:in `each' {noformat} This is the test code {code:title=visibility_labels_admin_test.rb} label = 'TEST_VISIBILITY' user = org.apache.hadoop.hbase.security.User.getCurrent().getName(); visibility_admin.add_labels(label) visibility_admin.set_auths(user, label) {code} It says 'label does not exists' when calling set_auths. Then I add some ugly logs in DefaultVisibilityLabelServiceImpl and VisibilityLabelsCache. {code:title=DefaultVisibilityLabelServiceImpl.java} public OperationStatus[] addLabels(Listbyte[] labels) throws IOException { ... if (mutateLabelsRegion(puts, finalOpStatus)) { updateZk(true); } for (byte[] label : labels) { String labelStr = Bytes.toString(label); LOG.info(labelStr + = + this.labelsCache.getLabelOrdinal(labelStr)); } ... } {code}
[jira] [Updated] (HBASE-13651) Handle StoreFileScanner FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13651: Resolution: Fixed Fix Version/s: 1.2.0 0.98.13 0.94.28 2.0.0 Status: Resolved (was: Patch Available) Handle StoreFileScanner FileNotFoundException - Key: HBASE-13651 URL: https://issues.apache.org/jira/browse/HBASE-13651 Project: HBase Issue Type: Bug Affects Versions: 0.94.27, 0.98.10.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 0.94.28, 0.98.13, 1.2.0 Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch, HBASE-13651-v0-0.94.patch, HBASE-13651-v0-0.98.patch, HBASE-13651-v0-branch-1.patch, HBASE-13651-v0.patch Example: * Machine-1 is serving Region-X and start compaction * Machine-1 goes in GC pause * Region-X gets reassigned to Machine-2 * Machine-1 exit from the GC pause * Machine-1 (re)moves the compacted files * Machine-1 get the lease expired and shutdown Machine-2 has now tons of FileNotFoundException on scan. If we reassign the region everything is ok, because we pickup the files compacted by Machine-1. This problem doesn't happen in the new code 1.0+ (i think but I haven't checked, it may be 1.1) where we write on the WAL the compaction event before (re)moving the files. A workaround is handling FileNotFoundException and refresh the store files, or shutdown the region and reassign. the first one is easy in 1.0+ the second one requires more work because at the moment we don't have the code to notify the master that the RS is closing the region, alternatively we can shutdown the entire RS (it is not a good solution but the case is rare enough) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13694) CallQueueSize is incorrectly decremented until the response is sent
[ https://issues.apache.org/jira/browse/HBASE-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545785#comment-14545785 ] stack commented on HBASE-13694: --- But if error, we need to decrement the callQueueSize and with your patch we will not [~esteban] If we do this, we'll accumulate more responses in the server which we will be working on returning to clients while letting more requests in the front door. You see CPU use go up [~esteban] What is the 'other issue'? CallQueueSize is incorrectly decremented until the response is sent --- Key: HBASE-13694 URL: https://issues.apache.org/jira/browse/HBASE-13694 Project: HBase Issue Type: Bug Components: master, regionserver, rpc Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Attachments: 0001-HBASE-13694-CallQueueSize-is-incorrectly-decremented.patch We should decrement the CallQueueSize as soon as we no longer need the call around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only pushing back other client requests while we send the response back to the client that originated the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13697) update ref guide prereq support tables for 1.1 release train
[ https://issues.apache.org/jira/browse/HBASE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546138#comment-14546138 ] Nick Dimiduk commented on HBASE-13697: -- Yeah, could be site hasn't been pushed lately. update ref guide prereq support tables for 1.1 release train Key: HBASE-13697 URL: https://issues.apache.org/jira/browse/HBASE-13697 Project: HBase Issue Type: Task Affects Versions: 1.1.0 Reporter: Sean Busbey Priority: Blocker Fix For: 1.1.1 the ref guide doesn't have a listing for Java or Hadoop versions needed / supported for the 1.1 release series. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11927) Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C)
[ https://issues.apache.org/jira/browse/HBASE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546143#comment-14546143 ] Hudson commented on HBASE-11927: SUCCESS: Integrated in HBase-1.2 #79 (See [https://builds.apache.org/job/HBase-1.2/79/]) HBASE-11927 Use Native Hadoop Library for HFile checksum. (Apekshit) (stack: rev 1cf85b3f7fd7a7d48894dc7d42dcf6978197f2f7) * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContextBuilder.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java * hbase-common/src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java Use Native Hadoop Library for HFile checksum (And flip default from CRC32 to CRC32C) Key: HBASE-11927 URL: https://issues.apache.org/jira/browse/HBASE-11927 Project: HBase Issue Type: Improvement Components: Performance Reporter: stack Assignee: Apekshit Sharma Fix For: 2.0.0, 1.2.0 Attachments: HBASE-11927-v1.patch, HBASE-11927-v2.patch, HBASE-11927-v4.patch, HBASE-11927-v5.patch, HBASE-11927-v6.patch, HBASE-11927-v7.patch, HBASE-11927-v8.patch, HBASE-11927-v8.patch, HBASE-11927.patch, after-compact-2%.svg, after-randomWrite1M-0.5%.svg, before-compact-22%.svg, before-randomWrite1M-5%.svg, c2021.crc2.svg, c2021.write.2.svg, c2021.zip.svg, crc32ct.svg Up in hadoop they have this change. Let me publish some graphs to show that it makes a difference (CRC is a massive amount of our CPU usage in my profiling of an upload because of compacting, flushing, etc.). We should also make use of native CRCings -- especially the 2.6 HDFS-6865 and ilk -- in hbase but that is another issue for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13651) Handle StoreFileScanner FileNotFoundException
[ https://issues.apache.org/jira/browse/HBASE-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546144#comment-14546144 ] Hudson commented on HBASE-13651: SUCCESS: Integrated in HBase-1.2 #79 (See [https://builds.apache.org/job/HBase-1.2/79/]) HBASE-13651 Handle StoreFileScanner FileNotFoundExceptin (matteo.bertozzi: rev 6968834c9c96c103e3a87f1be0dace49f2c9461e) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Handle StoreFileScanner FileNotFoundException - Key: HBASE-13651 URL: https://issues.apache.org/jira/browse/HBASE-13651 Project: HBase Issue Type: Bug Affects Versions: 0.94.27, 0.98.10.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 0.94.28, 0.98.13, 1.2.0 Attachments: HBASE-13651-0.94-draft.patch, HBASE-13651-draft.patch, HBASE-13651-v0-0.94.patch, HBASE-13651-v0-0.98.patch, HBASE-13651-v0-branch-1.patch, HBASE-13651-v0.patch Example: * Machine-1 is serving Region-X and start compaction * Machine-1 goes in GC pause * Region-X gets reassigned to Machine-2 * Machine-1 exit from the GC pause * Machine-1 (re)moves the compacted files * Machine-1 get the lease expired and shutdown Machine-2 has now tons of FileNotFoundException on scan. If we reassign the region everything is ok, because we pickup the files compacted by Machine-1. This problem doesn't happen in the new code 1.0+ (i think but I haven't checked, it may be 1.1) where we write on the WAL the compaction event before (re)moving the files. A workaround is handling FileNotFoundException and refresh the store files, or shutdown the region and reassign. the first one is easy in 1.0+ the second one requires more work because at the moment we don't have the code to notify the master that the RS is closing the region, alternatively we can shutdown the entire RS (it is not a good solution but the case is rare enough) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13336) Consistent rules for security meta table protections
[ https://issues.apache.org/jira/browse/HBASE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546145#comment-14546145 ] Hadoop QA commented on HBASE-13336: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12733196/HBASE-13336.patch against master branch at commit 9ba7337ac82d13b22a1b0c40edaba7873c0bd795. ATTACHMENT ID: 12733196 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14059//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14059//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14059//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14059//console This message is automatically generated. Consistent rules for security meta table protections Key: HBASE-13336 URL: https://issues.apache.org/jira/browse/HBASE-13336 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Srikanth Srungarapu Fix For: 2.0.0, 0.98.13, 1.2.0 Attachments: HBASE-13336.patch The AccessController and VisibilityController do different things regarding protecting their meta tables. The AC allows schema changes and disable/enable if the user has permission. The VC unconditionally disallows all admin actions. Generally, bad things will happen if these meta tables are damaged, disabled, or dropped. The likely outcome is random frequent (or constant) server side op failures with nasty stack traces. On the other hand some things like column family and table attribute changes can have valid use cases. We should have consistent and sensible rules for protecting security meta tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13698) Add RegionLocator methods to Thrift2 proxy.
Elliott Clark created HBASE-13698: - Summary: Add RegionLocator methods to Thrift2 proxy. Key: HBASE-13698 URL: https://issues.apache.org/jira/browse/HBASE-13698 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Elliott Clark Thrift2 doesn't provide the same functionality as the java client for getting region locations. We should change that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13681) Refactor Scan section in refguide to take account of scanner chunking, heartbeating, prefetch
[ https://issues.apache.org/jira/browse/HBASE-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones reassigned HBASE-13681: --- Assignee: Misty Stanley-Jones Refactor Scan section in refguide to take account of scanner chunking, heartbeating, prefetch - Key: HBASE-13681 URL: https://issues.apache.org/jira/browse/HBASE-13681 Project: HBase Issue Type: Task Components: documentation Affects Versions: 1.1.0 Reporter: stack Assignee: Misty Stanley-Jones Scanners got a revamp courtesy of [~jonathan.lawlor] Our Scan section in perf section, http://hbase.apache.org/book.html#perf.reading, doesn't jibe with his redo. Fix. His blog post is good source material: https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1 While at it, include note on HBASE-13071. It has a fat release note to use as input. [~misty] This one for you? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13697) update ref guide prereq support tables for 1.1 release train
[ https://issues.apache.org/jira/browse/HBASE-13697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546125#comment-14546125 ] Sean Busbey commented on HBASE-13697: - Huh. Maybe the site version hasn't been updated? update ref guide prereq support tables for 1.1 release train Key: HBASE-13697 URL: https://issues.apache.org/jira/browse/HBASE-13697 Project: HBase Issue Type: Task Affects Versions: 1.1.0 Reporter: Sean Busbey Priority: Blocker Fix For: 1.1.1 the ref guide doesn't have a listing for Java or Hadoop versions needed / supported for the 1.1 release series. -- This message was sent by Atlassian JIRA (v6.3.4#6332)