[jira] [Commented] (HBASE-13729) Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present
[ https://issues.apache.org/jira/browse/HBASE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574089#comment-14574089 ] stack commented on HBASE-13729: --- +1 on this patch. Will apply in morning unless objection. Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present -- Key: HBASE-13729 URL: https://issues.apache.org/jira/browse/HBASE-13729 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.0.1, 1.1.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Critical Attachments: 0001-HBASE-13729-Old-hbase.regionserver.global.memstore.u.patch, HBASE-13729.2.patch, HBASE-13729.3.patch, HBASE-13729.4.patch If hbase.regionserver.global.memstore.upperLimit or lowerLimit are present we should use them instead of hbase.regionserver.global.memstore.size or hbase.regionserver.global.memstore.size.lower.limit respectively. The current implementation of HeapMemorySizeUtil.getGlobalMemStorePercent() and getGlobalMemStoreLowerMark() asumes that if the new properties are not defined then we should use the old configurations, however the new properties are defined in hbase-default.xml which makes the old configuration names useless and this has a direct impact when doing a rolling upgrade and hbase-site.xml hasn't been changed to handle the new property names exclusively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-13828: -- Attachment: HBASE-13828-v1.patch Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13451) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators
[ https://issues.apache.org/jira/browse/HBASE-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574095#comment-14574095 ] stack commented on HBASE-13451: --- Carrying over +1 from RB though before commit, please review we have test coverage for the methods in new index readers... would be good to get bugs out in unit tests rather than out in production context. If sufficient coverage, +1 on commit. Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators -- Key: HBASE-13451 URL: https://issues.apache.org/jira/browse/HBASE-13451 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13451.patch, HBASE-13451_1.patch, HBASE-13451_2.patch After HBASE-10800 we could ensure that all the blockKeys in the BlockReader are converted to Cells (KeyOnlyKeyValue) so that we could use CellComparators. Note that this can be done only for the keys that are written using CellComparators and not for the ones using RawBytesComparator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574087#comment-14574087 ] stack commented on HBASE-13811: --- [~Apache9] JIRA was down so it took a while to respond bq. Fine, I think it will work. But I still feel a little nervous to have two methods which have same name but different behaviors... Makes sense. In this v7 patch, I made the two overloaded methods work the same and changed what happens in HRegion when we prepare to flush. bq. And I remember that, when implmenting HBASE-10201 and HBASE-12405, actually I wanted to return the flushedSeqId when calling startCacheFlush first. But there are two problems. First is getNextSequenceId method is in HRegion, not in FSHLog, so a simple solution is return NO_SEQ_NUM when flushing all stores and let HRegion call getNextSequenceId. Yes. That is how it 'works' in patch v6 but it is hard to read. We can actually tell when we are flushing if we should do all of the region, right? If the passed in families are null or equal in number to region stores, we are doing a full region flush so we should use the flush sequence id, the result of the getNextSequenceId call. Otherwise, we want the getEarliest for the region because are doing a column family only flush... bq. But here comes the second problem, startCacheFlush may fail which means we can not start a flush, so there are three types of return values, 'sequenceId', 'choose a sequenceId by yourself', 'give up flushing!'. I think it is ugly to have a '-2' or a null java.lang.Long to indicate a 'give up flushing' at that time so I gave up... Pardon me, I don't see the problem here? Your nice TestSplitWalDataLoss test was failing for me earlier because I was not doing the abort accounting properly; the 'restore' of old sequenceids. Abort of the flush will 'restore' the old sequenceids. The region flush id won't be updated. This is ok? bq. Maybe we could consider this solution again? getEarliestMemstoreSeqNum can be used everywhere but startCacheFlush is restricted in the flushing scope I think. I'd like to purge getEarliestMemstoreSeqNum or narrow its usage if possible. What do you mean by 'startCacheFlush is restricted'. Thanks Duo Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13811: -- Attachment: 13811.v7.branch-1.txt Fix the failing unit tests. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13337) Table regions are not assigning back, after restarting all regionservers at once.
[ https://issues.apache.org/jira/browse/HBASE-13337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574205#comment-14574205 ] Samir Ahmic commented on HBASE-13337: - Thanks for review [~stack], As far as i can see we have two options for fixing this issue and handling connection correctly: 1. Change default connection to non final and instead of creating new connection object recreate existing connection in ServerManager#getRsAdmin(), {code} 156- private final ClusterConnection connection; 156+ private ClusterConnection connection; {code} {code} + Configuration conf = master.getConfiguration(); + this.connection = (ClusterConnection) ConnectionFactory.createConnection(conf); {code} This connection will be closed when master is shutdown. 2. We can implement additional logic in ServerManager that will take care of creating new connection when rs is restarted and close/remove it when becomes staled. I have tested first option and issue is fixed. Which method we prefer ? Table regions are not assigning back, after restarting all regionservers at once. - Key: HBASE-13337 URL: https://issues.apache.org/jira/browse/HBASE-13337 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Priority: Blocker Fix For: 2.0.0 Attachments: HBASE-13337-v2.patch, HBASE-13337.patch Regions of the table are continouly in state=FAILED_CLOSE. {noformat} RegionState RIT time (ms) 8f62e819b356736053e06240f7f7c6fd t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113929 caf59209ae65ea80fca6bdc6996a7d68 t1,,1427362431330.caf59209ae65ea80fca6bdc6996a7d68. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM2,16040,1427362533691 113929 db52a74988f71e5cf257bbabf31f26f3 t1,,1427362431330.db52a74988f71e5cf257bbabf31f26f3. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM3,16040,1427362533691 113920 43f3a65b9f9ff283f598c5450feab1f8 t1,,1427362431330.43f3a65b9f9ff283f598c5450feab1f8. state=FAILED_CLOSE, ts=Thu Mar 26 15:05:36 IST 2015 (113s ago), server=VM1,16040,1427362531818 113920 {noformat} *Steps to reproduce:* 1. Start HBase cluster with more than one regionserver. 2. Create a table with precreated regions. (lets say 15 regions) 3. Make sure the regions are well balanced. 4. Restart all the Regionservers process at once across the cluster, except HMaster process 5. After restarting the Regionservers, successfully will connect to the HMaster. *Bug:* But no regions are assigning back to the Regionservers. *Master log shows as follows:* {noformat} 2015-03-26 15:05:36,201 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=OFFLINE, ts=1427362536106, server=VM2,16040,1427362242602} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,202 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_OPENsn=VM1,16040,1427362531818 2015-03-26 15:05:36,244 DEBUG [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Force region state offline {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStates: Transition {8f62e819b356736053e06240f7f7c6fd state=PENDING_OPEN, ts=1427362536201, server=VM1,16040,1427362531818} to {8f62e819b356736053e06240f7f7c6fd state=PENDING_CLOSE, ts=1427362536244, server=VM1,16040,1427362531818} 2015-03-26 15:05:36,244 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.RegionStateStore: Updating row t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd. with state=PENDING_CLOSE 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0] master.AssignmentManager: Server VM1,16040,1427362531818 returned java.nio.channels.ClosedChannelException for t1,,1427362431330.8f62e819b356736053e06240f7f7c6fd., try=1 of 10 2015-03-26 15:05:36,248 INFO [VM2,16020,1427362216887-GeneralBulkAssigner-0]
[jira] [Commented] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574261#comment-14574261 ] Ashish Singhi commented on HBASE-13828: --- Attached v1 patch for master branch to fix long lines. Attached all other version patches also. Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-0.98.patch, HBASE-13828-branch-1.0.patch, HBASE-13828-branch-1.1.patch, HBASE-13828-branch-1.patch, HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574274#comment-14574274 ] Hadoop QA commented on HBASE-13811: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737860/13811.v7.branch-1.txt against branch-1 branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737860 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 13 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14299//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14299//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14299//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14299//console This message is automatically generated. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574311#comment-14574311 ] Duo Zhang commented on HBASE-13811: --- {quote} We can actually tell when we are flushing if we should do all of the region, right? If the passed in families are null or equal in number to region stores, we are doing a full region flush so we should use the flush sequence id, the result of the getNextSequenceId call. Otherwise, we want the getEarliest for the region because are doing a column family only flush... {quote} If the stores which we do not flush are all empty, then we should still use the flush sequence id returned by getNextSequenceId(getEarliest should return NO_SEQ_NUM in this case). Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574336#comment-14574336 ] Hadoop QA commented on HBASE-13828: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737876/HBASE-13828-v1.patch against master branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737876 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14300//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14300//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14300//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14300//console This message is automatically generated. Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-0.98.patch, HBASE-13828-branch-1.0.patch, HBASE-13828-branch-1.1.patch, HBASE-13828-branch-1.patch, HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574227#comment-14574227 ] Duo Zhang commented on HBASE-13811: --- {quote} Pardon me, I don't see the problem here? {quote} Not a big problem... The only thing is I do not like java.lang.Long...I can prepare a patch to explain how to implement it. {quote} What do you mean by 'startCacheFlush is restricted'. {quote} 'startCacheFlush' is a exact name which tell us what it will do and I think people will only call it when they want to flush a region, so it is less hurt to change its behavior. 'getEarliestMemstoreSeqNum' is a more general name, people can use it everywhere when they want to get the value. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-13828: -- Attachment: HBASE-13828-branch-1.patch Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-0.98.patch, HBASE-13828-branch-1.0.patch, HBASE-13828-branch-1.1.patch, HBASE-13828-branch-1.patch, HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-13828: -- Attachment: HBASE-13828-0.98.patch Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-0.98.patch, HBASE-13828-branch-1.0.patch, HBASE-13828-branch-1.1.patch, HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-13828: -- Attachment: HBASE-13828-branch-1.1.patch HBASE-13828-branch-1.0.patch Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-0.98.patch, HBASE-13828-branch-1.0.patch, HBASE-13828-branch-1.1.patch, HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-13811: -- Attachment: startCacheFlush.diff Implement flushedSeqId calculation inside startCacheFlush. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13835) KeyValueHeap.current might be in heap when exception happens in pollRealKV
[ https://issues.apache.org/jira/browse/HBASE-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574380#comment-14574380 ] zhouyingchao commented on HBASE-13835: -- Thanks for the comment. Will try to add a unit test some time next week. KeyValueHeap.current might be in heap when exception happens in pollRealKV -- Key: HBASE-13835 URL: https://issues.apache.org/jira/browse/HBASE-13835 Project: HBase Issue Type: Bug Components: Scanners Reporter: zhouyingchao Assignee: zhouyingchao Attachments: HBASE-13835-001.patch In a 0.94 hbase cluster, we found a NPE with following stack: {code} Exception in thread regionserver21600.leaseChecker java.lang.NullPointerException at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1530) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:225) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:201) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:191) at java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:641) at java.util.PriorityQueue.siftDown(PriorityQueue.java:612) at java.util.PriorityQueue.poll(PriorityQueue.java:523) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:241) at org.apache.hadoop.hbase.regionserver.StoreScanner.close(StoreScanner.java:355) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:237) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.close(HRegion.java:4302) at org.apache.hadoop.hbase.regionserver.HRegionServer$ScannerListener.leaseExpired(HRegionServer.java:3033) at org.apache.hadoop.hbase.regionserver.Leases.run(Leases.java:119) at java.lang.Thread.run(Thread.java:662) {code} Before this NPE exception, there is an exception happens in pollRealKV, which we think is the culprit of the NPE. {code} ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for reader reader= at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:180) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(StoreFileScanner.java:371) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.pollRealKV(KeyValueHeap.java:366) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:116) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:455) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:154) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:4124) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:4196) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:4067) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:4057) at org.apache.hadoop.hbase.regionserver.HRegionServer.internalNext(HRegionServer.java:2898) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2833) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2815) at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Server.call(SecureRpcEngine.java:337) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1583) {code} Simply put, if there is an exception happens in pollRealKV( ), the KeyValueHeap.current might be in heap. Later on, when KeyValueHeap.close( ) is called, the current would be closed firstly. However, since it might still be in the heap, it would either be closed again or its peek() (which is null after it is closed) is called by the heap's poll(). Neither case is expected. Although it is caught in 0.94, it is still in the trunk from the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13451) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators
[ https://issues.apache.org/jira/browse/HBASE-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-13451: --- Status: Patch Available (was: Open) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators -- Key: HBASE-13451 URL: https://issues.apache.org/jira/browse/HBASE-13451 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13451.patch, HBASE-13451_1.patch, HBASE-13451_2.patch, HBASE-13451_3.patch After HBASE-10800 we could ensure that all the blockKeys in the BlockReader are converted to Cells (KeyOnlyKeyValue) so that we could use CellComparators. Note that this can be done only for the keys that are written using CellComparators and not for the ones using RawBytesComparator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13451) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators
[ https://issues.apache.org/jira/browse/HBASE-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-13451: --- Status: Open (was: Patch Available) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators -- Key: HBASE-13451 URL: https://issues.apache.org/jira/browse/HBASE-13451 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13451.patch, HBASE-13451_1.patch, HBASE-13451_2.patch, HBASE-13451_3.patch After HBASE-10800 we could ensure that all the blockKeys in the BlockReader are converted to Cells (KeyOnlyKeyValue) so that we could use CellComparators. Note that this can be done only for the keys that are written using CellComparators and not for the ones using RawBytesComparator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13828) Add group permissions testing coverage to AC.
[ https://issues.apache.org/jira/browse/HBASE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574447#comment-14574447 ] Hadoop QA commented on HBASE-13828: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737901/HBASE-13828-branch-1.patch against branch-1 branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737901 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14301//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14301//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14301//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14301//console This message is automatically generated. Add group permissions testing coverage to AC. - Key: HBASE-13828 URL: https://issues.apache.org/jira/browse/HBASE-13828 Project: HBase Issue Type: Improvement Reporter: Srikanth Srungarapu Assignee: Ashish Singhi Attachments: HBASE-13828-0.98.patch, HBASE-13828-branch-1.0.patch, HBASE-13828-branch-1.1.patch, HBASE-13828-branch-1.patch, HBASE-13828-v1.patch, HBASE-13828.patch We suffered a regression HBASE-13826 recently due to lack of testing coverage for group permissions for AC. With the recent perf boost provided by HBASE-13658, it wouldn't be a bad idea to add checks for group level users to applicable unit tests in TestAccessController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574483#comment-14574483 ] Hadoop QA commented on HBASE-13811: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737911/startCacheFlush.diff against master branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737911 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1921 checkstyle errors (more than the master's current 1920 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14302//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14302//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14302//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14302//console This message is automatically generated. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13451) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators
[ https://issues.apache.org/jira/browse/HBASE-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-13451: --- Attachment: HBASE-13451_3.patch Addresses review comments from Stack. Minor changes from the previous patches. [~saint@gmail.com] Thanks for the review. I verified every method in the BlockIndexreader and they have sufficient coverage. TestHFile, TestHfileWriterV2 and V3, TestSeekTo, TestcompoundBloomFilter addresses the coverage part. Regarding changing the scope to package private, I doubt it cannot be done because the BlockIndexReader is used in CompoundBloomFilter used in util package. Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators -- Key: HBASE-13451 URL: https://issues.apache.org/jira/browse/HBASE-13451 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13451.patch, HBASE-13451_1.patch, HBASE-13451_2.patch, HBASE-13451_3.patch After HBASE-10800 we could ensure that all the blockKeys in the BlockReader are converted to Cells (KeyOnlyKeyValue) so that we could use CellComparators. Note that this can be done only for the keys that are written using CellComparators and not for the ones using RawBytesComparator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs
[ https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574445#comment-14574445 ] ramkrishna.s.vasudevan commented on HBASE-12295: Ping for reviews!!!. Prevent block eviction under us if reads are in progress from the BBs - Key: HBASE-12295 URL: https://issues.apache.org/jira/browse/HBASE-12295 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, HBASE-12295_2.patch, HBASE-12295_trunk.patch While we try to serve the reads from the BBs directly from the block cache, we need to ensure that the blocks does not get evicted under us while reading. This JIRA is to discuss and implement a strategy for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Open (was: Patch Available) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Attachment: HBASE-13755-v2.patch here's v2 with notes addressed, how's it? Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Patch Available (was: Open) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13850) Check for dead server on CallTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13850: Attachment: TestGetPerf.java HBASE-13850-v0.patch Check for dead server on CallTimeoutException - Key: HBASE-13850 URL: https://issues.apache.org/jira/browse/HBASE-13850 Project: HBase Issue Type: Improvement Components: Client, MTTR Affects Versions: 2.0.0, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Attachments: HBASE-13850-v0.patch, TestGetPerf.java WARN this may be a misconf, so let me know if there is a timeout param to set. {noformat} hbase-site.xml zookeeper.session.timeout 1 hbase.regionserver.storefile.refresh.period 1 hbase.client.operation.timeout 5000 hbase.client.meta.operation.timeout 5000 hbase.client.scanner.timeout.period 1 hbase.regionserver.lease.period 1 {noformat} I have a test that does a kill STOP on a RS and tries to query it. From the conf the zk lease is 10sec, and the master is correctly doing the reassign after 10sec and meta is updated. the client keep trying to query the RS for a specific row until it get a response. The table.get(row) in the loop throws a CallTimeoutException every 5sec (which is the configured settings). but instead of succeed after 2/3 retries ( 10sec where the master reassign) it keeps retrying up to 60sec (I don't know what that 60sec is, maybe a conf param that I'm not able to find) one simple fix in the code is handling the CallTimeoutException in RegionServerCallable and clear the meta cache for that RS that is not responding. (but maybe there is already a conf to set to reduce that 60sec period) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
[ https://issues.apache.org/jira/browse/HBASE-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574932#comment-14574932 ] Hadoop QA commented on HBASE-13848: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737967/HBASE-13848.1.patch against master branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737967 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14305//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14305//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14305//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14305//console This message is automatically generated. Access InfoServer SSL passwords through Credential Provder API -- Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey Attachments: HBASE-13848.1.patch HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574930#comment-14574930 ] Sean Busbey commented on HBASE-13849: - I'm conflicted. This sounds like operational compatibility to me, which would mean no-go for 1.0 and 1.1. But I agree with the security concern, especially since restore is allowed and will be destructive. Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574915#comment-14574915 ] Sean Busbey commented on HBASE-13849: - what's the target version for this change? Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574921#comment-14574921 ] Matteo Bertozzi commented on HBASE-13849: - everything Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
[ https://issues.apache.org/jira/browse/HBASE-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574974#comment-14574974 ] Sean Busbey commented on HBASE-13848: - ran TestDistributedLogSplitting locally and it failed once and succeeded twice. I don't think it's related, but checking a bit more. Access InfoServer SSL passwords through Credential Provder API -- Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey Attachments: HBASE-13848.1.patch HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13834: - Attachment: (was: EvictCountBug.patch) Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13834: - Attachment: HBASE-13834-v1.patch Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13847) getWriteRequestCount function in HRegionServer uses int variable to return the count.
[ https://issues.apache.org/jira/browse/HBASE-13847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhilash updated HBASE-13847: - Labels: easyfix (was: ) getWriteRequestCount function in HRegionServer uses int variable to return the count. - Key: HBASE-13847 URL: https://issues.apache.org/jira/browse/HBASE-13847 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.1, 1.2.0, 1.1.1 Attachments: HBASE-13847.patch, screenshot-1.png Variable used to return the value of getWriteRequestCount is int, must be long. I think it causes cluster UI to show negative Write Request Count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13850) Check for dead server on CallTimeoutException
Matteo Bertozzi created HBASE-13850: --- Summary: Check for dead server on CallTimeoutException Key: HBASE-13850 URL: https://issues.apache.org/jira/browse/HBASE-13850 Project: HBase Issue Type: Improvement Components: Client, MTTR Affects Versions: 2.0.0, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor WARN this may be a misconf, so let me know if there is a timeout param to set. {noformat} hbase-site.xml zookeeper.session.timeout 1 hbase.regionserver.storefile.refresh.period 1 hbase.client.operation.timeout 5000 hbase.client.meta.operation.timeout 5000 hbase.client.scanner.timeout.period 1 hbase.regionserver.lease.period 1 {noformat} I have a test that does a kill STOP on a RS and tries to query it. From the conf the zk lease is 10sec, and the master is correctly doing the reassign after 10sec and meta is updated. the client keep trying to query the RS for a specific row until it get a response. The table.get(row) in the loop throws a CallTimeoutException every 5sec (which is the configured settings). but instead of succeed after 2/3 retries ( 10sec where the master reassign) it keeps retrying up to 60sec (I don't know what that 60sec is, maybe a conf param that I'm not able to find) one simple fix in the code is handling the CallTimeoutException in RegionServerCallable and clear the meta cache for that RS that is not responding. (but maybe there is already a conf to set to reduce that 60sec period) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
[ https://issues.apache.org/jira/browse/HBASE-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13848: Attachment: HBASE-13848.1.patch Access InfoServer SSL passwords through Credential Provder API -- Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey Attachments: HBASE-13848.1.patch HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13378) RegionScannerImpl synchronized for READ_UNCOMMITTED Isolation Levels
[ https://issues.apache.org/jira/browse/HBASE-13378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574735#comment-14574735 ] John Leach commented on HBASE-13378: I will take a closer look this weekend and see if I can get it to not change the guarantee while removing the synchronization. RegionScannerImpl synchronized for READ_UNCOMMITTED Isolation Levels Key: HBASE-13378 URL: https://issues.apache.org/jira/browse/HBASE-13378 Project: HBase Issue Type: New Feature Reporter: John Leach Assignee: John Leach Priority: Minor Attachments: HBASE-13378.patch, HBASE-13378.txt Original Estimate: 2h Time Spent: 2h Remaining Estimate: 0h This block of code below coupled with the close method could be changed so that READ_UNCOMMITTED does not synchronize. {CODE:JAVA} // synchronize on scannerReadPoints so that nobody calculates // getSmallestReadPoint, before scannerReadPoints is updated. IsolationLevel isolationLevel = scan.getIsolationLevel(); synchronized(scannerReadPoints) { this.readPt = getReadpoint(isolationLevel); scannerReadPoints.put(this, this.readPt); } {CODE} This hotspots for me under heavy get requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13849) Remove restore and clone snapshot from the WebUI
Matteo Bertozzi created HBASE-13849: --- Summary: Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.98.13, 1.1.0, 1.0.1, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13735) race condition for web interface during master start up
[ https://issues.apache.org/jira/browse/HBASE-13735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574725#comment-14574725 ] Sean Busbey commented on HBASE-13735: - {code} - final ZooKeeperProtos.Master master = parse(this.getData(false)); + byte[] data = this.getData(false); {code} the variable data should be final. {code} assert active_master != null : Failed to retrieve master's ServerName!; +if (active_master == null) { +/%java +div class=row inner_header +div class=page-header +h1smallMaster address temporarily unavailable/small/h1 +/div +/div +%java + return; +} int infoPort = (masterAddressTracker == null) ? 0 : masterAddressTracker.getMasterInfoPort(); /%java {code} * Please remove the assert, since we are now accounting for when the answer is null. * Please refactor so that we either include the master is unavailable bit or the other information about the master, rather than adding a return statement. race condition for web interface during master start up --- Key: HBASE-13735 URL: https://issues.apache.org/jira/browse/HBASE-13735 Project: HBase Issue Type: Bug Components: master Affects Versions: 1.0.1 Reporter: Sean Busbey Assignee: Pankaj Kumar Priority: Minor Attachments: HBASE-13735.patch loaded the master web page while the master was starting up and managed to hit a HTTP 500 with a NPE. {code} java.lang.NullPointerException at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.parse(MasterAddressTracker.java:236) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterInfoPort(MasterAddressTracker.java:86) at org.apache.hadoop.hbase.tmpl.master.BackupMasterStatusTmplImpl.renderNoFlush(BackupMasterStatusTmplImpl.java:53) at org.apache.hadoop.hbase.tmpl.master.BackupMasterStatusTmpl.renderNoFlush(BackupMasterStatusTmpl.java:113) at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.renderNoFlush(MasterStatusTmplImpl.java:309) at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.renderNoFlush(MasterStatusTmpl.java:373) at org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.render(MasterStatusTmpl.java:364) at org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusServlet.java:81) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1351) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.hbase.http.NoCacheFilter.doFilter(NoCacheFilter.java:49) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) {code} -- This
[jira] [Updated] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13849: Status: Patch Available (was: Open) Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.98.13, 1.1.0, 1.0.1, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13849: Attachment: HBASE-13849-v0.patch Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
[ https://issues.apache.org/jira/browse/HBASE-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-13848: Status: Patch Available (was: Open) Access InfoServer SSL passwords through Credential Provder API -- Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey Attachments: HBASE-13848.1.patch HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13378) RegionScannerImpl synchronized for READ_UNCOMMITTED Isolation Levels
[ https://issues.apache.org/jira/browse/HBASE-13378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574733#comment-14574733 ] Sean Busbey commented on HBASE-13378: - I agree that it should not go into the 1.0 and 1.1 lines. I'd be fine with a release noted change in behavior on 1.2. (though perhaps we should ask on user@ how surprising this will be?) RegionScannerImpl synchronized for READ_UNCOMMITTED Isolation Levels Key: HBASE-13378 URL: https://issues.apache.org/jira/browse/HBASE-13378 Project: HBase Issue Type: New Feature Reporter: John Leach Assignee: John Leach Priority: Minor Attachments: HBASE-13378.patch, HBASE-13378.txt Original Estimate: 2h Time Spent: 2h Remaining Estimate: 0h This block of code below coupled with the close method could be changed so that READ_UNCOMMITTED does not synchronize. {CODE:JAVA} // synchronize on scannerReadPoints so that nobody calculates // getSmallestReadPoint, before scannerReadPoints is updated. IsolationLevel isolationLevel = scan.getIsolationLevel(); synchronized(scannerReadPoints) { this.readPt = getReadpoint(isolationLevel); scannerReadPoints.put(this, this.readPt); } {CODE} This hotspots for me under heavy get requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14574859#comment-14574859 ] Hadoop QA commented on HBASE-13755: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737956/HBASE-13755-v2.patch against master branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737956 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 16 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1921 checkstyle errors (more than the master's current 1920 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14304//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14304//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14304//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14304//console This message is automatically generated. Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
Sean Busbey created HBASE-13848: --- Summary: Access InfoServer SSL passwords through Credential Provder API Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-13779) Calling table.exists() before table.get() end up with an empty Result
[ https://issues.apache.org/jira/browse/HBASE-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi resolved HBASE-13779. - Resolution: Fixed Calling table.exists() before table.get() end up with an empty Result - Key: HBASE-13779 URL: https://issues.apache.org/jira/browse/HBASE-13779 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0, 1.2.0, 0.98.12.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1, 0.98.13 Attachments: 13779-addendum.txt, HBASE-13779-addendum.patch, HBASE-13779-test.patch, HBASE-13779-v0.patch, HBASE-13779-v0.patch If we call exists() before a get() the result returned by the get() will be empty. simple test to verify it: {code} Put put = new Put(ROW); put.add(FAMILY, QUALIFIER, VALUE); table.put(put); Get get = new Get(ROW); boolean exist = table.exists(get); exist = table.exists(get); assertEquals(true, exist); Result result = table.get(get); // this will fail saying that the Result is empty // if we remove the exist everything is fine assertEquals(false, result.isEmpty()); assertTrue(Bytes.equals(VALUE, result.getValue(FAMILY, QUALIFIER))); {code} if we use a different Get instance for the get everything works {code} ... get = new Get(ROW); Result result = table.get(get); assertEquals(false, result.isEmpty()); {code} HTable.exists() set the checkExistenceOnly flag in the Get so that object is not reusable by a table.get() {code} public boolean exists(final Get get) throws IOException { get.setCheckExistenceOnly(true); Result r = get(get); assert r.getExists() != null; return r.getExists(); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13729) Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present
[ https://issues.apache.org/jira/browse/HBASE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13729: -- Resolution: Fixed Fix Version/s: 1.1.1 1.2.0 1.0.2 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to branch-1.0+ (FYI [~enis] and [~ndimiduk]). Thanks for the patch [~esteban] Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present -- Key: HBASE-13729 URL: https://issues.apache.org/jira/browse/HBASE-13729 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.0.1, 1.1.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Critical Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: 0001-HBASE-13729-Old-hbase.regionserver.global.memstore.u.patch, HBASE-13729.2.patch, HBASE-13729.3.patch, HBASE-13729.4.patch If hbase.regionserver.global.memstore.upperLimit or lowerLimit are present we should use them instead of hbase.regionserver.global.memstore.size or hbase.regionserver.global.memstore.size.lower.limit respectively. The current implementation of HeapMemorySizeUtil.getGlobalMemStorePercent() and getGlobalMemStoreLowerMark() asumes that if the new properties are not defined then we should use the old configurations, however the new properties are defined in hbase-default.xml which makes the old configuration names useless and this has a direct impact when doing a rolling upgrade and hbase-site.xml hasn't been changed to handle the new property names exclusively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13729) Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present
[ https://issues.apache.org/jira/browse/HBASE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575245#comment-14575245 ] Hudson commented on HBASE-13729: FAILURE: Integrated in HBase-TRUNK #6547 (See [https://builds.apache.org/job/HBase-TRUNK/6547/]) HBASE-13729 Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present (Esteban Guitierrez) (stack: rev c1be65ecf095157dc4112429af23916b96aafb95) * hbase-common/src/main/resources/hbase-default.xml Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present -- Key: HBASE-13729 URL: https://issues.apache.org/jira/browse/HBASE-13729 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.0.1, 1.1.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Critical Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: 0001-HBASE-13729-Old-hbase.regionserver.global.memstore.u.patch, HBASE-13729.2.patch, HBASE-13729.3.patch, HBASE-13729.4.patch If hbase.regionserver.global.memstore.upperLimit or lowerLimit are present we should use them instead of hbase.regionserver.global.memstore.size or hbase.regionserver.global.memstore.size.lower.limit respectively. The current implementation of HeapMemorySizeUtil.getGlobalMemStorePercent() and getGlobalMemStoreLowerMark() asumes that if the new properties are not defined then we should use the old configurations, however the new properties are defined in hbase-default.xml which makes the old configuration names useless and this has a direct impact when doing a rolling upgrade and hbase-site.xml hasn't been changed to handle the new property names exclusively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575272#comment-14575272 ] Gabor Liptak commented on HBASE-13666: -- As per MNG-5839, we are to call post-site (instead of site). I'm uploading an updated patch. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13811: -- Attachment: 13811.v8.branch-1.txt Integrated your suggestion [~Apache9] of having the startCacheFlush do the sequence id calculation. I then went further and removed the last use of getEarliest for region (in close -- seemed like we were going long way around getting closed region sequence id) and then deprecated the method altogether; its operation is subtle and shouldn't be exposed as public method. Added tests for startCacheFlush's new operation. Let me try this on cluster. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, 13811.v8.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575248#comment-14575248 ] Hadoop QA commented on HBASE-13834: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12738010/HBASE-13834-v1.patch against master branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12738010 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14307//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14307//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14307//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14307//console This message is automatically generated. Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
[ https://issues.apache.org/jira/browse/HBASE-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575328#comment-14575328 ] Sean Busbey commented on HBASE-13848: - test failed 3/10 runs without the patch, so I'm pretty sure it isn't related. I'll try to git-bisect tonight. Access InfoServer SSL passwords through Credential Provder API -- Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey Attachments: HBASE-13848.1.patch HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13789) ForeignException should not be sent to the client
[ https://issues.apache.org/jira/browse/HBASE-13789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-13789: Resolution: Fixed Fix Version/s: 1.1.1 1.2.0 1.0.2 0.98.14 2.0.0 Status: Resolved (was: Patch Available) ForeignException should not be sent to the client - Key: HBASE-13789 URL: https://issues.apache.org/jira/browse/HBASE-13789 Project: HBase Issue Type: Bug Components: Client, master Affects Versions: 2.0.0, 0.98.13, 1.0.1.1, 1.1.0.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13789-v0.patch ForeignException is in hbase-server so the client will not be able to deserialize it, and also it will hide the DoNotRetryException of the cause. I haven't found an easy way to test it, aside manually looking at the logs. and this stuff will go away with proc-v2. so for now the easy workaround is catch the ForeignException in the master which are just the few places related to proc-v1 and throw the cause to the client -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13811: -- Attachment: 13811.v9.branch-1.txt Missed a change. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, 13811.v8.branch-1.txt, 13811.v9.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13451) Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators
[ https://issues.apache.org/jira/browse/HBASE-13451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575161#comment-14575161 ] stack commented on HBASE-13451: --- bq. Regarding changing the scope to package private, I doubt it cannot be done because the BlockIndexReader is used in CompoundBloomFilter used in util package. Sounds like CompoundBloomFilters should be moved then. It doesn't seem like a 'util' type thing.. more io. Make the HFileBlockIndex blockKeys to Cells so that it could be easy to use in the CellComparators -- Key: HBASE-13451 URL: https://issues.apache.org/jira/browse/HBASE-13451 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13451.patch, HBASE-13451_1.patch, HBASE-13451_2.patch, HBASE-13451_3.patch After HBASE-10800 we could ensure that all the blockKeys in the BlockReader are converted to Cells (KeyOnlyKeyValue) so that we could use CellComparators. Note that this can be done only for the keys that are written using CellComparators and not for the ones using RawBytesComparator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575109#comment-14575109 ] Hadoop QA commented on HBASE-13811: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12738020/13811.v8.branch-1.txt against branch-1 branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12738020 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 16 new or modified tests. {color:red}-1 javac{color}. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.1. Compilation errors resume: [ERROR] COMPILATION ERROR : [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java:[85,17] error: DisabledWAL is not abstract and does not override abstract method startCacheFlush(byte[],Setbyte[]) in WAL [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java:[190,19] error: startCacheFlush(byte[],Setbyte[]) in DisabledWAL cannot implement startCacheFlush(byte[],Setbyte[]) in WAL [ERROR] return type boolean is not compatible with Long [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java:[85,17] error: DisabledWAL is not abstract and does not override abstract method startCacheFlush(byte[],Setbyte[]) in WAL [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java:[190,19] error: startCacheFlush(byte[],Setbyte[]) in DisabledWAL cannot implement startCacheFlush(byte[],Setbyte[]) in WAL [ERROR] return type boolean is not compatible with Long [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/DisabledWALProvider.java:[189,4] error: method does not override or implement a method from a supertype [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hbase-server Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14308//console This message is automatically generated. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, 13811.v8.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Patch Available (was: Open) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Attachment: HBASE-13755-v3.patch fixed checkstyle (made Superusers class final, though it's a nit really) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Open (was: Patch Available) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13827) Delayed scanner close in KeyValueHeap and StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-13827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13827: -- Attachment: HBASE-13827.patch Retry before [~anoop.hbase] wakes up Delayed scanner close in KeyValueHeap and StoreScanner -- Key: HBASE-13827 URL: https://issues.apache.org/jira/browse/HBASE-13827 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-13827.patch, HBASE-13827.patch, HBASE-13827.patch This is to support the work in HBASE-12295. We have to return the blocks when the close() happens on the HFileScanner. Right now close is not at all there. Will add. The StoreFileScanner will call it on its close(). In KVHeap when we see one of the child scanner runs out of cells, we will remove them from the PriorityQueue as well as close it. Also the same kind of stuff in StoreScanner too. But when we want to do the return block in close() this kind of early close is not correct. Still there might be cells created out of these cached blocks. This Jira aims at changing these container scanners not to do early close. When it seems a child scanner no longer required, it will avoid using it completely but just wont call close(). Instead it will be added to another list for a delayed close and that will be closed when the container scanner close() happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Liptak updated HBASE-13666: - Attachment: HBASE-13666.2.patch book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13729) Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present
[ https://issues.apache.org/jira/browse/HBASE-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575294#comment-14575294 ] Hudson commented on HBASE-13729: FAILURE: Integrated in HBase-1.0 #947 (See [https://builds.apache.org/job/HBase-1.0/947/]) HBASE-13729 Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present (Esteban Guitierrez) (stack: rev 9dd5e0ce984f16ca0616ed4b36e6f3aae35e912c) * hbase-common/src/main/resources/hbase-default.xml Old hbase.regionserver.global.memstore.upperLimit and lowerLimit properties are ignored if present -- Key: HBASE-13729 URL: https://issues.apache.org/jira/browse/HBASE-13729 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.0.1, 1.1.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Priority: Critical Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: 0001-HBASE-13729-Old-hbase.regionserver.global.memstore.u.patch, HBASE-13729.2.patch, HBASE-13729.3.patch, HBASE-13729.4.patch If hbase.regionserver.global.memstore.upperLimit or lowerLimit are present we should use them instead of hbase.regionserver.global.memstore.size or hbase.regionserver.global.memstore.size.lower.limit respectively. The current implementation of HeapMemorySizeUtil.getGlobalMemStorePercent() and getGlobalMemStoreLowerMark() asumes that if the new properties are not defined then we should use the old configurations, however the new properties are defined in hbase-default.xml which makes the old configuration names useless and this has a direct impact when doing a rolling upgrade and hbase-site.xml hasn't been changed to handle the new property names exclusively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575315#comment-14575315 ] Apekshit Sharma commented on HBASE-13702: - Does the patch looks good? I feel like it's ready for commit. ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma Assignee: Apekshit Sharma Attachments: HBASE-13702.patch ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13848) Access InfoServer SSL passwords through Credential Provder API
[ https://issues.apache.org/jira/browse/HBASE-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575100#comment-14575100 ] Sean Busbey commented on HBASE-13848: - test failed 7/10 runs. checking without patch. Access InfoServer SSL passwords through Credential Provder API -- Key: HBASE-13848 URL: https://issues.apache.org/jira/browse/HBASE-13848 Project: HBase Issue Type: Improvement Components: security Reporter: Sean Busbey Assignee: Sean Busbey Attachments: HBASE-13848.1.patch HBASE-11810 took care of getting our SSL passwords out of the Hadoop Credential Provider API, but we also get several out of clear text configuration for the InfoServer class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13849) Remove restore and clone snapshot from the WebUI
[ https://issues.apache.org/jira/browse/HBASE-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575105#comment-14575105 ] Hadoop QA commented on HBASE-13849: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12737990/HBASE-13849-v0.patch against master branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12737990 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14306//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14306//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14306//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14306//console This message is automatically generated. Remove restore and clone snapshot from the WebUI Key: HBASE-13849 URL: https://issues.apache.org/jira/browse/HBASE-13849 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 1.0.1, 1.1.0, 0.98.13, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-13849-v0.patch Remove the clone and restore snapshot buttons from the WebUI. first reason, is that the operation may be too long for having the user wait on the WebUI. second reason is that an action from the webUI does not play well with security. since it is going to be executed by the hbase user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13827) Delayed scanner close in KeyValueHeap and StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-13827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575229#comment-14575229 ] stack commented on HBASE-13827: --- LGTM +1 Delayed scanner close in KeyValueHeap and StoreScanner -- Key: HBASE-13827 URL: https://issues.apache.org/jira/browse/HBASE-13827 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-13827.patch, HBASE-13827.patch This is to support the work in HBASE-12295. We have to return the blocks when the close() happens on the HFileScanner. Right now close is not at all there. Will add. The StoreFileScanner will call it on its close(). In KVHeap when we see one of the child scanner runs out of cells, we will remove them from the PriorityQueue as well as close it. Also the same kind of stuff in StoreScanner too. But when we want to do the return block in close() this kind of early close is not correct. Still there might be cells created out of these cached blocks. This Jira aims at changing these container scanners not to do early close. When it seems a child scanner no longer required, it will avoid using it completely but just wont call close(). Instead it will be added to another list for a delayed close and that will be closed when the container scanner close() happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Patch Available (was: Open) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Open (was: Patch Available) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs
[ https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575331#comment-14575331 ] stack commented on HBASE-12295: --- bq. Then thinking more and more I landed in all these kind of practical issues... bq. Am I explaining it clearly now? Ok [~anoop.hbase] A close on Cell won't fly. Agree. Looking at last patch now... Prevent block eviction under us if reads are in progress from the BBs - Key: HBASE-12295 URL: https://issues.apache.org/jira/browse/HBASE-12295 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-12295.pdf, HBASE-12295_1.patch, HBASE-12295_1.pdf, HBASE-12295_2.patch, HBASE-12295_trunk.patch While we try to serve the reads from the BBs directly from the block cache, we need to ensure that the blocks does not get evicted under us while reading. This JIRA is to discuss and implement a strategy for the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Liptak updated HBASE-13666: - Release Note: Correct PDF renaming and bump version of maven-antrun-plugin Status: Patch Available (was: Open) book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13827) Delayed scanner close in KeyValueHeap and StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-13827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575457#comment-14575457 ] Hadoop QA commented on HBASE-13827: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12738053/HBASE-13827.patch against master branch at commit c1be65ecf095157dc4112429af23916b96aafb95. ATTACHMENT ID: 12738053 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14311//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14311//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14311//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14311//console This message is automatically generated. Delayed scanner close in KeyValueHeap and StoreScanner -- Key: HBASE-13827 URL: https://issues.apache.org/jira/browse/HBASE-13827 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-13827.patch, HBASE-13827.patch, HBASE-13827.patch This is to support the work in HBASE-12295. We have to return the blocks when the close() happens on the HFileScanner. Right now close is not at all there. Will add. The StoreFileScanner will call it on its close(). In KVHeap when we see one of the child scanner runs out of cells, we will remove them from the PriorityQueue as well as close it. Also the same kind of stuff in StoreScanner too. But when we want to do the return block in close() this kind of early close is not correct. Still there might be cells created out of these cached blocks. This Jira aims at changing these container scanners not to do early close. When it seems a child scanner no longer required, it will avoid using it completely but just wont call close(). Instead it will be added to another list for a delayed close and that will be closed when the container scanner close() happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13702) ImportTsv: Add dry-run functionality and log bad rows
[ https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575335#comment-14575335 ] Srikanth Srungarapu commented on HBASE-13702: - Let's see how much overhead does adding the dry-run functionality will add to the original code. Can you please also come up with timings without the patch for the experiments you have mentioned in [this comment|https://issues.apache.org/jira/browse/HBASE-13702?focusedCommentId=14568310page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14568310]? ImportTsv: Add dry-run functionality and log bad rows - Key: HBASE-13702 URL: https://issues.apache.org/jira/browse/HBASE-13702 Project: HBase Issue Type: New Feature Reporter: Apekshit Sharma Assignee: Apekshit Sharma Attachments: HBASE-13702.patch ImportTSV job skips bad records by default (keeps a count though). -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is encountered. To be easily able to determine which rows are corrupted in an input, rather than failing on one row at a time seems like a good feature to have. Moreover, there should be 'dry-run' functionality in such kinds of tools, which can essentially does a quick run of tool without making any changes but reporting any errors/warnings and success/failure. To identify corrupted rows, simply logging them should be enough. In worst case, all rows will be logged and size of logs will be same as input size, which seems fine. However, user might have to do some work figuring out where the logs. Is there some link we can show to the user when the tool starts which can help them with that? For the dry run, we can simply use if-else to skip over writing out KVs, and any other mutations, if present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575373#comment-14575373 ] Hadoop QA commented on HBASE-13811: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12738033/13811.v9.branch-1.txt against branch-1 branch at commit 67c463f63e3c48efd9e1281166b707a3a806a2ad. ATTACHMENT ID: 12738033 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 16 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14309//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14309//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14309//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14309//console This message is automatically generated. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, 13811.v8.branch-1.txt, 13811.v9.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575398#comment-14575398 ] Duo Zhang commented on HBASE-13811: --- +1 on the latest version. Hope it can pass ITBLL. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, 13811.v8.branch-1.txt, 13811.v9.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits - DATALOSS
[ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575438#comment-14575438 ] stack commented on HBASE-13811: --- Thanks [~Apache9] It passed runs that failed before. Let me do an overnighter before commit. Splitting WALs, we are filtering out too many edits - DATALOSS --- Key: HBASE-13811 URL: https://issues.apache.org/jira/browse/HBASE-13811 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.0.0, 1.2.0 Reporter: stack Assignee: stack Priority: Critical Fix For: 2.0.0, 1.2.0 Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt, 13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt, 13811.v6.branch-1.txt, 13811.v6.branch-1.txt, 13811.v7.branch-1.txt, 13811.v8.branch-1.txt, 13811.v9.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch, startCacheFlush.diff I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place so can only think it the cause (but cannot see how). When we split the logs, we are skipping legit edits. Digging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575514#comment-14575514 ] Gabor Liptak commented on HBASE-13666: -- Yes, brought the plugin to current version. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575513#comment-14575513 ] Gabor Liptak commented on HBASE-13666: -- Yes, brought the plugin to current version. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575517#comment-14575517 ] Gabor Liptak commented on HBASE-13666: -- Yes, brought the plugin to current version. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575515#comment-14575515 ] Gabor Liptak commented on HBASE-13666: -- Yes, brought the plugin to current version. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575518#comment-14575518 ] Gabor Liptak commented on HBASE-13666: -- Yes, brought the plugin to current version. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13666) book.pdf is not renamed during site build
[ https://issues.apache.org/jira/browse/HBASE-13666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575516#comment-14575516 ] Gabor Liptak commented on HBASE-13666: -- Yes, brought the plugin to current version. book.pdf is not renamed during site build - Key: HBASE-13666 URL: https://issues.apache.org/jira/browse/HBASE-13666 Project: HBase Issue Type: Task Components: site Reporter: Nick Dimiduk Attachments: HBASE-13666.1.patch, HBASE-13666.2.patch Noticed this while testing HBASE-13665. Looks like the post-site hook is broken or not executed, so the file {{book.pdf}} is not copied over to the expected name {{apache_hbase_reference_guide.pdf}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13851) RpcClientImpl.close() can hang with cancelled replica RPCs
[ https://issues.apache.org/jira/browse/HBASE-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13851: -- Attachment: hbase-13851_v1.patch Attaching a patch which fixes the hang issue in my tests. Ran the IT 20 times without hanging. The patch interrupts the CallSender thread as well, and in case Connection is not started, it calls close() which will remove it from the connections list. RpcClientImpl.close() can hang with cancelled replica RPCs -- Key: HBASE-13851 URL: https://issues.apache.org/jira/browse/HBASE-13851 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.1 Attachments: hbase-13851_v1.patch We have seen the clients hanging in running the test {{IntegrationTestRegionReplicaPerf}} in 1.1 code base during the test.The jstack gives: {code} IPC Client (1344340481) connection to os-enis-dal-test-jun-4-1.openstacklocal/172.22.80.25:16020 from root - writer daemon prio=10 tid=0x7f3891b29800 nid=0x7345 waiting on condition [0x7f3865647000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x00070d54a240 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$CallSender.run(RpcClientImpl.java:253) TestClient-3 prio=10 tid=0x7f3892660800 nid=0x63b0 waiting on condition [0x7f386ecdd000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.ipc.RpcClientImpl.close(RpcClientImpl.java:1139) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2371) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2384) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:1036) at org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest.testTakedown(PerformanceEvaluation.java:1351) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1055) at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1612) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:410) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:405) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13851) RpcClientImpl.close() can hang with cancelled replica RPCs
[ https://issues.apache.org/jira/browse/HBASE-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13851: -- Status: Patch Available (was: Open) RpcClientImpl.close() can hang with cancelled replica RPCs -- Key: HBASE-13851 URL: https://issues.apache.org/jira/browse/HBASE-13851 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.1 Attachments: hbase-13851_v1.patch We have seen the clients hanging in running the test {{IntegrationTestRegionReplicaPerf}} in 1.1 code base during the test.The jstack gives: {code} IPC Client (1344340481) connection to os-enis-dal-test-jun-4-1.openstacklocal/172.22.80.25:16020 from root - writer daemon prio=10 tid=0x7f3891b29800 nid=0x7345 waiting on condition [0x7f3865647000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x00070d54a240 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$CallSender.run(RpcClientImpl.java:253) TestClient-3 prio=10 tid=0x7f3892660800 nid=0x63b0 waiting on condition [0x7f386ecdd000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.ipc.RpcClientImpl.close(RpcClientImpl.java:1139) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2371) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2384) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:1036) at org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest.testTakedown(PerformanceEvaluation.java:1351) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1055) at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1612) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:410) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:405) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575549#comment-14575549 ] Abhilash commented on HBASE-13834: -- Thanks a lot [~eclark] for introducing me to HBase. Thanks to [~ted_yu] and [~anoop.hbase] for your reviews. Really excited to contribute to HBase further ^_^ . Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Attachment: HBASE-13755-v3.patch looks like the patch got stuck after weird protoc error, re-attaching exact same patch file.. Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Open (was: Patch Available) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Attachment: (was: HBASE-13755-v3.patch) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Open (was: Patch Available) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575526#comment-14575526 ] Anoop Sam John commented on HBASE-13834: The zombie seems not at all related to HBase tests. As per here https://builds.apache.org/job/PreCommit-HBASE-Build/14307/console, jstack there are threads related to derby. +1 for the patch. Will commit Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13851) RpcClientImpl.close() can hang with cancelled replica RPCs
[ https://issues.apache.org/jira/browse/HBASE-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575524#comment-14575524 ] Enis Soztutar commented on HBASE-13851: --- Here is an explanation of what is happening for the brave souls: The RpcClientImpl just hangs in close() after interrupting every Connection thread that is running. We have the Connection thread and the CallSender thread per RS. CallSender is only started if we are using specifiThreadForWriting (enabled for replica reads). The CallSender thread is started in Connection constructor, while the Connection thread itself is started only after the setupIOStreams() is successful. setupIOStreams() is called only in the case of a call being written. RPCClientImpl keeps a map of Connection objects. A new Rpc Call will create a new Connection object and add it to the map if needed. On RpcClient.close() it interrupts all Connections and waits until all Connections in the map are removed. Normally, the Connection thread after getting an interruption will call markClosed() and then the Thread run loop will end which as a last operation will call close(). Connection.close() removes the Connection from the RpcClient's connections map. If a replica RPC is performed, a Connection object is constructed, and added to the map. Normally the Rpc Call is handled by the RpcSender thread which is already running, and it will setupIOStreams() and depending on whether an exception or not, it will either start the Connection thread or call Connection.close() which will remove the Connection from the map. In a rare case, a new Connection can be created, but before CallSender sends the RPC call and for that sets up IO streams and starts the Connection thread, the RPC may be cancelled if another replica responded first. Previously we were not canceling the RPC, but after HBASE-12668, the cancelation is happening which will cause the Connection thread to not start at all if there are no more RPCs coming. In this case, since there is no Connection thread running, the RpcClientImpl.close() will not be able to interrupt the thread (since it is not running), and Connection.close() will never be called. RpcClientImpl.close() can hang with cancelled replica RPCs -- Key: HBASE-13851 URL: https://issues.apache.org/jira/browse/HBASE-13851 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.1 We have seen the clients hanging in running the test {{IntegrationTestRegionReplicaPerf}} in 1.1 code base during the test.The jstack gives: {code} IPC Client (1344340481) connection to os-enis-dal-test-jun-4-1.openstacklocal/172.22.80.25:16020 from root - writer daemon prio=10 tid=0x7f3891b29800 nid=0x7345 waiting on condition [0x7f3865647000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x00070d54a240 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$CallSender.run(RpcClientImpl.java:253) TestClient-3 prio=10 tid=0x7f3892660800 nid=0x63b0 waiting on condition [0x7f386ecdd000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.ipc.RpcClientImpl.close(RpcClientImpl.java:1139) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2371) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2384) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:1036) at org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest.testTakedown(PerformanceEvaluation.java:1351) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1055) at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1612) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:410) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:405) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at
[jira] [Comment Edited] (HBASE-13851) RpcClientImpl.close() can hang with cancelled replica RPCs
[ https://issues.apache.org/jira/browse/HBASE-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575524#comment-14575524 ] Enis Soztutar edited comment on HBASE-13851 at 6/6/15 2:24 AM: --- Here is an explanation of what is happening for the brave souls: The RpcClientImpl just hangs in close() after interrupting every Connection thread that is running. We have the Connection thread and the CallSender thread per RS. CallSender is only started if we are using specifiThreadForWriting (enabled for replica reads). The CallSender thread is started in Connection constructor, while the Connection thread itself is started only after the setupIOStreams() is successful. setupIOStreams() is called only in the case of a call being written. RPCClientImpl keeps a map of Connection objects. A new Rpc Call will create a new Connection object and add it to the map if needed. On RpcClient.close() it interrupts all Connections and waits until all Connections in the map are removed. Normally, the Connection thread after getting an interruption will call markClosed() and then the Thread run loop will end which as a last operation will call close(). Connection.close() removes the Connection from the RpcClient's connections map. If a replica RPC is performed, a Connection object is constructed, and added to the map. Normally the Rpc Call is handled by the RpcSender thread which is already running, and it will setupIOStreams() and depending on whether an exception or not, it will either start the Connection thread or call Connection.close() which will remove the Connection from the map. In a rare case, a new Connection can be created, but before CallSender sends the RPC call and for that sets up IO streams and starts the Connection thread, the RPC may be cancelled if another replica responded first. Previously we were not canceling the RPC, but after HBASE-12668, the cancelation is happening which will cause the Connection thread to not start at all if there are no more RPCs coming. In this case, since there is no Connection thread running, the RpcClientImpl.close() will not be able to interrupt the thread (since it is not running), and Connection.close() will never be called. was (Author: enis): Here is an explanation of what is happening for the brave souls: The RpcClientImpl just hangs in close() after interrupting every Connection thread that is running. We have the Connection thread and the CallSender thread per RS. CallSender is only started if we are using specifiThreadForWriting (enabled for replica reads). The CallSender thread is started in Connection constructor, while the Connection thread itself is started only after the setupIOStreams() is successful. setupIOStreams() is called only in the case of a call being written. RPCClientImpl keeps a map of Connection objects. A new Rpc Call will create a new Connection object and add it to the map if needed. On RpcClient.close() it interrupts all Connections and waits until all Connections in the map are removed. Normally, the Connection thread after getting an interruption will call markClosed() and then the Thread run loop will end which as a last operation will call close(). Connection.close() removes the Connection from the RpcClient's connections map. If a replica RPC is performed, a Connection object is constructed, and added to the map. Normally the Rpc Call is handled by the RpcSender thread which is already running, and it will setupIOStreams() and depending on whether an exception or not, it will either start the Connection thread or call Connection.close() which will remove the Connection from the map. In a rare case, a new Connection can be created, but before CallSender sends the RPC call and for that sets up IO streams and starts the Connection thread, the RPC may be cancelled if another replica responded first. Previously we were not canceling the RPC, but after HBASE-12668, the cancelation is happening which will cause the Connection thread to not start at all if there are no more RPCs coming. In this case, since there is no Connection thread running, the RpcClientImpl.close() will not be able to interrupt the thread (since it is not running), and Connection.close() will never be called. RpcClientImpl.close() can hang with cancelled replica RPCs -- Key: HBASE-13851 URL: https://issues.apache.org/jira/browse/HBASE-13851 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.1 We have seen the clients hanging in running the test {{IntegrationTestRegionReplicaPerf}} in 1.1 code base during the test.The jstack gives: {code} IPC Client (1344340481) connection to
[jira] [Updated] (HBASE-13755) Provide single super user check implementation
[ https://issues.apache.org/jira/browse/HBASE-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Antonov updated HBASE-13755: Status: Patch Available (was: Open) Provide single super user check implementation -- Key: HBASE-13755 URL: https://issues.apache.org/jira/browse/HBASE-13755 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Mikhail Antonov Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13755-v1.patch, HBASE-13755-v2.patch, HBASE-13755-v3.patch, HBASE-13755-v3.patch Followup for HBASE-13375. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13827) Delayed scanner close in KeyValueHeap and StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-13827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-13827: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to master. Delayed scanner close in KeyValueHeap and StoreScanner -- Key: HBASE-13827 URL: https://issues.apache.org/jira/browse/HBASE-13827 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-13827.patch, HBASE-13827.patch, HBASE-13827.patch This is to support the work in HBASE-12295. We have to return the blocks when the close() happens on the HFileScanner. Right now close is not at all there. Will add. The StoreFileScanner will call it on its close(). In KVHeap when we see one of the child scanner runs out of cells, we will remove them from the PriorityQueue as well as close it. Also the same kind of stuff in StoreScanner too. But when we want to do the return block in close() this kind of early close is not correct. Still there might be cells created out of these cached blocks. This Jira aims at changing these container scanners not to do early close. When it seems a child scanner no longer required, it will avoid using it completely but just wont call close(). Instead it will be added to another list for a delayed close and that will be closed when the container scanner close() happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575545#comment-14575545 ] Elliott Clark commented on HBASE-13834: --- Thanks [~abhilak] for your first patch. Thankd [~yuzhih...@gmail.com] and [~anoop.hbase] for the reviews. Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13851) RpcClientImpl.close() can hang with cancelled replica RPCs
Enis Soztutar created HBASE-13851: - Summary: RpcClientImpl.close() can hang with cancelled replica RPCs Key: HBASE-13851 URL: https://issues.apache.org/jira/browse/HBASE-13851 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.2.0, 1.1.1 We have seen the clients hanging in running the test {{IntegrationTestRegionReplicaPerf}} in 1.1 code base during the test.The jstack gives: {code} IPC Client (1344340481) connection to os-enis-dal-test-jun-4-1.openstacklocal/172.22.80.25:16020 from root - writer daemon prio=10 tid=0x7f3891b29800 nid=0x7345 waiting on condition [0x7f3865647000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x00070d54a240 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:374) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$CallSender.run(RpcClientImpl.java:253) TestClient-3 prio=10 tid=0x7f3892660800 nid=0x63b0 waiting on condition [0x7f386ecdd000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.ipc.RpcClientImpl.close(RpcClientImpl.java:1139) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.internalClose(ConnectionManager.java:2371) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.close(ConnectionManager.java:2384) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTakedown(PerformanceEvaluation.java:1036) at org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest.testTakedown(PerformanceEvaluation.java:1351) at org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1055) at org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1612) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:410) at org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:405) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575570#comment-14575570 ] Hudson commented on HBASE-13834: FAILURE: Integrated in HBase-1.2 #134 (See [https://builds.apache.org/job/HBase-1.2/134/]) HBASE-13834 Evict count not properly passed to HeapMemoryTuner. (Abhilash) (anoopsamjohn: rev a18397e0ee40cb0a76f03537e5e856d45c8a6bea) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HeapMemoryManager.java Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13829) Add more ThrottleType
[ https://issues.apache.org/jira/browse/HBASE-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-13829: --- Attachment: HBASE-13829-v1.patch Add more ThrottleType - Key: HBASE-13829 URL: https://issues.apache.org/jira/browse/HBASE-13829 Project: HBase Issue Type: Improvement Components: Client Reporter: Guanghao Zhang Assignee: Guanghao Zhang Fix For: 2.0.0 Attachments: HBASE-13829-v1.patch, HBASE-13829.patch HBASE-11598 add simple throttling for hbase. But in the client, it doesn't support user to set ThrottleType like WRITE_NUM, WRITE_SIZE, READ_NUM, READ_SIZE. REVIEW BOARD: https://reviews.apache.org/r/34989/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13829) Add more ThrottleType
[ https://issues.apache.org/jira/browse/HBASE-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575581#comment-14575581 ] Guanghao Zhang commented on HBASE-13829: OK. It is public now. Add more ThrottleType - Key: HBASE-13829 URL: https://issues.apache.org/jira/browse/HBASE-13829 Project: HBase Issue Type: Improvement Components: Client Reporter: Guanghao Zhang Assignee: Guanghao Zhang Fix For: 2.0.0 Attachments: HBASE-13829-v1.patch, HBASE-13829.patch HBASE-11598 add simple throttling for hbase. But in the client, it doesn't support user to set ThrottleType like WRITE_NUM, WRITE_SIZE, READ_NUM, READ_SIZE. REVIEW BOARD: https://reviews.apache.org/r/34989/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13834) Evict count not properly passed to HeapMemoryTuner.
[ https://issues.apache.org/jira/browse/HBASE-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575599#comment-14575599 ] Hudson commented on HBASE-13834: FAILURE: Integrated in HBase-TRUNK #6548 (See [https://builds.apache.org/job/HBase-TRUNK/6548/]) HBASE-13834 Evict count not properly passed to HeapMemoryTuner. (Abhilash) (anoopsamjohn: rev c1d970b04d27f4b34a5d4ccd981b9fe8fc326148) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HeapMemoryManager.java Evict count not properly passed to HeapMemoryTuner. --- Key: HBASE-13834 URL: https://issues.apache.org/jira/browse/HBASE-13834 Project: HBase Issue Type: Bug Components: hbase, regionserver Affects Versions: 1.0.0 Reporter: Abhilash Assignee: Abhilash Labels: easyfix Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1 Attachments: HBASE-13834-v1.patch, HBASE-13834.patch Evict count calculated inside the HeapMemoryManager class in tune function that is passed to HeapMemoryTuner via TunerContext is miscalculated. It is supposed to be Evict count between two intervals but its not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13827) Delayed scanner close in KeyValueHeap and StoreScanner
[ https://issues.apache.org/jira/browse/HBASE-13827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575598#comment-14575598 ] Hudson commented on HBASE-13827: FAILURE: Integrated in HBase-TRUNK #6548 (See [https://builds.apache.org/job/HBase-TRUNK/6548/]) HBASE-13827 Delayed scanner close in KeyValueHeap and StoreScanner. (anoopsamjohn: rev fef6d7f48c81d63b12be4f53031bdbf208635cac) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ReversedKeyValueHeap.java Delayed scanner close in KeyValueHeap and StoreScanner -- Key: HBASE-13827 URL: https://issues.apache.org/jira/browse/HBASE-13827 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-13827.patch, HBASE-13827.patch, HBASE-13827.patch This is to support the work in HBASE-12295. We have to return the blocks when the close() happens on the HFileScanner. Right now close is not at all there. Will add. The StoreFileScanner will call it on its close(). In KVHeap when we see one of the child scanner runs out of cells, we will remove them from the PriorityQueue as well as close it. Also the same kind of stuff in StoreScanner too. But when we want to do the return block in close() this kind of early close is not correct. Still there might be cells created out of these cached blocks. This Jira aims at changing these container scanners not to do early close. When it seems a child scanner no longer required, it will avoid using it completely but just wont call close(). Instead it will be added to another list for a delayed close and that will be closed when the container scanner close() happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)