[jira] [Commented] (HBASE-5908) TestHLogSplit.testTralingGarbageCorruptionFileSkipErrorsPasses should not use append to corrupt the HLog
[ https://issues.apache.org/jira/browse/HBASE-5908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265527#comment-13265527 ] Zhihong Yu commented on HBASE-5908: --- @Gregory: I would suggest referencing other JIRAs by their names only, such as HADOOP-8230. This way we would easily see whether the JIRA has been resolved. TestHLogSplit.testTralingGarbageCorruptionFileSkipErrorsPasses should not use append to corrupt the HLog Key: HBASE-5908 URL: https://issues.apache.org/jira/browse/HBASE-5908 Project: HBase Issue Type: Bug Components: test, wal Affects Versions: 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Attachments: HBASE-5908-trunk.patch TestHLogSplit.testTralingGarbageCorruptionFileSkipErrorsPasses fails against a version of hadoop with https://issues.apache.org/jira/browse/HADOOP-8230 The failure: java.io.IOException: Append is not supported. Please see the dfs.support.append configuration parameter. Instead of using append, we can probably just: - copy over the contents to a new file - append the garbage to the new file - copy back to the old file -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell
[ https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265529#comment-13265529 ] Zhihong Yu commented on HBASE-5548: --- @Jesse: I checked the two outstanding Hadoop QA jobs around 23:27 - they were not for this JIRA. Add ability to get a table in the shell --- Key: HBASE-5548 URL: https://issues.apache.org/jira/browse/HBASE-5548 Project: HBase Issue Type: Improvement Components: shell Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0 Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch, ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch Currently, all the commands that operate on a table in the shell first have to take the table as name as input. There are two main considerations: * It is annoying to have to write the table name every time, when you should just be able to get a reference to a table * the current implementation is very wasteful - it creates a new HTable for each call (but reuses the connection since it uses the same configuration) We should be able to get a handle to a single HTable and then operate on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265606#comment-13265606 ] Zhihong Yu commented on HBASE-5699: --- bq. to one HLog object, which might have more than one underlying stream. The above can be a (sub-)task by itself. Run with 1 WAL in HRegionServer - Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin Assignee: Li Pi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265620#comment-13265620 ] Zhihong Yu commented on HBASE-5699: --- Currently we maintain one sequence number per region per HLog. From append(): {code} this.lastSeqWritten.putIfAbsent(regionInfo.getEncodedNameAsBytes(), Long.valueOf(seqNum)); {code} If WALEdit's from a particular region can spread across multiple streams, accounting would be more complex. Run with 1 WAL in HRegionServer - Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin Assignee: Li Pi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13265627#comment-13265627 ] Zhihong Yu commented on HBASE-5547: --- @Jesse: Do you want to attach patch so that Hadoop QA can run test suite ? Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5897) prePut coprocessor hook causing substantial CPU usage
[ https://issues.apache.org/jira/browse/HBASE-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264538#comment-13264538 ] Zhihong Yu commented on HBASE-5897: --- +1 on Todd's patch. prePut coprocessor hook causing substantial CPU usage - Key: HBASE-5897 URL: https://issues.apache.org/jira/browse/HBASE-5897 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.94.0, 0.96.0 Attachments: 5897-simple.txt, hbase-5897.txt I was running an insert workload against trunk under oprofile and saw that a significant portion of CPU usage was going to calling the prePut coprocessor hook inside doMiniBatchPut, even though I don't have any coprocessors installed. I ran a million-row insert and collected CPU time spent in the RS after commenting out the preput hook, and found CPU usage reduced by 33%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5342) Grant/Revoke global permissions
[ https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5342: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525018/HBASE-5342-v0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1684//console This message is automatically generated.) Grant/Revoke global permissions --- Key: HBASE-5342 URL: https://issues.apache.org/jira/browse/HBASE-5342 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Matteo Bertozzi Attachments: HBASE-5342-draft.patch, HBASE-5342-v0.patch HBASE-3025 introduced simple ACLs based on coprocessors. It defines global/table/cf/cq level permissions. However, there is no way to grant/revoke global level permissions, other than the hbase.superuser conf setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5342) Grant/Revoke global permissions
[ https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264539#comment-13264539 ] Zhihong Yu commented on HBASE-5342: --- @Matteo: Can you run the new patch for security profile and let us know the result ? Thanks Grant/Revoke global permissions --- Key: HBASE-5342 URL: https://issues.apache.org/jira/browse/HBASE-5342 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Matteo Bertozzi Attachments: HBASE-5342-draft.patch, HBASE-5342-v0.patch HBASE-3025 introduced simple ACLs based on coprocessors. It defines global/table/cf/cq level permissions. However, there is no way to grant/revoke global level permissions, other than the hbase.superuser conf setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5894) Delete table failed but HBaseAdmin#deletetable report it as success
[ https://issues.apache.org/jira/browse/HBASE-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264290#comment-13264290 ] Zhihong Yu commented on HBASE-5894: --- {code} + throw new RegionException(Retries exhausted, it took too long to wait+ {code} I think IOException should be thrown above. If a test can be added to verify the fix, that would be nice. Delete table failed but HBaseAdmin#deletetable report it as success --- Key: HBASE-5894 URL: https://issues.apache.org/jira/browse/HBASE-5894 Project: HBase Issue Type: Bug Affects Versions: 0.90.7, 0.92.2, 0.94.0 Environment: all versions Reporter: xufeng Assignee: xufeng Priority: Minor Attachments: HBASE-5894_trunk_patch_v1.patch, HBASE-5894_trunk_patch_v1_surefire-report.html Reproduce this issue by following steps: For reproduce it I add this code in DeleteTableHandler#handleTableOperation(): {noformat} LOG.debug(Deleting region + region.getRegionNameAsString() + from META and FS); +if (true) { + throw new IOException(ERROR); +} // Remove region from META MetaEditor.deleteRegion(this.server.getCatalogTracker(), region); {noformat} step1:create a table and disable it. step2:delete it by HBaseAdmin#deleteTable() API. result:after lone time, The log say the Table has been deleted, but in fact if we do list in shell,the table also exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5611: -- Attachment: 5611-94.addendum Addendum for 0.94 Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5611-94.addendum, HBASE-5611-92.patch, HBASE-5611-94-minorchange.patch, HBASE-5611-trunk-v2-minorchange.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5894) Delete table failed but HBaseAdmin#deletetable report it as success
[ https://issues.apache.org/jira/browse/HBASE-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264343#comment-13264343 ] Zhihong Yu commented on HBASE-5894: --- You can use the following annotation to limit the duration for a particular test: {code} @Test(timeout = 12) {code} Delete table failed but HBaseAdmin#deletetable report it as success --- Key: HBASE-5894 URL: https://issues.apache.org/jira/browse/HBASE-5894 Project: HBase Issue Type: Bug Affects Versions: 0.90.7, 0.92.2, 0.94.0 Environment: all versions Reporter: xufeng Assignee: xufeng Priority: Minor Attachments: HBASE-5894_trunk_patch_v1.patch, HBASE-5894_trunk_patch_v1_surefire-report.html Reproduce this issue by following steps: For reproduce it I add this code in DeleteTableHandler#handleTableOperation(): {noformat} LOG.debug(Deleting region + region.getRegionNameAsString() + from META and FS); +if (true) { + throw new IOException(ERROR); +} // Remove region from META MetaEditor.deleteRegion(this.server.getCatalogTracker(), region); {noformat} step1:create a table and disable it. step2:delete it by HBaseAdmin#deleteTable() API. result:after lone time, The log say the Table has been deleted, but in fact if we do list in shell,the table also exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5611: --- Changes for 0.94 reverted due to TestChangingEncoding failure. Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5611-94.addendum, HBASE-5611-92.patch, HBASE-5611-94-minorchange.patch, HBASE-5611-trunk-v2-minorchange.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5342) Grant/Revoke global permissions
[ https://issues.apache.org/jira/browse/HBASE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264415#comment-13264415 ] Zhihong Yu commented on HBASE-5342: --- I got the following when applying the draft patch: {code} 2 out of 9 hunks FAILED -- saving rejects to file security/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java.rej {code} {code} + private void updateGlobalCache(ListMultimapString,TablePermission userPerms) { {code} I would expect Permission in the method signature above. Can the following method be changed to return ListMultimapString, Permission ? {code} ListMultimapString,TablePermission perms = AccessControlLists.readPermissions(in, conf); {code} {code} -SetString tableSet = new HashSetString(); +Setbyte[] tableSet = new HashSetbyte[](); {code} HashSet is backed by HashMap: see line 93 of http://www.docjar.com/html/api/java/util/HashSet.java.html I think a proper comparator should be used above. {code} + * Returns true if this permission describe a user global permission. {code} Should read 'describes a global user permission' {code} + raise(ArgumentError, Can't find a family: #{family}) unless htd.hasFamily(family.to_java_bytes) {code} Line exceeds 100 chars. Remove the 'a' before 'family' or replace it with 'the'. {code} +user_permission = org.apache.hadoop.hbase.security.access.UserPermission.new(user.to_java_bytes, table_name.to_java_bytes, fambytes, qualbytes, .to_java_bytes) {code} Above line is too long. Length should be no longer than 100 chars. Same with the assignment in the else block below. Grant/Revoke global permissions --- Key: HBASE-5342 URL: https://issues.apache.org/jira/browse/HBASE-5342 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Matteo Bertozzi Attachments: HBASE-5342-draft.patch HBASE-3025 introduced simple ACLs based on coprocessors. It defines global/table/cf/cq level permissions. However, there is no way to grant/revoke global level permissions, other than the hbase.superuser conf setting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System
[ https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264441#comment-13264441 ] Zhihong Yu commented on HBASE-5885: --- Now that HBASE-5611 was taken out of 0.94, yet we still see the following test failure: {code} https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/159/testReport/org.apache.hadoop.hbase.io.encoding/TestChangingEncoding/testFlippingEncodeOnDisk/ {code} I start to think that this JIRA is related to the failure above. Invalid HFile block magic on Local file System -- Key: HBASE-5885 URL: https://issues.apache.org/jira/browse/HBASE-5885 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5885-trunk-v2.txt, HBASE-5885-94-0.patch, HBASE-5885-94-1.patch, HBASE-5885-trunk-0.patch, HBASE-5885-trunk-1.patch ERROR: java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=7, exceptions: Thu Apr 26 11:19:18 PDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for reader reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268, compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], firstKey=01/info:data/1335463981520/Put, lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, avgValueLen=1000, entries=1215085, length=1264354417, cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0] at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) Caused by: java.io.IOException: Invalid HFile block magic: \xEC\xD5\x9D\xB4\xC2bfo at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153) at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164) at org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130) ... 12 more Thu Apr 26 11:19:19 PDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: java.io.IOException: java.lang.IllegalArgumentException at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132) at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at
[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System
[ https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264452#comment-13264452 ] Zhihong Yu commented on HBASE-5885: --- Here is another one: https://builds.apache.org/job/HBase-0.94/157/testReport/org.apache.hadoop.hbase.io.encoding/TestChangingEncoding/testFlippingEncodeOnDisk/ Invalid HFile block magic on Local file System -- Key: HBASE-5885 URL: https://issues.apache.org/jira/browse/HBASE-5885 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5885-trunk-v2.txt, HBASE-5885-94-0.patch, HBASE-5885-94-1.patch, HBASE-5885-trunk-0.patch, HBASE-5885-trunk-1.patch ERROR: java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=7, exceptions: Thu Apr 26 11:19:18 PDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for reader reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268, compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], firstKey=01/info:data/1335463981520/Put, lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, avgValueLen=1000, entries=1215085, length=1264354417, cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0] at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) Caused by: java.io.IOException: Invalid HFile block magic: \xEC\xD5\x9D\xB4\xC2bfo at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153) at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164) at org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130) ... 12 more Thu Apr 26 11:19:19 PDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: java.io.IOException: java.lang.IllegalArgumentException at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132) at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121) at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376) Caused by: java.lang.IllegalArgumentException at
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264456#comment-13264456 ] Zhihong Yu commented on HBASE-5898: --- Interesting idea. Minor comments: The indentation for while (true) loop is off. Changes to conf/hbase-site.xml belong to another JIRA. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Attachments: hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock
[ https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13264470#comment-13264470 ] Zhihong Yu commented on HBASE-5898: --- Consider the case where off heap cache is enabled. From DoubleBlockCache: Suppose getBlock() is executed without the lock (first pass in the new loop of readBlock) and doesn't find cacheKey from onHeapCache but finds it in offHeapCache - it will call onHeapCache.cacheBlock(): {code} public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching) { Cacheable cachedBlock; if ((cachedBlock = onHeapCache.getBlock(cacheKey, caching)) != null) { stats.hit(caching); return cachedBlock; } else if ((cachedBlock = offHeapCache.getBlock(cacheKey, caching)) != null) { if (caching) { onHeapCache.cacheBlock(cacheKey, cachedBlock); } {code} Another thread calls cacheBlock() around the same time and executes onHeapCache.cacheBlock() for the same cacheKey: {code} public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf) { onHeapCache.cacheBlock(cacheKey, buf); offHeapCache.cacheBlock(cacheKey, buf); } {code} I think there is a race condition which didn't exist before the proposed change: the entries for the same cacheKey in onHeapCache and offHeapCache would diverge. If off heap cache is disabled, I don't see problem with proposed optimization. Consider double-checked locking for block cache lock Key: HBASE-5898 URL: https://issues.apache.org/jira/browse/HBASE-5898 Project: HBase Issue Type: Improvement Components: performance Affects Versions: 0.94.1 Reporter: Todd Lipcon Attachments: hbase-5898.txt Running a workload with a high query rate against a dataset that fits in cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a lot of CPU doing lock management here. I wrote a quick patch to switch to a double-checked locking and it improved throughput substantially for this workload. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262648#comment-13262648 ] Zhihong Yu commented on HBASE-5611: --- @Jieshan: When you have multiple patches for different branches, attach patch for trunk apart from the other patches. Otherwise Hadoop QA may pick up the wrong patch. Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5611: -- Hadoop Flags: Reviewed Status: Patch Available (was: Open) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262652#comment-13262652 ] Zhihong Yu commented on HBASE-5611: --- Patch v2 looks good in general. Comment on formatting: {code} + * @param regionName + * region name. {code} The line length is 100 chars. Please put javadoc for param on the same line as param name. You can wait for Hadoop QA result to come back before attaching new patches. Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262652#comment-13262652 ] Zhihong Yu edited comment on HBASE-5611 at 4/26/12 2:58 PM: Patch v2 looks good in general. Comment on formatting: {code} + * @param regionName + * region name. {code} The line length limit is 100 chars. Please put javadoc for param on the same line as param name. You can wait for Hadoop QA result to come back before attaching new patches. was (Author: zhi...@ebaysf.com): Patch v2 looks good in general. Comment on formatting: {code} + * @param regionName + * region name. {code} The line length is 100 chars. Please put javadoc for param on the same line as param name. You can wait for Hadoop QA result to come back before attaching new patches. Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server
[ https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262698#comment-13262698 ] Zhihong Yu commented on HBASE-5875: --- bq. Or can we update the root region node in the RS side after updating the online server list? Let's try this approach first. The other approach would involve retry count, sleep interval, etc. Process RIT and Master restart may remove an online server considering it as a dead server -- Key: HBASE-5875 URL: https://issues.apache.org/jira/browse/HBASE-5875 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.94.1 If on master restart it finds the ROOT/META to be in RIT state, master tries to assign the ROOT region through ProcessRIT. Master will trigger the assignment and next will try to verify the Root Region Location. Root region location verification is done seeing if the RS has the region in its online list. If the master triggered assignment has not yet been completed in RS then the verify root region location will fail. Because it failed {code} splitLogAndExpireIfOnline(currentRootServer); {code} we do split log and also remove the server from online server list. Ideally here there is nothing to do in splitlog as no region server was restarted. So master, though the server is online, master just invalidates the region server. In a special case, if i have only one RS then my cluster will become non operative. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262702#comment-13262702 ] Zhihong Yu commented on HBASE-5611: --- Tests were clear. @Jieshan: Please address formatting and prepare patches for each branch. We should also run test suite for 0.90 and 0.92 once patches are available. Good job. Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client
[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262710#comment-13262710 ] Zhihong Yu commented on HBASE-5877: --- @N: I think the following validation in real cluster would illustrate the benefit of this feature: For given table, select a region server and note the row key ranges hosted by the region server. Direct client load to this server. Kill the server at time T. Difference in client response to region migration around time T with and without the patch would be interesting. When a query fails because the region has moved, let the regionserver return the new address to the client -- Key: HBASE-5877 URL: https://issues.apache.org/jira/browse/HBASE-5877 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5877.v1.patch This is mainly useful when we do a rolling restart. This will decrease the load on the master and the network load. Note that a region is not immediately opened after a close. So: - it seems preferable to wait before retrying on the other server. An optimisation would be to have an heuristic depending on when the region was closed. - during a rolling restart, the server moves the regions then stops. So we may have failures when the server is stopped, and this patch won't help. The implementation in the first patch does: - on the region move, there is an added parameter on the regionserver#close to say where we are sending the region - the regionserver keeps a list of what was moved. Each entry is kept 100 seconds. - the regionserver sends a specific exception when it receives a query on a moved region. This exception contains the new address. - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client
[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5877: -- Comment: was deleted (was: @N: I think the following validation in real cluster would illustrate the benefit of this feature: For given table, select a region server and note the row key ranges hosted by the region server. Direct client load to this server. Kill the server at time T. Difference in client response to region migration around time T with and without the patch would be interesting.) When a query fails because the region has moved, let the regionserver return the new address to the client -- Key: HBASE-5877 URL: https://issues.apache.org/jira/browse/HBASE-5877 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5877.v1.patch This is mainly useful when we do a rolling restart. This will decrease the load on the master and the network load. Note that a region is not immediately opened after a close. So: - it seems preferable to wait before retrying on the other server. An optimisation would be to have an heuristic depending on when the region was closed. - during a rolling restart, the server moves the regions then stops. So we may have failures when the server is stopped, and this patch won't help. The implementation in the first patch does: - on the region move, there is an added parameter on the regionserver#close to say where we are sending the region - the regionserver keeps a list of what was moved. Each entry is kept 100 seconds. - the regionserver sends a specific exception when it receives a query on a moved region. This exception contains the new address. - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client
[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262714#comment-13262714 ] Zhihong Yu commented on HBASE-5877: --- @N: I think the following validation in real cluster would illustrate the benefit of this feature: For given table, select a region server and note the row key ranges hosted by one region on the region server. Direct client load to this region. Issue the following command in shell: {code} hbase move 'ENCODED_REGIONNAME', 'SERVER_NAME' {code} at time T. Difference in client response to region migration around time T with and without the patch would be interesting. When a query fails because the region has moved, let the regionserver return the new address to the client -- Key: HBASE-5877 URL: https://issues.apache.org/jira/browse/HBASE-5877 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5877.v1.patch This is mainly useful when we do a rolling restart. This will decrease the load on the master and the network load. Note that a region is not immediately opened after a close. So: - it seems preferable to wait before retrying on the other server. An optimisation would be to have an heuristic depending on when the region was closed. - during a rolling restart, the server moves the regions then stops. So we may have failures when the server is stopped, and this patch won't help. The implementation in the first patch does: - on the region move, there is an added parameter on the regionserver#close to say where we are sending the region - the regionserver keeps a list of what was moved. Each entry is kept 100 seconds. - the regionserver sends a specific exception when it receives a query on a moved region. This exception contains the new address. - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client
[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262731#comment-13262731 ] Zhihong Yu commented on HBASE-5877: --- @N: If the testing result is favorable, I think Lars may want it in 0.94 as well. I think making this feature functional in 0.94 cluster would be a good start. bq. We could have this by adding the info in zk A separate discussion should be started w.r.t. the above. This would shift load imposed by clients from master to zk quorum. When a query fails because the region has moved, let the regionserver return the new address to the client -- Key: HBASE-5877 URL: https://issues.apache.org/jira/browse/HBASE-5877 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5877.v1.patch This is mainly useful when we do a rolling restart. This will decrease the load on the master and the network load. Note that a region is not immediately opened after a close. So: - it seems preferable to wait before retrying on the other server. An optimisation would be to have an heuristic depending on when the region was closed. - during a rolling restart, the server moves the regions then stops. So we may have failures when the server is stopped, and this patch won't help. The implementation in the first patch does: - on the region move, there is an added parameter on the regionserver#close to say where we are sending the region - the regionserver keeps a list of what was moved. Each entry is kept 100 seconds. - the regionserver sends a specific exception when it receives a query on a moved region. This exception contains the new address. - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB
[ https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262742#comment-13262742 ] Zhihong Yu commented on HBASE-5620: --- Is there plan to adopt various measures to counter this 8% performance dip ? Convert the client protocol of HRegionInterface to PB - Key: HBASE-5620 URL: https://issues.apache.org/jira/browse/HBASE-5620 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: hbase-5620-sec.patch, hbase-5620_v3.patch, hbase-5620_v4.patch, hbase-5620_v4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262748#comment-13262748 ] Zhihong Yu commented on HBASE-5862: --- @Stack: Do you have suggestions on further improvement for the latest patch ? After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5862: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12524468/TSD.png against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1657//console This message is automatically generated.) After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch, TSD.png If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262798#comment-13262798 ] Zhihong Yu commented on HBASE-5829: --- The latest patch is good to go. Useless statement can be addressed elsewhere. Inconsistency between the regions map and the servers map in AssignmentManager -- Key: HBASE-5829 URL: https://issues.apache.org/jira/browse/HBASE-5829 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.92.1 Reporter: Maryann Xue Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch There are occurrences in AM where this.servers is not kept consistent with this.regions. This might cause balancer to offline a region from the RS that already returned NotServingRegionException at a previous offline attempt. In AssignmentManager.unassign(HRegionInfo, boolean) try { // TODO: We should consider making this look more like it does for the // region open where we catch all throwables and never abort if (serverManager.sendRegionClose(server, state.getRegion(), versionOfClosingNode)) { LOG.debug(Sent CLOSE to + server + for region + region.getRegionNameAsString()); return; } // This never happens. Currently regionserver close always return true. LOG.warn(Server + server + region CLOSE RPC returned false for + region.getRegionNameAsString()); } catch (NotServingRegionException nsre) { LOG.info(Server + server + returned + nsre + for + region.getRegionNameAsString()); // Presume that master has stale data. Presume remote side just split. // Presume that the split message when it comes in will fix up the master's // in memory cluster state. } catch (Throwable t) { if (t instanceof RemoteException) { t = ((RemoteException)t).unwrapRemoteException(); if (t instanceof NotServingRegionException) { if (checkIfRegionBelongsToDisabling(region)) { // Remove from the regionsinTransition map LOG.info(While trying to recover the table + region.getTableNameAsString() + to DISABLED state the region + region + was offlined but the table was in DISABLING state); synchronized (this.regionsInTransition) { this.regionsInTransition.remove(region.getEncodedName()); } // Remove from the regionsMap synchronized (this.regions) { this.regions.remove(region); } deleteClosingOrClosedNode(region); } } // RS is already processing this region, only need to update the timestamp if (t instanceof RegionAlreadyInTransitionException) { LOG.debug(update + state + the timestamp.); state.update(state.getState()); } } In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, boolean) synchronized (this.regions) { this.regions.put(plan.getRegionInfo(), plan.getDestination()); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262808#comment-13262808 ] Zhihong Yu commented on HBASE-5862: --- {code} +//per hfile. Figuring out which cfs, hfiles, ... {code} Should cfs be in expanded form (column families) ? {code} +//and on the next tick of the metrics everything that is still relevant will be +//re-added. {code} 're-added' - 'added' or 'added again' The initialization work in clear() should be moved to RegionServerDynamicMetrics ctor because it is one time operation. After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, HBASE-5862-94-3.patch, TSD.png If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5887) Make TestAcidGuarantees usable for system testing.
[ https://issues.apache.org/jira/browse/HBASE-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263261#comment-13263261 ] Zhihong Yu commented on HBASE-5887: --- {code} +int millis = c.getInt(millis, 5000); +int numWriters = c.getInt(numWriters, 50); +int numGetters = c.getInt(numGetters, 2); +int numScanners = c.getInt(numScanners, 2); +int numUniqueRows = c.getInt(numUniqueRows, 3); {code} Can user specify these config parameters from the command line ? Make TestAcidGuarantees usable for system testing. -- Key: HBASE-5887 URL: https://issues.apache.org/jira/browse/HBASE-5887 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0, 0.92.1, 0.94.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5887.patch Currently, the TestAcidGuarantees run via main() will always abort with an NPE because it digs into a non-existant HBaseTestingUtility for a flusher thread. We should tool this up so that it works properly from the command line. This would be a very useful long running test when used in conjunction with fault injections to verify row acid properties. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5611: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12524808/HBASE-5611-94.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1663//console This message is automatically generated.) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263327#comment-13263327 ] Zhihong Yu commented on HBASE-5611: --- {code} + * Roll back the global MemStore size when a region can't open. {code} The above is not accurate: we're only rolling back the replay edits size for specified region from global MemStore size. Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5874) The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException.
[ https://issues.apache.org/jira/browse/HBASE-5874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263335#comment-13263335 ] Zhihong Yu commented on HBASE-5874: --- +1 on patch. The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException. Key: HBASE-5874 URL: https://issues.apache.org/jira/browse/HBASE-5874 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.6 Reporter: fulin wang Attachments: HBASE-5874-0.90.patch, HBASE-5874-trunk.patch The HBase do not configure the 'fs.default.name' attribute, the hbck tool and Merge tool throw IllegalArgumentException. the hbck tool and Merge tool, we should add 'fs.default.name' attriubte to the code. hbck exception: Exception in thread main java.lang.IllegalArgumentException: Wrong FS: hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:128) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:301) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:489) at org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegioninfo(HBaseFsck.java:565) at org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:596) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:332) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:360) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2907) Merge exception: [2012-05-05 10:48:24,830] [ERROR] [main] [org.apache.hadoop.hbase.util.Merge 381] exiting due to error java.lang.IllegalArgumentException: Wrong FS: hdfs://160.176.0.101:9000/hbase/.META./1028785192/.regioninfo, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:412) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:59) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:382) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:285) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:823) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:415) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2679) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2665) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2634) at org.apache.hadoop.hbase.util.MetaUtils.openMetaRegion(MetaUtils.java:276) at org.apache.hadoop.hbase.util.MetaUtils.scanMetaRegion(MetaUtils.java:261) at org.apache.hadoop.hbase.util.Merge.run(Merge.java:115) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.Merge.main(Merge.java:379) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261648#comment-13261648 ] Zhihong Yu commented on HBASE-5848: --- Looks like the addendum wasn't applied to trunk. Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, 5848-addendum-v7.txt, 5848-addendum-v7.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261655#comment-13261655 ] Zhihong Yu commented on HBASE-5829: --- Patch makes sense. w.r.t. this.servers, I found a useless statement (at least in trunk): {code} void unassignCatalogRegions() { this.servers.entrySet(); {code} that should be removed. Inconsistency between the regions map and the servers map in AssignmentManager -- Key: HBASE-5829 URL: https://issues.apache.org/jira/browse/HBASE-5829 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.92.1 Reporter: Maryann Xue Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch There are occurrences in AM where this.servers is not kept consistent with this.regions. This might cause balancer to offline a region from the RS that already returned NotServingRegionException at a previous offline attempt. In AssignmentManager.unassign(HRegionInfo, boolean) try { // TODO: We should consider making this look more like it does for the // region open where we catch all throwables and never abort if (serverManager.sendRegionClose(server, state.getRegion(), versionOfClosingNode)) { LOG.debug(Sent CLOSE to + server + for region + region.getRegionNameAsString()); return; } // This never happens. Currently regionserver close always return true. LOG.warn(Server + server + region CLOSE RPC returned false for + region.getRegionNameAsString()); } catch (NotServingRegionException nsre) { LOG.info(Server + server + returned + nsre + for + region.getRegionNameAsString()); // Presume that master has stale data. Presume remote side just split. // Presume that the split message when it comes in will fix up the master's // in memory cluster state. } catch (Throwable t) { if (t instanceof RemoteException) { t = ((RemoteException)t).unwrapRemoteException(); if (t instanceof NotServingRegionException) { if (checkIfRegionBelongsToDisabling(region)) { // Remove from the regionsinTransition map LOG.info(While trying to recover the table + region.getTableNameAsString() + to DISABLED state the region + region + was offlined but the table was in DISABLING state); synchronized (this.regionsInTransition) { this.regionsInTransition.remove(region.getEncodedName()); } // Remove from the regionsMap synchronized (this.regions) { this.regions.remove(region); } deleteClosingOrClosedNode(region); } } // RS is already processing this region, only need to update the timestamp if (t instanceof RegionAlreadyInTransitionException) { LOG.debug(update + state + the timestamp.); state.update(state.getState()); } } In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, boolean) synchronized (this.regions) { this.regions.put(plan.getRegionInfo(), plan.getDestination()); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine
[ https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261738#comment-13261738 ] Zhihong Yu commented on HBASE-5732: --- Since AccessController and TokenProvider coprocessors remain after this merge, my point was that we need to keep security profile for running the unit tests related to these coprocessors. Remove the SecureRPCEngine and merge the security-related logic in the core engine -- Key: HBASE-5732 URL: https://issues.apache.org/jira/browse/HBASE-5732 Project: HBase Issue Type: Improvement Reporter: Devaraj Das Assignee: Devaraj Das Attachments: rpcengine-merge.3.patch, rpcengine-merge.patch Remove the SecureRPCEngine and merge the security-related logic in the core engine. Follow up to HBASE-5727. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reassigned HBASE-5870: - Assignee: Zhihong Yu Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261747#comment-13261747 ] Zhihong Yu commented on HBASE-5870: --- I ran the patch against 0.23 profile. I got one test failure: {code} testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport) Time elapsed: 2.583 sec ERROR! java.io.FileNotFoundException: File does not exist: /Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar {code} But the jar was there: {code} -rw-r--r-- 1 zhihyu 110088321 1768854 Apr 24 11:23 /Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar {code} Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5873) TimeOut Monitor thread should be started after atleast one region server registers.
[ https://issues.apache.org/jira/browse/HBASE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261763#comment-13261763 ] Zhihong Yu commented on HBASE-5873: --- +1 if tests pass. TimeOut Monitor thread should be started after atleast one region server registers. --- Key: HBASE-5873 URL: https://issues.apache.org/jira/browse/HBASE-5873 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Priority: Minor Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5873-trunk.txt, HBASE-5873.patch Currently timeout monitor thread is started even before the region server has registered with the master. In timeout monitor we depend on the region server to be online {code} boolean allRSsOffline = this.serverManager.getOnlineServersList(). isEmpty(); {code} Now when the master starts up it sees there are no online servers and hence sets allRSsOffline to true. {code} setAllRegionServersOffline(allRSsOffline); {code} So this.allRegionServersOffline is also true. By this time an RS has come up, Now timeout comes up again (after 10secs) in the next cycle he sees allRSsOffline as false. Hence {code} else if (this.allRegionServersOffline !allRSsOffline) { // if some RSs just came back online, we can start the // the assignment right away actOnTimeOut(regionState); {code} This condition makes him to take action based on timeout. Because of this even if one Region assignment of ROOT is going on, this piece of code triggers another assignment and thus we get RegionAlreadyinTransition Exception. Later we need to wait for 30 mins for assigning ROOT itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261773#comment-13261773 ] Zhihong Yu commented on HBASE-5870: --- The failure is consistent: {code} testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport) Time elapsed: 2.552 sec ERROR! java.io.FileNotFoundException: File does not exist: /Users/zhihyu/.m2/repository/org/apache/hadoop/hadoop-common/0.23.2-SNAPSHOT/hadoop-common-0.23.2-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:729) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1218) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1239) at org.apache.hadoop.hbase.mapreduce.TestImportExport.testSimpleCase(TestImportExport.java:114) {code} Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size
[ https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261799#comment-13261799 ] Zhihong Yu commented on HBASE-5611: --- {code} + // global memstore size once a region opening failed. {code} 'region opening failed' - 'region failed opening'. {code} + private final ConcurrentMapHRegionInfo, AtomicLong replayEditsPerRegion = {code} Do we need HRegionInfo as the key to the Map ? Can we use region name ? For rollbackRegionReplayEditsSize(): {code} + addAndGetGlobalMemstoreSize(-replayEdistsSize.get()); + clearRegionReplayEditsSize(hri); {code} I suggest remembering the value of -replayEdistsSize.get() in a variable so that we can exchange the order of the two statements above and return directly from the if block. If replayEdistsSize is null, would that indicate certain race condition ? Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size Key: HBASE-5611 URL: https://issues.apache.org/jira/browse/HBASE-5611 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: Jean-Daniel Cryans Assignee: Jieshan Bean Priority: Critical Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5611-trunk.patch This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think it's still possible to hit it if a region fails to open for more obscure reasons like HDFS errors. Consider a region that just went through distributed splitting and that's now being opened by a new RS. The first thing it does is to read the recovery files and put the edits in the {{MemStores}}. If this process takes a long time, the master will move that region away. At that point the edits are still accounted for in the global {{MemStore}} size but they are dropped when the {{HRegion}} gets cleaned up. It's completely invisible until the {{MemStoreFlusher}} needs to force flush a region and that none of them have edits: {noformat} 2012-03-21 00:33:39,303 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=5.9g 2012-03-21 00:33:39,303 ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed for entry null java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:129) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223) at java.lang.Thread.run(Thread.java:662) {noformat} The {{null}} here is a region. In my case I had so many edits in the {{MemStore}} during recovery that I'm over the low barrier although in fact I'm at 0. It happened yesterday and it still printing this out. To fix this we need to be able to decrease the global {{MemStore}} size when the region can't open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5870: -- Attachment: 5870-v2.txt Patch v2 fills in obtainJobConf() for MapreduceV2Shim. getJobTrackerConf() creates a new JobConf. So setting config param in the returned JobConf is not effective. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261845#comment-13261845 ] Zhihong Yu commented on HBASE-5870: --- Even in build #136 TestImportExport failed, due to a different exception: https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/136/testReport/org.apache.hadoop.hbase.mapreduce/TestImportExport/org_apache_hadoop_hbase_mapreduce_TestImportExport/ I suggest checking in patch v2 and investigate TestImportExport using another JIRA. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261934#comment-13261934 ] Zhihong Yu commented on HBASE-5870: --- The two failed tests passed locally. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5870: -- Attachment: 5870-v2.txt Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5870: -- Attachment: (was: 5870-v2.txt) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261948#comment-13261948 ] Zhihong Yu commented on HBASE-5862: --- +1 on patch v3. After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, HBASE-5862-2.patch, HBASE-5862-3.patch If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5862: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Fix For: 0.96.0 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, HBASE-5862-2.patch, HBASE-5862-3.patch If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile
Zhihong Yu created HBASE-5876: - Summary: TestImportExport has been failing against hadoop 0.23 profile Key: HBASE-5876 URL: https://issues.apache.org/jira/browse/HBASE-5876 Project: HBase Issue Type: Bug Reporter: Zhihong Yu TestImportExport has been failing against hadoop 0.23 profile -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261998#comment-13261998 ] Zhihong Yu commented on HBASE-5870: --- Will integrate later this afternoon if there is no objection. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5876) TestImportExport has been failing against hadoop 0.23 profile
[ https://issues.apache.org/jira/browse/HBASE-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262024#comment-13262024 ] Zhihong Yu commented on HBASE-5876: --- I face different issue on Macbook. See: https://issues.apache.org/jira/browse/HBASE-5870?focusedCommentId=13261773page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13261773 TestImportExport has been failing against hadoop 0.23 profile - Key: HBASE-5876 URL: https://issues.apache.org/jira/browse/HBASE-5876 Project: HBase Issue Type: Bug Reporter: Zhihong Yu Assignee: Uma Maheswara Rao G TestImportExport has been failing against hadoop 0.23 profile -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262104#comment-13262104 ] Zhihong Yu commented on HBASE-5870: --- I ran the test suite and TestSplitTransactionOnCluster passed: {code} Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.737 sec {code} I ran it again standalone and it passed. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262118#comment-13262118 ] Zhihong Yu commented on HBASE-5870: --- Integrated to trunk. Thanks for the review, Lars and Jon. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client
[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5877: -- Fix Version/s: 0.96.0 Summary: When a query fails because the region has moved, let the regionserver return the new address to the client (was: When a query fails because the region has moved, let the regionserver returns the new address to the client) When a query fails because the region has moved, let the regionserver return the new address to the client -- Key: HBASE-5877 URL: https://issues.apache.org/jira/browse/HBASE-5877 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5877.v1.patch This is mainly useful when we do a rolling restart. This will decrease the load on the master and the network load. Note that a region is not immediately opened after a close. So: - it seems preferable to wait before retrying on the other server. An optimisation would be to have an heuristic depending on when the region was closed. - during a rolling restart, the server moves the regions then stops. So we may have failures when the server is stopped, and this patch won't help. The implementation in the first patch does: - on the region move, there is an added parameter on the regionserver#close to say where we are sending the region - the regionserver keeps a list of what was moved. Each entry is kept 100 seconds. - the regionserver sends a specific exception when it receives a query on a moved region. This exception contains the new address. - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client
[ https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262199#comment-13262199 ] Zhihong Yu commented on HBASE-5877: --- For RegionMovedException.java: {code} +String tmpHostname = nohostname; {code} I think the above could potentially be a host name :-) {code} +} catch (Exception ignored) { + LOG.warn(Can't parse the hostname and the port from this string: + s + , + +Continuing); +} {code} Can we mark the failure and make this RegionMovedException behave the same as NotServingRegionException ? For updateCachedLocations(), please put explanation for parameter on the same line as the parameter: {code} + * @param row - row and tableName can be null id hrl is not null. {code} {code} +LOG.warn(Failed all from + loc, e); {code} 'Failed all' - 'Failed call' {code} + if (resp == null) { +// Entire server failed +LOG.fatal(Failed all for server: + loc.getHostnamePort() + + , removing from cache); +continue; + } {code} How is the server removed from cache since I see 'continue' above ? {code} + } else { +if (numRetries == 1) + LOG.fatal(step 4 got result + regionResult.getFirst() + + regionResult.getSecond()); {code} Why is the above fatal (regionResult != null) ? Step 4 appears in a comment below the above code. Should the above say step 3 ? Please increase the VERSION of HRegionInterface {code} + * @param destServerName: server name on which the server will be moved {code} 'which the server' - 'which the region' For ServerManager.sendRegionClose(), please add javadoc for destServerName param. For HRegionServer.java: {code} +LOG.info(Closing region +region.getRegionName()+, moving to +sn.getServerName() ); {code} Is it possible that destServerName is null ? {code} + private ServerName getMovedRegion(String encodedRegionName) { +LOG.fatal(Called getMovedRegion for +encodedRegionName+ + movedRegions.size()+ +movedRegions); {code} Please change the above to debug log. When a query fails because the region has moved, let the regionserver return the new address to the client -- Key: HBASE-5877 URL: https://issues.apache.org/jira/browse/HBASE-5877 Project: HBase Issue Type: Improvement Components: client, master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5877.v1.patch This is mainly useful when we do a rolling restart. This will decrease the load on the master and the network load. Note that a region is not immediately opened after a close. So: - it seems preferable to wait before retrying on the other server. An optimisation would be to have an heuristic depending on when the region was closed. - during a rolling restart, the server moves the regions then stops. So we may have failures when the server is stopped, and this patch won't help. The implementation in the first patch does: - on the region move, there is an added parameter on the regionserver#close to say where we are sending the region - the regionserver keeps a list of what was moved. Each entry is kept 100 seconds. - the regionserver sends a specific exception when it receives a query on a moved region. This exception contains the new address. - the client analyses the exeptions and update its cache accordingly... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5870: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Zhihong Yu Priority: Blocker Fix For: 0.96.0 Attachments: 5870-v2.txt, 5870.txt After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5880) TestImportExport fails on trunk when build/running against hadoop 23.
[ https://issues.apache.org/jira/browse/HBASE-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262250#comment-13262250 ] Zhihong Yu commented on HBASE-5880: --- I logged HBASE-5876 for this issue already. TestImportExport fails on trunk when build/running against hadoop 23. - Key: HBASE-5880 URL: https://issues.apache.org/jira/browse/HBASE-5880 Project: HBase Issue Type: Bug Components: mapreduce, test Affects Versions: 0.96.0 Reporter: Jonathan Hsieh After fixing trunk against hadoop 23 compilation problems with HBASE-5870 and HBASE-5861, we have one remaining problem -- TestImportExport consistently fails unit test run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94
[ https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262266#comment-13262266 ] Zhihong Yu commented on HBASE-5864: --- Latest patch looks good. Minor comments: {code} + // after reading the root index the check sum bytes has to {code} 'check sum bytes has to' - 'checksum bytes have to' {code} + // be subracted to know if the mid key exists. {code} 'subracted' - 'subtracted' Error while reading from hfile in 0.94 -- Key: HBASE-5864 URL: https://issues.apache.org/jira/browse/HBASE-5864 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, HBASE-5864_3.patch, HBASE-5864_test.patch Got the following stacktrace during region split. {noformat} 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: Failed getting store size for value java.io.IOException: Requested block is out of range: 2906737606134037404, lastDataBlockOffset: 84764558 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638) at org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943) at org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921) at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5860) splitlogmanager should not unnecessarily resubmit tasks when zk unavailable
[ https://issues.apache.org/jira/browse/HBASE-5860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262318#comment-13262318 ] Zhihong Yu commented on HBASE-5860: --- Patch makes sense. {code} + static boolean isAnyCreateZNodePending() { {code} This method can be made private, right ? Would isAnyZNodeCreationPending be a better name ? splitlogmanager should not unnecessarily resubmit tasks when zk unavailable --- Key: HBASE-5860 URL: https://issues.apache.org/jira/browse/HBASE-5860 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: 0001-HBASE-5860-splitlogmanager-should-not-unnecessarily-.patch (Doesn't really impact the run time or correctness of log splitting) say the master has lost connection to zk. splitlogmanager's timeoutmanager will realize that all the tasks that were submitted are still unassigned. It will resubmit those tasks (i.e. create dummy znodes) splitlogmanager should realze that the tasks are unassigned but their znodes have not been created. 012-04-20 13:11:20,516 INFO org.apache.hadoop.hbase.master.SplitLogManager: dead splitlog worker msgstore295.snc4.facebook.com,60020,1334948757026 2012-04-20 13:11:20,517 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: Scheduling batch of logs to split 2012-04-20 13:11:20,517 INFO org.apache.hadoop.hbase.master.SplitLogManager: started splitting logs in [hdfs://msgstore215.snc4.facebook.com:9000/MSGSTORE215-SNC4-HBASE/.logs/msgstore295.snc4.facebook.com,60020,1334948757026-splitting] 2012-04-20 13:11:20,565 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server msgstore235.snc4.facebook.com/10.30.222.186:2181 2012-04-20 13:11:20,566 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to msgstore235.snc4.facebook.com/10.30.222.186:2181, initiating session 2012-04-20 13:11:20,575 INFO org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned = 4 2012-04-20 13:11:20,576 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: resubmitting unassigned task(s) after timeout 2012-04-20 13:11:21,577 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: resubmitting unassigned task(s) after timeout 2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x36ccb0f8010002, likely server has closed socket, closing socket connection and attempting reconnect 2012-04-20 13:11:21,683 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x136ccb0f489, likely server has closed socket, closing socket connection and attempting reconnect 2012-04-20 13:11:21,786 WARN org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc =CONNECTIONLOSS for /hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951586677 retry=3 2012-04-20 13:11:21,786 WARN org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback: create rc =CONNECTIONLOSS for /hbase/splitlog/hdfs%3A%2F%2Fmsgstore215.snc4.facebook.com%3A9000%2FMSGSTORE215-SNC4-HBASE%2F.logs%2Fmsgstore295.snc4.facebook.com%2C60020%2C1334948757026-splitting%2F10.30.251.186%253A60020.1334951920332 retry=3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4393) Implement a canary monitoring program
[ https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260557#comment-13260557 ] Zhihong Yu commented on HBASE-4393: --- This checkin might be related to: {code} [ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.8:check (default) on project hbase: Too many unapproved licenses: 1 - [Help 1] {code} Implement a canary monitoring program - Key: HBASE-4393 URL: https://issues.apache.org/jira/browse/HBASE-4393 Project: HBase Issue Type: New Feature Components: monitoring Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Matteo Bertozzi Fix For: 0.94.0, 0.96.0 Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java This JIRA is to implement a standalone program that can be used to do canary monitoring of a running HBase cluster. This program would gather a list of the regions in the cluster, then iterate over them doing lightweight operations (eg short scans) to provide metrics about latency as well as alert on availability issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94
[ https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260697#comment-13260697 ] Zhihong Yu commented on HBASE-5864: --- {code} -DataInputStream nextBlockAsStream(BlockType blockType) throws IOException; +HFileBlock nextBlockAsStream(BlockType blockType) throws IOException; {code} The method should be named nextBlock() because stream isn't returned. {code} + * Read in the root-level index from the given input stream. Must match {code} 'input stream' is no longer the input. HFileBlock is. Please add @return to the javadoc. For TestHFileWriterV2.java: {code} -final Compression.Algorithm COMPRESS_ALGO = Compression.Algorithm.GZ; +final Compression.Algorithm COMPRESS_ALGO = Compression.Algorithm.NONE; {code} We should exercise both compression algorithms. Refactoring is needed. Error while reading from hfile in 0.94 -- Key: HBASE-5864 URL: https://issues.apache.org/jira/browse/HBASE-5864 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.0 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, HBASE-5864_test.patch Got the following stacktrace during region split. {noformat} 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: Failed getting store size for value java.io.IOException: Requested block is out of range: 2906737606134037404, lastDataBlockOffset: 84764558 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638) at org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943) at org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921) at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94
[ https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260709#comment-13260709 ] Zhihong Yu commented on HBASE-5864: --- Patch v2 passes the new test. {code} + private void writeDataAndReadFromHFile(Path hfilePath, + Algorithm COMPRESS_ALGO, int ENTRY_COUNT, boolean findMidKey) throws IOException { {code} Please don't use all upper case parameter names. Please refactor the new readRootIndex() to re-use the existing method. Error while reading from hfile in 0.94 -- Key: HBASE-5864 URL: https://issues.apache.org/jira/browse/HBASE-5864 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.0 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, HBASE-5864_test.patch Got the following stacktrace during region split. {noformat} 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: Failed getting store size for value java.io.IOException: Requested block is out of range: 2906737606134037404, lastDataBlockOffset: 84764558 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638) at org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943) at org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921) at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94
[ https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260743#comment-13260743 ] Zhihong Yu commented on HBASE-5864: --- The following computation assumes checksum is on: {code} + int numBytes = (int) ChecksumUtil.numBytes(blk + .getOnDiskDataSizeWithHeader(), blk.getBytesPerChecksum()); {code} If checksum is off, we would get 'divide by 0' exception. I suggest using HFileBlock.totalChecksumBytes() in place of the above. Error while reading from hfile in 0.94 -- Key: HBASE-5864 URL: https://issues.apache.org/jira/browse/HBASE-5864 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.0 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, HBASE-5864_test.patch Got the following stacktrace during region split. {noformat} 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: Failed getting store size for value java.io.IOException: Requested block is out of range: 2906737606134037404, lastDataBlockOffset: 84764558 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638) at org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943) at org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921) at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064
[ https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260785#comment-13260785 ] Zhihong Yu commented on HBASE-5861: --- I got the following with v3 using 0.23 profile: {code} [ERROR] /Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner {code} Hadoop 23 compile broken due to tests introduced in HBASE-5064 --- Key: HBASE-5861 URL: https://issues.apache.org/jira/browse/HBASE-5861 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5861.txt, hbase-5861-jon.patch, hbase-5861-v2.patch, hbase-5861-v3.patch When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of compilation error messages: {code} jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests ... [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 18.926s [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012 [INFO] Final Memory: 55M/555M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure: Compilation failure: [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29] org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be instantiated [ERROR] - [Help 1] {code} Upon further investigation this issue is due to code introduced in HBASE-5064 and is also present in trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260809#comment-13260809 ] Zhihong Yu commented on HBASE-5848: --- TestFullLogReconstruction#testReconstruction gave the following based on addendum: {code} 012-04-24 11:35:48,409 WARN [Thread-189] client.HConnectionManager$HConnectionImplementation(1020): Encountered problems when prefetch META table: org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: tabletest, row=tabletest,aaa,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:158) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:385) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1017) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1071) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:959) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1849) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1733) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1020) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:832) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:807) at org.apache.hadoop.hbase.HBaseTestingUtility.loadTable(HBaseTestingUtility.java:992) at org.apache.hadoop.hbase.TestFullLogReconstruction.testReconstruction(TestFullLogReconstruction.java:102) {code} Will try to come up with new addendum. Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5848: -- Attachment: 5848-addendum-v2.txt Addendum v2 passes the following tests: {code} 889 mt -Dtest=TestFullLogReconstruction#testReconstruction 890 mt -Dtest=TestRegionRebalancing#testRebalanceOnRegionServerNumberChange {code} Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5848: -- Attachment: 5848-addendum-v3.txt Addendum v2 didn't address the root cause of this issue. Addendum v3 treats NodeExistsException specially in asyncSetOfflineInZooKeeper(). Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5715) Revert 'Instant schema alter' for now, HBASE-4213
[ https://issues.apache.org/jira/browse/HBASE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261026#comment-13261026 ] Zhihong Yu commented on HBASE-5715: --- @Subbu: Thanks for following up. I think this work should be discussed under HBASE-5713. I will provide review comments there. Have you run all unit tests under instant_schema_alter branch ? Revert 'Instant schema alter' for now, HBASE-4213 - Key: HBASE-5715 URL: https://issues.apache.org/jira/browse/HBASE-5715 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: patch1.patch, revert.txt, revert.v2.txt, revert.v3.txt, revert.v4.txt, revert094.v4.txt See this discussion: http://search-hadoop.com/m/NxCQh1KlSxR1/Pull+instant+schema+updating+out%253Fsubj=Pull+instant+schema+updating+out+ Pull out hbase-4213 for now. Can add it back later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261068#comment-13261068 ] Zhihong Yu commented on HBASE-5848: --- I modified testCreateTableWithEmptyRowInTheSplitKeys using the above pattern and master didn't crash (with addendum): {code} Index: src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java === --- src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java (revision 1330037) +++ src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java (working copy) @@ -733,15 +733,13 @@ @Test public void testCreateTableWithEmptyRowInTheSplitKeys() throws IOException{ byte[] tableName = Bytes.toBytes(testCreateTableWithEmptyRowInTheSplitKeys); -byte[][] splitKeys = new byte[3][]; -splitKeys[0] = region1.getBytes(); -splitKeys[1] = HConstants.EMPTY_BYTE_ARRAY; -splitKeys[2] = region2.getBytes(); +byte[][] splitKeys = new byte[2][]; +splitKeys[0] = HConstants.EMPTY_BYTE_ARRAY; +splitKeys[1] = region2.getBytes(); HTableDescriptor desc = new HTableDescriptor(tableName); desc.addFamily(new HColumnDescriptor(col)); try { admin.createTable(desc, splitKeys); - fail(Test case should fail as empty split key is passed.); } catch (IllegalArgumentException e) { } } {code} Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5848: -- Attachment: 5848-addendum-v4.txt Addendum v4 passes TestAdmin testCreateTableWithEmptyRowInTheSplitKeys is removed since IllegalArgumentException wouldn't be thrown. Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 5848-addendum-v4.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5713) Introduce throttling during Instant schema change process to throttle opening/closing regions.
[ https://issues.apache.org/jira/browse/HBASE-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5713: -- Attachment: 5713.txt Subbu's patch. Introduce throttling during Instant schema change process to throttle opening/closing regions. --- Key: HBASE-5713 URL: https://issues.apache.org/jira/browse/HBASE-5713 Project: HBase Issue Type: Bug Components: client, master, regionserver, shell Reporter: Subbu M Iyer Assignee: Subbu M Iyer Priority: Minor Attachments: 5713.txt There is a potential for region open/close stampede during instant schema change process as the process attempts to close/open impacted regions in rapid succession. We need to introduce some kind of throttling to eliminate the race condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5713) Introduce throttling during Instant schema change process to throttle opening/closing regions.
[ https://issues.apache.org/jira/browse/HBASE-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261097#comment-13261097 ] Zhihong Yu commented on HBASE-5713: --- For SchemaChangeTracker.java: {code} + Throwable exception + ) { {code} Please move the second line to the end of the first line. For CompactSplitThread.java: {code} +import java.util.concurrent.*; {code} Please restore the individual imports from java.util.concurrent {code} while (this.server.getSchemaChangeTracker() .isSchemaChangeInProgress(tableName)) { try { -Thread.sleep(100); +Thread.sleep(500); {code} Why is the sleep interval longer ? {code} + namehbase.instant.schema.throttle.time/name + value500/value + descriptionThrottle time in millis while closing/re opening impacted regions {code} 're opening' - 're-opening' Since user may choose longer throttle interval, 'hbase.instant.schema.alter.timeout' should made longer. Introduce throttling during Instant schema change process to throttle opening/closing regions. --- Key: HBASE-5713 URL: https://issues.apache.org/jira/browse/HBASE-5713 Project: HBase Issue Type: Bug Components: client, master, regionserver, shell Reporter: Subbu M Iyer Assignee: Subbu M Iyer Priority: Minor Attachments: 5713.txt There is a potential for region open/close stampede during instant schema change process as the process attempts to close/open impacted regions in rapid succession. We need to introduce some kind of throttling to eliminate the race condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5861) Hadoop 23 compile broken due to tests introduced in HBASE-5064
[ https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5861: -- Attachment: 5861-v4.patch Patch v4 adds support for obtaining JobConf. TestHLogRecordReader passes using either hadoop 1.0 or 0.23 profile. Hadoop 23 compile broken due to tests introduced in HBASE-5064 --- Key: HBASE-5861 URL: https://issues.apache.org/jira/browse/HBASE-5861 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5861-v4.patch, 5861.txt, hbase-5861-jon.patch, hbase-5861-v2.patch, hbase-5861-v3.patch When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of compilation error messages: {code} jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests ... [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 18.926s [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012 [INFO] Final Memory: 55M/555M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure: Compilation failure: [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29] org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be instantiated [ERROR] - [Help 1] {code} Upon further investigation this issue is due to code introduced in HBASE-5064 and is also present in trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261153#comment-13261153 ] Zhihong Yu commented on HBASE-5848: --- What about the root cause Ram identified: https://issues.apache.org/jira/browse/HBASE-5848?focusedCommentId=13259411page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13259411 ? A hacker can call master.createTable(desc, splitKeys) directly, bypassing HBaseAdmin. Right ? Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 5848-addendum-v4.txt, 5848-addendum-v5.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261177#comment-13261177 ] Zhihong Yu commented on HBASE-5862: --- {code} @SuppressWarnings(unused) private RegionServerDynamicMetrics dynamicMetrics; {code} I tried to find out how HRegionServer.dynamicMetrics is used but wasn't able to. {code} +//Clear all of the dynamic metrics as they are now probably useless +this.dynamicMetrics.clear(); {code} Only encodedName is removed. Why do we clear dynamicMetrics ? {code} + } catch (SecurityException e) { +LOG.debug(Unable to clear metricsRecord); {code} We don't need to stumble over the same exception(s) again and again. Why not set a boolean to indicate that reflection shouldn't be used in the future ? {code} +if (this.recordMetricMapField != null || this.registryMetricMapField != null) { + try { {code} Please separate the above two conditions into two if blocks. {code} +import com.google.common.collect.Multiset.Entry; {code} Is the above import used ? It's nice to have a test. After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261182#comment-13261182 ] Zhihong Yu commented on HBASE-5848: --- Addendum v6 looks good. BTW this is the highest numbered addendum I have ever worked with :-) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: ramkrishna.s.vasudevan Priority: Minor Fix For: 0.94.0, 0.96.0 Attachments: 5848-addendum-v2.txt, 5848-addendum-v3.txt, 5848-addendum-v4.txt, 5848-addendum-v5.txt, 5848-addendum-v6.txt, HBASE-5848.patch, HBASE-5848.patch, HBASE-5848_0.94.patch, HBASE-5848_addendum.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build
[ https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261210#comment-13261210 ] Zhihong Yu commented on HBASE-5872: --- {code} + echo $MVN clean test -DskipTests -D${PROJECT_NAME}PatchProcess $PATCH_DIR/trunkJavacWarnings.txt 21 {code} I am confused by the above: if tests are skipped, why is test target specified ? Improve hadoopqa script to include checks for hadoop 0.23 build --- Key: HBASE-5872 URL: https://issues.apache.org/jira/browse/HBASE-5872 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5872.patch There have been a few patches that have made it into hbase trunk that have broken the compile of hbase against hadoop 0.23.x, without being known for a few days. We could have the bot do a few things: 1) verify that patch compiles against hadoop 23 2) verify that unit tests pass against hadoop 23 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build
[ https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261213#comment-13261213 ] Zhihong Yu commented on HBASE-5872: --- Actually https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23 is built daily. We just need to observe the compilation error there. Improve hadoopqa script to include checks for hadoop 0.23 build --- Key: HBASE-5872 URL: https://issues.apache.org/jira/browse/HBASE-5872 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5872.patch There have been a few patches that have made it into hbase trunk that have broken the compile of hbase against hadoop 0.23.x, without being known for a few days. We could have the bot do a few things: 1) verify that patch compiles against hadoop 23 2) verify that unit tests pass against hadoop 23 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261238#comment-13261238 ] Zhihong Yu commented on HBASE-5862: --- MetricsRecord has become an interface in MRv2. Please introduce Shim to make the solution work for both hadoop 1.0 and 2.0 After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, HBASE-5862-2.patch, HBASE-5862-3.patch If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5861) Hadoop 23 compilation broken due to tests introduced in HBASE-5604
[ https://issues.apache.org/jira/browse/HBASE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5861: -- Summary: Hadoop 23 compilation broken due to tests introduced in HBASE-5604 (was: Hadoop 23 compile broken due to tests introduced in HBASE-5064 ) Hadoop 23 compilation broken due to tests introduced in HBASE-5604 -- Key: HBASE-5861 URL: https://issues.apache.org/jira/browse/HBASE-5861 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5861-v4.patch, 5861.txt, hbase-5861-jon.patch, hbase-5861-v2.patch, hbase-5861-v3.patch When attempting to compile HBase 0.94rc1 against hadoop 23, I got this set of compilation error messages: {code} jon@swoop:~/proj/hbase-0.94$ mvn clean test -Dhadoop.profile=23 -DskipTests ... [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 18.926s [INFO] Finished at: Mon Apr 23 10:38:47 PDT 2012 [INFO] Final Memory: 55M/555M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure: Compilation failure: [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[147,46] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[153,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[194,46] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[206,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[213,29] org.apache.hadoop.mapreduce.JobContext is abstract; cannot be instantiated [ERROR] [ERROR] /home/jon/proj/hbase-0.94/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java:[226,29] org.apache.hadoop.mapreduce.TaskAttemptContext is abstract; cannot be instantiated [ERROR] - [Help 1] {code} Upon further investigation this issue is due to code introduced in HBASE-5064 and is also present in trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5870: -- Summary: Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found (was: Hadoop 23 compile broken because can't find HBaseTestingUtility#getJobTracker() method) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.96.0 After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5870) Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found
[ https://issues.apache.org/jira/browse/HBASE-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261254#comment-13261254 ] Zhihong Yu commented on HBASE-5870: --- https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-on-Hadoop-23/136/console was the last build which didn't show this compilation error. Hadoop 23 compilation broken because JobTrackerRunner#getJobTracker() method is not found - Key: HBASE-5870 URL: https://issues.apache.org/jira/browse/HBASE-5870 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.96.0 After HBASE-5861 on 0.94 we are left with this issue on trunk. {code} $ mvn clean test -PlocalTests -DskipTests -Dhadoop.profile=23 ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jon/proj/hbase-svn/hbase/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1333,35] cannot find symbol [ERROR] symbol : method getJobTracker() [ERROR] location: class org.apache.hadoop.mapred.MiniMRCluster.JobTrackerRunner [ERROR] - [Help 1] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build
[ https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261291#comment-13261291 ] Zhihong Yu commented on HBASE-5872: --- Thanks for the explanation, Jon. I think the patch cannot be tested by Hadoop QA. Otherwise the QA report should include have contained compilation error instead of failed tests. I am fine with checking in the patch after HBASE-5870 is fixed - otherwise all Hadoop QA reports would only contain compilation error :-) Improve hadoopqa script to include checks for hadoop 0.23 build --- Key: HBASE-5872 URL: https://issues.apache.org/jira/browse/HBASE-5872 Project: HBase Issue Type: New Feature Affects Versions: 0.96.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Attachments: hbase-5872.patch There have been a few patches that have made it into hbase trunk that have broken the compile of hbase against hadoop 0.23.x, without being known for a few days. We could have the bot do a few things: 1) verify that patch compiles against hadoop 23 2) verify that unit tests pass against hadoop 23 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261308#comment-13261308 ] Zhihong Yu commented on HBASE-5849: --- I looped TestClusterBootOrder using patch v4 5 times and didn't see hanging test. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5732) Remove the SecureRPCEngine and merge the security-related logic in the core engine
[ https://issues.apache.org/jira/browse/HBASE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261318#comment-13261318 ] Zhihong Yu commented on HBASE-5732: --- Now that security profile is gone in patch v2, the build would be intrinsically secure HBase ? Remove the SecureRPCEngine and merge the security-related logic in the core engine -- Key: HBASE-5732 URL: https://issues.apache.org/jira/browse/HBASE-5732 Project: HBase Issue Type: Improvement Reporter: Devaraj Das Assignee: Devaraj Das Attachments: rpcengine-merge.3.patch, rpcengine-merge.patch Remove the SecureRPCEngine and merge the security-related logic in the core engine. Follow up to HBASE-5727. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5787) Table owner can't disable/delete its own table
[ https://issues.apache.org/jira/browse/HBASE-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu resolved HBASE-5787. --- Resolution: Fixed Table owner can't disable/delete its own table -- Key: HBASE-5787 URL: https://issues.apache.org/jira/browse/HBASE-5787 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Labels: acl, security Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5787-tests-wrong-names.patch, HBASE-5787-v0.patch, HBASE-5787-v1.patch An user with CREATE privileges can create a table, but can not disable it, because disable operation require ADMIN privileges. Also if a table is already disabled, anyone can remove it. {code} public void preDeleteTable(ObserverContextMasterCoprocessorEnvironment c, byte[] tableName) throws IOException { requirePermission(Permission.Action.CREATE); } public void preDisableTable(ObserverContextMasterCoprocessorEnvironment c, byte[] tableName) throws IOException { /* TODO: Allow for users with global CREATE permission and the table owner */ requirePermission(Permission.Action.ADMIN); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5787) Table owner can't disable/delete his/her own table
[ https://issues.apache.org/jira/browse/HBASE-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5787: -- Summary: Table owner can't disable/delete his/her own table (was: Table owner can't disable/delete its own table) Table owner can't disable/delete his/her own table -- Key: HBASE-5787 URL: https://issues.apache.org/jira/browse/HBASE-5787 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Labels: acl, security Fix For: 0.92.2, 0.96.0, 0.94.1 Attachments: HBASE-5787-tests-wrong-names.patch, HBASE-5787-v0.patch, HBASE-5787-v1.patch An user with CREATE privileges can create a table, but can not disable it, because disable operation require ADMIN privileges. Also if a table is already disabled, anyone can remove it. {code} public void preDeleteTable(ObserverContextMasterCoprocessorEnvironment c, byte[] tableName) throws IOException { requirePermission(Permission.Action.CREATE); } public void preDisableTable(ObserverContextMasterCoprocessorEnvironment c, byte[] tableName) throws IOException { /* TODO: Allow for users with global CREATE permission and the table owner */ requirePermission(Permission.Action.ADMIN); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5621) Convert admin protocol of HRegionInterface to PB
[ https://issues.apache.org/jira/browse/HBASE-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259709#comment-13259709 ] Zhihong Yu commented on HBASE-5621: --- Can you list the tests that failed on your machine ? For security profile, TestProcessBasedCluster.testProcessBasedCluster fails consistently and is tracked by HBASE-5851. Other than that test, we should be careful. Convert admin protocol of HRegionInterface to PB Key: HBASE-5621 URL: https://issues.apache.org/jira/browse/HBASE-5621 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: hbase-5621_v3.patch, hbase_5621_v4.patch, hbase_5621_v4.patch, hbase_5621_v5.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259740#comment-13259740 ] Zhihong Yu commented on HBASE-5848: --- According to http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29: This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time So we shouldn't use nanoTime(). Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5826) Improve sync of HLog edits
[ https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5826: -- Attachment: 5826.txt Todd's patch, for trunk. Improve sync of HLog edits -- Key: HBASE-5826 URL: https://issues.apache.org/jira/browse/HBASE-5826 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu Attachments: 5826.txt HBASE-5782 solved the correctness issue for the sync of HLog edits. Todd provided a patch that would achieve higher throughput. This JIRA is a continuation of Todd's work submitted there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5826) Improve sync of HLog edits
[ https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5826: -- Status: Patch Available (was: Open) Improve sync of HLog edits -- Key: HBASE-5826 URL: https://issues.apache.org/jira/browse/HBASE-5826 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu Attachments: 5826.txt HBASE-5782 solved the correctness issue for the sync of HLog edits. Todd provided a patch that would achieve higher throughput. This JIRA is a continuation of Todd's work submitted there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5851) TestProcessBasedCluster sometimes fails
[ https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259810#comment-13259810 ] Zhihong Yu commented on HBASE-5851: --- The test sometime failed in trunk build as well: https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2793/ TestProcessBasedCluster sometimes fails --- Key: HBASE-5851 URL: https://issues.apache.org/jira/browse/HBASE-5851 Project: HBase Issue Type: Test Reporter: Zhihong Yu Assignee: Jimmy Xiang TestProcessBasedCluster failed in https://builds.apache.org/job/HBase-TRUNK-security/178 Looks like cluster failed to start: {code} 2012-04-21 14:22:32,666 INFO [Thread-1] util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. Retries left: 2 java.io.IOException: Giving up trying to location region in meta: thread is interrupted. at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192) at org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174) at org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134) at org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178) at org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5848) Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort
[ https://issues.apache.org/jira/browse/HBASE-5848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259829#comment-13259829 ] Zhihong Yu commented on HBASE-5848: --- +1 on patch. Create table with EMPTY_START_ROW passed as splitKey causes the HMaster to abort Key: HBASE-5848 URL: https://issues.apache.org/jira/browse/HBASE-5848 Project: HBase Issue Type: Bug Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Attachments: HBASE-5848.patch A coworker of mine just had this scenario. It does not make sense the EMPTY_START_ROW as splitKey (since the region with the empty start key is implicit), but it should not cause the HMaster to abort. The abort happens because it tries to bulk assign the same region twice and then runs into race conditions with ZK. The same would (presumably) happen when two identical split keys are passed, but the client blocks that. The simplest solution here is to also block passed null or EMPTY_START_ROW as split key by the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash
[ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259914#comment-13259914 ] Zhihong Yu commented on HBASE-5844: --- {code} + (Environment variable HBASE_ZNODE_FILE is no set).); {code} 'is no set' - 'is not set' Delete the region servers znode after a regions server crash Key: HBASE-5844 URL: https://issues.apache.org/jira/browse/HBASE-5844 Project: HBase Issue Type: Improvement Components: regionserver, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 5844.v1.patch, 5844.v2.patch today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery process will stop only after a timeout, usually 30s. By deleting the znode in start script, we remove this delay and the recovery starts immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5699) Run with 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5699: -- Comment: was deleted (was: This seems interesting. I'll take a look at doing this.) Run with 1 WAL in HRegionServer - Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin Assignee: Li Pi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260008#comment-13260008 ] Zhihong Yu commented on HBASE-5699: --- It was a duplicate message. Run with 1 WAL in HRegionServer - Key: HBASE-5699 URL: https://issues.apache.org/jira/browse/HBASE-5699 Project: HBase Issue Type: Improvement Reporter: binlijin Assignee: Li Pi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.
[ https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260011#comment-13260011 ] Zhihong Yu commented on HBASE-5862: --- {code} + public void closeMetrics() { +for (String m:metricsPut) { {code} Please insert spaces around the colon above. {code} + RegionMetricsStorage.deleteTimeVaryingMetric(m); +} +this.metricsPut = new TreeSetString(); {code} Calling this.metricsPut.clear() should be enough. After Region Close remove the Operation Metrics. Key: HBASE-5862 URL: https://issues.apache.org/jira/browse/HBASE-5862 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark Priority: Minor Attachments: HBASE-5862-0.patch If a region is closed then Hadoop metrics shouldn't still be reporting about that region. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira