[jira] [Commented] (HBASE-5635) If getTaskList() returns null splitlogWorker is down. It wont serve any requests.
[ https://issues.apache.org/jira/browse/HBASE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258414#comment-13258414 ] Lars Hofhansl commented on HBASE-5635: -- +1 on patch. I wonder how many more problems like this we have lurking in the worker threads. Can you do a 0.94.1 patch as well? If getTaskList() returns null splitlogWorker is down. It wont serve any requests. -- Key: HBASE-5635 URL: https://issues.apache.org/jira/browse/HBASE-5635 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.1 Reporter: Kristam Subba Swathi Attachments: HBASE-5635.1.patch, HBASE-5635.2.patch, HBASE-5635.patch During the hlog split operation if all the zookeepers are down ,then the paths will be returned as null and the splitworker thread wil be exited Now this regionserver wil not be able to acquire any other tasks since the splitworker thread is exited Please find the attached code for more details {code} private ListString getTaskList() { for (int i = 0; i zkretries; i++) { try { return (ZKUtil.listChildrenAndWatchForNewChildren(this.watcher, this.watcher.splitLogZNode)); } catch (KeeperException e) { LOG.warn(Could not get children of znode + this.watcher.splitLogZNode, e); try { Thread.sleep(1000); } catch (InterruptedException e1) { LOG.warn(Interrupted while trying to get task list ..., e1); Thread.currentThread().interrupt(); return null; } } } {code} in the org.apache.hadoop.hbase.regionserver.SplitLogWorker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257818#comment-13257818 ] Lars Hofhansl commented on HBASE-5547: -- For this I was actually thinking a single cluster-wide setting (This HBase cluster is now in backup mode), which would affect all tables would be sufficient. But could do per table as well as Jesse has done and it definitely does not need to be a custom znode, checking a value is fine. Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256756#comment-13256756 ] Lars Hofhansl commented on HBASE-5782: -- For some reason I cannot run TestHLog locally. I always get: {code} Tests run: 0, Failures: 0, Errors: 0, Skipped: 0 {code} I can run all other tests that I tried (including some others marked with @LargeTests). Verified again with HLogPerformanceEvaluation manually. Should we put HBASE-5792 in 0.94 as well, so that we use this test? Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256757#comment-13256757 ] Lars Hofhansl commented on HBASE-5545: -- @Ram: yes that is what I meant to say (recursive is not needed, and somebody might misunderstand later and use this to delete tmp directory). Not important. +1 on commit as is. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ]
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256933#comment-13256933 ] Lars Hofhansl commented on HBASE-5782: -- Now note, that TestHLog still does not run anything locally (neither in 0.94 or trunk), there is something with that specific test it seems. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256938#comment-13256938 ] Lars Hofhansl commented on HBASE-5545: -- I'm going to commit this 0.94 and 0.96 in the next few minutes. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call:
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257105#comment-13257105 ] Lars Hofhansl commented on HBASE-5782: -- Yes. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257114#comment-13257114 ] Lars Hofhansl commented on HBASE-5547: -- What I had in mind when I filed this was something quite simple: A single znode that all RegionServers would check when deleting any HFile and then based on existence or value of that znode either delete the HFile or rename it. Of things are never quite as simple... To get a guaranteed consistent snapshot the RegionServers need to check for the znode's value synchronously in the delete path (or at least I see no other way). Otherwise there are times when the RegionServers do not agree and some files will be deleted and some will be backed up with no possibility for the client to know exactly as of when the backup would be consistent. Since HFiles are deleted as result of a compaction in an asynchronous thread, synchronously checking the znode should not cause performance issues, unless we fear we'll overload ZK. This simple solution would add special code for just this scenario, which is bad. At the same time it would be relatively simple (famous last words), so that's something to weigh. Don't delete HFiles when in backup mode - Key: HBASE-5547 URL: https://issues.apache.org/jira/browse/HBASE-5547 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Jesse Yates This came up in a discussion I had with Stack. It would be nice if HBase could be notified that a backup is in progress (via a znode for example) and in that case either: 1. rename HFiles to be delete to file.bck 2. rename the HFiles into a special directory 3. rename them to a general trash directory (which would not need to be tied to backup mode). That way it should be able to get a consistent backup based on HFiles (HDFS snapshots or hard links would be better options here, but we do not have those). #1 makes cleanup a bit harder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255714#comment-13255714 ] Lars Hofhansl commented on HBASE-5545: -- If this gets done before HBASE-5782 I'll pull it in, otherwise I don't think I should hold up the release for this. Sounds good? region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.1 Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255770#comment-13255770 ] Lars Hofhansl commented on HBASE-5782: -- Yep. It's crucial that syncTillHere is updated last. The pendingWrites are appended strictly in order, so there is a very short race that might lead to sync be issued multiple time when only one was necessary (it seems the same race condition existed before). So I think it is safe and low risk. The question now is: Do this for 0.94 and then a more elaborate rewrite in trunk, or do a more risky rewrite for 0.94? Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255779#comment-13255779 ] Lars Hofhansl commented on HBASE-5545: -- +1 on patch. I assume there no strange race conditions in this part of the code. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public abstract
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255828#comment-13255828 ] Lars Hofhansl commented on HBASE-5782: -- Anybody opposed to do my patch for 0.94 and do a rewrite in trunk? Todd? Stack? Ted? Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255911#comment-13255911 ] Lars Hofhansl commented on HBASE-5782: -- @Stack: the doneUpTo is fine, because it is only used to set syncedTillHere. We won't write more into the log (once we take the pendingWrites they are gone), but we might sync too much unnecessarily. Will do the (performance) testing now. I don't have a cluster at my disposal atm, so I'll do it with a local HDFS instance. If we'd rather finish Todd's for 0.94 that'd be nice. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255939#comment-13255939 ] Lars Hofhansl commented on HBASE-5782: -- Ah, I see. Yes, doneUpto line should be pulled into the synchronized block. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255974#comment-13255974 ] Lars Hofhansl commented on HBASE-5782: -- @Stack: I ran it too. It works fine. The interesting thing I find that it is faster *with* the patch! And that scares me. I ran with 100 threads, 1 iterations. With the patch it took 29s, without it took 43s. This is on a 6 core machine with hyper threading. Now, the HLogPerformanceEvaluation does not work with -nosync and -verify, right? (presumably because no final sync is issued). Wouldn't mind to get some other numbers from you as well if you had some time. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256058#comment-13256058 ] Lars Hofhansl commented on HBASE-5782: -- Thanks Stack! I'll run a test with fewer threads too, just to make sure, the fact that both of our patches are faster probably means that we did a lot of unnecessary sync'ing before...? 20% performance increase is pretty damn awesome. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256072#comment-13256072 ] Lars Hofhansl commented on HBASE-5782: -- Did some more tests with HLogPerformanceEvaluation (and my patch): 10 threads, 10 iterations: with patch: 42s, without patch: 41s 5 threads, 20 iterations: with patch: 46s, without patch: 44s 2 threads, 20 iterations: with patch: 46s, without patch: 44s So for fewer threads my patch is slightly slower. So... What do we do? :) Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence
[ https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256083#comment-13256083 ] Lars Hofhansl commented on HBASE-5741: -- Thanks Clint. No harm, I just think it will hide complexity that should not be hidden. I also don't think it's too much to ask a user to create the table first (but we should definitely fix the Javadoc in that case). That all said, I am +-0 on this. It's a simple change, and v3 looks good. ImportTsv does not check for table existence - Key: HBASE-5741 URL: https://issues.apache.org/jira/browse/HBASE-5741 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.4 Reporter: Clint Heath Assignee: Himanshu Vashishtha Fix For: 0.96.0, 0.94.1 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, HBase-5741.patch The usage statement for the importtsv command to hbase claims this: Note: if you do not use this option, then the target table must already exist in HBase (in reference to the importtsv.bulk.output command-line option) The truth is, the table must exist no matter what, importtsv cannot and will not create it for you. This is the case because the createSubmittableJob method of ImportTsv does not even attempt to check if the table exists already, much less create it: (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java) 305 HTable table = new HTable(conf, tableName); The HTable method signature in use there assumes the table exists and runs a meta scan on it: (From org.apache.hadoop.hbase.client.HTable.java) 142 * Creates an object to access a HBase table. ... 151 public HTable(Configuration conf, final String tableName) What we should do inside of createSubmittableJob is something similar to what the completebulkloads command would do: (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java) 690 boolean tableExists = this.doesTableExist(tableName); 691 if (!tableExists) this.createTable(tableName,dirPath); Currently the docs are misleading, the table in fact must exist prior to running importtsv. We should check if it exists rather than assume it's already there and throw the below exception: 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table: org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: myTable2, row=myTable2,,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256094#comment-13256094 ] Lars Hofhansl commented on HBASE-5782: -- I'm +1 on committing on my patch for 0.94.0. We can then either revisit for 0.94.1+ or 0.96. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256198#comment-13256198 ] Lars Hofhansl commented on HBASE-5545: -- @Ram: Isn't this a pretty remote corner case? If we feel strongly that this should go into 0.94.0, let's get it in. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256200#comment-13256200 ] Lars Hofhansl commented on HBASE-5782: -- Yes, we need a test for this. What scared me most about this bug was that no test caught it, and this one of *the* core parts of HBase. This test would basically just call code from HLogPerformanceEvaluation, right? Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256202#comment-13256202 ] Lars Hofhansl commented on HBASE-5782: -- I can work on a test, unless you like to Stack, or maybe you have started on it anyway. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256206#comment-13256206 ] Lars Hofhansl commented on HBASE-5545: -- I see. Yep, that is more likely to happen than a DN crash (I think). No hurry Ram :) region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) [2012-03-07 20:51:45,858] [WARN ] [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the method call: public
[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.
[ https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256210#comment-13256210 ] Lars Hofhansl commented on HBASE-5545: -- 2nd patch looks good. I don't think you needed to the add the FSUtil code, but it cannot hurt. fs.delete does not throw if the file does not exist, right? Do you want to set recursive to false (just in case somebody changes this around ends up pointing to a directory)... This is a super minor nit. +1 on commit if delete does not throw on non-existant file. region can't be opened for a long time. Because the creating File failed. - Key: HBASE-5545 URL: https://issues.apache.org/jira/browse/HBASE-5545 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.6 Reporter: gaojinchao Assignee: gaojinchao Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5545.patch, HBASE-5545.patch Scenario: 1. File is created 2. But while writing data, all datanodes might have crashed. So writing data will fail. 3. Now even if close is called in finally block, close also will fail and throw the Exception because writing data failed. 4. After this if RS try to create the same file again, then AlreadyBeingCreatedException will come. Suggestion to handle this scenario. --- 1. Check for the existence of the file, if exists delete the file and create new file. Here delete call for the file will not check whether the file is open or closed. Overwrite Option: 1. Overwrite option will be applicable if you are trying to overwrite a closed file. 2. If the file is not closed, then even with overwrite option Same AlreadyBeingCreatedException will be thrown. This is the expected behaviour to avoid the Multiple clients writing to same file. Region server logs: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo for DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 on client 158.1.132.19 because current leaseholder is trying to recreate file. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131) at org.apache.hadoop.ipc.Client.call(Client.java:961) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245) at $Proxy6.create(Unknown Source) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at $Proxy6.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518) at org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at
[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility
[ https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254864#comment-13254864 ] Lars Hofhansl commented on HBASE-5795: -- +1 on v2, are you integrating v2 with Stacks test? hbase-3927 breaks 0.92-0.94 compatibility --- Key: HBASE-5795 URL: https://issues.apache.org/jira/browse/HBASE-5795 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.0, 0.96.0 Attachments: 5795-v2.txt, 5795.unittest.txt This commit broke our 0.92/0.94 compatibility: {code} r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line HBASE-3927 display total uncompressed byte size of a region in web UI {code} I just tried the new RC for 0.94. I brought up a 0.94 master on a 0.92 cluster and rather than just digest version 1 of the HServerLoad, I get this: {code} 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to read call parameters for client 10.4.14.38 java.io.IOException: Error in readFields at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: A record version mismatch occured. Expecting v2, found v1 at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681) ... 9 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254925#comment-13254925 ] Lars Hofhansl commented on HBASE-5782: -- So the problem is that logSyncerThread keeps the edit in order but the syncer then applies the pending batches potentially out of order? We might just need a sync lock to prevent two threads syncing at the same time. This is different from the update lock, which also prevents writing any edits. Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255094#comment-13255094 ] Lars Hofhansl commented on HBASE-5782: -- The short term choices we have are: # revert HBASE-4528, HBASE-4487, and HBASE-5541 (are there others?) # Partially revert HBASE-2467 (or devise other ways to have strictly one thread flushing an HLog). Maybe Todd as the author of HBASE-2467 could chime in... Todd? Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255168#comment-13255168 ] Lars Hofhansl commented on HBASE-5782: -- Is this really only a problem because of HBASE-4528 and HBASE-4487? It seems even without HLog.appendNoSync() is is possible that one threads flushes an entire batch of pending write before an thread that started earlier can get to it. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782.txt, HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5780) Fix race in HBase regionserver startup vs ZK SASL authentication
[ https://issues.apache.org/jira/browse/HBASE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255170#comment-13255170 ] Lars Hofhansl commented on HBASE-5780: -- testResetZooKeeperSession failed in 0.94 again. :( Fix race in HBase regionserver startup vs ZK SASL authentication Key: HBASE-5780 URL: https://issues.apache.org/jira/browse/HBASE-5780 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.92.1, 0.94.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5780-v2.patch, HBASE-5780.patch, TestReplicationPeer-Security-output.log, TestReplicationPeer-output.log, testoutput.tar.gz Secure RegionServers sometimes fail to start with the following backtrace: 2012-03-22 17:20:16,737 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server centos60-20.ent.cloudera.com,60020,1332462015929: Unexpected exception during initialization, aborting org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase/shutdown at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:295) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:518) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:494) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:569) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:532) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:634) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255191#comment-13255191 ] Lars Hofhansl commented on HBASE-5782: -- Sketched patch looks good, making sure that we do sync' a batch of write before the previous batch is done. I was thinking about something similar, but still allowing multiple threads to write, just that a thread cannot start writing before the previous batch is confirmed sync'ed. I guess we actually wouldn't get more parallelism out that than with your approach. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-sketch.txt, 5782.txt, HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255214#comment-13255214 ] Lars Hofhansl commented on HBASE-5778: -- If we could tail the logs it would work. We just cannot seek into an HLog in the middle and start reading from it. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255230#comment-13255230 ] Lars Hofhansl commented on HBASE-5778: -- Hmm... Then that does not explain what I saw. I saw the ReplicationSource trying to read from a position in the file (indicated by ZK) and then the read failing because the dictionary was not built up. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255308#comment-13255308 ] Lars Hofhansl commented on HBASE-5782: -- @Ram: It looks like your patch should work for this scenario, but I'd be generally wary about edits not being in order in the WAL. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility
[ https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254414#comment-13254414 ] Lars Hofhansl commented on HBASE-5795: -- +1 om patch. hbase-3927 breaks 0.92-0.94 compatibility --- Key: HBASE-5795 URL: https://issues.apache.org/jira/browse/HBASE-5795 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: 5794.txt, 5795-v1.txt This commit broke our 0.92/0.94 compatibility: {code} r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line HBASE-3927 display total uncompressed byte size of a region in web UI {code} I just tried the new RC for 0.94. I brought up a 0.94 master on a 0.92 cluster and rather than just digest version 1 of the HServerLoad, I get this: {code} 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to read call parameters for client 10.4.14.38 java.io.IOException: Error in readFields at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: A record version mismatch occured. Expecting v2, found v1 at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681) ... 9 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility
[ https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254429#comment-13254429 ] Lars Hofhansl commented on HBASE-5795: -- @Stack: I think you forgot to include the actual test in the patch :) hbase-3927 breaks 0.92-0.94 compatibility --- Key: HBASE-5795 URL: https://issues.apache.org/jira/browse/HBASE-5795 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: 5795-v1.txt This commit broke our 0.92/0.94 compatibility: {code} r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line HBASE-3927 display total uncompressed byte size of a region in web UI {code} I just tried the new RC for 0.94. I brought up a 0.94 master on a 0.92 cluster and rather than just digest version 1 of the HServerLoad, I get this: {code} 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to read call parameters for client 10.4.14.38 java.io.IOException: Error in readFields at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: A record version mismatch occured. Expecting v2, found v1 at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681) ... 9 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility
[ https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254430#comment-13254430 ] Lars Hofhansl commented on HBASE-5795: -- Oh, you attached your 5794 patch to this issue. I removed to avoid confusing, when you get a chance could you attach your patch here? Then we can make a combined patch with that and Ted's fix. hbase-3927 breaks 0.92-0.94 compatibility --- Key: HBASE-5795 URL: https://issues.apache.org/jira/browse/HBASE-5795 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: 5795-v1.txt This commit broke our 0.92/0.94 compatibility: {code} r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line HBASE-3927 display total uncompressed byte size of a region in web UI {code} I just tried the new RC for 0.94. I brought up a 0.94 master on a 0.92 cluster and rather than just digest version 1 of the HServerLoad, I get this: {code} 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to read call parameters for client 10.4.14.38 java.io.IOException: Error in readFields at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: A record version mismatch occured. Expecting v2, found v1 at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681) ... 9 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254433#comment-13254433 ] Lars Hofhansl commented on HBASE-5782: -- Thanks Ram! Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.94.0 Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5796) Fix our abuse of IOE: see http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html
[ https://issues.apache.org/jira/browse/HBASE-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254470#comment-13254470 ] Lars Hofhansl commented on HBASE-5796: -- Could be as simple as a better hierarchy of classes extending IOException (like DoNotRetryException). Fix our abuse of IOE: see http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html --- Key: HBASE-5796 URL: https://issues.apache.org/jira/browse/HBASE-5796 Project: HBase Issue Type: Task Reporter: stack Lets make more context particular exceptions rather than throw IOEs everywhere. See Benoît's rant: http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility
[ https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254510#comment-13254510 ] Lars Hofhansl commented on HBASE-5795: -- @Ted: I don't understand why that redundant write in 0.94 causes any problem. Can you elaborate? Was there an other problem in v1? hbase-3927 breaks 0.92-0.94 compatibility --- Key: HBASE-5795 URL: https://issues.apache.org/jira/browse/HBASE-5795 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.94.0 Attachments: 5795-v2.txt, 5795.unittest.txt This commit broke our 0.92/0.94 compatibility: {code} r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line HBASE-3927 display total uncompressed byte size of a region in web UI {code} I just tried the new RC for 0.94. I brought up a 0.94 master on a 0.92 cluster and rather than just digest version 1 of the HServerLoad, I get this: {code} 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to read call parameters for client 10.4.14.38 java.io.IOException: Error in readFields at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684) at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: A record version mismatch occured. Expecting v2, found v1 at org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681) ... 9 more {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254204#comment-13254204 ] Lars Hofhansl commented on HBASE-5778: -- @Ted: Unfortunately it is not as simple as that. As I tried to explain above, the ReplicationSource reads from the WAL files at offsets that are stored in ZK. This does not work any longer, as you can no longer start reading the WAL at an offset. The files need to be read from the beginning to build up the dictionary. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix
[ https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254205#comment-13254205 ] Lars Hofhansl commented on HBASE-5781: -- @Jon: Are you saying sink rc1 for this? Zookeeper session got closed while trying to assign the region to RS using hbck -fix Key: HBASE-5781 URL: https://issues.apache.org/jira/browse/HBASE-5781 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Kristam Subba Swathi Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.94.0 After running the hbck in the cluster ,it is found that one region is not assigned So the hbck -fix is used to fix this But the assignment didnt happen since the zookeeper session is closed Please find the attached trace for more details - Trying to fix unassigned region... 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {NAME = 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY = '002300', ENDKEY = '002311', ENCODED = 00871fbd7583512e12c4eb38e900be8d,} 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x236738a263a 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed ERROR: Region { meta = ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs = hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed = } not deployed on any region server. Trying to fix unassigned region... 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable to set watcher on znode (/hbase) org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) at org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109) at org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92) at org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235) at org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351) at org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894) 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: hconnection-0x236738a263a Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) at
[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation
[ https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254207#comment-13254207 ] Lars Hofhansl commented on HBASE-5790: -- @Jesse: Was more referring to the request size that is shipped to ZK. Probably not a problem. ZKUtil deleteRecursively should be a recoverable operation -- Key: HBASE-5790 URL: https://issues.apache.org/jira/browse/HBASE-5790 Project: HBase Issue Type: Improvement Reporter: Jesse Yates Assignee: Jesse Yates Labels: zookeeper Fix For: 0.96.0, 0.94.1 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means we can wholesale delete chunks of the zk tree and ensure that we don't have any pesky recursive delete issues where we delete the children of a node, but then a child joins before deletion of the parent. Even without transactions, this should be the behavior, but it is possible to make it much cleaner now that we have this new feature in zk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation
[ https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254208#comment-13254208 ] Lars Hofhansl commented on HBASE-5790: -- +1 on V2 ZKUtil deleteRecursively should be a recoverable operation -- Key: HBASE-5790 URL: https://issues.apache.org/jira/browse/HBASE-5790 Project: HBase Issue Type: Improvement Reporter: Jesse Yates Assignee: Jesse Yates Labels: zookeeper Fix For: 0.96.0, 0.94.1 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means we can wholesale delete chunks of the zk tree and ensure that we don't have any pesky recursive delete issues where we delete the children of a node, but then a child joins before deletion of the parent. Even without transactions, this should be the behavior, but it is possible to make it much cleaner now that we have this new feature in zk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix
[ https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254211#comment-13254211 ] Lars Hofhansl commented on HBASE-5781: -- Dang :( Zookeeper session got closed while trying to assign the region to RS using hbck -fix Key: HBASE-5781 URL: https://issues.apache.org/jira/browse/HBASE-5781 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Kristam Subba Swathi Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.94.0 After running the hbck in the cluster ,it is found that one region is not assigned So the hbck -fix is used to fix this But the assignment didnt happen since the zookeeper session is closed Please find the attached trace for more details - Trying to fix unassigned region... 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {NAME = 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY = '002300', ENDKEY = '002311', ENCODED = 00871fbd7583512e12c4eb38e900be8d,} 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x236738a263a 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed ERROR: Region { meta = ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs = hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed = } not deployed on any region server. Trying to fix unassigned region... 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable to set watcher on znode (/hbase) org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) at org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109) at org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92) at org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235) at org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351) at org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894) 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: hconnection-0x236738a263a Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) at
[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix
[ https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254260#comment-13254260 ] Lars Hofhansl commented on HBASE-5781: -- +1 on patch. Zookeeper session got closed while trying to assign the region to RS using hbck -fix Key: HBASE-5781 URL: https://issues.apache.org/jira/browse/HBASE-5781 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 Reporter: Kristam Subba Swathi Assignee: Jonathan Hsieh Priority: Critical Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: hbase-5781.patch After running the hbck in the cluster ,it is found that one region is not assigned So the hbck -fix is used to fix this But the assignment didnt happen since the zookeeper session is closed Please find the attached trace for more details - Trying to fix unassigned region... 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {NAME = 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY = '002300', ENDKEY = '002311', ENCODED = 00871fbd7583512e12c4eb38e900be8d,} 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: Closed zookeeper sessionid=0x236738a263a 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed ERROR: Region { meta = ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs = hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed = } not deployed on any region server. Trying to fix unassigned region... 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable to set watcher on znode (/hbase) org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) at org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109) at org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92) at org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235) at org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351) at org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894) 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: hconnection-0x236738a263a Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()
[ https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254262#comment-13254262 ] Lars Hofhansl commented on HBASE-5256: -- I see. This needs to be reverted from the 0.94 branch, otherwise it breaks compatibility with 0.92. Use WritableUtils.readVInt() in RegionLoad.readFields() --- Key: HBASE-5256 URL: https://issues.apache.org/jira/browse/HBASE-5256 Project: HBase Issue Type: Task Reporter: Zhihong Yu Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-5256.trunk.v1.patch Currently in.readInt() is used in RegionLoad.readFields() More metrics would be added to RegionLoad in the future, we should utilize WritableUtils.readVInt() to reduce the amount of data exchanged between Master and region servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()
[ https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254263#comment-13254263 ] Lars Hofhansl commented on HBASE-5256: -- Or does it now? I think that affects the public interface. Use WritableUtils.readVInt() in RegionLoad.readFields() --- Key: HBASE-5256 URL: https://issues.apache.org/jira/browse/HBASE-5256 Project: HBase Issue Type: Task Reporter: Zhihong Yu Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-5256.trunk.v1.patch Currently in.readInt() is used in RegionLoad.readFields() More metrics would be added to RegionLoad in the future, we should utilize WritableUtils.readVInt() to reduce the amount of data exchanged between Master and region servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253549#comment-13253549 ] Lars Hofhansl commented on HBASE-5677: -- Arghh... OK. So: * in 0.94+ this is fixed, correct? * you like to backport HBASE-5454 to 0.90 and 0.92, right? So let's close this one then? The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5677-proposal.txt, 5677-proposal.txt, 5677-proposal.txt, HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253588#comment-13253588 ] Lars Hofhansl commented on HBASE-5778: -- I still don't understand why this is a problem with replication. J-D do you have any insights? Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253614#comment-13253614 ] Lars Hofhansl commented on HBASE-5778: -- Oh I see. The KVs are only decompressed when read. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253765#comment-13253765 ] Lars Hofhansl commented on HBASE-5604: -- Sigh... That's exactly what I was afraid about when adding data parsing. What timezone is used to store the WAL edit? Is it GMT? If so, the date should be interpreted as GMT. If the data in the WAL is always the local TZ, I should remove the date verification test (which in a sense is pointless anyway, because it just tests whether SimpleDateFormat works). M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253824#comment-13253824 ] Lars Hofhansl commented on HBASE-5604: -- @Stack and Ted: I'll just remove that test. The HFile pretty printer will show time in the same TZ as WALPlayer would parse it, which is still useful. I'll not file an extra jira but post an addendum to this one. M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253839#comment-13253839 ] Lars Hofhansl commented on HBASE-5604: -- OK. Done. I'll use that build as RC1. M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253886#comment-13253886 ] Lars Hofhansl commented on HBASE-5782: -- Did you have HBASE-5778 enabled? Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Priority: Critical Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5790) ZKUtil deleteRecurisively should be a recoverable operation
[ https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253986#comment-13253986 ] Lars Hofhansl commented on HBASE-5790: -- Patch looks good. Will there be any request size problem if the subtree to delete is large? Super minor nit: Should point out in Javadoc that deleteRecursively and an idempotent operation. ZKUtil deleteRecurisively should be a recoverable operation --- Key: HBASE-5790 URL: https://issues.apache.org/jira/browse/HBASE-5790 Project: HBase Issue Type: Improvement Reporter: Jesse Yates Assignee: Jesse Yates Labels: zookeeper Fix For: 0.96.0, 0.94.1 Attachments: java_HBASE-5790.patch As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means we can wholesale delete chunks of the zk tree and ensure that we don't have any pesky recursive delete issues where we delete the children of a node, but then a child joins before deletion of the parent. Even without transactions, this should be the behavior, but it is possible to make it much cleaner now that we have this new feature in zk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253990#comment-13253990 ] Lars Hofhansl commented on HBASE-5782: -- You tried multiple times in 0.92.1, and no problem? Any chance to capture this in a test? (Although that might be difficult) Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Priority: Critical Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253995#comment-13253995 ] Lars Hofhansl commented on HBASE-5778: -- This fundamentally break replication. The problem above is actually that the HLogKey and WALEdit after being read from a compressed HLog have the compression context set, and hence this will be used to compress them when sent over the wire to the sink. Of course the sink does not know how to uncompress. So I just set the compression context to null in ReplicationSource. With that hurdle out of the way, I find that seeking to a specific position in the HLog (the position stored in ZK) does not work, because the dictionary is not build up (compressed HLogs always need to read from the beginning). Not sure how to fix the 2nd part. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253996#comment-13253996 ] Lars Hofhansl commented on HBASE-5782: -- And is there anything interesting in the log files? Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Priority: Critical Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252530#comment-13252530 ] Lars Hofhansl commented on HBASE-5604: -- Since this is only new code I propose this for 0.94.0 as well. M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252533#comment-13252533 ] Lars Hofhansl commented on HBASE-5677: -- @Stack: This seems like a good addition to 0.94. Let's say if we can track down the test failures today we'll put this in 0.94.0 otherwise 0.94.1 The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252726#comment-13252726 ] Lars Hofhansl commented on HBASE-3443: -- @Stack: Wasn't sure about this one. Didn't feel right to add this to a performance release :) You think this should go into 0.94? Might be good to have this correctness fix early. ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 0.92.0, 0.92.1 Reporter: Kannan Muthukkaruppan Assignee: Lars Hofhansl Priority: Critical Labels: corruption Fix For: 0.96.0 Attachments: 3443.txt For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5774) Add documentation for WALPlayer to HBase reference guide.
[ https://issues.apache.org/jira/browse/HBASE-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252755#comment-13252755 ] Lars Hofhansl commented on HBASE-5774: -- Thanks Doug. I think I'll just roll this up into parent. Add documentation for WALPlayer to HBase reference guide. - Key: HBASE-5774 URL: https://issues.apache.org/jira/browse/HBASE-5774 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 5774.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5775) ZKUtil doesn't handle deleteRecurisively cleanly
[ https://issues.apache.org/jira/browse/HBASE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252831#comment-13252831 ] Lars Hofhansl commented on HBASE-5775: -- Will commit when HadoopQA is done. ZKUtil doesn't handle deleteRecurisively cleanly Key: HBASE-5775 URL: https://issues.apache.org/jira/browse/HBASE-5775 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.94.0, 0.96.0 Attachments: java_HBASE-5775.patch ZKUtil.deleteNodeRecursively()'s contract says that it handles deletion of the node and all its children. However, nothing is mentioned as to what happens if the node you are attempting to delete doesn't actually exist. Turns out, it throws a null pointer exception. I 'm proposing that we change the code s.t. it handles the case where the parent is already gone and exits cleanly, rather than failing horribly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5775) ZKUtil doesn't handle deleteRecurisively cleanly
[ https://issues.apache.org/jira/browse/HBASE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252889#comment-13252889 ] Lars Hofhansl commented on HBASE-5775: -- Thanks for the patch. ZKUtil doesn't handle deleteRecurisively cleanly Key: HBASE-5775 URL: https://issues.apache.org/jira/browse/HBASE-5775 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.94.0, 0.96.0 Attachments: java_HBASE-5775.patch ZKUtil.deleteNodeRecursively()'s contract says that it handles deletion of the node and all its children. However, nothing is mentioned as to what happens if the node you are attempting to delete doesn't actually exist. Turns out, it throws a null pointer exception. I 'm proposing that we change the code s.t. it handles the case where the parent is already gone and exits cleanly, rather than failing horribly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252892#comment-13252892 ] Lars Hofhansl commented on HBASE-5604: -- @Ted: Are you OK with latest patch? If so, I'll commit today. M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252934#comment-13252934 ] Lars Hofhansl commented on HBASE-5604: -- And I did insert the spaces before commit :) M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252939#comment-13252939 ] Lars Hofhansl commented on HBASE-5778: -- +1 on patch Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252950#comment-13252950 ] Lars Hofhansl commented on HBASE-5677: -- I ran TestReplication with this patch (5577-proposal.txt) and it passed fine. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252953#comment-13252953 ] Lars Hofhansl commented on HBASE-5677: -- TestImportTsv passes as well. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5677-proposal.txt, 5677-proposal.txt, HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253046#comment-13253046 ] Lars Hofhansl commented on HBASE-5677: -- @xufeng: I made a trunk patch, so that I can get a HadoopQA test run. All the failures are unrelated, though. In fact they are all because of negative array sizes if WALEdit de-serialization, which is suspicious. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5677-proposal.txt, 5677-proposal.txt, 5677-proposal.txt, HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253053#comment-13253053 ] Lars Hofhansl commented on HBASE-5778: -- I see a bunch of suspicious test failures now: {code} java.lang.NegativeArraySizeException at org.apache.hadoop.hbase.regionserver.wal.HLogKey.readFields(HLogKey.java:305) {code} Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253058#comment-13253058 ] Lars Hofhansl commented on HBASE-5778: -- Yeah... The failures in TestHLog are because of this. Need to rollback or figure out what the problem is. Probably test related. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253061#comment-13253061 ] Lars Hofhansl commented on HBASE-5778: -- TestHLog.testAppendClose() uses this to read back the WALEdits: {code} // Make sure you can read all the content SequenceFile.Reader reader = new SequenceFile.Reader(this.fs, walPath, this.conf); {code} Well, dah, that does not work. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253063#comment-13253063 ] Lars Hofhansl commented on HBASE-5778: -- Then there's FaultySequenceFileLogReader, which does not do the right thing. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253065#comment-13253065 ] Lars Hofhansl commented on HBASE-5778: -- Have a fix for TestHLog, working on TestHLogSplit Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253091#comment-13253091 ] Lars Hofhansl commented on HBASE-5778: -- Similar... There're some other issues. I'll have a patch soon. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253094#comment-13253094 ] Lars Hofhansl commented on HBASE-5604: -- Yes, should do an addendum. Will mark as large. M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253095#comment-13253095 ] Lars Hofhansl commented on HBASE-5604: -- Done. Thanks Ted. M/R tool to replay WAL files Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253118#comment-13253118 ] Lars Hofhansl commented on HBASE-5778: -- mvn failed with an OOME. Let's revert this change, until we track these issues down. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252057#comment-13252057 ] Lars Hofhansl commented on HBASE-5677: -- Sigh... I think these are test-related issues. In a real cluster the Master is useless unless initialized (although I wouldn't know if that holds for a backup master as well). I'll have a look at these when I get a chance. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence
[ https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252064#comment-13252064 ] Lars Hofhansl commented on HBASE-5741: -- Sorry for chiming in late here, but do we actually want ImportTsv to create the table for us? Personally I'd rather have it fail, which forces me to think about how I want create the table (pre split, compression, etc). ImportTsv does not check for table existence - Key: HBASE-5741 URL: https://issues.apache.org/jira/browse/HBASE-5741 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.4 Reporter: Clint Heath Assignee: Himanshu Vashishtha Fix For: 0.94.0, 0.96.0 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, HBase-5741.patch The usage statement for the importtsv command to hbase claims this: Note: if you do not use this option, then the target table must already exist in HBase (in reference to the importtsv.bulk.output command-line option) The truth is, the table must exist no matter what, importtsv cannot and will not create it for you. This is the case because the createSubmittableJob method of ImportTsv does not even attempt to check if the table exists already, much less create it: (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java) 305 HTable table = new HTable(conf, tableName); The HTable method signature in use there assumes the table exists and runs a meta scan on it: (From org.apache.hadoop.hbase.client.HTable.java) 142 * Creates an object to access a HBase table. ... 151 public HTable(Configuration conf, final String tableName) What we should do inside of createSubmittableJob is something similar to what the completebulkloads command would do: (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java) 690 boolean tableExists = this.doesTableExist(tableName); 691 if (!tableExists) this.createTable(tableName,dirPath); Currently the docs are misleading, the table in fact must exist prior to running importtsv. We should check if it exists rather than assume it's already there and throw the below exception: 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table: org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: myTable2, row=myTable2,,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence
[ https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252076#comment-13252076 ] Lars Hofhansl commented on HBASE-5741: -- Import does not do that either. At least we should be consistent between the various importing tools. It seems overkill to burden these tools with that. That said, I am not opposed, just questioning the usefulness. ImportTsv does not check for table existence - Key: HBASE-5741 URL: https://issues.apache.org/jira/browse/HBASE-5741 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.4 Reporter: Clint Heath Assignee: Himanshu Vashishtha Fix For: 0.94.0, 0.96.0 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, HBase-5741.patch The usage statement for the importtsv command to hbase claims this: Note: if you do not use this option, then the target table must already exist in HBase (in reference to the importtsv.bulk.output command-line option) The truth is, the table must exist no matter what, importtsv cannot and will not create it for you. This is the case because the createSubmittableJob method of ImportTsv does not even attempt to check if the table exists already, much less create it: (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java) 305 HTable table = new HTable(conf, tableName); The HTable method signature in use there assumes the table exists and runs a meta scan on it: (From org.apache.hadoop.hbase.client.HTable.java) 142 * Creates an object to access a HBase table. ... 151 public HTable(Configuration conf, final String tableName) What we should do inside of createSubmittableJob is something similar to what the completebulkloads command would do: (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java) 690 boolean tableExists = this.doesTableExist(tableName); 691 if (!tableExists) this.createTable(tableName,dirPath); Currently the docs are misleading, the table in fact must exist prior to running importtsv. We should check if it exists rather than assume it's already there and throw the below exception: 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table: org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: myTable2, row=myTable2,,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence
[ https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252095#comment-13252095 ] Lars Hofhansl commented on HBASE-5741: -- Even the created HFileOutputFormat use the table in HBase to find out about splits. We will rarely bulk import into an empty table that is not pre-split...? (I'm assuming here, you folks at Cloudera see many more use cases that I do) If you believe strongly that we should add this, let's do it :) We can do Import in a separate patch, or not do that as we see fit. ImportTsv does not check for table existence - Key: HBASE-5741 URL: https://issues.apache.org/jira/browse/HBASE-5741 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.4 Reporter: Clint Heath Assignee: Himanshu Vashishtha Fix For: 0.94.0, 0.96.0 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, HBase-5741.patch The usage statement for the importtsv command to hbase claims this: Note: if you do not use this option, then the target table must already exist in HBase (in reference to the importtsv.bulk.output command-line option) The truth is, the table must exist no matter what, importtsv cannot and will not create it for you. This is the case because the createSubmittableJob method of ImportTsv does not even attempt to check if the table exists already, much less create it: (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java) 305 HTable table = new HTable(conf, tableName); The HTable method signature in use there assumes the table exists and runs a meta scan on it: (From org.apache.hadoop.hbase.client.HTable.java) 142 * Creates an object to access a HBase table. ... 151 public HTable(Configuration conf, final String tableName) What we should do inside of createSubmittableJob is something similar to what the completebulkloads command would do: (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java) 690 boolean tableExists = this.doesTableExist(tableName); 691 if (!tableExists) this.createTable(tableName,dirPath); Currently the docs are misleading, the table in fact must exist prior to running importtsv. We should check if it exists rather than assume it's already there and throw the below exception: 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: Encountered problems when prefetch META table: org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for table: myTable2, row=myTable2,,99 at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250959#comment-13250959 ] Lars Hofhansl commented on HBASE-5604: -- * Let's not overkill on the exceptions. This is an inner class of WALPlayer, WALPlayer will always pass the correct arguments. Maybe I'll make the mapper classed private and remove all exception handling. * Thought about using SimpleDateFormat, then punted. I guess you're right, should make it bit more user friendly. * The TODO was left over. Not sure that even makes sense. M/R tools to replay WAL files - Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250968#comment-13250968 ] Lars Hofhansl commented on HBASE-3443: -- For 0.96, though. I can take this. Can't promise to work on it today, though. Had checked out the code a while ago to make ICV correct w.r.t. to WAL updates, this involves mostly removing some crufty code. ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 0.92.0, 0.92.1 Reporter: Kannan Muthukkaruppan Priority: Critical Labels: corruption For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5728) Methods Missing in HTableInterface
[ https://issues.apache.org/jira/browse/HBASE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250985#comment-13250985 ] Lars Hofhansl commented on HBASE-5728: -- {code} public int getScannerCaching(); public void setScannerCaching(int scannerCaching); {code} These two are deprecated (in trunk at least), let's not add them to the interface. Methods Missing in HTableInterface -- Key: HBASE-5728 URL: https://issues.apache.org/jira/browse/HBASE-5728 Project: HBase Issue Type: Improvement Components: client Reporter: Bing Li Dear all, I found some methods existed in HTable were not in HTableInterface. setAutoFlush setWriteBufferSize ... In most cases, I manipulate HBase through HTableInterface from HTablePool. If I need to use the above methods, how to do that? I am considering writing my own table pool if no proper ways. Is it fine? Thanks so much! Best regards, Bing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251006#comment-13251006 ] Lars Hofhansl commented on HBASE-5677: -- Is this at all similar to HBASE-5615? Maybe here too, there is no point holding up 0.94.0 for this. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251053#comment-13251053 ] Lars Hofhansl commented on HBASE-5604: -- @Stack: Yeah, probably. You'd vote for leaving it the way it is? M/R tools to replay WAL files - Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251189#comment-13251189 ] Lars Hofhansl commented on HBASE-5677: -- bq. public boolean isMasterRunning() { return !isStopped() isInitialized(); } But Some class like HMerge,this tool just care the master is running or not. Is that actually a problem? HMerge would need to wait until the master is initialized. Seems generally a better a better condition for running than just having the process up. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5756) we can change defalult File Appender to RFA instead of DRFA.
[ https://issues.apache.org/jira/browse/HBASE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251315#comment-13251315 ] Lars Hofhansl commented on HBASE-5756: -- Anybody can still change the log4j.properties to get the desired behavior. I see no need to put into 0.94. That said, the change in HBASE-5655 is simple, if you believe strongly I'll put it into 0.94. we can change defalult File Appender to RFA instead of DRFA. Key: HBASE-5756 URL: https://issues.apache.org/jira/browse/HBASE-5756 Project: HBase Issue Type: Bug Reporter: rohithsharma Priority: Minor This can be a point of concern when on a certain day the logging happens more because of more and more activity. In that case the log file for that day can grow huge. These logs can not be opened for analysis since size is more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5755) Region sever looking for master forever with cached stale data.
[ https://issues.apache.org/jira/browse/HBASE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251327#comment-13251327 ] Lars Hofhansl commented on HBASE-5755: -- Looks like MasterAddressTracker.java was moved from /hbase/trunk/src/main/java/org/apache/hadoop/hbase to /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper in 0.96 Region sever looking for master forever with cached stale data. --- Key: HBASE-5755 URL: https://issues.apache.org/jira/browse/HBASE-5755 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: hbase-5755.patch When the master address tracker doesn't have the master address ZK data, or the cached data is wrong, region server should not use the cached data. It should pull the data from ZK directly again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5751) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250056#comment-13250056 ] Lars Hofhansl commented on HBASE-5751: -- Tail of discussion from parent: stack added a comment - 06/Apr/12 15:03 This seems to have broken 0.90 builds. Can we revert from 0.90 trunk till fixed. See here: https://builds.apache.org/view/G-L/view/HBase/job/hbase-0.90/ Build #458 is where it got committed and thereafter the build fails rolling the wal test. David S. Wang added a comment - 06/Apr/12 15:18 Greg is out of town for the next week or so ... IMHO if it's breaking 0.90, we should revert for now and Greg can look at it when he gets back. Zhihong Yu added a comment - 06/Apr/12 16:11 TestLogRolling hangs in 0.90 Jonathan Hsieh added a comment - 06/Apr/12 16:21 I'll revert the 0.90 version. Sorry about this fellas. Lars Hofhansl added a comment - 09/Apr/12 03:31 So from the discussion the test is only a problem in 0.90? Do we know why this is? Asking because we're close to another RC attempt for 0.94.0. Jonathan Hsieh added a comment - 09/Apr/12 03:37 The issue is apparently only in 0.90. How about we close this issue for 0.92/0.94/trunk and create a follow on issue for 0.90? This will unblock this for 0.94 and Greg can address this when he gets back. hbase master stop does not bring down backup masters -- Key: HBASE-5751 URL: https://issues.apache.org/jira/browse/HBASE-5751 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Gregory Chanan Fix For: 0.90.7 Carry forward the discussion from parent for 0.90 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region
[ https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250069#comment-13250069 ] Lars Hofhansl commented on HBASE-5615: -- Thanks Ram... I'll hold up 0.94 for this (unless you think that unnecessary). the master never does balance because of balancing the parent region Key: HBASE-5615 URL: https://issues.apache.org/jira/browse/HBASE-5615 Project: HBase Issue Type: Bug Affects Versions: 0.90.7 Reporter: xufeng Assignee: xufeng Priority: Critical Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html the master never do balance becauseof when master do rebuildUserRegions(),it will add the parent region into AssignmentManager#servers, if balancer let the parent region to move,the parent will in RIT forever.thus balance will never be executed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250336#comment-13250336 ] Lars Hofhansl commented on HBASE-5604: -- TestForceCacheImportantBlocks passes locally. I'm ready to commit. HLog replay tool that generates HFiles for use by LoadIncrementalHFiles. Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files
[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250439#comment-13250439 ] Lars Hofhansl commented on HBASE-5604: -- Thanks Ted. * I'll add Javadoc. * index of 0 is used here, since when creating HFiles for bulk import only a single table is currently allowed (that is also documented in usage(), but perhaps not clearly enough...?). I'll add a comment to that extent. * That the two arrays are of the same size if guaranteed in WALPlayer.createSubmittableJob(), but perhaps it is better to double check here. * Checking heapSize() seems unnecessary. After all, this is single WALEdit. M/R tools to replay WAL files - Key: HBASE-5604 URL: https://issues.apache.org/jira/browse/HBASE-5604 Project: HBase Issue Type: New Feature Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, HLog-5604-v3.txt Just an idea I had. Might be useful for restore of a backup using the HLogs. This could an M/R (with a mapper per HLog file). The tool would get a timerange and a (set of) table(s). We'd pick the right HLogs based on time before the M/R job is started and then have a mapper per HLog file. The mapper would then go through the HLog, filter all WALEdits that didn't fit into the time range or are not any of the tables and then uses HFileOutputFormat to generate HFiles. Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250445#comment-13250445 ] Lars Hofhansl commented on HBASE-5677: -- Seems this should in 0.94.0. Agreed? The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng Attachments: HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, surefire-report_patched_v1.html If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5656) LoadIncrementalHFiles createTable should detect and set compression algorithm
[ https://issues.apache.org/jira/browse/HBASE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249567#comment-13249567 ] Lars Hofhansl commented on HBASE-5656: -- +1 on patch. Failures seems unrelated. LoadIncrementalHFiles createTable should detect and set compression algorithm - Key: HBASE-5656 URL: https://issues.apache.org/jira/browse/HBASE-5656 Project: HBase Issue Type: Bug Components: util Affects Versions: 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: 5656-simple.txt, HBASE-5656-0.92.patch, HBASE-5656-0.92.patch, HBASE-5656-0.92.patch, HBASE-5656-0.92.patch Original Estimate: 1h Remaining Estimate: 1h LoadIncrementalHFiles doesn't set compression when creating the the table. This can be detected from the files within each family dir. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1936) ClassLoader that loads from hdfs; useful adding filters to classpath without having to restart services
[ https://issues.apache.org/jira/browse/HBASE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249656#comment-13249656 ] Lars Hofhansl commented on HBASE-1936: -- We have since solved this problem with coprocessors, right? It might make sense to look at how coprocessors handle loading from HDFS and (hopefully) reuse the bits. Thanks for picking this up again Jieshan. ClassLoader that loads from hdfs; useful adding filters to classpath without having to restart services --- Key: HBASE-1936 URL: https://issues.apache.org/jira/browse/HBASE-1936 Project: HBase Issue Type: New Feature Reporter: stack Assignee: Daniel Ploeg Labels: noob Attachments: cp_from_hdfs.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5213) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249658#comment-13249658 ] Lars Hofhansl commented on HBASE-5213: -- So from the discussion the test is only a problem in 0.90? Do we know why this is? Asking because we're close to another RC attempt for 0.94.0. hbase master stop does not bring down backup masters -- Key: HBASE-5213 URL: https://issues.apache.org/jira/browse/HBASE-5213 Project: HBase Issue Type: Bug Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5213.jstack, HBASE-5213-v0-trunk.patch, HBASE-5213-v1-trunk.patch, HBASE-5213-v2-90.patch, HBASE-5213-v2-92.patch, HBASE-5213-v2-trunk.patch Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5720) HFileDataBlockEncoderImpl uses wrong header size when reading HFiles with no checksums
[ https://issues.apache.org/jira/browse/HBASE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249328#comment-13249328 ] Lars Hofhansl commented on HBASE-5720: -- That patch looks like it should work. As I mention in an earlier comment, though, this strategy is brittle especially considering future changes. At the same time I cannot think of anything better offhand. Let me move the trunk discussion into another jira and commit the 0.94 fix via this one. Everybody OK with that? HFileDataBlockEncoderImpl uses wrong header size when reading HFiles with no checksums -- Key: HBASE-5720 URL: https://issues.apache.org/jira/browse/HBASE-5720 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.94.0 Reporter: Matt Corgan Priority: Blocker Fix For: 0.94.0 Attachments: 5720-trunk-v2.txt, 5720-trunk.txt, 5720v4.txt, 5720v4.txt, 5720v4.txt, HBASE-5720-v1.patch, HBASE-5720-v2.patch, HBASE-5720-v3.patch When reading a .92 HFile without checksums, encoding it, and storing in the block cache, the HFileDataBlockEncoderImpl always allocates a dummy header appropriate for checksums even though there are none. This corrupts the byte[]. Attaching a patch that allocates a DUMMY_HEADER_NO_CHECKSUM in that case which I think is the desired behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira