[jira] [Created] (HBASE-5089) NIO based REST server
NIO based REST server - Key: HBASE-5089 URL: https://issues.apache.org/jira/browse/HBASE-5089 Project: HBase Issue Type: Improvement Components: rest Affects Versions: 0.94.0 Reporter: Mubarak Seyed Fix For: 0.94.0 Rest server code uses Jetty 6.1.26, which uses old IO (org.mortbay.jetty.Server). It would be good if Rest server uses NIO based servlet server. The options are as follows: # [Jetty 7 or above (NIO based)|http://download.eclipse.org/jetty/stable-8/xref/] # [Netty-based Rest server|https://github.com/netty/netty/wiki/Related-projects] # [Grizzly|http://grizzly.java.net] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5087) Up the 0.92RC zk to 3.4.1RC0
[ https://issues.apache.org/jira/browse/HBASE-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174691#comment-13174691 ] Hudson commented on HBASE-5087: --- Integrated in HBase-0.92-security #48 (See [https://builds.apache.org/job/HBase-0.92-security/48/]) HBASE-5087 Up the 0.92RC zk to 3.4.1RC0 stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/pom.xml Up the 0.92RC zk to 3.4.1RC0 Key: HBASE-5087 URL: https://issues.apache.org/jira/browse/HBASE-5087 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.92.0 Attachments: 5087.txt ZK just found bad bug in 3.4.1 zookeeper-1333. They put up a fix and new rc, 3.4.1 (Andrew, you saw Todds query asking if it'd possible to hold to zk 3.3.4 and just have 3.4.1 for secure installs?) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2182) NIO based RPC
[ https://issues.apache.org/jira/browse/HBASE-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174701#comment-13174701 ] Mubarak Seyed commented on HBASE-2182: -- Netty 3.2 ([NETTY-293|https://issues.jboss.org/browse/NETTY-293]) supports sendfile() system call. NIO based RPC - Key: HBASE-2182 URL: https://issues.apache.org/jira/browse/HBASE-2182 Project: HBase Issue Type: Improvement Components: ipc, performance Reporter: Andrew Purtell Assignee: ryan rawson Reimplement HBase RPC with an NIO framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174761#comment-13174761 ] Hudson commented on HBASE-5081: --- Integrated in HBase-0.92 #209 (See [https://builds.apache.org/job/HBase-0.92/209/]) HBASE-5081 Distributed log splitting deleteNode races againsth splitLog retry; REVERT -- COMMITTED BEFORE REVIEW FINISHED -- AGAIN HBASE-5081 Distributed log splitting deleteNode races againsth splitLog retry; REAPPLY stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174821#comment-13174821 ] Hudson commented on HBASE-5081: --- Integrated in HBase-TRUNK #2568 (See [https://builds.apache.org/job/HBase-TRUNK/2568/]) HBASE-5081 Distributed log splitting deleteNode races againsth splitLog retry; REVERT -- COMMITTED BEFORE REVIEW FINISHED -- AGAINpatch -p0 -R x.txt patch -p0 -R x.txt HBASE-5081 Distributed log splitting deleteNode races againsth splitLog retry; REAPPLY stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174822#comment-13174822 ] Hudson commented on HBASE-5085: --- Integrated in HBase-TRUNK #2568 (See [https://builds.apache.org/job/HBase-TRUNK/2568/]) HBASE-5085 fix test-patch script from setting the ulimit -- EXPERIMENT stack : Files : * /hbase/trunk/dev-support/test-patch.sh fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174828#comment-13174828 ] stack commented on HBASE-5085: -- So, my script changes have still not been noticed. Maybe the script has to be checked out by Giri for changes to take effect. fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174840#comment-13174840 ] jirapos...@reviews.apache.org commented on HBASE-5081: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3292/#review4076 --- src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java https://reviews.apache.org/r/3292/#comment9178 Since the value of status won't change, I think it is better to call deleteNodeNow() here. If we call deleteNodeNow() at line 360, we hold the lock much longer. - Ted On 2011-12-22 00:31:23, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3292/ bq. --- bq. bq. (Updated 2011-12-22 00:31:23) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Lars Hofhansl. bq. bq. bq. Summary bq. --- bq. bq. In this patch, after a task is done, we don't delete the node if the task is failed. So that when it's retried later on, there won't be race problem. bq. bq. It used to delete the node always. bq. bq. bq. This addresses bug HBASE-5081. bq. https://issues.apache.org/jira/browse/HBASE-5081 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 bq.src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java 32ad7e8 bq. bq. Diff: https://reviews.apache.org/r/3292/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -Dtest=TestDistributedLogSplitting clean test bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174847#comment-13174847 ] jirapos...@reviews.apache.org commented on HBASE-5081: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3292/#review4077 --- src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java https://reviews.apache.org/r/3292/#comment9179 DeleteAsyncCallback is only used by deleteNode(). I think we should simplify logic by removing deleteNode() and DeleteAsyncCallback - deleteNodeNow() uses RecoverableZooKeeper which has the retry logic. - Ted On 2011-12-22 00:31:23, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3292/ bq. --- bq. bq. (Updated 2011-12-22 00:31:23) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Lars Hofhansl. bq. bq. bq. Summary bq. --- bq. bq. In this patch, after a task is done, we don't delete the node if the task is failed. So that when it's retried later on, there won't be race problem. bq. bq. It used to delete the node always. bq. bq. bq. This addresses bug HBASE-5081. bq. https://issues.apache.org/jira/browse/HBASE-5081 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 bq.src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java 32ad7e8 bq. bq. Diff: https://reviews.apache.org/r/3292/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -Dtest=TestDistributedLogSplitting clean test bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5089) NIO based REST server
[ https://issues.apache.org/jira/browse/HBASE-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174850#comment-13174850 ] stack commented on HBASE-5089: -- Sounds good. NIO based REST server - Key: HBASE-5089 URL: https://issues.apache.org/jira/browse/HBASE-5089 Project: HBase Issue Type: Improvement Components: rest Affects Versions: 0.94.0 Reporter: Mubarak Seyed Labels: noob Fix For: 0.94.0 Rest server code uses Jetty 6.1.26, which uses old IO (org.mortbay.jetty.Server). It would be good if Rest server uses NIO based servlet server. The options are as follows: # [Jetty 7 or above (NIO based)|http://download.eclipse.org/jetty/stable-8/xref/] # [Netty-based Rest server|https://github.com/netty/netty/wiki/Related-projects] # [Grizzly|http://grizzly.java.net] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174856#comment-13174856 ] stack commented on HBASE-4720: -- @Mubarak Understood (re: formatter). Would suggest you not run it on total file when changing a few lines in the file only; we've not been doing a good job enforcing formatting across the code base so forcing formatting on all of a file will usually turn up loads of changes. When loads of changes, your patch gets big. Big patches are harder to get reviews on. Just some advice for next time. On the test URL making, sounds good. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5084) Allow different HTable instances to share one ExecutorService
[ https://issues.apache.org/jira/browse/HBASE-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174874#comment-13174874 ] Bruno Dumon commented on HBASE-5084: (I'm a Lily developer) The main reason I'm not using that HConnection/ExecutorService constructor is because we are using CDH's HBase which does not have that HTable constructor yet. Otherwise, I like the idea of managing both HConnection and ExecutorService myself. Though it seems like two different issues. Rather than connection not being null, you'd need a constructor HTable(Configuration, tableName, ExecutorService). Not sure if it's worth adding another constructor, you can as well call HConnectionManager.getConnection(). Allow different HTable instances to share one ExecutorService - Key: HBASE-5084 URL: https://issues.apache.org/jira/browse/HBASE-5084 Project: HBase Issue Type: Task Reporter: Zhihong Yu This came out of Lily 1.1.1 release: Use a shared ExecutorService for all HTable instances, leading to better (or actual) thread reuse -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174881#comment-13174881 ] stack commented on HBASE-5083: -- Related, we should make sure master shows a webpage even if meta is not deployed. Currently meta has to be deployed for master page to come up. I'd think backup master webpage would also point to the actual master if one. Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4956) Control direct memory buffer consumption by HBaseClient
[ https://issues.apache.org/jira/browse/HBASE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174884#comment-13174884 ] stack commented on HBASE-4956: -- @Lars So supposition is that we bound the number of direct buffers client allocates by bounding the number of threads in the thread pool (and have all accesses go via the HTable thread pool)? If so, sounds good. What happens if as many HTable instances as there are application-level threads and say the application spawns lots of threads? The application should then go via HTablePool? Or rather, you are suggesting that application uses new HBASE-4805 api and pass common executor to all HTable instances? Control direct memory buffer consumption by HBaseClient --- Key: HBASE-4956 URL: https://issues.apache.org/jira/browse/HBASE-4956 Project: HBase Issue Type: New Feature Reporter: Ted Yu As Jonathan explained here https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357?pli=1 , standard hbase client inadvertently consumes large amount of direct memory. We should consider using netty for NIO-related tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4956) Control direct memory buffer consumption by HBaseClient
[ https://issues.apache.org/jira/browse/HBASE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174892#comment-13174892 ] Zhihong Yu commented on HBASE-4956: --- Currently lily-project uses a hack to share ExecutorService among all HTable instances. Once HBASE-5084 is integrated, client application would have full control over reusing ExecutorService across all HTable instances. Control direct memory buffer consumption by HBaseClient --- Key: HBASE-4956 URL: https://issues.apache.org/jira/browse/HBASE-4956 Project: HBase Issue Type: New Feature Reporter: Ted Yu As Jonathan explained here https://groups.google.com/group/asynchbase/browse_thread/thread/c45bc7ba788b2357?pli=1 , standard hbase client inadvertently consumes large amount of direct memory. We should consider using netty for NIO-related tasks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4439) Move ClientScanner out of HTable
[ https://issues.apache.org/jira/browse/HBASE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174906#comment-13174906 ] Lars Hofhansl commented on HBASE-4439: -- I think the preferred/default route would still be to go through HTable.getScanner(...). So for standard clients nothing would change. More advanced clients now have an example of how to do a stand alone client side scanner. In the end this mostly a refactoring to make the cleaner. Move ClientScanner out of HTable Key: HBASE-4439 URL: https://issues.apache.org/jira/browse/HBASE-4439 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 4439-v1.txt, 4439-v2.txt, 4439.txt See HBASE-1935 for motivation. ClientScanner should be able to exist outside of HTable. While we're at it, we can also add an abstract client scanner to easy development of new client side scanners (such as parallel scanners, or per region scanners). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174919#comment-13174919 ] jirapos...@reviews.apache.org commented on HBASE-5081: -- bq. On 2011-12-22 15:34:13, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java, line 373 bq. https://reviews.apache.org/r/3292/diff/8/?file=65682#file65682line373 bq. bq. DeleteAsyncCallback is only used by deleteNode(). bq. I think we should simplify logic by removing deleteNode() and DeleteAsyncCallback - deleteNodeNow() uses RecoverableZooKeeper which has the retry logic. The difference is that deleteNode has unlimited retries. RecoverableZooKeeper doesn't. It has only 3 retries by default. - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3292/#review4077 --- On 2011-12-22 00:31:23, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3292/ bq. --- bq. bq. (Updated 2011-12-22 00:31:23) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Lars Hofhansl. bq. bq. bq. Summary bq. --- bq. bq. In this patch, after a task is done, we don't delete the node if the task is failed. So that when it's retried later on, there won't be race problem. bq. bq. It used to delete the node always. bq. bq. bq. This addresses bug HBASE-5081. bq. https://issues.apache.org/jira/browse/HBASE-5081 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 bq.src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java 32ad7e8 bq. bq. Diff: https://reviews.apache.org/r/3292/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -Dtest=TestDistributedLogSplitting clean test bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174926#comment-13174926 ] jirapos...@reviews.apache.org commented on HBASE-5081: -- bq. On 2011-12-22 15:08:38, Michael Stack wrote: bq. This patch seems to be raising too many questions. Should we try going other route of ensuring the async delete removes the 'right' task? Due to the race issue, we have to put more than the filename in the node and the hashmap, so as to removes the right task. That's much bigger change and will raise more questions. bq. On 2011-12-22 15:08:38, Michael Stack wrote: bq. src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java, line 349 bq. https://reviews.apache.org/r/3292/diff/8/?file=65682#file65682line349 bq. bq. Why check for success up here rather than down inside the synchronize on task.batch? Why not do this safeToDeleteNodeAsync in there in the else clause where we up the count of errors? Is it not safe to do the delete of zk node NOW under the synchronize block? It is safe to do the delete under the synchronize block. The reason is that I don't want to hold the lock on task.batch while delete the node synchronously. bq. On 2011-12-22 15:08:38, Michael Stack wrote: bq. src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java, line 378 bq. https://reviews.apache.org/r/3292/diff/8/?file=65682#file65682line378 bq. bq. Are we duplicating the code from deleteNode here? Should we have sync/async versions? deleteNode is the async version. deleteNodeNow is the sync version. The async version can have unlimited retries. The sync version can retry up to certain configured number (3 by default). So the sync version doesn't guarantee it will be deleted. The code wise, it's hard to reuse. - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3292/#review4073 --- On 2011-12-22 00:31:23, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3292/ bq. --- bq. bq. (Updated 2011-12-22 00:31:23) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Lars Hofhansl. bq. bq. bq. Summary bq. --- bq. bq. In this patch, after a task is done, we don't delete the node if the task is failed. So that when it's retried later on, there won't be race problem. bq. bq. It used to delete the node always. bq. bq. bq. This addresses bug HBASE-5081. bq. https://issues.apache.org/jira/browse/HBASE-5081 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 bq.src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java 32ad7e8 bq. bq. Diff: https://reviews.apache.org/r/3292/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -Dtest=TestDistributedLogSplitting clean test bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174932#comment-13174932 ] Giridharan Kesavan commented on HBASE-5085: --- this patch cannot be submitted to test-patch as we are patching test-patch script itself. Only option is to review and commit or manually run the test-patch after applying the patch. fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META
[ https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174935#comment-13174935 ] Alex Newman commented on HBASE-4616: On the moving of directories. We could easily remove the patch which changes the format of the id if we feel as though moving directories is a bad idea. Update hregion encoded name to reduce logic and prevent region collisions in META - Key: HBASE-4616 URL: https://issues.apache.org/jira/browse/HBASE-4616 Project: HBase Issue Type: Umbrella Reporter: Alex Newman Assignee: Alex Newman Attachments: HBASE-4616-v2.patch, HBASE-4616-v3.patch, HBASE-4616.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META
[ https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174934#comment-13174934 ] Alex Newman commented on HBASE-4616: So on the first part about using printable chars in for the last region. Consider the fact that ! is a valid table char as far as I know and comes before other printable chars on the ascii table so that's out. I am not a big fan of the table to int mapping as I think it provides for another level of complexity and it's own synchronization issues. Also, having to do a direct (hdfs) filesystem operation for adding or removing regions seems like a bad idea. After all opening files in hdfs takes like 200ms. It'd probably be better to have another table like meta/root for something like this. If your worried about using 29 chars consider it's most likely small when compared to the size of the hri/row key. If we really want to cut down on space we can just the 128bits of the md5 directly, it will make it even more difficult to read, but it will be compact. Pros to pulling the tablename out: Constant overhead for large tablenames More flexibility on what can be a tablename More deterministic overhead Cons: More difficult to debug Not what we do now Pros to md5 vs lookup table: One less table piece of state Easier to debug Tables from one cluster wont clobber tables from another Cons to the md5 approach: More key space overhead than a lookup table(although not with the binary approach) Update hregion encoded name to reduce logic and prevent region collisions in META - Key: HBASE-4616 URL: https://issues.apache.org/jira/browse/HBASE-4616 Project: HBase Issue Type: Umbrella Reporter: Alex Newman Assignee: Alex Newman Attachments: HBASE-4616-v2.patch, HBASE-4616-v3.patch, HBASE-4616.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174941#comment-13174941 ] Zhihong Yu commented on HBASE-5085: --- I suggest adding call to whoami or id in test-patch.sh so that we can know the effective user running the tests. fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174960#comment-13174960 ] Giridharan Kesavan commented on HBASE-5085: --- The effective user is jenkins as the jenkins slave is launched to run as jenkins user. This should not matter much b'coz the ulimit is set to all the users system wide except root. Also Im very sure that the system wide ulimit would come into effect when we remove the override set in the test-patch.sh. (which is my initial patch). fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174961#comment-13174961 ] Phabricator commented on HBASE-4218: mbautin has commented on the revision [jira] [HBASE-4218] HFile data block encoding (delta encoding). Replying to Matt's comments. A new version of the diff will follow. @mcorgan: thanks for reviewing! INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:137 Done. src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:157 Done. src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java:161 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:171 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:175 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java:850 Done. src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java:162 Done. REVISION DETAIL https://reviews.facebook.net/D447 Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4218: --- Attachment: D447.12.patch mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding (delta encoding). Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Addressing Matt's comments. Also, renaming DataBlockEncodingAlgorithms to DataBlockEncodings for brevity, and adding a private constructor to that class. All unit tests pass, continuing cluster testing. REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4218: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508415/D447.12.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 68 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/580//console This message is automatically generated.) Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4218: -- Status: Open (was: Patch Available) Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time
[ https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174984#comment-13174984 ] Phabricator commented on HBASE-5033: Kannan has accepted the revision [jira][HBASE-5033][[89-fb]]Opening/Closing store in parallel to reduce region open/close time. Nice work! One pending comment from previous iteration inlined regarding the default. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HConstants.java:202 previous comment: And I would set the default much lower... to say 8 or something. Not sure how many disks is typical. REVISION DETAIL https://reviews.facebook.net/D933 Opening/Closing store in parallel to reduce region open/close time -- Key: HBASE-5033 URL: https://issues.apache.org/jira/browse/HBASE-5033 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, D933.5.patch Region servers are opening/closing each store and each store file for every store in sequential fashion, which may cause inefficiency to open/close regions. So this diff is to open/close each store in parallel in order to reduce region open/close time. Also it would help to reduce the cluster restart time. 1) Opening each store in parallel 2) Loading each store file for every store in parallel 3) Closing each store in parallel 4) Closing each store file for every store in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5089) NIO based REST server
[ https://issues.apache.org/jira/browse/HBASE-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174983#comment-13174983 ] Andrew Purtell commented on HBASE-5089: --- Cloudera ships a Jetty 6 version with NIO, drop in replacement, open source: https://github.com/toddlipcon/jetty-hadoop-fix I don't have an opinion about what servlet server to use. Once upon a time we had a WAR target. We dropped it due to build difficulties when moving to Maven initially. We could bring it back, leave it up to the user what servlet container to use. NIO based REST server - Key: HBASE-5089 URL: https://issues.apache.org/jira/browse/HBASE-5089 Project: HBase Issue Type: Improvement Components: rest Affects Versions: 0.94.0 Reporter: Mubarak Seyed Labels: noob Fix For: 0.94.0 Rest server code uses Jetty 6.1.26, which uses old IO (org.mortbay.jetty.Server). It would be good if Rest server uses NIO based servlet server. The options are as follows: # [Jetty 7 or above (NIO based)|http://download.eclipse.org/jetty/stable-8/xref/] # [Netty-based Rest server|https://github.com/netty/netty/wiki/Related-projects] # [Grizzly|http://grizzly.java.net] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time
[ https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174998#comment-13174998 ] Phabricator commented on HBASE-5033: Liyin has commented on the revision [jira][HBASE-5033][[89-fb]]Opening/Closing store in parallel to reduce region open/close time. Thanks Kannan for reviewing it. 1) I will change the default number to a lower number, 8. 2) Create another jira about avoiding the evictOnClose if the region server is going to stop. REVISION DETAIL https://reviews.facebook.net/D933 Opening/Closing store in parallel to reduce region open/close time -- Key: HBASE-5033 URL: https://issues.apache.org/jira/browse/HBASE-5033 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, D933.5.patch Region servers are opening/closing each store and each store file for every store in sequential fashion, which may cause inefficiency to open/close regions. So this diff is to open/close each store in parallel in order to reduce region open/close time. Also it would help to reduce the cluster restart time. 1) Opening each store in parallel 2) Loading each store file for every store in parallel 3) Closing each store in parallel 4) Closing each store file for every store in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175005#comment-13175005 ] stack commented on HBASE-5085: -- I applied the patch last night but it seems to have no effect when this issue was submitted to hadoop-qa again (I'll retry just to be sure to be sure). fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5085: - Attachment: 5085-v4-experiment.txt v4 is same as v3 fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, 5085-v4-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5085: - Status: Patch Available (was: Open) fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, 5085-v4-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5091) [replication] Update replication doc to reflect current directory structure
[replication] Update replication doc to reflect current directory structure --- Key: HBASE-5091 URL: https://issues.apache.org/jira/browse/HBASE-5091 Project: HBase Issue Type: Bug Components: documentation, replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Trivial A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 When in actuality it should be this: /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-4218: --- Attachment: D447.13.patch mbautin updated the revision [jira] [HBASE-4218] HFile data block encoding (delta encoding). Reviewers: JIRA, tedyu, stack, nspiegelberg, Kannan Fixing a compile error that Ted saw and TestHeapSize on 32-bit JVM (failure seen on Jenkins). REVISION DETAIL https://reviews.facebook.net/D447 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/KeyValue.java src/main/java/org/apache/hadoop/hbase/io/HalfStoreFileReader.java src/main/java/org/apache/hadoop/hbase/io/encoding/BitsetKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CompressionState.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncodings.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncoderBufferTooSmallException.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java src/main/java/org/apache/hadoop/hbase/io/hfile/BlockType.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFilePrettyPrinter.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java src/main/java/org/apache/hadoop/hbase/util/ByteBufferUtils.java src/main/ruby/hbase/admin.rb src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/TestHalfStoreFileReader.java src/test/java/org/apache/hadoop/hbase/io/TestHeapSize.java src/test/java/org/apache/hadoop/hbase/io/encoding/RedundantKVGenerator.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestBufferedDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/util/LoadTestTool.java
[jira] [Updated] (HBASE-5091) [replication] Update replication doc to reflect current directory structure
[ https://issues.apache.org/jira/browse/HBASE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HBASE-5091: Description: A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} When in actuality it should be this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. was: A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 When in actuality it should be this: /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. [replication] Update replication doc to reflect current directory structure --- Key: HBASE-5091 URL: https://issues.apache.org/jira/browse/HBASE-5091 Project: HBase Issue Type: Bug Components: documentation, replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Trivial A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} When in actuality it should be this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-4218: -- Attachment: Delta-encoding.patch-2011-12-22_11_52_07.patch Appending a new version of patch that should apply using the patch command, compile, and pass TestHeapSize on Jenkins. Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5091) [replication] Update replication doc to reflect current znode structure
[ https://issues.apache.org/jira/browse/HBASE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HBASE-5091: Summary: [replication] Update replication doc to reflect current znode structure (was: [replication] Update replication doc to reflect current directory structure) [replication] Update replication doc to reflect current znode structure --- Key: HBASE-5091 URL: https://issues.apache.org/jira/browse/HBASE-5091 Project: HBase Issue Type: Bug Components: documentation, replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Trivial A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} When in actuality it should be this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175011#comment-13175011 ] Hadoop QA commented on HBASE-5085: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508423/5085-v4-experiment.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/581//console This message is automatically generated. fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, 5085-v4-experiment.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-4218: -- Status: Patch Available (was: Open) Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175013#comment-13175013 ] Zhihong Yu commented on HBASE-4218: --- Please remove the last hunk from HFilePerformanceEvaluation.java which led to: {code} 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej {code} Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5070) Constraints implementation and javadoc changes
[ https://issues.apache.org/jira/browse/HBASE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175016#comment-13175016 ] jirapos...@reviews.apache.org commented on HBASE-5070: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3273/ --- (Updated 2011-12-22 19:59:47.762581) Review request for hbase, Gary Helmling, Ted Yu, and Michael Stack. Changes --- Fixing documentation as per comments. Remaining concerns: 1) handling configuration storage in HTD (seems ok, but see comments on stack's comments on previous patch) 2) Configured has a default constructor which does setConf(null). Do we want to enforce this behavior? Or just let it be null? Or create our own BaseConfigured class that extends from Configurable? Summary --- Follow-up on changes to constraint as per stack's comments on HBASE-4605. This addresses bug HBASE-5070. https://issues.apache.org/jira/browse/HBASE-5070 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/constraint/BaseConstraint.java 7ce6d45 src/main/java/org/apache/hadoop/hbase/constraint/Constraint.java 2d8b4d7 src/main/java/org/apache/hadoop/hbase/constraint/Constraints.java 7825466 src/docbkx/book.xml bd3f881 src/main/java/org/apache/hadoop/hbase/constraint/package-info.java 6145ed5 src/test/java/org/apache/hadoop/hbase/constraint/CheckConfigurationConstraint.java c49098d Diff: https://reviews.apache.org/r/3273/diff Testing --- mvn clean test -P localTests -Dest=*Constraint* - all tests pass. Thanks, Jesse Constraints implementation and javadoc changes -- Key: HBASE-5070 URL: https://issues.apache.org/jira/browse/HBASE-5070 Project: HBase Issue Type: Task Reporter: Zhihong Yu This is continuation of HBASE-4605 See Stack's comments https://reviews.apache.org/r/2579/#review3980 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175021#comment-13175021 ] Hadoop QA commented on HBASE-4218: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508425/Delta-encoding.patch-2011-12-22_11_52_07.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 92 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -142 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/582//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/582//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/582//console This message is automatically generated. Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5070) Constraints implementation and javadoc changes
[ https://issues.apache.org/jira/browse/HBASE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175024#comment-13175024 ] Zhihong Yu commented on HBASE-5070: --- For concern #1, how about creating helper method to remove entries from Configuration which have same values as those in hbase-default.xml ? This should be done in another JIRA. Constraints implementation and javadoc changes -- Key: HBASE-5070 URL: https://issues.apache.org/jira/browse/HBASE-5070 Project: HBase Issue Type: Task Reporter: Zhihong Yu This is continuation of HBASE-4605 See Stack's comments https://reviews.apache.org/r/2579/#review3980 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4218: -- Comment: was deleted (was: Please remove the last hunk from HFilePerformanceEvaluation.java which led to: {code} 1 out of 2 hunks FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/HFilePerformanceEvaluation.java.rej {code}) Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5070) Constraints implementation and javadoc changes
[ https://issues.apache.org/jira/browse/HBASE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175032#comment-13175032 ] Jesse Yates commented on HBASE-5070: bq. For concern #1, how about creating helper method to remove entries from Configuration which have same values as those in hbase-default.xml ? I think that is a little bit overkill - what if you happen to want those just a couple of those values? Creating a starting configuration feels cleaner than a remove util. Why not just add the commenting in that you should use new Configuration(false) to avoid adding a ton of values? (I know, I'm kinda flip-flopping here, but trying to reason out the best soln.) Constraints implementation and javadoc changes -- Key: HBASE-5070 URL: https://issues.apache.org/jira/browse/HBASE-5070 Project: HBase Issue Type: Task Reporter: Zhihong Yu This is continuation of HBASE-4605 See Stack's comments https://reviews.apache.org/r/2579/#review3980 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175038#comment-13175038 ] stack commented on HBASE-5081: -- @Jimmy Another tack would be ensuring splitLogDistributed has cleaned-up after itself before it returns including clean up on early-out because of exception. It seems like we will rerun the split if we early-out if OrphanHLogAfterSplitException is thrown ONLY (Is this what happened in your scenario? You say three log splits failed? Was it because a new log file showed up: i.e. OrphanHLogAfterSplitException? Or for some other reason? If for some other reason, the split should have failed?). I'd think that if a new file shows up while we were splitting, its fine to redo the split but I'd think that splitLogDistibuted would make sure it'd cleaned up after itself before it returned... that it had completed the batch it had been asked do. I was waiting on this issue to be done before cutting the RC but after looking at the pieces, I think that while this an important issue, my thinking is that it rare so I won't hold up the RC. Good stuff. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175061#comment-13175061 ] Jonathan Hsieh commented on HBASE-5083: --- Agreed and can probably be handled in the same patch. (updated description to add the missing other option). Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5083) Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5083: -- Description: Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. (was: Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master.) Backup HMaster should have http infoport open with link to the active master Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META
[ https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175067#comment-13175067 ] Alex Newman commented on HBASE-4616: I really think these functions should be removed. It locks us into either having a metadata hit, or Having regionNames in which the tablename is embedded(reversible). After all there already is a getTableName, this static one just doesn't seem useful. === + public static byte [] getTableName(byte[] regionName, Configuration conf)$ This is ugly, but maybe its ok? You've deprecated it (but you need to point user elsewhere when you deprecate something). What you thinking? That it should just not be possible to get tablename from an HRI going forward? This is kinda ugly method anyways... no longer used I'd say so fine if its going away (I seem to remember it was written for some strange context where needed to find tablename walking filesystem but no cluster up). Ditto for the version that don't take a Configuration... but man, thats ugly creating a Configuration inside in the method. Needs WARNINGS in javadoc: DONT DO IT! Follow formatting of method opening as is done elsewhere in this file; the below is different: + private static boolean isLHSSplit(final byte[] regionName,$ + final int offset) throws IOException {$ The above method needs a bit of doc on what its about. And there should be a blank line after each method ... there is none here. The below method does not look right and should be deprecated anyways (Its returning region name? At least warn it don't work any more)? /**$ * Get current table name of the region$ + * Get the tablename of the region from the regionKey$ @return byte array of table name$ */$ public byte[] getTableName() {$ if (tableName == null || tableName.length == 0) {$ tableName = getTableName(getRegionName());$ + tableName = getRegionName();$ }$ return tableName;$ }$ Update hregion encoded name to reduce logic and prevent region collisions in META - Key: HBASE-4616 URL: https://issues.apache.org/jira/browse/HBASE-4616 Project: HBase Issue Type: Umbrella Reporter: Alex Newman Assignee: Alex Newman Attachments: HBASE-4616-v2.patch, HBASE-4616-v3.patch, HBASE-4616.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mubarak Seyed updated HBASE-4720: - Attachment: HBASE-4720.trunk.v2.patch The attached file (HBASE-4720.trunk.v2.patch) addresses most of the code review comments. Thanks. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5085: - Resolution: Fixed Fix Version/s: 0.94.0 Status: Resolved (was: Patch Available) Closing. In console for hadoop-qa can see that file descriptors are indeed now at 60k rather than at 1024. If max processes is not north of 2k, will go bug Giri offline again. fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Fix For: 0.94.0 Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, 5085-v4-experiment.txt, 5085-v5.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5085) fix test-patch script from setting the ulimit
[ https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5085: - Attachment: 5085-v5.txt It looks like my committing of patch last night did the trick. My patch had Giri's removal of setting the ulimit down to 1024 for open files. Giri says he just upped the max processes per user to 10240 on all hadoop-qa machines. This patch cleans the 'experimental' part of last nights patch. I committed it to trunk. fix test-patch script from setting the ulimit - Key: HBASE-5085 URL: https://issues.apache.org/jira/browse/HBASE-5085 Project: HBase Issue Type: Bug Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Fix For: 0.94.0 Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 5085-v3-experiment.txt, 5085-v3-experiment.txt, 5085-v4-experiment.txt, 5085-v5.txt, hbase-5085.patch test-patch.sh script sets the ulimit -n 1024 just after triggering the patch setting this overrides the underlying systems ulimit and hence failing the hbase tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5091) [replication] Update replication doc to reflect current znode structure
[ https://issues.apache.org/jira/browse/HBASE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HBASE-5091: Attachment: HBASE-5091.patch Here is a simple edit to replication.xml with the correction. [replication] Update replication doc to reflect current znode structure --- Key: HBASE-5091 URL: https://issues.apache.org/jira/browse/HBASE-5091 Project: HBase Issue Type: Bug Components: documentation, replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Trivial Attachments: HBASE-5091.patch A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} When in actuality it should be this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175105#comment-13175105 ] stack commented on HBASE-4720: -- lgtm on cursory review. I'll commit in a few days. Maybe Andrew will take a look meantime. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5091) [replication] Update replication doc to reflect current znode structure
[ https://issues.apache.org/jira/browse/HBASE-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5091. -- Resolution: Fixed Fix Version/s: 0.94.0 Hadoop Flags: Reviewed Committed to trunk. Thanks for patch Chris. [replication] Update replication doc to reflect current znode structure --- Key: HBASE-5091 URL: https://issues.apache.org/jira/browse/HBASE-5091 Project: HBase Issue Type: Bug Components: documentation, replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Trivial Fix For: 0.94.0 Attachments: HBASE-5091.patch A small nit: The zookeeper node structure in the region server fail over section of the replication document is slightly different than the actual structure. The doc shows this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ peers/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} When in actuality it should be this: {noformat} /hbase/replication/rs/ 1.1.1.1,60020,123456780/ 2/ 1.1.1.1,60020.1234 (Contains a position) 1.1.1.1,60020.1265 {noformat} Not a big deal, but it gets confusing when you are going through the code and using the doc as a reference. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4720: -- Status: Patch Available (was: Open) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4616) Update hregion encoded name to reduce logic and prevent region collisions in META
[ https://issues.apache.org/jira/browse/HBASE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175109#comment-13175109 ] stack commented on HBASE-4616: -- You can't remove them w/o deprecating them first. On your other comments: What about my proposal that you narrow scope of this patch and attach the priority issue, removing get closestorbefore, and do uuid/md5'ing in another issue? '!' is not legal table char. Here is the function that tests legal characters: http://hbase.apache.org/xref/org/apache/hadoop/hbase/HTableDescriptor.html#278 Anything less than '-', the lowest allowable table character should do. ',' qualifies (just about) though it would have to be the character to designate the last region and we'd need another delimiter for non-end-regions... most chars are problematic but could use a '+' (I think) or a '#' -- but this'd be too ugly... need to think of something better. Agree on not doing int for table name; problematic when replicating not to mind the complexity it adds (as you outline above). I like your pros and cons table. Update hregion encoded name to reduce logic and prevent region collisions in META - Key: HBASE-4616 URL: https://issues.apache.org/jira/browse/HBASE-4616 Project: HBase Issue Type: Umbrella Reporter: Alex Newman Assignee: Alex Newman Attachments: HBASE-4616-v2.patch, HBASE-4616-v3.patch, HBASE-4616.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175128#comment-13175128 ] jirapos...@reviews.apache.org commented on HBASE-5081: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3292/#review4089 --- I feel that the proper fix should go in the method createTaskIfAbsent() before attempting to delete a task in zk, task.deleted is set to true. The task is not removed from tasks array until task is successfully removed from zk. In createTaskIfAbsent() when you find a deleted task we should do the following * If the task had completed successfully then return null. (It is as if the task is completed right away). * if the task had completed unsuccessfully then block (with timeouts) until the task is removed from the tasks array. Without fixing anything, the problem, I think is present only in the following scenario - at startup the master acquires orphan tasks listed in zookeeper. One of these orphan tasks fails. Before that orphan task could be deleted some master thread asks for that task to be completed. As things currently stand, the SplitLogManager will reply with SUCCESS immediately. (This is because of the logic in createTaskIfAbsent()) The common case where this race happens should work ... - a master thread asks for a log dir to be split. That task fails but it has not been deleted from zk yet nor removed from tasks yet. The log-dir-split is retried and the retry finds the old, soon to be deleted task. But the retry will also see that task.batch is set and it will immediately throw an error saying 'someone else is waiting for this task'. And the next time log-dir-split is retried the tasks map might have been cleared and things will work. src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java https://reviews.apache.org/r/3292/#comment9193 The task corresponding to this path has to be removed from the tasks map (as in deleteNodeSuccess()) src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java https://reviews.apache.org/r/3292/#comment9194 I guess this should be considered an error that the delete did not go through? - Prakash On 2011-12-22 00:31:23, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3292/ bq. --- bq. bq. (Updated 2011-12-22 00:31:23) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Lars Hofhansl. bq. bq. bq. Summary bq. --- bq. bq. In this patch, after a task is done, we don't delete the node if the task is failed. So that when it's retried later on, there won't be race problem. bq. bq. It used to delete the node always. bq. bq. bq. This addresses bug HBASE-5081. bq. https://issues.apache.org/jira/browse/HBASE-5081 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 bq.src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java 32ad7e8 bq. bq. Diff: https://reviews.apache.org/r/3292/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -Dtest=TestDistributedLogSplitting clean test bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly
[jira] [Commented] (HBASE-5033) Opening/Closing store in parallel to reduce region open/close time
[ https://issues.apache.org/jira/browse/HBASE-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175130#comment-13175130 ] Phabricator commented on HBASE-5033: Kannan has commented on the revision [jira][HBASE-5033][[89-fb]]Opening/Closing store in parallel to reduce region open/close time. Liyin - sounds good. REVISION DETAIL https://reviews.facebook.net/D933 Opening/Closing store in parallel to reduce region open/close time -- Key: HBASE-5033 URL: https://issues.apache.org/jira/browse/HBASE-5033 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D933.1.patch, D933.2.patch, D933.3.patch, D933.4.patch, D933.5.patch Region servers are opening/closing each store and each store file for every store in sequential fashion, which may cause inefficiency to open/close regions. So this diff is to open/close each store in parallel in order to reduce region open/close time. Also it would help to reduce the cluster restart time. 1) Opening each store in parallel 2) Loading each store file for every store in parallel 3) Closing each store in parallel 4) Closing each store file for every store in parallel. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175132#comment-13175132 ] Shrijeet Paliwal commented on HBASE-5041: - Our logic to check if the name is a regionname or tablename is designed to be as follows: tl;dr: If it is not an existing table, its should be a region. {noformat} /** * @param tableNameOrRegionName Name of a table or name of a region. * @return True if codetableNameOrRegionName/code is *possibly* a region * name else false if a verified tablename (we call {@link #tableExists(byte[])}; * else we throw an exception. * @throws IOException */ private boolean isRegionName(final byte [] tableNameOrRegionName) throws IOException { if (tableNameOrRegionName == null) { throw new IllegalArgumentException(Pass a table name or region name); } return !tableExists(tableNameOrRegionName); } {noformat} My plan was to modify majorCompact function's else block to check if the table exist and throw TableNotFoundException if it does not. But because of name logic one will never reach 'else' part and a compaction request will be registered assuming it must be a region. Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175142#comment-13175142 ] Lars Hofhansl commented on HBASE-5088: -- If the observed behavior is for the calling thread to enter an infinite loop (as Jieshan seems to suggest) I'd say this is a critical issue. (imagine this happening in a long lived application server) Jieshan, can confirm that what you saw was a thread entering looping indefinitely? As I said above, I think using SoftReferences for caching is generally misguided. As far as I am concerned we can just replace SoftValueSortedMap with ConcurrentSkipListMap. A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175146#comment-13175146 ] Lars Hofhansl commented on HBASE-5081: -- @Prakash: I like that idea. Keeps the logic where it belongs. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5081: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508297/hbase-5081-patch-v6.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -152 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 76 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/573//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/573//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/573//console This message is automatically generated.) Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5081: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508328/hbase-5081-patch-v7.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -152 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 76 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/575//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/575//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/575//console This message is automatically generated.) Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175149#comment-13175149 ] jirapos...@reviews.apache.org commented on HBASE-5081: -- bq. On 2011-12-22 22:55:55, Prakash Khemani wrote: bq. I feel that the proper fix should go in the method createTaskIfAbsent() bq. bq. before attempting to delete a task in zk, task.deleted is set to true. The task is not removed from tasks array until task is successfully removed from zk. bq. bq. In createTaskIfAbsent() when you find a deleted task we should do the following bq. * If the task had completed successfully then return null. (It is as if the task is completed right away). bq. * if the task had completed unsuccessfully then block (with timeouts) until the task is removed from the tasks array. bq. bq. Without fixing anything, the problem, I think is present only in the following scenario bq. - at startup the master acquires orphan tasks listed in zookeeper. One of these orphan tasks fails. Before that orphan task could be deleted some master thread asks for that task to be completed. As things currently stand, the SplitLogManager will reply with SUCCESS immediately. (This is because of the logic in createTaskIfAbsent()) bq. bq. The common case where this race happens should work ... bq. - a master thread asks for a log dir to be split. That task fails but it has not been deleted from zk yet nor removed from tasks yet. The log-dir-split is retried and the retry finds the old, soon to be deleted task. But the retry will also see that task.batch is set and it will immediately throw an error saying 'someone else is waiting for this task'. And the next time log-dir-split is retried the tasks map might have been cleared and things will work. The task is not removed from tasks array until task is successfully removed from zk. This seems not correct. stopTrackingTasks() will remove all tasks even if the task is not removed from zk. That's why createTaskIfAbsent() can put a new task in the set. If we remove stopTrackingTasks(), then the task should be still in tasks, then this alternative will work. Will removing stopTrackingTasks() cause other issues? For the second *, how long should we block? If the task is still not removed from the tasks array after the timeout, what should we do? Can you come up a patch? I am very open to any fix. bq. On 2011-12-22 22:55:55, Prakash Khemani wrote: bq. src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java, line 382 bq. https://reviews.apache.org/r/3292/diff/8/?file=65682#file65682line382 bq. bq. The task corresponding to this path has to be removed from the tasks map (as in deleteNodeSuccess()) It is removed in the stopTrackingTasks() methods, since this one is failed, so batch.installed != batch.done. - Jimmy --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3292/#review4089 --- On 2011-12-22 00:31:23, Jimmy Xiang wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3292/ bq. --- bq. bq. (Updated 2011-12-22 00:31:23) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Lars Hofhansl. bq. bq. bq. Summary bq. --- bq. bq. In this patch, after a task is done, we don't delete the node if the task is failed. So that when it's retried later on, there won't be race problem. bq. bq. It used to delete the node always. bq. bq. bq. This addresses bug HBASE-5081. bq. https://issues.apache.org/jira/browse/HBASE-5081 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 bq.src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java 32ad7e8 bq. bq. Diff: https://reviews.apache.org/r/3292/diff bq. bq. bq. Testing bq. --- bq. bq. mvn -Dtest=TestDistributedLogSplitting clean test bq. bq. bq. Thanks, bq. bq. Jimmy bq. bq. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt,
[jira] [Commented] (HBASE-4916) LoadTest MR Job
[ https://issues.apache.org/jira/browse/HBASE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175152#comment-13175152 ] Phabricator commented on HBASE-4916: jdcryans has commented on the revision HBASE-4916 [jira] LoadTest MR Job. Another day of testing later, here's what I think of the current patch: - I like being able to drive a lot of load from just a few mappers, this is definitely something YCSB and PE can't do. - Being able to mix things is fun too, with PE I'd have to start 2 tests. With YCSB it's more versatile tho since you can tweak more than just random reads and writes. - Regarding the work loads, it's also much easier to do in YCSB as you don't have to write code. My other comments are in the code below. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/mapreduce/LoadTest.java:197 HBC.create() ?? src/main/java/org/apache/hadoop/hbase/loadtest/Workload.java:100 I understand why the table is recreated every time (you need to know the existing keys), but this is also a big flaw for this load tester. If you have a respectable number of nodes, it's takes a lot of time to generate enough data so that your block cache/OS cache cannot hold all the data. Being able to restart using an existing table is a must have IMO. How are you guys using it? src/main/java/org/apache/hadoop/hbase/loadtest/Workload.java:136 I think this should be configurable. src/main/java/org/apache/hadoop/hbase/loadtest/GetGenerator.java:97 This is one of the other big problem with this load generator, the only pattern is random. YCSB let's you do that but also offers zipf or latest distribution. REVISION DETAIL https://reviews.facebook.net/D741 LoadTest MR Job --- Key: HBASE-4916 URL: https://issues.apache.org/jira/browse/HBASE-4916 Project: HBase Issue Type: Sub-task Components: client, regionserver Reporter: Nicolas Spiegelberg Assignee: Christopher Gist Fix For: 0.94.0 Attachments: HBASE-4916.D741.1.patch Add a script to start a streaming map-reduce job where each map tasks runs an instance of the load tester for a partition of the key-space. Ensure that the load tester takes a parameter indicating the start key for write operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175155#comment-13175155 ] Jimmy Xiang commented on HBASE-5081: @Stack, it is not an orphan task. It happens in ServerShutdownHandler. It retries the log splitting if the previous one failed for any reason: line 178: this.services.getExecutorService().submit(this); It keep retrying. Should we have a limit here? Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175158#comment-13175158 ] Jimmy Xiang commented on HBASE-5081: @Prakash, this one didn't happen when the master starts up. It happened when one region server died. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3565) Add a metric to keep track of slow HLog appends
[ https://issues.apache.org/jira/browse/HBASE-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175163#comment-13175163 ] Mubarak Seyed commented on HBASE-3565: -- IMO we need 4 metrics 1. slowHLogAppendsCount (AtomicLong, increment by 1 whenever it takes 1 sec to append HLog, no reset, long-live until RS runs) 2. minHLogAppendTime (when we log slow appends 1 sec) 3. avgHLogAppendTime 4. maxHLogAppendTime 2, 3, and 4 are needs to get reset upon every doUpdate() for every 5 minutes or so. For every 5 minutes, we will get 1 min/avg/max HLog append time. Add a metric to keep track of slow HLog appends --- Key: HBASE-3565 URL: https://issues.apache.org/jira/browse/HBASE-3565 Project: HBase Issue Type: Improvement Components: metrics, regionserver Reporter: Benoit Sigoure Labels: monitoring Whenever an edit takes too long to be written to an HLog, HBase logs a warning such as this one: {code} 2011-02-23 20:03:14,703 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: IPC Server handler 21 on 60020 took 15065ms appending an edit to hlog; editcount=126050 {code} I would like to have a counter incremented each time this happens and this counter exposed via the metrics stuff in HBase so I can collect it in my monitoring system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3565) Add a metric to keep track of slow HLog appends
[ https://issues.apache.org/jira/browse/HBASE-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175169#comment-13175169 ] Mubarak Seyed commented on HBASE-3565: -- I guess metric names are confusing. How about slowHLogAppendCount slowHLogAppendMinTime slowHLogAppendAvgTime slowHLogAppendMaxTime Add a metric to keep track of slow HLog appends --- Key: HBASE-3565 URL: https://issues.apache.org/jira/browse/HBASE-3565 Project: HBase Issue Type: Improvement Components: metrics, regionserver Reporter: Benoit Sigoure Labels: monitoring Whenever an edit takes too long to be written to an HLog, HBase logs a warning such as this one: {code} 2011-02-23 20:03:14,703 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: IPC Server handler 21 on 60020 took 15065ms appending an edit to hlog; editcount=126050 {code} I would like to have a counter incremented each time this happens and this counter exposed via the metrics stuff in HBase so I can collect it in my monitoring system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175174#comment-13175174 ] Zhihong Yu commented on HBASE-4720: --- @Mubarak: Please resubmit patch for Hadoop QA - build machine was just restarted. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175172#comment-13175172 ] Zhihong Yu commented on HBASE-5081: --- Currently SplitLogManager.Task doesn't have flag indicating whether the deletion was caused by successful splitting. I think we should introduce such flag through boolean or enum so that stopTrackingTasks() can make better decision. Looking at the code from 0.89-fb, I can see some subtle differences between 0.89-fb and TRUNK. e.g. task.batch.notify() in 0.89-fb is guarded by a condition: {code} if (!task.isOrphan()) { synchronized (task.batch) { if (status == SUCCESS) { task.batch.done++; } else { task.batch.error++; } if ((task.batch.done + task.batch.error) == task.batch.installed) { task.batch.notify(); } } } {code} I think we should unify the two codebases so that our observations have common ground. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175176#comment-13175176 ] stack commented on HBASE-5081: -- @Jimmy I see. Its a recursion on ioe; Any ioe could get us stuck here retrying split logs of a downed server. I suppose there should be a bound on how many times we try split logs or a particular server. Distributed log splitting deleteNode races againsth splitLog retry --- Key: HBASE-5081 URL: https://issues.apache.org/jira/browse/HBASE-5081 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0 Attachments: distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt, patch_for_92_v3.txt Recently, during 0.92 rc testing, we found distributed log splitting hangs there forever. Please see attached screen shot. I looked into it and here is what happened I think: 1. One rs died, the servershutdownhandler found it out and started the distributed log splitting; 2. All three tasks failed, so the three tasks were deleted, asynchronously; 3. Servershutdownhandler retried the log splitting; 4. During the retrial, it created these three tasks again, and put them in a hashmap (tasks); 5. The asynchronously deletion in step 2 finally happened for one task, in the callback, it removed one task in the hashmap; 6. One of the newly submitted tasks' zookeeper watcher found out that task is unassigned, and it is not in the hashmap, so it created a new orphan task. 7. All three tasks failed, but that task created in step 6 is an orphan so the batch.err counter was one short, so the log splitting hangs there and keeps waiting for the last task to finish which is never going to happen. So I think the problem is step 2. The fix is to make deletion sync, instead of async, so that the retry will have a clean start. Async deleteNode will mess up with split log retrial. In extreme situation, if async deleteNode doesn't happen soon enough, some node created during the retrial could be deleted. deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mubarak Seyed updated HBASE-4720: - Attachment: HBASE-4720.trunk.v2.patch Per Ted's request, resubmitting the patch. Thanks. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3565) Add a metric to keep track of slow HLog appends
[ https://issues.apache.org/jira/browse/HBASE-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175200#comment-13175200 ] Mubarak Seyed commented on HBASE-3565: -- slowHLogAppendCount slowHLogAppendTime -slowHLogAppendMinTime- -slowHLogAppendAvgTime- -slowHLogAppendMaxTime- Waiting for corporate approval to contribute this patch. Thanks. Add a metric to keep track of slow HLog appends --- Key: HBASE-3565 URL: https://issues.apache.org/jira/browse/HBASE-3565 Project: HBase Issue Type: Improvement Components: metrics, regionserver Reporter: Benoit Sigoure Labels: monitoring Whenever an edit takes too long to be written to an HLog, HBase logs a warning such as this one: {code} 2011-02-23 20:03:14,703 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: IPC Server handler 21 on 60020 took 15065ms appending an edit to hlog; editcount=126050 {code} I would like to have a counter incremented each time this happens and this counter exposed via the metrics stuff in HBase so I can collect it in my monitoring system. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175226#comment-13175226 ] Hadoop QA commented on HBASE-4720: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508447/HBASE-4720.trunk.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -151 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 77 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/583//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/583//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/583//console This message is automatically generated. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175229#comment-13175229 ] Zhihong Yu commented on HBASE-4720: --- The failed tests were due to NumberFormatException (see MAPREDUCE-3583) TestMasterReplication hung but shouldn't be caused by this JIRA. The latest patch should be good to go. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shrijeet Paliwal updated HBASE-5041: Attachment: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Attaching first patch. Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175231#comment-13175231 ] Hadoop QA commented on HBASE-4720: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508468/HBASE-4720.trunk.v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -151 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 77 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/584//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/584//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/584//console This message is automatically generated. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175233#comment-13175233 ] Mubarak Seyed commented on HBASE-4720: -- Thanks Ted and Stack. Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175234#comment-13175234 ] Zhihong Yu commented on HBASE-5041: --- Patch (for 0.90) looks good. Please attach patch for TRUNK so that Hadoop QA can do its job. Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reassigned HBASE-5041: - Assignee: Shrijeet Paliwal Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5010: --- Attachment: D909.2.patch mbautin updated the revision [jira] [HBASE-5010] [89-fb] Filter HFiles based on TTL. Reviewers: Kannan, Liyin, JIRA Addressing Kannan's and Ted's comments. REVISION DETAIL https://reviews.facebook.net/D909 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java src/main/java/org/apache/hadoop/hbase/regionserver/NonLazyKeyValueScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java src/main/java/org/apache/hadoop/hbase/regionserver/TimeRangeTracker.java src/main/java/org/apache/hadoop/hbase/util/Threads.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSelectScannersUsingTTL.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D909.1.patch, D909.2.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175247#comment-13175247 ] ramkrishna.s.vasudevan commented on HBASE-5041: --- @Shrijeet {code} if (isRegionName(tableNameOrRegionName)) { PairHRegionInfo, HServerAddress pair = MetaReader.getRegion(ct, tableNameOrRegionName); if (pair == null || pair.getSecond() == null) { LOG.info(No server in .META. for + Bytes.toStringBinary(tableNameOrRegionName) + ; pair= + pair); {code} The MetaReader.getRegion should return null pair if a wrong table name is provided. Can't we throw exception there? May be am missing something. Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175253#comment-13175253 ] Jieshan Bean commented on HBASE-5088: - Loop into the mothod of TreeMap#fixAfterDeletion(EntryK,V x), once the x is null(It may caused by a concurrency issue), it can't come out of that loop. At that time, CPU usage is high. That's what we saw. All the time, the thread was blocked in that method. Currently, the patch is made just replace TreeMap with ConcurrentSkipListMap as Anoop and Lars's suggestion. We're verifying the patch. @Ted, actually, the heapmap method of ConcurrentSkipListMap is different from TreeMap, it is also backed by the original thread-safe map. what do you think? A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-4218: -- Status: Open (was: Patch Available) Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175256#comment-13175256 ] Zhihong Yu commented on HBASE-5088: --- I think Anoop and Lars' suggestion is good. Let us know the result of your tests. A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175257#comment-13175257 ] Lars Hofhansl commented on HBASE-5088: -- @Stack... Do you think this is critical? (it has been like this forever, so maybe it's not, seems bad enough, though) A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)
[ https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175258#comment-13175258 ] Zhihong Yu commented on HBASE-4218: --- Hadoop QA remembers attachment Id and wouldn't retest the same attachment. Please attach the patch again. Delta Encoding of KeyValues (aka prefix compression) - Key: HBASE-4218 URL: https://issues.apache.org/jira/browse/HBASE-4218 Project: HBase Issue Type: Improvement Components: io Affects Versions: 0.94.0 Reporter: Jacek Migdal Assignee: Mikhail Bautin Labels: compression Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, Delta-encoding.patch-2011-12-22_11_52_07.patch, Delta_encoding_with_memstore_TS.patch, open-source.diff A compression for keys. Keys are sorted in HFile and they are usually very similar. Because of that, it is possible to design better compression than general purpose algorithms, It is an additional step designed to be used in memory. It aims to save memory in cache as well as speeding seeks within HFileBlocks. It should improve performance a lot, if key lengths are larger than value lengths. For example, it makes a lot of sense to use it when value is a counter. Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) shows that I could achieve decent level of compression: key compression ratio: 92% total compression ratio: 85% LZO on the same data: 85% LZO after delta encoding: 91% While having much better performance (20-80% faster decompression ratio than LZO). Moreover, it should allow far more efficient seeking which should improve performance a bit. It seems that a simple compression algorithms are good enough. Most of the savings are due to prefix compression, int128 encoding, timestamp diffs and bitfields to avoid duplication. That way, comparisons of compressed data can be much faster than a byte comparator (thanks to prefix compression and bitfields). In order to implement it in HBase two important changes in design will be needed: -solidify interface to HFileBlock / HFileReader Scanner to provide seeking and iterating; access to uncompressed buffer in HFileBlock will have bad performance -extend comparators to support comparison assuming that N first bytes are equal (or some fields are equal) Link to a discussion about something similar: http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175264#comment-13175264 ] Shrijeet Paliwal commented on HBASE-5041: - @Ted, will add a unit test and upload a new one on top of trunk. @Ram, thanks for commenting. Do you mean to say isRegionName should throw an exception? I wanted to keep the semantic same as before - it tells weather the name argument 'appears' to be a region name or not. When MetaReader.getRegion returns null we know one thing for sure, it is not a region. Determining if its a valid table is left to caller, depending on need. Did you mean something else? Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4439) Move ClientScanner out of HTable
[ https://issues.apache.org/jira/browse/HBASE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-4439: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk Move ClientScanner out of HTable Key: HBASE-4439 URL: https://issues.apache.org/jira/browse/HBASE-4439 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 4439-v1.txt, 4439-v2.txt, 4439.txt See HBASE-1935 for motivation. ClientScanner should be able to exist outside of HTable. While we're at it, we can also add an abstract client scanner to easy development of new client side scanners (such as parallel scanners, or per region scanners). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175270#comment-13175270 ] stack commented on HBASE-5088: -- @Lars Sounds bad. Not enough to kill RC IMO since its been this way so long but should make next RC if one and backport to 0.90.6 I'd say. A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175274#comment-13175274 ] Lars Hofhansl commented on HBASE-5088: -- Agree... Shouldn't kill the RC. One thing I noticed in the code, we always use the concrete Map type everywhere. This is in general bad design. Should only the interface that is needed. Most methods only use the Map interface. Shouldn't be passed anything more then. I can attach a patch the reduces the Map interface used to what is actually needed (that may or may not make this problem simpler, but it'll be nicer ;) ). A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5052) The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key)
[ https://issues.apache.org/jira/browse/HBASE-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175275#comment-13175275 ] Zhihong Yu commented on HBASE-5052: --- @Andrei: Have you considered possible collision introduced by the proposed change ? The path where a dynamically loaded coprocessor jar is copied on the local file system depends on the region name (and implicitly, the start key) - Key: HBASE-5052 URL: https://issues.apache.org/jira/browse/HBASE-5052 Project: HBase Issue Type: Bug Components: coprocessors Affects Versions: 0.92.0 Reporter: Andrei Dragomir Attachments: HBASE-5052.patch When loading a coprocessor from hdfs, the jar file gets copied to a path on the local filesystem, which depends on the region name, and the region start key. The name is cleaned, but not enough, so when you have filesystem unfriendly characters (/?:, etc), the coprocessor is not loaded, and an error is thrown -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap
[ https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175283#comment-13175283 ] Jieshan Bean commented on HBASE-5088: - +1 Use the concrete Map type is always not recommended. A concurrency issue on SoftValueSortedMap - Key: HBASE-5088 URL: https://issues.apache.org/jira/browse/HBASE-5088 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4, 0.94.0 Reporter: Jieshan Bean Assignee: Jieshan Bean SoftValueSortedMap is backed by a TreeMap. All the methods in this class are synchronized. If we use this method to add/delete elements, it's ok. But in HConnectionManager#getCachedLocation, it use headMap to get a view from SoftValueSortedMap#internalMap. Once we operate on this view map(like add/delete) in other threads, a concurrency issue may occur. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175284#comment-13175284 ] stack commented on HBASE-5041: -- I think patch is doing right thing. Its changing the contract for isRegionName but this is a private method and you are tightening what was a sloppy contract previous; it looks too like all instances of isRegionName can benefit from this tightening (is this your though Shrijeet?). Here's a few comments on the patch: This bit of code is repeated three time: {code} +final String tableName = Bytes.toString(tableNameOrRegionName); +if (!MetaReader.tableExists(ct, tableName)) { + throw new TableNotFoundException(tableName); +} {code} ... which isn't the end of the world but if you are going to cut a new patch. you might make a method that returns a String tablename for a table you know exists (else it throws the TNFE). Then, given who the author of this patch is, I wonder about the below: {code} +CatalogTracker ct = getCatalogTracker(); +try { + return (MetaReader.getRegion(ct, tableNameOrRegionName) != null); +} finally { + cleanupCatalogTracker(ct); +} {code} We are creating a new CatalogTracker instance. No one seems to be shutting it down? Is that a prob? Otherwise, patch is good by me. Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175287#comment-13175287 ] Shrijeet Paliwal commented on HBASE-5041: - @Stack {quote} I think patch is doing right thing. Its changing the contract for isRegionName but this is a private method and you are tightening what was a sloppy contract previous; it looks too like all instances of isRegionName can benefit from this tightening (is this your though Shrijeet?). {quote} Yes that is the idea. {quote} You might make a method that returns a String tablename for a table you know exists (else it throws the TNFE). {quote} Makes sense, will do. {quote} We are creating a new CatalogTracker instance. No one seems to be shutting it down? Is that a prob? {quote} Did not understand this one Stack. cleanupCatalogTracker called in finally will stop the CatalogTracker, no? Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5041) Major compaction on non existing table does not throw error
[ https://issues.apache.org/jira/browse/HBASE-5041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175288#comment-13175288 ] Zhihong Yu commented on HBASE-5041: --- Shrijeet is using this method which was already in HBaseAdmin.java: {code} private void cleanupCatalogTracker(final CatalogTracker ct) { ct.stop(); } {code} Major compaction on non existing table does not throw error Key: HBASE-5041 URL: https://issues.apache.org/jira/browse/HBASE-5041 Project: HBase Issue Type: Bug Components: regionserver, shell Affects Versions: 0.90.3 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 0001-HBASE-5041-Throw-error-if-table-does-not-exist.patch Following will not complain even if fubar does not exist {code} echo major_compact 'fubar' | $HBASE_HOME/bin/hbase shell {code} The downside for this defect is that major compaction may be skipped due to a typo by Ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reassigned HBASE-4720: - Assignee: Mubarak Seyed (was: Seyed) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira