[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186760#comment-13186760 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/ --- (Updated 2012-01-16 07:58:33.454515) Review request for hbase, Ted Yu and Michael Stack. Changes --- Fixed comments (before Ted points them out to me :) ) Summary --- Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. WAL recovery already handles this case. This addresses bug HBASE-5203. https://issues.apache.org/jira/browse/HBASE-5203 Diffs (updated) - http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1231744 Diff: https://reviews.apache.org/r/3510/diff Testing --- * Tests added in HBASE-3485 * manual testing. * getting a full test run right now Thanks, Lars Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5179: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510674/5179-90v7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/773//console This message is automatically generated.) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss Key: HBASE-5179 URL: https://issues.apache.org/jira/browse/HBASE-5179 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.2 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case. 1.master completed splitLogAfterStartup() 2.RegionserverA restarts, and ServerShutdownHandler is processing. 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server. 4.master starts to assign regions of RegionserverA because it is a dead server by step3. However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5207) Apply HBASE-5155 to trunk
Apply HBASE-5155 to trunk -- Key: HBASE-5207 URL: https://issues.apache.org/jira/browse/HBASE-5207 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The issue HBASE-5155 has been fixed on branch(0.90). The same has to be applied on trunk also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186768#comment-13186768 ] Hadoop QA commented on HBASE-5203: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510675/5203.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/774//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/774//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/774//console This message is automatically generated. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186770#comment-13186770 ] gaojinchao commented on HBASE-5179: --- @chunhui Maybe it has a problem. the number of shutdownhandler thread pool is 3(default), If there are more than 3 deadserver is processing. we will wait forever. Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss Key: HBASE-5179 URL: https://issues.apache.org/jira/browse/HBASE-5179 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.2 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case. 1.master completed splitLogAfterStartup() 2.RegionserverA restarts, and ServerShutdownHandler is processing. 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server. 4.master starts to assign regions of RegionserverA because it is a dead server by step3. However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186781#comment-13186781 ] chunhui shen commented on HBASE-5179: - @Jinchao I think it is another problem. If we restart general three RS, and then kill META server, The first three ServerShutdownHandler will wait meta region, however METAServerShutdownHandler will not be processed because shutdownhandler thread pool is full until one ServerShutdownHandler is finished. So it exists a forever wait. Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss Key: HBASE-5179 URL: https://issues.apache.org/jira/browse/HBASE-5179 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.2 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case. 1.master completed splitLogAfterStartup() 2.RegionserverA restarts, and ServerShutdownHandler is processing. 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server. 4.master starts to assign regions of RegionserverA because it is a dead server by step3. However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186786#comment-13186786 ] gaojinchao commented on HBASE-5179: --- @chunhui Regarding to a normal flow. METAServerShutdownHandler use different thread pool. only init flow, scome cases we can't distinguish meta region server. Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss Key: HBASE-5179 URL: https://issues.apache.org/jira/browse/HBASE-5179 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.2 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case. 1.master completed splitLogAfterStartup() 2.RegionserverA restarts, and ServerShutdownHandler is processing. 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server. 4.master starts to assign regions of RegionserverA because it is a dead server by step3. However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186804#comment-13186804 ] chunhui shen commented on HBASE-5179: - @Jinchao So, we need ensure dead meta server (which is consider as a general server) is processed by SSH when master initializing? Otherwise, one of meta-data loss or waiting forever must happen? Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss Key: HBASE-5179 URL: https://issues.apache.org/jira/browse/HBASE-5179 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.2 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case. 1.master completed splitLogAfterStartup() 2.RegionserverA restarts, and ServerShutdownHandler is processing. 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server. 4.master starts to assign regions of RegionserverA because it is a dead server by step3. However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-5153: Attachment: HBASE-5153-V6-90-minorchange.patch Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186809#comment-13186809 ] Hadoop QA commented on HBASE-5153: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510683/HBASE-5153-V6-90-minorchange.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/775//console This message is automatically generated. Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5153: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510683/HBASE-5153-V6-90-minorchange.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/775//console This message is automatically generated.) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186819#comment-13186819 ] ramkrishna.s.vasudevan commented on HBASE-5203: --- @Lars The coprocessor postPut and postDelete {code} if (m instanceof Put) { coprocessorHost.postPut((Put) m, walEdits.get(i), m.getWriteToWAL()); } else if (m instanceof Delete) { coprocessorHost.postDelete((Delete) m, walEdits.get(i), m.getWriteToWAL()); } {code} Can this be done even if any failures in the internalPut or internalDelete()? Just correct me if am wrong. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5204: - Priority: Blocker (was: Major) Fix Version/s: 0.92.0 Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186906#comment-13186906 ] stack commented on HBASE-5204: -- +1 on patch. TestFromClientSide works locally for me. I'm going to commit. Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186907#comment-13186907 ] stack commented on HBASE-5204: -- Committed branch and trunk. Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186928#comment-13186928 ] gaojinchao commented on HBASE-5179: --- In patch v7, Can we replace process expired server to public void splitLog(final String serverName)? Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss Key: HBASE-5179 URL: https://issues.apache.org/jira/browse/HBASE-5179 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.2 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.92.0, 0.94.0, 0.90.6 Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch If master's processing its failover and ServerShutdownHandler's processing happen concurrently, it may appear following case. 1.master completed splitLogAfterStartup() 2.RegionserverA restarts, and ServerShutdownHandler is processing. 3.master starts to rebuildUserRegions, and RegionserverA is considered as dead server. 4.master starts to assign regions of RegionserverA because it is a dead server by step3. However, when doing step4(assigning region), ServerShutdownHandler may be doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Telford updated HBASE-5208: Attachment: HBASE-5208-001.txt Adds hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.stop options to TableInputFormat to permit defining start/stop row separately. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Telford updated HBASE-5208: Release Note: Added hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.stop for defining start and stop rows for a MapReduce job without having to serialize a Scan object. Status: Patch Available (was: Open) Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186936#comment-13186936 ] Hadoop QA commented on HBASE-5208: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510704/HBASE-5208-001.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/776//console This message is automatically generated. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186941#comment-13186941 ] gaojinchao commented on HBASE-4191: --- @Liyin This is a good feature, How do you process now? hbase load balancer needs locality awareness Key: HBASE-4191 URL: https://issues.apache.org/jira/browse/HBASE-4191 Project: HBase Issue Type: New Feature Reporter: Ted Yu Assignee: Liyin Tang Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, which provides the HFile level locality information. But in order to work with load balancer and region assignment, we need the region level locality information. Let's define the region locality information first, which is almost the same as HFile locality index. HRegion locality index (HRegion A, RegionServer B) = (Total number of HDFS blocks that can be retrieved locally by the RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the Region A) So the HRegion locality index tells us that how much locality we can get if the HMaster assign the HRegion A to the RegionServer B. So there will be 2 steps involved to assign regions based on the locality. 1) During the cluster start up time, the master will scan the hdfs to calculate the HRegion locality index for each pair of HRegion and Region Server. It is pretty expensive to scan the dfs. So we only needs to do this once during the start up time. 2) During the cluster run time, each region server will update the HRegion locality index as metrics periodically as HBASE-4114 did. The Region Server can expose them to the Master through ZK, meta table, or just RPC messages. Based on the HRegion locality index, the assignment manager in the master would have a global knowledge about the region locality distribution and can run the MIN COST MAXIMUM FLOW solver to reach the global optimization. Let's construct the graph first: [Graph] Imaging there is a bipartite graph and the left side is the set of regions and the right side is the set of region servers. There is a source node which links itself to each node in the region set. There is a sink node which is linked from each node in the region server set. [Capacity] The capacity between the source node and region nodes is 1. And the capacity between the region nodes and region server nodes is also 1. (The purpose is each region can ONLY be assigned to one region server at one time) The capacity between the region server nodes and sink node are the avg number of regions which should be assigned each region server. (The purpose is balance the load for each region server) [Cost] The cost between each region and region server is the opposite of locality index, which means the higher locality is, if region A is assigned to region server B, the lower cost it is. The cost function could be more sophisticated when we put more metrics into account. So after running the min-cost max flow solver, the master could assign the regions based on the global locality optimization. Also the master should share this global view to secondary master in case the master fail over happens. In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on the same metrics, to proactively to scan dfs to calculate the global locality information in the cluster. It will help us to verify data locality information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186945#comment-13186945 ] Zhihong Yu commented on HBASE-5208: --- @Nicolas: You should use --no-prefix to generate your patch so that Hadoop Qa can run it. This is a useful feature. Can you add a unit test for it ? Thanks Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Telford updated HBASE-5208: Attachment: HBASE-5208-002.txt Git patches seem to break the QA bot. Manually edited to remove the a/ prefixes. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5204: -- Fix Version/s: 0.94.0 Hadoop Flags: Incompatible change,Reviewed (was: Incompatible change) Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0, 0.94.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5153: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510383/HBASE-5153-V4-90.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/747//console This message is automatically generated.) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5153: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510179/HBASE-5153-V3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/727//console This message is automatically generated.) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186957#comment-13186957 ] Hudson commented on HBASE-5204: --- Integrated in HBase-0.92 #245 (See [https://builds.apache.org/job/HBase-0.92/245/]) HBASE-5204 Backward compatibility fixes for 0.92 stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0, 0.94.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3724) Load balancer improvements
[ https://issues.apache.org/jira/browse/HBASE-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186959#comment-13186959 ] gaojinchao commented on HBASE-3724: --- I found the balance ago in branch92 is invalid for our scenario. So I use this issue to hang all issues related to balance. If someone want to see it, it will be easy. Load balancer improvements -- Key: HBASE-3724 URL: https://issues.apache.org/jira/browse/HBASE-3724 Project: HBase Issue Type: Umbrella Reporter: stack Umbrella issue under which we hang all regions related to balancer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3724) Load balancer improvements
[ https://issues.apache.org/jira/browse/HBASE-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186963#comment-13186963 ] Zhihong Yu commented on HBASE-3724: --- @Jinchao: Can you describe your scenario ? Then we will see which task can best accommodate your requirement. Load balancer improvements -- Key: HBASE-3724 URL: https://issues.apache.org/jira/browse/HBASE-3724 Project: HBase Issue Type: Umbrella Reporter: stack Umbrella issue under which we hang all regions related to balancer -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186964#comment-13186964 ] Hadoop QA commented on HBASE-5208: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510705/HBASE-5208-002.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/777//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/777//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/777//console This message is automatically generated. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186962#comment-13186962 ] Nicholas Telford commented on HBASE-5208: - Tests were excluded from the patch as for now I'm unable to get the large tests to run in my environment, even from a clean trunk. I do have a patch with tests, but I'm not happy submitting them until I can get it working. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186967#comment-13186967 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/#review4394 --- Nice work, Lars. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java https://reviews.apache.org/r/3510/#comment9902 I think DoNotRetryIOException may be more appropriate here. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9897 White space. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9898 Please replace this parameter with clusterId. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9899 Please add clusterId parameter here. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9903 Should we allow caller to pass clusterId ? That parameter would be used at line 4213. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9900 The original intent of this check being inside for loop was to populate walEdits. Now we can lift this check to after line 4157. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9901 There is only one WALEdit now, right ? http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java https://reviews.apache.org/r/3510/#comment9904 I think the original javadoc should be modified to indicate the support of Put and Delete. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java https://reviews.apache.org/r/3510/#comment9906 Good. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java https://reviews.apache.org/r/3510/#comment9907 I think 'to a map from key to values' may be clearer. Otherwise people have to read the method body to fully understand. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java https://reviews.apache.org/r/3510/#comment9908 I don't see InterruptedException declared to be thrown by this method. IE is caught at line 171. - Ted On 2012-01-16 07:58:33, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3510/ bq. --- bq. bq. (Updated 2012-01-16 07:58:33) bq. bq. bq. Review request for hbase, Ted Yu and Michael Stack. bq. bq. bq. Summary bq. --- bq. bq. Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. bq. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). bq. HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. bq. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. bq. WAL recovery already handles this case. bq. bq. bq. This addresses bug HBASE-5203. bq. https://issues.apache.org/jira/browse/HBASE-5203 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1231744 bq. bq. Diff: https://reviews.apache.org/r/3510/diff bq. bq. bq. Testing bq. --- bq. bq. * Tests added in HBASE-3485 bq. * manual
[jira] [Commented] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186977#comment-13186977 ] Hudson commented on HBASE-5204: --- Integrated in HBase-TRUNK #2634 (See [https://builds.apache.org/job/HBase-TRUNK/2634/]) HBASE-5204 Backward compatibility fixes for 0.92 stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0, 0.94.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jieshan Bean updated HBASE-5153: Attachment: TestResults-hbase5153.out Ran the tests again, got the same results: 5 tests failed due to the hostName problem. Please find the results from the attachment TestResults-hbase5153.out. Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, TestResults-hbase5153.out HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186988#comment-13186988 ] Lars Hofhansl commented on HBASE-5203: -- @Ram: are you looking at the right patch? The part you quote was removed with this. Or are you saying it should be possible to do that? I think that would not correct, as I want an atomic operation and I realized in HBASE-3584 that I need to write a single WALEdit. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186989#comment-13186989 ] Hadoop QA commented on HBASE-5153: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510715/TestResults-hbase5153.out against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/778//console This message is automatically generated. Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, TestResults-hbase5153.out HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186993#comment-13186993 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java, line 149 bq. https://reviews.apache.org/r/3510/diff/2/?file=68986#file68986line149 bq. bq. I think DoNotRetryIOException may be more appropriate here. Sure. Although this is client side code, so there is no notion of retry. (Put does the same) bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4170 bq. https://reviews.apache.org/r/3510/diff/2/?file=68987#file68987line4170 bq. bq. Should we allow caller to pass clusterId ? bq. That parameter would be used at line 4213. The clusterID is only used for replication. Only plain Puts and Deletes need to use an optional clusterId (when executed from the ReplicationSink). All other operations do (and should) use the local clusterID. bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4184 bq. https://reviews.apache.org/r/3510/diff/2/?file=68987#file68987line4184 bq. bq. The original intent of this check being inside for loop was to populate walEdits. bq. Now we can lift this check to after line 4157. Correct. But I still need to execute and check all preHooks before the 1st WALEdit is written. bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4214 bq. https://reviews.apache.org/r/3510/diff/2/?file=68987#file68987line4214 bq. bq. There is only one WALEdit now, right ? Correct. Should read and apply edits (there are many edits in the one WALEdit) bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java, line 98 bq. https://reviews.apache.org/r/3510/diff/2/?file=68988#file68988line98 bq. bq. I think the original javadoc should be modified to indicate the support of Put and Delete. Agreed. bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java, line 175 bq. https://reviews.apache.org/r/3510/diff/2/?file=68988#file68988line175 bq. bq. I think 'to a map from key to values' may be clearer. bq. Otherwise people have to read the method body to fully understand. I think this should be a static util method somewhere(?) bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java, line 202 bq. https://reviews.apache.org/r/3510/diff/2/?file=68988#file68988line202 bq. bq. I don't see InterruptedException declared to be thrown by this method. bq. IE is caught at line 171. Argghh... HTable.batch() throws it, and my first attempt was to pass it on. This is a leftover will be remove. Thanks for the keen eyes. bq. On 2012-01-16 15:27:14, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1788 bq. https://reviews.apache.org/r/3510/diff/2/?file=68987#file68987line1788 bq. bq. Please replace this parameter with clusterId. I knew you would find some Javadoc I missed :) Will fix. - Lars --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/#review4394 --- On 2012-01-16 07:58:33, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3510/ bq. --- bq. bq. (Updated 2012-01-16 07:58:33) bq. bq. bq. Review request for hbase, Ted Yu and Michael Stack. bq. bq. bq. Summary bq. --- bq. bq. Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. bq. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). bq. HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. bq. This is the
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187001#comment-13187001 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/#review4397 --- http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java https://reviews.apache.org/r/3510/#comment9917 What I meant was that if coprocessorHost == null, the for loop can be skipped. - Ted On 2012-01-16 07:58:33, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3510/ bq. --- bq. bq. (Updated 2012-01-16 07:58:33) bq. bq. bq. Review request for hbase, Ted Yu and Michael Stack. bq. bq. bq. Summary bq. --- bq. bq. Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. bq. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). bq. HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. bq. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. bq. WAL recovery already handles this case. bq. bq. bq. This addresses bug HBASE-5203. bq. https://issues.apache.org/jira/browse/HBASE-5203 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1231744 bq. bq. Diff: https://reviews.apache.org/r/3510/diff bq. bq. bq. Testing bq. --- bq. bq. * Tests added in HBASE-3485 bq. * manual testing. bq. * getting a full test run right now bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5153: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510715/TestResults-hbase5153.out against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/778//console This message is automatically generated.) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, TestResults-hbase5153.out HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187011#comment-13187011 ] Zhihong Yu commented on HBASE-5153: --- @Jieshan: Please find a machine which has access to internet to run the test suite. maven needs to download artifacts. I ran the 5 tests on MacBook and they passed. {code} 839 mt -Dtest=TestClockSkewDetection 840 mt -Dtest=TestScanner 841 mt -Dtest=TestCatalogTrackerOnCluster 842 mt -Dtest=TestCatalogTracker {code} +1 on latest patch. Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, TestResults-hbase5153.out HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187023#comment-13187023 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- bq. On 2012-01-16 16:25:31, Ted Yu wrote: bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 4184 bq. https://reviews.apache.org/r/3510/diff/2/?file=68987#file68987line4184 bq. bq. What I meant was that if coprocessorHost == null, the for loop can be skipped. Oh I see. You're right. - Lars --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/#review4397 --- On 2012-01-16 07:58:33, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3510/ bq. --- bq. bq. (Updated 2012-01-16 07:58:33) bq. bq. bq. Review request for hbase, Ted Yu and Michael Stack. bq. bq. bq. Summary bq. --- bq. bq. Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. bq. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). bq. HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. bq. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. bq. WAL recovery already handles this case. bq. bq. bq. This addresses bug HBASE-5203. bq. https://issues.apache.org/jira/browse/HBASE-5203 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1231744 bq. bq. Diff: https://reviews.apache.org/r/3510/diff bq. bq. bq. Testing bq. --- bq. bq. * Tests added in HBASE-3485 bq. * manual testing. bq. * getting a full test run right now bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187037#comment-13187037 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/ --- (Updated 2012-01-16 17:28:09.639619) Review request for hbase, Ted Yu and Michael Stack. Changes --- * Addresses Ted's comments. * Passes all tests. Summary --- Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. WAL recovery already handles this case. This addresses bug HBASE-5203. https://issues.apache.org/jira/browse/HBASE-5203 Diffs (updated) - http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1231744 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java 1231744 Diff: https://reviews.apache.org/r/3510/diff Testing --- * Tests added in HBASE-3485 * manual testing. * getting a full test run right now Thanks, Lars Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Telford updated HBASE-5208: Attachment: HBASE-5208-003.txt Adds tests for Scans defined by a Configuration. Getting the largeTests suite running proved difficult and I think this actually makes the test run too long - I had to comment out the old testScan() tests to get it to complete in a reasonable time (i.e. without being killed for taking too long). Should I have separated this out in to a separate test file? Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Reporter: Aditya Acharya I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5209) HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup
[ https://issues.apache.org/jira/browse/HBASE-5209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5209: -- Affects Version/s: 0.94.0 0.92.0 0.90.5 HConnection/HMasterInterface should allow for way to get hostname of currently active master in multi-master HBase setup Key: HBASE-5209 URL: https://issues.apache.org/jira/browse/HBASE-5209 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0, 0.94.0, 0.90.5 Reporter: Aditya Acharya I have a multi-master HBase set up, and I'm trying to programmatically determine which of the masters is currently active. But the API does not allow me to do this. There is a getMaster() method in the HConnection class, but it returns an HMasterInterface, whose methods do not allow me to find out which master won the last race. The API should have a getActiveMasterHostname() or something to that effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187059#comment-13187059 ] jirapos...@reviews.apache.org commented on HBASE-2600: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ --- (Updated 2012-01-16 18:26:39.949854) Review request for hbase and Michael Stack. Changes --- Updating the patch so that src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java uses the endkey instead of the startkey as it's more oftenly populated. it fixes the occasional test breakage of org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster#testShutdownSimpleFixup Summary --- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600. https://issues.apache.org/jira/browse/HBASE-2600 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing --- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch,
[jira] [Updated] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5203: - Status: Open (was: Patch Available) Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187060#comment-13187060 ] Hadoop QA commented on HBASE-5208: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510724/HBASE-5208-003.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationPeer org.apache.hadoop.hbase.regionserver.TestSplitLogWorker org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/779//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/779//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/779//console This message is automatically generated. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman updated HBASE-2600: --- Attachment: 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187064#comment-13187064 ] ramkrishna.s.vasudevan commented on HBASE-5203: --- @Lars Sorry i pasted the snippet from the code. If you take doMiniBatchPuts the postPut() will be done onlly if the put is successful. Here in mutateRow() we do the postPut in finally block. So just i wanted to know if we the MutatedRow's log.append() fails we still execute the postPut(). Pls do correct me if am wrong. I get the intent behind the patch but this part am not sure. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187065#comment-13187065 ] Zhihong Yu commented on HBASE-5208: --- Looks like testScan() is always followed by testScanFromConfiguration() with the same parameters: {code} testScan(null, app, apo); +testScanFromConfiguration(null, app, apo); {code} I suggest adding an intermediary method that calls both testScanFromConfiguration() and testScan(). So using the existing TestTableInputFormatScan should be fine. Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187067#comment-13187067 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/#review4400 --- Ship it! - Ted On 2012-01-16 17:28:09, Lars Hofhansl wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3510/ bq. --- bq. bq. (Updated 2012-01-16 17:28:09) bq. bq. bq. Review request for hbase, Ted Yu and Michael Stack. bq. bq. bq. Summary bq. --- bq. bq. Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. bq. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). bq. HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. bq. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. bq. WAL recovery already handles this case. bq. bq. bq. This addresses bug HBASE-5203. bq. https://issues.apache.org/jira/browse/HBASE-5203 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1231744 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java 1231744 bq. bq. Diff: https://reviews.apache.org/r/3510/diff bq. bq. bq. Testing bq. --- bq. bq. * Tests added in HBASE-3485 bq. * manual testing. bq. * getting a full test run right now bq. bq. bq. Thanks, bq. bq. Lars bq. bq. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187071#comment-13187071 ] Nicholas Telford commented on HBASE-5208: - That was my intention. I can extract that out to an intermediary method if that's preferable, however that doesn't really solve the problem that doubling the number of MR jobs spun up causes the test to timeout. Any ideas on that one? Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187074#comment-13187074 ] Lars Hofhansl commented on HBASE-5203: -- @Ram: I see what you mean. Good point. Unlike doMiniBatchPut there is no partial completion here, but the postHooks should indeed only be run if the (entire) operation was successful. I'll have a change soon. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5120) Timeout monitor races with table disable handler
[ https://issues.apache.org/jira/browse/HBASE-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187081#comment-13187081 ] ramkrishna.s.vasudevan commented on HBASE-5120: --- Just was browsing thro HBASE-4015 where Timeout monitor was refactored. With 5 secs as timeout period it was tested by balancing, killing and bringing up RS. Things came out fine. But this disable scenario was missed out. Another change that i could see is when HBASE-4015 was done for forceful unassign() we check if the node is present in CLOSING state then we did not proceed with it. Now in recent code the check is removed. May be that exposed the problem. Thanks to JD for pointing this out. As per JD if after reducing the timeout period if we don't run to such type of issues then we can say TM is really fixed. Timeout monitor races with table disable handler Key: HBASE-5120 URL: https://issues.apache.org/jira/browse/HBASE-5120 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Zhihong Yu Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0, 0.92.1 Attachments: HBASE-5120.patch, HBASE-5120_1.patch, HBASE-5120_2.patch, HBASE-5120_3.patch, HBASE-5120_4.patch, HBASE-5120_5.patch, HBASE-5120_5.patch Here is what J-D described here: https://issues.apache.org/jira/browse/HBASE-5119?focusedCommentId=13179176page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13179176 I think I will retract from my statement that it used to be extremely racy and caused more troubles than it fixed, on my first test I got a stuck region in transition instead of being able to recover. The timeout was set to 2 minutes to be sure I hit it. First the region gets closed {quote} 2012-01-04 00:16:25,811 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to sv4r5s38,62023,1325635980913 for region test1,089cd0c9,1325635015491.1a4b111bcc228043e89f59c4c3f6a791. {quote} 2 minutes later it times out: {quote} 2012-01-04 00:18:30,026 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: test1,089cd0c9,1325635015491.1a4b111bcc228043e89f59c4c3f6a791. state=PENDING_CLOSE, ts=1325636185810, server=null 2012-01-04 00:18:30,026 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=test1,089cd0c9,1325635015491.1a4b111bcc228043e89f59c4c3f6a791. 2012-01-04 00:18:30,027 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region test1,089cd0c9,1325635015491.1a4b111bcc228043e89f59c4c3f6a791. (offlining) {quote} 100ms later the master finally gets the event: {quote} 2012-01-04 00:18:30,129 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=sv4r5s38,62023,1325635980913, region=1a4b111bcc228043e89f59c4c3f6a791, which is more than 15 seconds late 2012-01-04 00:18:30,129 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for 1a4b111bcc228043e89f59c4c3f6a791 2012-01-04 00:18:30,129 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Table being disabled so deleting ZK node and removing from regions in transition, skipping assignment of region test1,089cd0c9,1325635015491.1a4b111bcc228043e89f59c4c3f6a791. 2012-01-04 00:18:30,129 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db03587 Deleting existing unassigned node for 1a4b111bcc228043e89f59c4c3f6a791 that is in expected state RS_ZK_REGION_CLOSED 2012-01-04 00:18:30,166 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db03587 Successfully deleted unassigned node for region 1a4b111bcc228043e89f59c4c3f6a791 in expected state RS_ZK_REGION_CLOSED {quote} At this point everything is fine, the region was processed as closed. But wait, remember that line where it said it was going to force an unassign? {quote} 2012-01-04 00:18:30,322 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db03587 Creating unassigned node for 1a4b111bcc228043e89f59c4c3f6a791 in a CLOSING state 2012-01-04 00:18:30,328 INFO org.apache.hadoop.hbase.master.AssignmentManager: Server null returned java.lang.NullPointerException: Passed server is null for 1a4b111bcc228043e89f59c4c3f6a791 {quote} Now the master is confused, it recreated the RIT znode but the region doesn't even exist anymore. It even tries to shut it down but is blocked by NPEs. Now this is what's going on. The late ZK notification that the znode was deleted (but it got recreated after): {quote} 2012-01-04 00:19:33,285 DEBUG
[jira] [Created] (HBASE-5210) HFiles are missing from an incremental load
HFiles are missing from an incremental load --- Key: HBASE-5210 URL: https://issues.apache.org/jira/browse/HBASE-5210 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.2 Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync). RHEL 2.6.18-164.15.1.el5. 4 node cluster (1 master, 3 slaves) Reporter: Lawrence Simpson We run an overnight map/reduce job that loads data from an external source and adds that data to an existing HBase table. The input files have been loaded into hdfs. The map/reduce job uses the HFileOutputFormat (and the TotalOrderPartitioner) to create HFiles which are subsequently added to the HBase table. On at least two separate occasions (that we know of), a range of output would be missing for a given day. The range of keys for the missing values corresponded to those of a particular region. This implied that a complete HFile somehow went missing from the job. Further investigation revealed the following: * Two different reducers (running in separate JVMs and thus separate class loaders) * in the same server can end up using the same file names for their * HFiles. The scenario is as follows: * 1. Both reducers start near the same time. * 2. The first reducer reaches the point where it wants to write its first file. * 3. It uses the StoreFile class which contains a static Random object * which is initialized by default using a timestamp. * 4. The file name is generated using the random number generator. * 5. The file name is checked against other existing files. * 6. The file is written into temporary files in a directory named * after the reducer attempt. * 7. The second reduce task reaches the same point, but its StoreClass * (which is now in the file system's cache) gets loaded within the * time resolution of the OS and thus initializes its Random() * object with the same seed as the first task. * 8. The second task also checks for an existing file with the name * generated by the random number generator and finds no conflict * because each task is writing files in its own temporary folder. * 9. The first task finishes and gets its temporary files committed * to the real folder specified for output of the HFiles. * 10. The second task then reaches its own conclusion and commits its * files (moveTaskOutputs). The released Hadoop code just overwrites * any files with the same name. No warning messages or anything. * The first task's HFiles just go missing. * * Note: The reducers here are NOT different attempts at the same * reduce task. They are different reduce tasks so data is * really lost. I am currently testing a fix in which I have added code to the Hadoop FileOutputCommitter.moveTaskOutputs method to check for a conflict with an existing file in the final output folder and to rename the HFile if needed. This may not be appropriate for all uses of FileOutputFormat. So I have put this into a new class which is then used by a subclass of HFileOutputFormat. Subclassing of FileOutputCommitter itself was a bit more of a problem due to private declarations. I don't know if my approach is the best fix for the problem. If someone more knowledgeable than myself deems that it is, I will be happy to share what I have done and by that time I may have some information on the results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187089#comment-13187089 ] jirapos...@reviews.apache.org commented on HBASE-5203: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3510/ --- (Updated 2012-01-16 19:05:01.756466) Review request for hbase, Ted Yu and Michael Stack. Changes --- * Addressing Ram's comments. Coprocessor postHooks are now only executed if all operations were successful (threw no exceptions). Summary --- Basically a rewrite (sorry about that) of HBASE-3485 Allow atomic put/delete in one call. This makes this actually correct in the case of RegionServer failures (HBASE-3485 was correct for all scenarios but RegionServer failures). HRegion.mutateRow(...) now groups all edits into a single WALEdit and appends all edits in one call. Only then are the memstore edits applied. This is the first time that WALEdits can contain KVs from different types of operations. So I also had to fix the replication code to understand that. WAL recovery already handles this case. This addresses bug HBASE-5203. https://issues.apache.org/jira/browse/HBASE-5203 Diffs (updated) - http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Delete.java 1232110 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1232110 http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java 1232110 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1232110 http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java 1232110 Diff: https://reviews.apache.org/r/3510/diff Testing (updated) --- * Tests added in HBASE-3485 * manual testing. * passes all tests. Thanks, Lars Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187091#comment-13187091 ] Lars Hofhansl commented on HBASE-5203: -- @Ram: added a new patch to RB. Only change is wrapping the whole operation in try/finally and running the postHooks outside the finally blocks (but still after the mvcc is rolled forward and the regionlock was released). Otherwise the patch is identical. Please have a look. Thanks. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187093#comment-13187093 ] Hadoop QA commented on HBASE-2600: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510727/0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 20 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 83 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD org.apache.hadoop.hbase.replication.TestReplicationPeer org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/780//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/780//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/780//console This message is automatically generated. Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
[ https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187094#comment-13187094 ] Jai Kumar Singh commented on HBASE-5166: Hi stack, Thanks for the comment. I've modified the patch accordingly. Added Executors.newFixedThreadPool(numberOfThreads) for executor part. -- JK MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop -- Key: HBASE-5166 URL: https://issues.apache.org/jira/browse/HBASE-5166 Project: HBase Issue Type: Improvement Reporter: Jai Kumar Singh Priority: Minor Labels: multithreaded, tablemapper Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch Original Estimate: 0.5h Remaining Estimate: 0.5h There is no MultiThreadedTableMapper in hbase currently just like we have a MultiThreadedMapper in Hadoop for IO Bound Jobs. UseCase, webcrawler: take input (urls) from a hbase table and put the content (urls, content) back into hbase. Running these kind of hbase mapreduce job with normal table mapper is quite slow as we are not utilizing CPU fully (N/W IO Bound). Moreover, I want to know whether It would be a good/bad idea to use HBase for these kind of usecases ?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187095#comment-13187095 ] Alex Newman commented on HBASE-2600: I'll take a look at these broken tests. Weird that these didn't break on my jenkins. Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
[ https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jai Kumar Singh updated HBASE-5166: --- Attachment: 0003-Added-MultithreadedTableMapper-HBASE-5166.patch Modified patch MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop -- Key: HBASE-5166 URL: https://issues.apache.org/jira/browse/HBASE-5166 Project: HBase Issue Type: Improvement Reporter: Jai Kumar Singh Priority: Minor Labels: multithreaded, tablemapper Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 0003-Added-MultithreadedTableMapper-HBASE-5166.patch Original Estimate: 0.5h Remaining Estimate: 0.5h There is no MultiThreadedTableMapper in hbase currently just like we have a MultiThreadedMapper in Hadoop for IO Bound Jobs. UseCase, webcrawler: take input (urls) from a hbase table and put the content (urls, content) back into hbase. Running these kind of hbase mapreduce job with normal table mapper is quite slow as we are not utilizing CPU fully (N/W IO Bound). Moreover, I want to know whether It would be a good/bad idea to use HBase for these kind of usecases ?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187100#comment-13187100 ] ramkrishna.s.vasudevan commented on HBASE-5203: --- +1. :) Thanks Lars. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Sigoure updated HBASE-5204: -- Affects Version/s: (was: 0.92.0) Fix Version/s: (was: 0.94.0) Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoit Sigoure updated HBASE-5204: -- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the quick turnaround. And once again, sorry for submitting this so late in the release process. Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.
[ https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187127#comment-13187127 ] Jonathan Hsieh commented on HBASE-5128: --- @Ted sounds good. [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online. - Key: HBASE-5128 URL: https://issues.apache.org/jira/browse/HBASE-5128 Project: HBase Issue Type: New Feature Components: hbck Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region consistency and table integrity invariant violations. However with '-fix' it can only automatically repair region consistency cases having to do with deployment problems. This updated version should be able to handle all cases (including a new orphan regiondir case). When complete will likely deprecate the OfflineMetaRepair tool and subsume several open META-hole related issue. Here's the approach (from the comment of at the top of the new version of the file). {code} /** * HBaseFsck (hbck) is a tool for checking and repairing region consistency and * table integrity. * * Region consistency checks verify that META, region deployment on * region servers and the state of data in HDFS (.regioninfo files) all are in * accordance. * * Table integrity checks verify that that all possible row keys can resolve to * exactly one region of a table. This means there are no individual degenerate * or backwards regions; no holes between regions; and that there no overlapping * regions. * * The general repair strategy works in these steps. * 1) Repair Table Integrity on HDFS. (merge or fabricate regions) * 2) Repair Region Consistency with META and assignments * * For table integrity repairs, the tables their region directories are scanned * for .regioninfo files. Each table's integrity is then verified. If there * are any orphan regions (regions with no .regioninfo files), or holes, new * regions are fabricated. Backwards regions are sidelined as well as empty * degenerate (endkey==startkey) regions. If there are any overlapping regions, * a new region is created and all data is merged into the new region. * * Table integrity repairs deal solely with HDFS and can be done offline -- the * hbase region servers or master do not need to be running. These phase can be * use to completely reconstruct the META table in an offline fashion. * * Region consistency requires three conditions -- 1) valid .regioninfo file * present in an hdfs region dir, 2) valid row with .regioninfo data in META, * and 3) a region is deployed only at the regionserver that is was assigned to. * * Region consistency requires hbck to contact the HBase master and region * servers, so the connect() must first be called successfully. Much of the * region consistency information is transient and less risky to repair. */ {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.
[ https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187128#comment-13187128 ] Jonathan Hsieh commented on HBASE-5128: --- @Ted sounds good. [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online. - Key: HBASE-5128 URL: https://issues.apache.org/jira/browse/HBASE-5128 Project: HBase Issue Type: New Feature Components: hbck Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region consistency and table integrity invariant violations. However with '-fix' it can only automatically repair region consistency cases having to do with deployment problems. This updated version should be able to handle all cases (including a new orphan regiondir case). When complete will likely deprecate the OfflineMetaRepair tool and subsume several open META-hole related issue. Here's the approach (from the comment of at the top of the new version of the file). {code} /** * HBaseFsck (hbck) is a tool for checking and repairing region consistency and * table integrity. * * Region consistency checks verify that META, region deployment on * region servers and the state of data in HDFS (.regioninfo files) all are in * accordance. * * Table integrity checks verify that that all possible row keys can resolve to * exactly one region of a table. This means there are no individual degenerate * or backwards regions; no holes between regions; and that there no overlapping * regions. * * The general repair strategy works in these steps. * 1) Repair Table Integrity on HDFS. (merge or fabricate regions) * 2) Repair Region Consistency with META and assignments * * For table integrity repairs, the tables their region directories are scanned * for .regioninfo files. Each table's integrity is then verified. If there * are any orphan regions (regions with no .regioninfo files), or holes, new * regions are fabricated. Backwards regions are sidelined as well as empty * degenerate (endkey==startkey) regions. If there are any overlapping regions, * a new region is created and all data is merged into the new region. * * Table integrity repairs deal solely with HDFS and can be done offline -- the * hbase region servers or master do not need to be running. These phase can be * use to completely reconstruct the META table in an offline fashion. * * Region consistency requires three conditions -- 1) valid .regioninfo file * present in an hdfs region dir, 2) valid row with .regioninfo data in META, * and 3) a region is deployed only at the regionserver that is was assigned to. * * Region consistency requires hbck to contact the HBase master and region * servers, so the connect() must first be called successfully. Much of the * region consistency information is transient and less risky to repair. */ {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5204) Backward compatibility fixes for 0.92
[ https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187129#comment-13187129 ] Hudson commented on HBASE-5204: --- Integrated in HBase-0.92-security #77 (See [https://builds.apache.org/job/HBase-0.92-security/77/]) HBASE-5204 Backward compatibility fixes for 0.92 stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java Backward compatibility fixes for 0.92 - Key: HBASE-5204 URL: https://issues.apache.org/jira/browse/HBASE-5204 Project: HBase Issue Type: Bug Components: ipc Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Blocker Labels: backwards-compatibility Fix For: 0.92.0 Attachments: 0001-Add-some-backward-compatible-support-for-reading-old.patch, 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 5204-trunk.txt Attached are 3 patches that are necessary to allow compatibility between HBase 0.90.x (and previous releases) and HBase 0.92.0. First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of people and would probably wind up being released as 0.92.0 tomorrow, so I sincerely apologize for creating this issue so late in the process. I spent a lot of time trying to work around the quirks of 0.92 but once I realized that with a few very quasi-trivial changes compatibility would be made significantly easier, I immediately sent these 3 patches to Stack, who suggested I create this issue. The first patch is required as without it clients sending a 0.90-style RPC to a 0.92-style server causes the server to die uncleanly. It seems that 0.92 ships with {{\-XX:OnOutOfMemoryError=kill \-9 %p}}, and when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer because it doesn't read fields of 0.90-style RPCs properly. This allocation attempt immediately triggers an OOME, which causes the JVM to die abruptly of a {{SIGKILL}}. So whenever a 0.90.x client attempts to connect to HBase, it kills whichever RS is hosting the {{\-ROOT-}} region. The second patch fixes a bug introduced by HBASE-2002, which added support for letting clients specify what protocol they want to speak. If a client doesn't properly specify what protocol to use, the connection's {{protocol}} field will be left {{null}}, which causes any subsequent RPC on that connection to trigger an NPE in the server, even though the connection was successfully established from the client's point of view. The fix is to simply give the connection a default protocol, by assuming the client meant to speak to a RegionServer. The third patch fixes an oversight that slipped in HBASE-451, where a change to {{HbaseObjectWritable}} caused all the codes used to serialize {{Writables}} to shift by one. This was carefully avoided in other changes such as HBASE-1502, which cleanly removed entries for {{HMsg}} and {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187132#comment-13187132 ] Zhihong Yu commented on HBASE-5208: --- From https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2634/console: {code} Running org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 454.79 sec {code} It was indeed long - without the new test cases. Can you pick only a few of the test cases from TestTableInputFormatScan for your new tests ? Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online.
[ https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187134#comment-13187134 ] Jonathan Hsieh commented on HBASE-5128: --- I've been testing using failed splits generated by cycling the hbase master while doing a heavy write load with a high split frequency prior to HBASE-5196 patch. A subset of problems has been fixed automatically but it seems to be a class of problems with splitting regions that isn't being handled properly. This actually is probably the case we are most likely to encounter. [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online. - Key: HBASE-5128 URL: https://issues.apache.org/jira/browse/HBASE-5128 Project: HBase Issue Type: New Feature Components: hbck Affects Versions: 0.92.0, 0.90.5 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region consistency and table integrity invariant violations. However with '-fix' it can only automatically repair region consistency cases having to do with deployment problems. This updated version should be able to handle all cases (including a new orphan regiondir case). When complete will likely deprecate the OfflineMetaRepair tool and subsume several open META-hole related issue. Here's the approach (from the comment of at the top of the new version of the file). {code} /** * HBaseFsck (hbck) is a tool for checking and repairing region consistency and * table integrity. * * Region consistency checks verify that META, region deployment on * region servers and the state of data in HDFS (.regioninfo files) all are in * accordance. * * Table integrity checks verify that that all possible row keys can resolve to * exactly one region of a table. This means there are no individual degenerate * or backwards regions; no holes between regions; and that there no overlapping * regions. * * The general repair strategy works in these steps. * 1) Repair Table Integrity on HDFS. (merge or fabricate regions) * 2) Repair Region Consistency with META and assignments * * For table integrity repairs, the tables their region directories are scanned * for .regioninfo files. Each table's integrity is then verified. If there * are any orphan regions (regions with no .regioninfo files), or holes, new * regions are fabricated. Backwards regions are sidelined as well as empty * degenerate (endkey==startkey) regions. If there are any overlapping regions, * a new region is created and all data is merged into the new region. * * Table integrity repairs deal solely with HDFS and can be done offline -- the * hbase region servers or master do not need to be running. These phase can be * use to completely reconstruct the META table in an offline fashion. * * Region consistency requires three conditions -- 1) valid .regioninfo file * present in an hdfs region dir, 2) valid row with .regioninfo data in META, * and 3) a region is deployed only at the regionserver that is was assigned to. * * Region consistency requires hbck to contact the HBase master and region * servers, so the connect() must first be called successfully. Much of the * region consistency information is transient and less risky to repair. */ {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5208) Allow setting Scan start/stop row individually in TableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187143#comment-13187143 ] Zhihong Yu commented on HBASE-5208: --- Running the test based on patch v3 timed out. Here is strace: {code} main prio=5 tid=101801000 nid=0x100601000 waiting on condition [1005fe000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1295) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:498) at org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan.testScanFromConfiguration(TestTableInputFormatScan.java:355) at org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan.testScanYZYToEmpty(TestTableInputFormatScan.java:319) {code} You can use the following command to verify that the new test case passes (just an example): {code} mvn test -P localTests TestTableInputFormatScan#testScanEmptyToEmpty {code} Allow setting Scan start/stop row individually in TableInputFormat -- Key: HBASE-5208 URL: https://issues.apache.org/jira/browse/HBASE-5208 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Nicholas Telford Priority: Minor Attachments: HBASE-5208-001.txt, HBASE-5208-002.txt, HBASE-5208-003.txt Currently, TableInputFormat initializes a serialized Scan from hbase.mapreduce.scan. Alternatively, it will instantiate a new Scan using properties defined in hbase.mapreduce.scan.*. However, of these properties the start row and stop row (arguably the most pertinent) are missing. TableInputFormat should permit the specification of a start/stop row as with the other fields using a new pair of properties: hbase.mapreduce.scan.row.start and hbase.mapreduce.scan.row.end The primary use-case for this is to permit Oozie and other job management tools that can't call TableMapReduceUtil.initTableMapperJob() to operate on a contiguous subset of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5155) ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted
[ https://issues.apache.org/jira/browse/HBASE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans reassigned HBASE-5155: - Assignee: ramkrishna.s.vasudevan Please don't forget to set the assignee. ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted --- Key: HBASE-5155 URL: https://issues.apache.org/jira/browse/HBASE-5155 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.4 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.90.6 Attachments: HBASE-5155_1.patch, HBASE-5155_2.patch, HBASE-5155_3.patch, HBASE-5155_latest.patch, hbase-5155_6.patch ServerShutDownHandler and disable/delete table handler races. This is not an issue due to TM. - A regionserver goes down. In our cluster the regionserver holds lot of regions. - A region R1 has two daughters D1 and D2. - The ServerShutdownHandler gets called and scans the META and gets all the user regions - Parallely a table is disabled. (No problem in this step). - Delete table is done. - The tables and its regions are deleted including R1, D1 and D2.. (So META is cleaned) - Now ServerShutdownhandler starts to processTheDeadRegion {code} if (hri.isOffline() hri.isSplit()) { LOG.debug(Offlined and split region + hri.getRegionNameAsString() + ; checking daughter presence); fixupDaughters(result, assignmentManager, catalogTracker); {code} As part of fixUpDaughters as the daughers D1 and D2 is missing for R1 {code} if (isDaughterMissing(catalogTracker, daughter)) { LOG.info(Fixup; missing daughter + daughter.getRegionNameAsString()); MetaEditor.addDaughter(catalogTracker, daughter, null); // TODO: Log WARN if the regiondir does not exist in the fs. If its not // there then something wonky about the split -- things will keep going // but could be missing references to parent region. // And assign it. assignmentManager.assign(daughter, true); {code} we call assign of the daughers. Now after this we again start with the below code. {code} if (processDeadRegion(e.getKey(), e.getValue(), this.services.getAssignmentManager(), this.server.getCatalogTracker())) { this.services.getAssignmentManager().assign(e.getKey(), true); {code} Now when the SSH scanned the META it had R1, D1 and D2. So as part of the above code D1 and D2 which where assigned by fixUpDaughters is again assigned by {code} this.services.getAssignmentManager().assign(e.getKey(), true); {code} Thus leading to a zookeeper issue due to bad version and killing the master. The important part here is the regions that were deleted are recreated which i think is more critical. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5207) Apply HBASE-5155 to trunk
[ https://issues.apache.org/jira/browse/HBASE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187156#comment-13187156 ] Jean-Daniel Cryans commented on HBASE-5207: --- Collision with HBASE-5206? Apply HBASE-5155 to trunk -- Key: HBASE-5207 URL: https://issues.apache.org/jira/browse/HBASE-5207 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The issue HBASE-5155 has been fixed on branch(0.90). The same has to be applied on trunk also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187163#comment-13187163 ] Lars Hofhansl commented on HBASE-5203: -- Ok... Will commit later today. @Stack: Wanna have a quick look? Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3489) .oldlogs not being cleaned out
[ https://issues.apache.org/jira/browse/HBASE-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187172#comment-13187172 ] Josh Wymer commented on HBASE-3489: --- We are seeing this on our replication cluster using 0.90.4. The /hbase/.oldlogs is filled with logs that are ~ 1 month old. .oldlogs not being cleaned out -- Key: HBASE-3489 URL: https://issues.apache.org/jira/browse/HBASE-3489 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Environment: 10 Nodes Write Heavy Cluster Reporter: Wayne Attachments: oldlog.txt The .oldlogs folder is never being cleaned up. The hbase.master.logcleaner.ttl has been set to clean up the old logs but the clean up is never kicking in. The limit of 10 files is not the problem. After running for 5 days not a single log file has ever been deleted and the logcleaner is set to 2 days (from the default of 7 days). It is assumed that the replication changes that want to be sure to keep these logs around if needed have caused the cleanup to be blocked. There is no replication defined (knowingly). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole
[ https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187217#comment-13187217 ] Todd Lipcon commented on HBASE-5196: Should this also be committed to the 0.90 branch? Failure in region split after PONR could cause region hole -- Key: HBASE-5196 URL: https://issues.apache.org/jira/browse/HBASE-5196 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0, 0.94.0 Attachments: 5196-v2.txt If region split fails after PONR, it relies on the master ServerShutdown handler to fix it. However, if the master doesn't get a chance to fix it. There will be a hole in the region chain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5211) org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey
org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey - Key: HBASE-5211 URL: https://issues.apache.org/jira/browse/HBASE-5211 Project: HBase Issue Type: Bug Reporter: Alex Newman Attachments: trunk.txt I can't seem to get this test to pass consistently on my laptop. Also my hudson occasionally tripps up on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5211) org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey
[ https://issues.apache.org/jira/browse/HBASE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman updated HBASE-5211: --- Attachment: trunk.txt I attached a log file. org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey - Key: HBASE-5211 URL: https://issues.apache.org/jira/browse/HBASE-5211 Project: HBase Issue Type: Bug Reporter: Alex Newman Attachments: trunk.txt I can't seem to get this test to pass consistently on my laptop. Also my hudson occasionally tripps up on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5153) Add retry logic in HConnectionImplementation#resetZooKeeperTrackers
[ https://issues.apache.org/jira/browse/HBASE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187277#comment-13187277 ] Lars Hofhansl commented on HBASE-5153: -- +1 on latest patch Add retry logic in HConnectionImplementation#resetZooKeeperTrackers --- Key: HBASE-5153 URL: https://issues.apache.org/jira/browse/HBASE-5153 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.4 Reporter: Jieshan Bean Assignee: Jieshan Bean Fix For: 0.90.6 Attachments: 5153-trunk.txt, HBASE-5153-V2.patch, HBASE-5153-V3.patch, HBASE-5153-V4-90.patch, HBASE-5153-V5-90.patch, HBASE-5153-V6-90-minorchange.patch, HBASE-5153-V6-90.txt, HBASE-5153-trunk-v2.patch, HBASE-5153-trunk.patch, HBASE-5153.patch, TestResults-hbase5153.out HBASE-4893 is related to this issue. In that issue, we know, if multi-threads share a same connection, once this connection got abort in one thread, the other threads will got a HConnectionManager$HConnectionImplementation@18fb1f7 closed exception. It solve the problem of stale connection can't removed. But the orignal HTable instance cann't be continue to use. The connection in HTable should be recreated. Actually, there's two aproach to solve this: 1. In user code, once catch an IOE, close connection and re-create HTable instance. We can use this as a workaround. 2. In HBase Client side, catch this exception, and re-create connection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5203: - Attachment: 5203-v3.txt Double checking latest patch (same as on RB) Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203-v3.txt, 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5203: - Status: Patch Available (was: Open) Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203-v3.txt, 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5212) Fix test TestTableMapReduce against 0.23.
Fix test TestTableMapReduce against 0.23. - Key: HBASE-5212 URL: https://issues.apache.org/jira/browse/HBASE-5212 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Mahadev konar Fix For: 0.92.1 As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails on 0.92 branch. There are minor changes to HBase poms required to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5191) Fix compilation error against hadoop 0.23.1
[ https://issues.apache.org/jira/browse/HBASE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187298#comment-13187298 ] Mahadev konar commented on HBASE-5191: -- @Ted, Thanks for running through this. Any further update? Can I help in any way? Fix compilation error against hadoop 0.23.1 --- Key: HBASE-5191 URL: https://issues.apache.org/jira/browse/HBASE-5191 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Zhihong Yu Fix For: 0.92.0, 0.94.0 Attachments: 5191.txt From Mahadev: I just checked out 0.92 branch and tried running: mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Looks like a compilation issue: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /Users/mahadev/workspace/hbase-workspace/hbase-git/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java:[341,33] cannot find symbol [ERROR] symbol : variable dnRegistration [ERROR] location: class org.apache.hadoop.hdfs.server.datanode.DataNode [ERROR] - [Help 1] [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5191) Fix compilation error against hadoop 0.23.1
[ https://issues.apache.org/jira/browse/HBASE-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187300#comment-13187300 ] Zhihong Yu commented on HBASE-5191: --- I haven't found out why the assertion doesn't fail in HBase trunk - I basically used an equivalent dfsCluster.stopDataNode() call. Since any patch has to make TestLogRolling pass for hadoop 1.0, I am still searching for the transformation. This effort was partially sidetracked by work on 0.92 Fix compilation error against hadoop 0.23.1 --- Key: HBASE-5191 URL: https://issues.apache.org/jira/browse/HBASE-5191 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Zhihong Yu Fix For: 0.92.0, 0.94.0 Attachments: 5191.txt From Mahadev: I just checked out 0.92 branch and tried running: mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Looks like a compilation issue: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /Users/mahadev/workspace/hbase-workspace/hbase-git/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java:[341,33] cannot find symbol [ERROR] symbol : variable dnRegistration [ERROR] location: class org.apache.hadoop.hdfs.server.datanode.DataNode [ERROR] - [Help 1] [ERROR] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5211) org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey
[ https://issues.apache.org/jira/browse/HBASE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187308#comment-13187308 ] Lars Hofhansl commented on HBASE-5211: -- Hmmm... This seems to be the crux...? {code} 2012-01-16 13:15:39,965 ERROR [Thread-3] hbase.MiniHBaseCluster(201): Error starting cluster java.lang.RuntimeException: Master not initialized after 200 seconds at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:206) at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:196) at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:76) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:627) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:601) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:549) at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:518) at org.apache.hadoop.hbase.replication.TestMultiSlaveReplication.testMultiSlaveReplication(TestMultiSlaveReplication.java:121) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) {code} org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey - Key: HBASE-5211 URL: https://issues.apache.org/jira/browse/HBASE-5211 Project: HBase Issue Type: Bug Reporter: Alex Newman Attachments: trunk.txt I can't seem to get this test to pass consistently on my laptop. Also my hudson occasionally tripps up on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187309#comment-13187309 ] Lars Hofhansl commented on HBASE-2600: -- These three always fail it seems: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187311#comment-13187311 ] Alex Newman commented on HBASE-2600: On all jenkins? Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5211) org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey
[ https://issues.apache.org/jira/browse/HBASE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman updated HBASE-5211: --- Attachment: log2.txt Lars your right, although this is the other error I get. org.apache.hadoop.hbase.replication.TestMultiSlaveReplication#testMultiSlaveReplication is flakey - Key: HBASE-5211 URL: https://issues.apache.org/jira/browse/HBASE-5211 Project: HBase Issue Type: Bug Reporter: Alex Newman Attachments: log2.txt, trunk.txt I can't seem to get this test to pass consistently on my laptop. Also my hudson occasionally tripps up on it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5203) Group atomic put/delete operation into a single WALEdit to handle region server failures.
[ https://issues.apache.org/jira/browse/HBASE-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187315#comment-13187315 ] Hadoop QA commented on HBASE-5203: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510770/5203-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/781//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/781//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/781//console This message is automatically generated. Group atomic put/delete operation into a single WALEdit to handle region server failures. - Key: HBASE-5203 URL: https://issues.apache.org/jira/browse/HBASE-5203 Project: HBase Issue Type: Sub-task Components: client, coprocessors, regionserver Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 5203-v3.txt, 5203.txt HBASE-3584 does not not provide fully atomic operation in case of region server failures (see explanation there). What should happen is that either (1) all edits are applied via a single WALEdit, or (2) the WALEdits are applied in async mode and then sync'ed together. For #1 it is not clear whether it is advisable to manage multiple *different* operations (Put/Delete) via a single WAL edit. A quick check reveals that WAL replay on region startup would work, but that replication would need to be adapted. The refactoring needed would be non-trivial. #2 Might actually not work, as another operation could request sync'ing a later edit and hence flush these entries out as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5196) Failure in region split after PONR could cause region hole
[ https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-5196: --- Attachment: hbase-5196_0.90.txt Failure in region split after PONR could cause region hole -- Key: HBASE-5196 URL: https://issues.apache.org/jira/browse/HBASE-5196 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0, 0.94.0 Attachments: 5196-v2.txt, hbase-5196_0.90.txt If region split fails after PONR, it relies on the master ServerShutdown handler to fix it. However, if the master doesn't get a chance to fix it. There will be a hole in the region chain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole
[ https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187318#comment-13187318 ] Jimmy Xiang commented on HBASE-5196: I attached a patch for 0.90 branch: hbase-5196_0.90.txt Could anyone please check it in? Failure in region split after PONR could cause region hole -- Key: HBASE-5196 URL: https://issues.apache.org/jira/browse/HBASE-5196 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0, 0.94.0 Attachments: 5196-v2.txt, hbase-5196_0.90.txt If region split fails after PONR, it relies on the master ServerShutdown handler to fix it. However, if the master doesn't get a chance to fix it. There will be a hole in the region chain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187320#comment-13187320 ] Lars Hofhansl commented on HBASE-2600: -- Something to do with the Hadoop version on the jenkins machines. Ted might know the details. Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole
[ https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187322#comment-13187322 ] Zhihong Yu commented on HBASE-5196: --- @Jimmy: Have you run 0.90 test suite over the new patch ? Failure in region split after PONR could cause region hole -- Key: HBASE-5196 URL: https://issues.apache.org/jira/browse/HBASE-5196 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0, 0.94.0 Attachments: 5196-v2.txt, hbase-5196_0.90.txt If region split fails after PONR, it relies on the master ServerShutdown handler to fix it. However, if the master doesn't get a chance to fix it. There will be a hole in the region chain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5212) Fix test TestTableMapReduce against 0.23.
[ https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated HBASE-5212: - Attachment: HBASE-5212.patch Thanks to Hitesh for helping me out on this. This should fix most of the issue with 0.23 tests. Fix test TestTableMapReduce against 0.23. - Key: HBASE-5212 URL: https://issues.apache.org/jira/browse/HBASE-5212 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Mahadev konar Fix For: 0.92.1 Attachments: HBASE-5212.patch As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails on 0.92 branch. There are minor changes to HBase poms required to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5212) Fix test TestTableMapReduce against 0.23.
[ https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated HBASE-5212: - Status: Patch Available (was: Open) Fix test TestTableMapReduce against 0.23. - Key: HBASE-5212 URL: https://issues.apache.org/jira/browse/HBASE-5212 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Mahadev konar Fix For: 0.92.1 Attachments: HBASE-5212.patch As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails on 0.92 branch. There are minor changes to HBase poms required to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187330#comment-13187330 ] jirapos...@reviews.apache.org commented on HBASE-2600: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/#review4418 --- This looks pretty good. Thanks for being persistent and patient Alex! Devil is probably still in the details. All the getClosestBefore huh hah can now be removed from HTable/Region[Server]/Store, right? src/main/java/org/apache/hadoop/hbase/HRegionInfo.java https://reviews.apache.org/r/3466/#comment9924 ! and Although it's not very intuitive. So the encoded region is now? tableName!,endKey,... tableName,endKey,... Is that simpler than replacing the separator? That could look like this: tableName,endKey,... tableName/endKey,... src/main/java/org/apache/hadoop/hbase/HRegionInfo.java https://reviews.apache.org/r/3466/#comment9923 addEncoding does not use the startKey. Could just remove it from there, and hence from here as well so that this method just needs to know the endKey. src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java https://reviews.apache.org/r/3466/#comment9925 I like this. Captures what it is doing without being too complicated. src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java https://reviews.apache.org/r/3466/#comment9926 Why is this needed? src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java https://reviews.apache.org/r/3466/#comment9927 Yeah... Be gone! - Lars On 2012-01-16 18:26:39, Alex Newman wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/3466/ bq. --- bq. bq. (Updated 2012-01-16 18:26:39) bq. bq. bq. Review request for hbase and Michael Stack. bq. bq. bq. Summary bq. --- bq. bq. This is an idea that Ryan and I have been kicking around on and off for a while now. bq. bq. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). bq. bq. If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. bq. bq. This issue is about changing the way we name regions. bq. bq. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). bq. bq. Converting to the new method, we'd have to run a migration on startup changing the content in meta. bq. bq. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. bq. bq. bq. This addresses bug HBASE-2600. bq. https://issues.apache.org/jira/browse/HBASE-2600 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/HConstants.java 904e2d2 bq.src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 bq.src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d bq.src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 bq.src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 99f90b2 bq.src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 bq. src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 8f4f4b8 bq.src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 bq.
[jira] [Commented] (HBASE-5212) Fix test TestTableMapReduce against 0.23.
[ https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187331#comment-13187331 ] Hadoop QA commented on HBASE-5212: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510776/HBASE-5212.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -145 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 82 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/782//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/782//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/782//console This message is automatically generated. Fix test TestTableMapReduce against 0.23. - Key: HBASE-5212 URL: https://issues.apache.org/jira/browse/HBASE-5212 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Mahadev konar Fix For: 0.92.1 Attachments: HBASE-5212.patch As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails on 0.92 branch. There are minor changes to HBase poms required to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5196) Failure in region split after PONR could cause region hole
[ https://issues.apache.org/jira/browse/HBASE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187332#comment-13187332 ] Jimmy Xiang commented on HBASE-5196: @Ted, I ran the test suite, and verified the fix on CDH3u3. Let me run the test suite on 0.90 now. Failure in region split after PONR could cause region hole -- Key: HBASE-5196 URL: https://issues.apache.org/jira/browse/HBASE-5196 Project: HBase Issue Type: Bug Components: master, regionserver Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0, 0.94.0 Attachments: 5196-v2.txt, hbase-5196_0.90.txt If region split fails after PONR, it relies on the master ServerShutdown handler to fix it. However, if the master doesn't get a chance to fix it. There will be a hole in the region chain. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid
[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187333#comment-13187333 ] Zhihong Yu commented on HBASE-2600: --- See MAPREDUCE-3583 for background on test failures for: {code} org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat {code} TestMetaMigrationRemovingHTD needs attention for this feature. Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid Key: HBASE-2600 URL: https://issues.apache.org/jira/browse/HBASE-2600 Project: HBase Issue Type: Bug Reporter: stack Assignee: Alex Newman Attachments: 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 32-bit encoding of regionnames waaay too susceptible to hash clashes proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5212) Fix test TestTableMapReduce against 0.23.
[ https://issues.apache.org/jira/browse/HBASE-5212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187337#comment-13187337 ] Zhihong Yu commented on HBASE-5212: --- {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java:[1279,4] cannot find symbol [ERROR] symbol : variable c [ERROR] location: class org.apache.hadoop.hbase.HBaseTestingUtility {code} Fix test TestTableMapReduce against 0.23. - Key: HBASE-5212 URL: https://issues.apache.org/jira/browse/HBASE-5212 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Mahadev konar Fix For: 0.92.1 Attachments: HBASE-5212.patch As reported by Andrew on the hadoop mailing list, mvn -Dhadoop.profile=23 clean test -Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce fails on 0.92 branch. There are minor changes to HBase poms required to fix that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira