[jira] [Created] (HBASE-6489) incorret TaskTracker logfile name
liang xie created HBASE-6489: Summary: incorret TaskTracker logfile name Key: HBASE-6489 URL: https://issues.apache.org/jira/browse/HBASE-6489 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.94.0 Reporter: liang xie Priority: Minor http://hbase.apache.org/book/trouble.log.html TaskTracker: $HADOOP_HOME/logs/hadoop-user-jobtracker-hostname.log should be TaskTracker: $HADOOP_HOME/logs/hadoop-user-tasktracker-hostname.log -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6435: --- Status: Open (was: Patch Available) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6435: --- Attachment: 6435.v9.patch Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6435: --- Status: Patch Available (was: Open) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426371#comment-13426371 ] nkeywal commented on HBASE-6435: Thanks for the review and the test failure analysis, Ted. v9 takes the comments into account. Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6052) Convert .META. and -ROOT- content to pb
[ https://issues.apache.org/jira/browse/HBASE-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426380#comment-13426380 ] stack commented on HBASE-6052: -- Thanks Enis. I can't get to this for a while yet. Getting rid of -ROOT- would make it so we now had two rather than three comparators' for KVs. Convert .META. and -ROOT- content to pb --- Key: HBASE-6052 URL: https://issues.apache.org/jira/browse/HBASE-6052 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.96.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6489) incorret TaskTracker logfile name
[ https://issues.apache.org/jira/browse/HBASE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liang xie updated HBASE-6489: - Attachment: 6489.txt incorret TaskTracker logfile name - Key: HBASE-6489 URL: https://issues.apache.org/jira/browse/HBASE-6489 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.94.0 Reporter: liang xie Priority: Minor Attachments: 6489.txt http://hbase.apache.org/book/trouble.log.html TaskTracker: $HADOOP_HOME/logs/hadoop-user-jobtracker-hostname.log should be TaskTracker: $HADOOP_HOME/logs/hadoop-user-tasktracker-hostname.log -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6489) incorret TaskTracker logfile name
[ https://issues.apache.org/jira/browse/HBASE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liang xie updated HBASE-6489: - Status: Patch Available (was: Open) incorret TaskTracker logfile name - Key: HBASE-6489 URL: https://issues.apache.org/jira/browse/HBASE-6489 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.94.0 Reporter: liang xie Priority: Minor Attachments: 6489.txt http://hbase.apache.org/book/trouble.log.html TaskTracker: $HADOOP_HOME/logs/hadoop-user-jobtracker-hostname.log should be TaskTracker: $HADOOP_HOME/logs/hadoop-user-tasktracker-hostname.log -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6454) Write PB definitions for filters
[ https://issues.apache.org/jira/browse/HBASE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426486#comment-13426486 ] Zhihong Ted Yu commented on HBASE-6454: --- Patch integrated to trunk. Thanks for the patch, Gregory. Write PB definitions for filters Key: HBASE-6454 URL: https://issues.apache.org/jira/browse/HBASE-6454 Project: HBase Issue Type: Task Components: ipc, migration Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.96.0 Attachments: HBASE-6454.patch See HBASE-5447. Conversion to protobuf requires writing protobuf definitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6489) incorret TaskTracker logfile name
[ https://issues.apache.org/jira/browse/HBASE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426503#comment-13426503 ] Hadoop QA commented on HBASE-6489: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538760/6489.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2471//console This message is automatically generated. incorret TaskTracker logfile name - Key: HBASE-6489 URL: https://issues.apache.org/jira/browse/HBASE-6489 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 0.94.0 Reporter: liang xie Priority: Minor Attachments: 6489.txt http://hbase.apache.org/book/trouble.log.html TaskTracker: $HADOOP_HOME/logs/hadoop-user-jobtracker-hostname.log should be TaskTracker: $HADOOP_HOME/logs/hadoop-user-tasktracker-hostname.log -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6478) TestClassLoading.testClassLoadingFromLibDirInJar in coprocessor may appear fail due to waitTableAvailable
[ https://issues.apache.org/jira/browse/HBASE-6478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426505#comment-13426505 ] zhou wenjian commented on HBASE-6478: - @Andrew Purtell Maybe we can use waitTableEnabled instead of waiTableAvailable TestClassLoading.testClassLoadingFromLibDirInJar in coprocessor may appear fail due to waitTableAvailable - Key: HBASE-6478 URL: https://issues.apache.org/jira/browse/HBASE-6478 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6478-trunk.patch When hudson runs for HBASE-6459, it encounters a failed testcase in org.apache.hadoop.hbase.coprocessor.TestClassLoading.testClassLoadingFromLibDirInJar. The link is https://builds.apache.org/job/PreCommit-HBASE-Build/2455/testReport/org.apache.hadoop.hbase.coprocessor/TestClassLoading/testClassLoadingFromLibDirInJar/ I check the log, and find that the function waitTableAvailable will only check the meta table, when rs open the region and update the metalocation in meta, it may not be added to the onlineregions in rs. for (HRegion region: hbase.getRegionServer(0).getOnlineRegionsLocalContext()) { this Loop will ship, and found1 will be false altogether. that's why the testcase failed. So maybe we can hbave some strictly check when table is created -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6454) Write PB definitions for filters
[ https://issues.apache.org/jira/browse/HBASE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426523#comment-13426523 ] Hudson commented on HBASE-6454: --- Integrated in HBase-TRUNK #3187 (See [https://builds.apache.org/job/HBase-TRUNK/3187/]) HBASE-6454 Write PB definitions for filters (Gregory Chanan) (Revision 1367904) Result = SUCCESS tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java * /hbase/trunk/hbase-server/src/main/protobuf/Client.proto * /hbase/trunk/hbase-server/src/main/protobuf/Filter.proto * /hbase/trunk/hbase-server/src/main/protobuf/hbase.proto Write PB definitions for filters Key: HBASE-6454 URL: https://issues.apache.org/jira/browse/HBASE-6454 Project: HBase Issue Type: Task Components: ipc, migration Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.96.0 Attachments: HBASE-6454.patch See HBASE-5447. Conversion to protobuf requires writing protobuf definitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6454) Write PB definitions for filters
[ https://issues.apache.org/jira/browse/HBASE-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426538#comment-13426538 ] Hudson commented on HBASE-6454: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #117 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/117/]) HBASE-6454 Write PB definitions for filters (Gregory Chanan) (Revision 1367904) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java * /hbase/trunk/hbase-server/src/main/protobuf/Client.proto * /hbase/trunk/hbase-server/src/main/protobuf/Filter.proto * /hbase/trunk/hbase-server/src/main/protobuf/hbase.proto Write PB definitions for filters Key: HBASE-6454 URL: https://issues.apache.org/jira/browse/HBASE-6454 Project: HBase Issue Type: Task Components: ipc, migration Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.96.0 Attachments: HBASE-6454.patch See HBASE-5447. Conversion to protobuf requires writing protobuf definitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426595#comment-13426595 ] Zhihong Ted Yu commented on HBASE-6435: --- From PreCommit build #2470, look like compilation against Hadoop 2.0 failed. Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6460) hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans
[ https://issues.apache.org/jira/browse/HBASE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426596#comment-13426596 ] Jie Huang commented on HBASE-6460: -- bq. When you say ignore, do you mean treat it as a warning as opposed to a error? Without that *-fixHdfsOrphans* option, the hbck won't try to fix the problem as you like. Regarding the shortcut, from my perspective, we can let -repairHoles to fix both items at the same time, which is quite acceptable literally. And I wonder if it is OK to have another shortcut for -fixAssignments -fixMeta, which aim to solve the meta related parts. And then we can combine any single option (like -fixHdfsHole) along with those options. What do you think? hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans - Key: HBASE-6460 URL: https://issues.apache.org/jira/browse/HBASE-6460 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Priority: Minor Attachments: hbase-6460.patch According to the hbck's help info, shortcut - -repairHoles will enable -fixHdfsOrphans as below. {noformat} -repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans {noformat} However, in the implementation, the function fsck.setFixHdfsOrphans(false); is called in -repairHoles. This is not consistent with the usage information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426656#comment-13426656 ] nkeywal commented on HBASE-6435: Is there a way to have more info on the failure? Locally {noformat} mvn test -Dhadoop.profile=2.0 {noformat} says {noformat} Tests in error: testSimpleCase(org.apache.hadoop.hbase.mapreduce.TestImportExport) testWithDeletes(org.apache.hadoop.hbase.mapreduce.TestImportExport) Tests run: 719, Failures: 0, Errors: 2, Skipped: 2 {noformat} and .TestBlockReorder is ok (executed 5 times) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6479) HFileReaderV1 caching the same parent META block could cause server abot when splitting
[ https://issues.apache.org/jira/browse/HBASE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426668#comment-13426668 ] Zhihong Ted Yu commented on HBASE-6479: --- @Chunhui: Can you include testWholesomeSplitWithHFileV1 in your patch to show that the problem is fixed ? Thanks HFileReaderV1 caching the same parent META block could cause server abot when splitting --- Key: HBASE-6479 URL: https://issues.apache.org/jira/browse/HBASE-6479 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6479.patch, test.patch If the hfile's version is 1 now, when splitting, two daughters would loadBloomfilter concurrently in the open progress. Because their META block is the same one(parent's META block), the following expection would be thrown when doing HFileReaderV1#getMetaBlock {code} java.io.IOException: Failed null-daughterOpener=af73f8c9a9b409531ac211a9a7f92eba at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:367) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:453) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplit(TestSplitTransaction.java:225) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplitWithHFileV1(TestSplitTransaction.java:203) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:540) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:463) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3784) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:506) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:486) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:424) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:271) at
[jira] [Commented] (HBASE-6358) Bulkloading from remote filesystem is problematic
[ https://issues.apache.org/jira/browse/HBASE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426680#comment-13426680 ] Harsh J commented on HBASE-6358: Hey Dave, Will you be doing the patch? We can probably mark-deprecate this feature in 0.96, and remove it in the release after that? Bulkloading from remote filesystem is problematic - Key: HBASE-6358 URL: https://issues.apache.org/jira/browse/HBASE-6358 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: Dave Revell Assignee: Dave Revell Bulk loading hfiles that don't live on the same filesystem as HBase can cause problems for subtle reasons. In Store.bulkLoadHFile(), the regionserver will copy the source hfile to its own filesystem if it's not already there. Since this can take a long time for large hfiles, it's likely that the client will timeout and retry. When the client retries repeatedly, there may be several bulkload operations in flight for the same hfile, causing lots of unnecessary IO and tying up handler threads. This can seriously impact performance. In my case, the cluster became unusable and the regionservers had to be kill -9'ed. Possible solutions: # Require that hfiles already be on the same filesystem as HBase in order for bulkloading to succeed. The copy could be handled by LoadIncrementalHFiles before the regionserver is called. # Others? I'm not familiar with Hadoop IPC so there may be tricks to extend the timeout or something else. I'm willing to write a patch but I'd appreciate recommendations on how to proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426682#comment-13426682 ] Zhihong Ted Yu commented on HBASE-6435: --- The compilation in PreCommit build was aborted. I couldn't reproduce the issue. Suggest re-attaching patch v9. Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6435: --- Status: Open (was: Patch Available) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6435: --- Attachment: 6435.v9.patch Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6435: --- Status: Patch Available (was: Open) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426683#comment-13426683 ] nkeywal commented on HBASE-6435: done :-) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6473) deletedtable is not deleted completely, some region may be still online
[ https://issues.apache.org/jira/browse/HBASE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426701#comment-13426701 ] Jimmy Xiang commented on HBASE-6473: Not sure how the patch can fix the issue. It seems to me we should fix this in unassignment handler: if the table is disabled, don't assign it again, right? deletedtable is not deleted completely, some region may be still online --- Key: HBASE-6473 URL: https://issues.apache.org/jira/browse/HBASE-6473 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6473-trunk.patch consider such Scenario: we have a table called T1, which has 1 regions: A 1. move A from rs1 to rs2,and A is now closed 2. disable T1, 3. delete T1. when we disable T1, disable handler will just set the zk to disabled and A will still be assigned. when Ais opened, A in transition will be clean out. At that time, Deletetable found it is safe to delete all regions and table in meta and fs , it will also delete the zk node of T1. {code} while (System.currentTimeMillis() done) { AssignmentManager.RegionState rs = am.isRegionInTransition(region); if (rs == null) break; Threads.sleep(waitingTimeForEvents); LOG.debug(Waiting on region to clear regions in transition; + rs); } if (am.isRegionInTransition(region) != null) { throw new IOException(Waited hbase.master.wait.on.region ( + waitTime + ms) for region to leave region + region.getRegionNameAsString() + in transitions); } {code} however A is still being unassigned, when it finished closed the A,it finds that the disabled state in zk is deleted, and then A will be assigned again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6460) hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans
[ https://issues.apache.org/jira/browse/HBASE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426709#comment-13426709 ] Jimmy Xiang commented on HBASE-6460: I agree with Jon. It's better to fix the documentation because we don't usually touch orphans in fixing holes. You can always combine different options for your use case. If make sense, we can have other shortcuts too. However, we don't want too many shortcuts to avoid some confusion. hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans - Key: HBASE-6460 URL: https://issues.apache.org/jira/browse/HBASE-6460 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Priority: Minor Attachments: hbase-6460.patch According to the hbck's help info, shortcut - -repairHoles will enable -fixHdfsOrphans as below. {noformat} -repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans {noformat} However, in the implementation, the function fsck.setFixHdfsOrphans(false); is called in -repairHoles. This is not consistent with the usage information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426714#comment-13426714 ] Hadoop QA commented on HBASE-6435: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538789/6435.v9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.fs.TestBlockReorder Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2472//console This message is automatically generated. Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even
[jira] [Created] (HBASE-6490) 'dfs.client.block.write.retries' value could be increased in HBase
nkeywal created HBASE-6490: -- Summary: 'dfs.client.block.write.retries' value could be increased in HBase Key: HBASE-6490 URL: https://issues.apache.org/jira/browse/HBASE-6490 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Environment: When allocating a new node during writing, hdfs tries 'dfs.client.block.write.retries' times (default 3) to write the block. When it fails, it goes back to the nanenode for a new list, and raises an error if the number of retries is reached. In HBase, if the error is while we're writing a hlog file, it will trigger a region server abort (as hbase does not trust the log anymore). For simple case (new, and as such empty log file), this seems to be ok, and we don't lose data. There could be some complex cases if the error occurs on a hlog file with already multiple blocks written. Logs lines are: Exception in createBlockOutputStream, then Abandoning block followed by Excluding datanode for a retry. IOException: Unable to create new block., when the number of retries is reached. Probability of occurence seems quite low, (number of bad nodes / number of nodes)^(number of retries), and it implies that you have a region server without its datanode. But it's per new block. Increasing the default value of 'dfs.client.block.write.retries' could make sense to be better covered in chaotic conditions. Reporter: nkeywal Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes
[ https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426738#comment-13426738 ] nkeywal commented on HBASE-6435: I was expecting the name to be locahost, but it's not the case on hadoop-qa env: {noformat} /asf011.sp2.ygridcore.net,43631,1343836299404/asf011.sp2.ygridcore.net%2C43631%2C1343836299404.1343836318993 is an HLog file, so reordering blocks, last hostname will be:asf011.sp2.ygridcore.net {noformat} So the trick used to check location ordering on a mini cluster does not work. I will find another way... Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes Key: HBASE-6435 URL: https://issues.apache.org/jira/browse/HBASE-6435 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 6435.unfinished.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 6435.v9.patch HBase writes a Write-Ahead-Log to revover from hardware failure. This log is written with 'append' on hdfs. Through ZooKeeper, HBase gets informed usually in 30s that it should start the recovery process. This means reading the Write-Ahead-Log to replay the edits on the other servers. In standards deployments, HBase process (regionserver) are deployed on the same box as the datanodes. It means that when the box stops, we've actually lost one of the edits, as we lost both the regionserver and the datanode. As HDFS marks a node as dead after ~10 minutes, it appears as available when we try to read the blocks to recover. As such, we are delaying the recovery process by 60 seconds as the read will usually fail with a socket timeout. If the file is still opened for writing, it adds an extra 20s + a risk of losing edits if we connect with ipc to the dead DN. Possible solutions are: - shorter dead datanodes detection by the NN. Requires a NN code change. - better dead datanodes management in DFSClient. Requires a DFS code change. - NN customisation to write the WAL files on another DN instead of the local one. - reordering the blocks returned by the NN on the client side to put the blocks on the same DN as the dead RS at the end of the priority queue. Requires a DFS code change or a kind of workaround. The solution retained is the last one. Compared to what was discussed on the mailing list, the proposed patch will not modify HDFS source code but adds a proxy. This for two reasons: - Some HDFS functions managing block orders are static (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would require to implement partially the fix, change the DFS interface to make this function non static, or put the hook static. None of these solution is very clean. - Adding a proxy allows to put all the code in HBase, simplifying dependency management. Nevertheless, it would be better to have this in HDFS. But this solution allows to target the last version only, and this could allow minimal interface changes such as non static methods. Moreover, writing the blocks to the non local DN would be an even better solution long term. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6411) Move Master Metrics to metrics 2
[ https://issues.apache.org/jira/browse/HBASE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426742#comment-13426742 ] Zhihong Ted Yu commented on HBASE-6411: --- For findbugs: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.1:process (default) on project hbase-hadoop1-compat: Failed to resolve dependencies for one or more projects in the reactor. Reason: Missing: [ERROR] -- [ERROR] 1) org.apache.hbase:hbase-hadoop-compat:test-jar:tests:0.95-SNAPSHOT [ERROR] [ERROR] Try downloading the file manually from the project website. [ERROR] [ERROR] Then, install it using the command: [ERROR] mvn install:install-file -DgroupId=org.apache.hbase -DartifactId=hbase-hadoop-compat -Dversion=0.95-SNAPSHOT -Dclassifier=tests -Dpackaging=test-jar -Dfile=/path/to/file [ERROR] [ERROR] Alternatively, if you host your own repository you can deploy the file there: [ERROR] mvn deploy:deploy-file -DgroupId=org.apache.hbase -DartifactId=hbase-hadoop-compat -Dversion=0.95-SNAPSHOT -Dclassifier=tests -Dpackaging=test-jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id] [ERROR] [ERROR] Path to dependency: [ERROR] 1) org.apache.hbase:hbase-hadoop1-compat:jar:0.95-SNAPSHOT [ERROR] 2) org.apache.hbase:hbase-hadoop-compat:test-jar:tests:0.95-SNAPSHOT [ERROR] [ERROR] -- [ERROR] 1 required artifact is missing. [ERROR] {code} Move Master Metrics to metrics 2 Key: HBASE-6411 URL: https://issues.apache.org/jira/browse/HBASE-6411 Project: HBase Issue Type: Sub-task Reporter: Elliott Clark Assignee: Alex Baranau Attachments: HBASE-6411-0.patch, HBASE-6411-1.patch, HBASE-6411-2.patch, HBASE-6411-3.patch, HBASE-6411-4.patch, HBASE-6411-4_2.patch, HBASE-6411_concept.patch Move Master Metrics to metrics 2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6358) Bulkloading from remote filesystem is problematic
[ https://issues.apache.org/jira/browse/HBASE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426747#comment-13426747 ] Dave Revell commented on HBASE-6358: @Harsh, Yes I will, sorry for the delay. I can have it within a week. Bulkloading from remote filesystem is problematic - Key: HBASE-6358 URL: https://issues.apache.org/jira/browse/HBASE-6358 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: Dave Revell Assignee: Dave Revell Bulk loading hfiles that don't live on the same filesystem as HBase can cause problems for subtle reasons. In Store.bulkLoadHFile(), the regionserver will copy the source hfile to its own filesystem if it's not already there. Since this can take a long time for large hfiles, it's likely that the client will timeout and retry. When the client retries repeatedly, there may be several bulkload operations in flight for the same hfile, causing lots of unnecessary IO and tying up handler threads. This can seriously impact performance. In my case, the cluster became unusable and the regionservers had to be kill -9'ed. Possible solutions: # Require that hfiles already be on the same filesystem as HBase in order for bulkloading to succeed. The copy could be handled by LoadIncrementalHFiles before the regionserver is called. # Others? I'm not familiar with Hadoop IPC so there may be tricks to extend the timeout or something else. I'm willing to write a patch but I'd appreciate recommendations on how to proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6444) Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable
[ https://issues.apache.org/jira/browse/HBASE-6444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6444: --- Attachment: trunk-6444.patch Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable -- Key: HBASE-6444 URL: https://issues.apache.org/jira/browse/HBASE-6444 Project: HBase Issue Type: Improvement Components: rest Reporter: Erich Hochmuth Attachments: HBASE-6444-0.94.patch, HBASE-6444.patch, trunk-6444.patch Original Estimate: 48h Remaining Estimate: 48h My corporate security office (ISO) requires that all http traffic get routed through a Web Access Management layer (http://en.wikipedia.org/wiki/Web_access_management) Our Hadoop cluster has been segmented by a virtual network with all access to HBase from outside clients being managed through HBase Stargate rest server. The corporate WAM system requires that all http clients authenticate with it first before making any http request to any http service in the corporate network. After the http client authenticates with the WAM system the WAM system returns the client a set of values that must be inserted into a http cookie and request header of all future http requests to other http clients. This would mean that all requests through the RemoteHTable interface would require that this cookie and request header be set as part of the http request. org.apache.hadoop.hbase.rest.client.Client looks like the appropriate place that this functionality would need to be plugged into. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6444) Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable
[ https://issues.apache.org/jira/browse/HBASE-6444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426763#comment-13426763 ] Jimmy Xiang commented on HBASE-6444: I attached patch trunk-6444.patch. Does it solve the problem? Expose the ability to set custom HTTP Request Headers for the REST client used by RemoteHTable -- Key: HBASE-6444 URL: https://issues.apache.org/jira/browse/HBASE-6444 Project: HBase Issue Type: Improvement Components: rest Reporter: Erich Hochmuth Attachments: HBASE-6444-0.94.patch, HBASE-6444.patch, trunk-6444.patch Original Estimate: 48h Remaining Estimate: 48h My corporate security office (ISO) requires that all http traffic get routed through a Web Access Management layer (http://en.wikipedia.org/wiki/Web_access_management) Our Hadoop cluster has been segmented by a virtual network with all access to HBase from outside clients being managed through HBase Stargate rest server. The corporate WAM system requires that all http clients authenticate with it first before making any http request to any http service in the corporate network. After the http client authenticates with the WAM system the WAM system returns the client a set of values that must be inserted into a http cookie and request header of all future http requests to other http clients. This would mean that all requests through the RemoteHTable interface would require that this cookie and request header be set as part of the http request. org.apache.hadoop.hbase.rest.client.Client looks like the appropriate place that this functionality would need to be plugged into. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6487) assign region doesn't check if the region is already assigned
[ https://issues.apache.org/jira/browse/HBASE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6487: --- Priority: Minor (was: Major) Assignee: Jimmy Xiang assign region doesn't check if the region is already assigned - Key: HBASE-6487 URL: https://issues.apache.org/jira/browse/HBASE-6487 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Tried to assign a region already assigned somewhere from hbase shell, the region is assigned to a different place but the previous assignment is not closed. So it causes double assignments. In such a case, it's better to issue a warning instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6487) assign region doesn't check if the region is already assigned
[ https://issues.apache.org/jira/browse/HBASE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426776#comment-13426776 ] Jimmy Xiang commented on HBASE-6487: Changed to Minor since it is a known issue: Assign a region.Use with caution.If region already assigned, this command will just go ahead and reassign the region anyways. For experts only. assign region doesn't check if the region is already assigned - Key: HBASE-6487 URL: https://issues.apache.org/jira/browse/HBASE-6487 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Tried to assign a region already assigned somewhere from hbase shell, the region is assigned to a different place but the previous assignment is not closed. So it causes double assignments. In such a case, it's better to issue a warning instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6487) assign region doesn't check if the region is already assigned
[ https://issues.apache.org/jira/browse/HBASE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6487: --- Attachment: trunk-6487.patch assign region doesn't check if the region is already assigned - Key: HBASE-6487 URL: https://issues.apache.org/jira/browse/HBASE-6487 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6487.patch Tried to assign a region already assigned somewhere from hbase shell, the region is assigned to a different place but the previous assignment is not closed. So it causes double assignments. In such a case, it's better to issue a warning instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6487) assign region doesn't check if the region is already assigned
[ https://issues.apache.org/jira/browse/HBASE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6487: --- Status: Patch Available (was: Open) assign region doesn't check if the region is already assigned - Key: HBASE-6487 URL: https://issues.apache.org/jira/browse/HBASE-6487 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6487.patch Tried to assign a region already assigned somewhere from hbase shell, the region is assigned to a different place but the previous assignment is not closed. So it causes double assignments. In such a case, it's better to issue a warning instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6487) assign region doesn't check if the region is already assigned
[ https://issues.apache.org/jira/browse/HBASE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6487: --- Fix Version/s: 0.96.0 assign region doesn't check if the region is already assigned - Key: HBASE-6487 URL: https://issues.apache.org/jira/browse/HBASE-6487 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 Attachments: trunk-6487.patch Tried to assign a region already assigned somewhere from hbase shell, the region is assigned to a different place but the previous assignment is not closed. So it causes double assignments. In such a case, it's better to issue a warning instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6052) Convert .META. and -ROOT- content to pb
[ https://issues.apache.org/jira/browse/HBASE-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar reassigned HBASE-6052: Assignee: Enis Soztutar (was: stack) Convert .META. and -ROOT- content to pb --- Key: HBASE-6052 URL: https://issues.apache.org/jira/browse/HBASE-6052 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Enis Soztutar Priority: Blocker Fix For: 0.96.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6052) Convert .META. and -ROOT- content to pb
[ https://issues.apache.org/jira/browse/HBASE-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426815#comment-13426815 ] Enis Soztutar commented on HBASE-6052: -- @Lars, are you referring HBASE-3171? It seems it is not going to make it into 0.96, wdyt? BTW, I am all in favor of HBASE-3171, since it will make zk the absolute authority about the state of the regions, and simplify split/merge. Should we focus on that instead? Convert .META. and -ROOT- content to pb --- Key: HBASE-6052 URL: https://issues.apache.org/jira/browse/HBASE-6052 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Enis Soztutar Priority: Blocker Fix For: 0.96.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ronghai.ma updated HBASE-6491: -- Attachment: ClientScanner.txt add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Reporter: ronghai.ma Attachments: ClientScanner.txt Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6491) add limit function at ClientScanner
ronghai.ma created HBASE-6491: - Summary: add limit function at ClientScanner Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Reporter: ronghai.ma Attachments: ClientScanner.txt Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ronghai.ma reassigned HBASE-6491: - Assignee: ronghai.ma add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Reporter: ronghai.ma Assignee: ronghai.ma Attachments: ClientScanner.txt Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ronghai.ma updated HBASE-6491: -- Attachment: HBASE-6491.patch add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Reporter: ronghai.ma Assignee: ronghai.ma Attachments: HBASE-6491.patch Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6052) Convert .META. and -ROOT- content to pb
[ https://issues.apache.org/jira/browse/HBASE-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426866#comment-13426866 ] Lars Hofhansl commented on HBASE-6052: -- @Enis: Yep. If we continue to assume a single .META. region for now, HBASE-3171 should not even be that hard (famous last words). .META. content would still need to be protobuffed. Convert .META. and -ROOT- content to pb --- Key: HBASE-6052 URL: https://issues.apache.org/jira/browse/HBASE-6052 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Enis Soztutar Priority: Blocker Fix For: 0.96.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ronghai.ma updated HBASE-6491: -- Component/s: client add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Reporter: ronghai.ma Assignee: ronghai.ma Labels: patch Fix For: 0.94.0 Attachments: HBASE-6491.patch, HBASE-6491.patch.1.jpg Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426893#comment-13426893 ] Hadoop QA commented on HBASE-6491: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538831/HBASE-6491.patch.1.jpg against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2474//console This message is automatically generated. add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Reporter: ronghai.ma Assignee: ronghai.ma Labels: patch Fix For: 0.94.0 Attachments: HBASE-6491.patch, HBASE-6491.patch.1.jpg Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6474) Separate exception types for recoverable and non-recoverable errors
[ https://issues.apache.org/jira/browse/HBASE-6474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426900#comment-13426900 ] Eli Collins commented on HBASE-6474: Thanks, would be good to doc that (eg in javadoc and/or explicit class names eg UnrecoverableException) and articulate the remaining work (perhaps as sub-tasks of this jira if it's a lot). Separate exception types for recoverable and non-recoverable errors Key: HBASE-6474 URL: https://issues.apache.org/jira/browse/HBASE-6474 Project: HBase Issue Type: Improvement Components: client Reporter: Eli Collins While reading [about asynchbase|http://www.quora.com/HBase/What-are-the-best-HBase-client-libraries-in-each-language?srid=hqw] I noticed that HBase turns exposes all errors to clients as IOEs rather than use the exception hierarchy to separate recoverable and non-recoverable exceptions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ronghai.ma updated HBASE-6491: -- Attachment: (was: HBASE-6491.patch.1.jpg) add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Reporter: ronghai.ma Assignee: ronghai.ma Labels: patch Fix For: 0.94.0 Attachments: HBASE-6491.patch Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ronghai.ma updated HBASE-6491: -- Affects Version/s: (was: 0.94.0) 0.96.0 Fix Version/s: (was: 0.94.0) 0.96.0 add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.96.0 Reporter: ronghai.ma Assignee: ronghai.ma Labels: patch Fix For: 0.96.0 Attachments: HBASE-6491.patch Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5954) Allow proper fsync support for HBase
[ https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426949#comment-13426949 ] Lars Hofhansl commented on HBASE-5954: -- Created HBASE-6492. Since I am interested in having this in 0.94 I'll start with the reflection based approach (but still in trunk for HadoopQA). This is what I am going to do: # global HFiles hsync option upon close block (this will also apply sync'ing the WAL on close) # global WAL edit hsync opion # hsync CF's HFiles # hsync CF's WAL edits # WAL hsync per Put/Delete/Append/Increment/etc Allow proper fsync support for HBase Key: HBASE-5954 URL: https://issues.apache.org/jira/browse/HBASE-5954 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 5954-trunk-hdfs-trunk-v2.txt, 5954-trunk-hdfs-trunk-v3.txt, 5954-trunk-hdfs-trunk-v4.txt, 5954-trunk-hdfs-trunk-v5.txt, 5954-trunk-hdfs-trunk-v6.txt, 5954-trunk-hdfs-trunk.txt, hbase-hdfs-744.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5954) Allow proper fsync support for HBase
[ https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426966#comment-13426966 ] Lars Hofhansl commented on HBASE-5954: -- Sigh... The WAL files would still need to be sync'ed upon blockclose. Since they mix data from different stores, there's no telling ahead of time. Which leads to: # global hsync option upon close block for HFiles and Hlogs (makes no sense to sync HLogs but not HFiles or vice versa) # global WAL edit hsync option # hsync CF's WAL edits # WAL hsync per Put/Delete/Append/Increment/etc Allow proper fsync support for HBase Key: HBASE-5954 URL: https://issues.apache.org/jira/browse/HBASE-5954 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Attachments: 5954-trunk-hdfs-trunk-v2.txt, 5954-trunk-hdfs-trunk-v3.txt, 5954-trunk-hdfs-trunk-v4.txt, 5954-trunk-hdfs-trunk-v5.txt, 5954-trunk-hdfs-trunk-v6.txt, 5954-trunk-hdfs-trunk.txt, hbase-hdfs-744.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6493) HashSet of byte array is being used in couple of places
Shrijeet Paliwal created HBASE-6493: --- Summary: HashSet of byte array is being used in couple of places Key: HBASE-6493 URL: https://issues.apache.org/jira/browse/HBASE-6493 Project: HBase Issue Type: Bug Reporter: Shrijeet Paliwal Priority: Minor While working on a jira I realized I had made a mistake of making a HashSet of byte array. Then out of curiosity I checked if we do same any where else in code base. I came with following files. # /src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java:Setbyte [] qualifiers = new HashSetbyte[](); # /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java: columnFamilies = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/filter/TestFirstKeyValueMatchingQualifiersFilter.java: Setbyte[] quals = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java: Setbyte[] families = new HashSetbyte[](); (1) and (3) are mine and I will fix them (not yet committed). Quoting the exact reference from (2) below : {code} @Override public GetStoreFileResponse getStoreFile(final RpcController controller, final GetStoreFileRequest request) throws ServiceException { try { HRegion region = getRegion(request.getRegion()); requestCount.incrementAndGet(); Setbyte[] columnFamilies = null; if (request.getFamilyCount() == 0) { columnFamilies = region.getStores().keySet(); } else { columnFamilies = new HashSetbyte[](); for (ByteString cf: request.getFamilyList()) { columnFamilies.add(cf.toByteArray()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6494) Add a delete option to Performance Evaluation
Amitanand Aiyer created HBASE-6494: -- Summary: Add a delete option to Performance Evaluation Key: HBASE-6494 URL: https://issues.apache.org/jira/browse/HBASE-6494 Project: HBase Issue Type: Improvement Affects Versions: 0.89-fb Reporter: Amitanand Aiyer Priority: Minor extend performance evaluation to do deletes on the table. will allow us to benchmark/test regression in performance for changes to the delete path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6468) RowCounter may return incorrect result if column name is specified in command line
[ https://issues.apache.org/jira/browse/HBASE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shrijeet Paliwal updated HBASE-6468: Attachment: 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch Last comments + change in Filter.proto RowCounter may return incorrect result if column name is specified in command line -- Key: HBASE-6468 URL: https://issues.apache.org/jira/browse/HBASE-6468 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Shrijeet Paliwal Attachments: 0001-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0002-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0004-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch The RowCounter use FirstKeyOnlyFilter regardless of whether or not the command line argument specified a column family (or family:qualifier). In case when no qualifier was specified as argument, the scan will give correct result. However in the other case the scan instance may have been set with columns other than the very first column in the row, causing scan to get nothing as the FirstKeyOnlyFilter removes everything else. https://issues.apache.org/jira/browse/HBASE-6042 is related. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6468) RowCounter may return incorrect result if column name is specified in command line
[ https://issues.apache.org/jira/browse/HBASE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shrijeet Paliwal updated HBASE-6468: Status: Patch Available (was: Open) RowCounter may return incorrect result if column name is specified in command line -- Key: HBASE-6468 URL: https://issues.apache.org/jira/browse/HBASE-6468 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Shrijeet Paliwal Attachments: 0001-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0002-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0004-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch The RowCounter use FirstKeyOnlyFilter regardless of whether or not the command line argument specified a column family (or family:qualifier). In case when no qualifier was specified as argument, the scan will give correct result. However in the other case the scan instance may have been set with columns other than the very first column in the row, causing scan to get nothing as the FirstKeyOnlyFilter removes everything else. https://issues.apache.org/jira/browse/HBASE-6042 is related. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6493) HashSet of byte array is being used in couple of places
[ https://issues.apache.org/jira/browse/HBASE-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427022#comment-13427022 ] Lars Hofhansl commented on HBASE-6493: -- The one in RegioServer is curious. We should add a static check for this. It is not necessarily incorrect if the byte[] is an an identity. HashSet of byte array is being used in couple of places --- Key: HBASE-6493 URL: https://issues.apache.org/jira/browse/HBASE-6493 Project: HBase Issue Type: Bug Reporter: Shrijeet Paliwal Priority: Minor While working on a jira I realized I had made a mistake of making a HashSet of byte array. Then out of curiosity I checked if we do same any where else in code base. I came with following files. # /src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java: Setbyte [] qualifiers = new HashSetbyte[](); # /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java: columnFamilies = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/filter/TestFirstKeyValueMatchingQualifiersFilter.java: Setbyte[] quals = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java: Setbyte[] families = new HashSetbyte[](); (1) and (3) are mine and I will fix them (not yet committed). Quoting the exact reference from (2) below : {code} @Override public GetStoreFileResponse getStoreFile(final RpcController controller, final GetStoreFileRequest request) throws ServiceException { try { HRegion region = getRegion(request.getRegion()); requestCount.incrementAndGet(); Setbyte[] columnFamilies = null; if (request.getFamilyCount() == 0) { columnFamilies = region.getStores().keySet(); } else { columnFamilies = new HashSetbyte[](); for (ByteString cf: request.getFamilyList()) { columnFamilies.add(cf.toByteArray()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6495) HBaseAdmin shouldn't expect HConnection to be an HConnectionImplementation
Jesse Yates created HBASE-6495: -- Summary: HBaseAdmin shouldn't expect HConnection to be an HConnectionImplementation Key: HBASE-6495 URL: https://issues.apache.org/jira/browse/HBASE-6495 Project: HBase Issue Type: Bug Affects Versions: 0.96.0, 0.94.1 Reporter: Jesse Yates Fix For: 0.96.0, 0.94.1 Currently, the HBaseAdmin has a constructor that takes an HConnection, but then immediately casts it to an HConnectionManager.HConnectionImplementation: {code} public HBaseAdmin(HConnection connection) throws MasterNotRunningException, ZooKeeperConnectionException { this.conf = connection.getConfiguration(); // We want the real class, without showing it our public interface, // hence the cast. this.connection = (HConnectionManager.HConnectionImplementation)connection; {code} However, this breaks the explicit contract in the javadocs and makes it basically impossible to mock out the hbaseadmin. We need to either make the hbaseadmin use a basic HConnection and optimize for cases where its smarter or bring up the couple of methods in HConnectionManager.HConnectionImplementation to the HConnection interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6493) HashSet of byte array is being used in couple of places
[ https://issues.apache.org/jira/browse/HBASE-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427031#comment-13427031 ] Shrijeet Paliwal commented on HBASE-6493: - Lars, {quote} The one in RegioServer is curious. We should add a static check for this. {quote} Did not understand. Do you mean a check during compile time? {quote} It is not necessarily incorrect if the byte[] is an an identity. {quote} Agree. HashSet of byte array is being used in couple of places --- Key: HBASE-6493 URL: https://issues.apache.org/jira/browse/HBASE-6493 Project: HBase Issue Type: Bug Reporter: Shrijeet Paliwal Priority: Minor While working on a jira I realized I had made a mistake of making a HashSet of byte array. Then out of curiosity I checked if we do same any where else in code base. I came with following files. # /src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java: Setbyte [] qualifiers = new HashSetbyte[](); # /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java: columnFamilies = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/filter/TestFirstKeyValueMatchingQualifiersFilter.java: Setbyte[] quals = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java: Setbyte[] families = new HashSetbyte[](); (1) and (3) are mine and I will fix them (not yet committed). Quoting the exact reference from (2) below : {code} @Override public GetStoreFileResponse getStoreFile(final RpcController controller, final GetStoreFileRequest request) throws ServiceException { try { HRegion region = getRegion(request.getRegion()); requestCount.incrementAndGet(); Setbyte[] columnFamilies = null; if (request.getFamilyCount() == 0) { columnFamilies = region.getStores().keySet(); } else { columnFamilies = new HashSetbyte[](); for (ByteString cf: request.getFamilyList()) { columnFamilies.add(cf.toByteArray()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6468) RowCounter may return incorrect result if column name is specified in command line
[ https://issues.apache.org/jira/browse/HBASE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427047#comment-13427047 ] Hadoop QA commented on HBASE-6468: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538850/0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2476//console This message is automatically generated. RowCounter may return incorrect result if column name is specified in command line -- Key: HBASE-6468 URL: https://issues.apache.org/jira/browse/HBASE-6468 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Shrijeet Paliwal Attachments: 0001-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0002-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0004-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch The RowCounter use FirstKeyOnlyFilter regardless of whether or not the command line argument specified a column family (or family:qualifier). In case when no qualifier was specified as argument, the scan will give correct result. However in the other case the scan instance may have been set with columns other than the very first column in the row, causing scan to get nothing as the FirstKeyOnlyFilter removes everything else. https://issues.apache.org/jira/browse/HBASE-6042 is related. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6484) Make AssignmentManger#enablingTables and disablintTables local variables
[ https://issues.apache.org/jira/browse/HBASE-6484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-6484: -- Assignee: Jimmy Xiang Make AssignmentManger#enablingTables and disablintTables local variables Key: HBASE-6484 URL: https://issues.apache.org/jira/browse/HBASE-6484 Project: HBase Issue Type: Sub-task Reporter: Jimmy Xiang Assignee: Jimmy Xiang Those enablingTables and disablingTables, are used only during the startup time. They should be some local variables. We can load them from ZKTable at the beginning instead of handling them per table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6483) Fully enable ServerShutdownHandler after master joins the cluster
[ https://issues.apache.org/jira/browse/HBASE-6483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-6483: -- Assignee: Jimmy Xiang Fully enable ServerShutdownHandler after master joins the cluster - Key: HBASE-6483 URL: https://issues.apache.org/jira/browse/HBASE-6483 Project: HBase Issue Type: Sub-task Reporter: Jimmy Xiang Assignee: Jimmy Xiang Once ROOT and META are assigned, ServerShutdownHandler is enabled. So that we can handle meta/root region server failure before joinCluster is completed. However, we can hold ServerShutdownHandler a little bit more for the user region assignments, i.e. doesn't assign user regions before joinCluster is returned. If so, we can avoid some region assignments racing: same regions are trying to be assigned in both joinCluster and ServerShutdownHandler. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6468) RowCounter may return incorrect result if column name is specified in command line
[ https://issues.apache.org/jira/browse/HBASE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu reassigned HBASE-6468: - Assignee: Shrijeet Paliwal RowCounter may return incorrect result if column name is specified in command line -- Key: HBASE-6468 URL: https://issues.apache.org/jira/browse/HBASE-6468 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.96.0 Attachments: 0001-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0002-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0004-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch The RowCounter use FirstKeyOnlyFilter regardless of whether or not the command line argument specified a column family (or family:qualifier). In case when no qualifier was specified as argument, the scan will give correct result. However in the other case the scan instance may have been set with columns other than the very first column in the row, causing scan to get nothing as the FirstKeyOnlyFilter removes everything else. https://issues.apache.org/jira/browse/HBASE-6042 is related. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6468) RowCounter may return incorrect result if column name is specified in command line
[ https://issues.apache.org/jira/browse/HBASE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6468: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed RowCounter may return incorrect result if column name is specified in command line -- Key: HBASE-6468 URL: https://issues.apache.org/jira/browse/HBASE-6468 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Shrijeet Paliwal Assignee: Shrijeet Paliwal Fix For: 0.96.0 Attachments: 0001-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0002-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0004-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch The RowCounter use FirstKeyOnlyFilter regardless of whether or not the command line argument specified a column family (or family:qualifier). In case when no qualifier was specified as argument, the scan will give correct result. However in the other case the scan instance may have been set with columns other than the very first column in the row, causing scan to get nothing as the FirstKeyOnlyFilter removes everything else. https://issues.apache.org/jira/browse/HBASE-6042 is related. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6468) RowCounter may return incorrect result if column name is specified in command line
[ https://issues.apache.org/jira/browse/HBASE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427054#comment-13427054 ] Zhihong Ted Yu commented on HBASE-6468: --- Latest patch looks good. Will integrate tomorrow if there is no objection. RowCounter may return incorrect result if column name is specified in command line -- Key: HBASE-6468 URL: https://issues.apache.org/jira/browse/HBASE-6468 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Shrijeet Paliwal Fix For: 0.96.0 Attachments: 0001-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0002-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0004-HBASE-6468-RowCounter-may-return-incorrect-result.patch, 0005-HBASE-6468-RowCounter-may-return-incorrect-results.patch The RowCounter use FirstKeyOnlyFilter regardless of whether or not the command line argument specified a column family (or family:qualifier). In case when no qualifier was specified as argument, the scan will give correct result. However in the other case the scan instance may have been set with columns other than the very first column in the row, causing scan to get nothing as the FirstKeyOnlyFilter removes everything else. https://issues.apache.org/jira/browse/HBASE-6042 is related. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit
[ https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427061#comment-13427061 ] Zhihong Ted Yu commented on HBASE-6429: --- Almost there. {code} +public class FilterWrapper implements Filter { {code} Suggest adding the following to above class: {code} @InterfaceAudience.Private @InterfaceStability.Evolving {code} Filter with filterRow() returning true is incompatible with scan with limit --- Key: HBASE-6429 URL: https://issues.apache.org/jira/browse/HBASE-6429 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.96.0 Reporter: Jason Dai Attachments: hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, hbase-6429-trunk.patch, hbase-6429_0_94_0.patch Currently if we scan with bot limit and a Filter with filterRow(ListKeyValue) implemented, an IncompatibleFilterException will be thrown. The same exception should also be thrown if the filer has its filterRow() implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6493) HashSet of byte array is being used in couple of places
[ https://issues.apache.org/jira/browse/HBASE-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427064#comment-13427064 ] Lars Hofhansl commented on HBASE-6493: -- I meant the one in RegionServer is definitely a bug. And generally we could add a build check for that (i.e. just grep for HashSetbyte[] and HashMapbyte[] HashSet of byte array is being used in couple of places --- Key: HBASE-6493 URL: https://issues.apache.org/jira/browse/HBASE-6493 Project: HBase Issue Type: Bug Reporter: Shrijeet Paliwal Priority: Minor While working on a jira I realized I had made a mistake of making a HashSet of byte array. Then out of curiosity I checked if we do same any where else in code base. I came with following files. # /src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java: Setbyte [] qualifiers = new HashSetbyte[](); # /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java: columnFamilies = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/filter/TestFirstKeyValueMatchingQualifiersFilter.java: Setbyte[] quals = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java: Setbyte[] families = new HashSetbyte[](); (1) and (3) are mine and I will fix them (not yet committed). Quoting the exact reference from (2) below : {code} @Override public GetStoreFileResponse getStoreFile(final RpcController controller, final GetStoreFileRequest request) throws ServiceException { try { HRegion region = getRegion(request.getRegion()); requestCount.incrementAndGet(); Setbyte[] columnFamilies = null; if (request.getFamilyCount() == 0) { columnFamilies = region.getStores().keySet(); } else { columnFamilies = new HashSetbyte[](); for (ByteString cf: request.getFamilyList()) { columnFamilies.add(cf.toByteArray()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6473) deletedtable is not deleted completely, some region may be still online
[ https://issues.apache.org/jira/browse/HBASE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427065#comment-13427065 ] Zhou wenjian commented on HBASE-6473: - @Jimmy Xiang the issue appears due to the the check condition of deletetable. when we deletetable, we first try to check if the zk state is disabled, and then check if the regions online is not in rit. For some reason disabled state is set but there is region still being unassigned, in that case the check of delete table shipps, and it will delete the table and zk state , that will cause the region being unassigned to assign again. In the patch I give a strictly check to deleting table. If there is region disabled but still online, we will wait for the region offline, after that we can delete table, and this case will never appear again deletedtable is not deleted completely, some region may be still online --- Key: HBASE-6473 URL: https://issues.apache.org/jira/browse/HBASE-6473 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6473-trunk.patch consider such Scenario: we have a table called T1, which has 1 regions: A 1. move A from rs1 to rs2,and A is now closed 2. disable T1, 3. delete T1. when we disable T1, disable handler will just set the zk to disabled and A will still be assigned. when Ais opened, A in transition will be clean out. At that time, Deletetable found it is safe to delete all regions and table in meta and fs , it will also delete the zk node of T1. {code} while (System.currentTimeMillis() done) { AssignmentManager.RegionState rs = am.isRegionInTransition(region); if (rs == null) break; Threads.sleep(waitingTimeForEvents); LOG.debug(Waiting on region to clear regions in transition; + rs); } if (am.isRegionInTransition(region) != null) { throw new IOException(Waited hbase.master.wait.on.region ( + waitTime + ms) for region to leave region + region.getRegionNameAsString() + in transitions); } {code} however A is still being unassigned, when it finished closed the A,it finds that the disabled state in zk is deleted, and then A will be assigned again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6493) HashSet of byte array is being used in couple of places
[ https://issues.apache.org/jira/browse/HBASE-6493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427072#comment-13427072 ] Lars Hofhansl commented on HBASE-6493: -- Interestingly there is just one such occurrence in 0.94 (TestSchemaMetrics.java) HashSet of byte array is being used in couple of places --- Key: HBASE-6493 URL: https://issues.apache.org/jira/browse/HBASE-6493 Project: HBase Issue Type: Bug Reporter: Shrijeet Paliwal Priority: Minor While working on a jira I realized I had made a mistake of making a HashSet of byte array. Then out of curiosity I checked if we do same any where else in code base. I came with following files. # /src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java: Setbyte [] qualifiers = new HashSetbyte[](); # /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java: columnFamilies = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/filter/TestFirstKeyValueMatchingQualifiersFilter.java: Setbyte[] quals = new HashSetbyte[](); # /src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java: Setbyte[] families = new HashSetbyte[](); (1) and (3) are mine and I will fix them (not yet committed). Quoting the exact reference from (2) below : {code} @Override public GetStoreFileResponse getStoreFile(final RpcController controller, final GetStoreFileRequest request) throws ServiceException { try { HRegion region = getRegion(request.getRegion()); requestCount.incrementAndGet(); Setbyte[] columnFamilies = null; if (request.getFamilyCount() == 0) { columnFamilies = region.getStores().keySet(); } else { columnFamilies = new HashSetbyte[](); for (ByteString cf: request.getFamilyList()) { columnFamilies.add(cf.toByteArray()); } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6460) hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans
[ https://issues.apache.org/jira/browse/HBASE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Huang updated HBASE-6460: - Attachment: (was: hbase-6460.patch) hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans - Key: HBASE-6460 URL: https://issues.apache.org/jira/browse/HBASE-6460 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Priority: Minor Attachments: hbase-6460.patch According to the hbck's help info, shortcut - -repairHoles will enable -fixHdfsOrphans as below. {noformat} -repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans {noformat} However, in the implementation, the function fsck.setFixHdfsOrphans(false); is called in -repairHoles. This is not consistent with the usage information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6460) hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans
[ https://issues.apache.org/jira/browse/HBASE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Huang updated HBASE-6460: - Attachment: hbase-6460.patch hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans - Key: HBASE-6460 URL: https://issues.apache.org/jira/browse/HBASE-6460 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Priority: Minor Attachments: hbase-6460.patch According to the hbck's help info, shortcut - -repairHoles will enable -fixHdfsOrphans as below. {noformat} -repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans {noformat} However, in the implementation, the function fsck.setFixHdfsOrphans(false); is called in -repairHoles. This is not consistent with the usage information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6479) HFileReaderV1 caching the same parent META block could cause server abot when splitting
[ https://issues.apache.org/jira/browse/HBASE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6479: Attachment: HBASE-6479v2.patch Including test case in the patchV2 HFileReaderV1 caching the same parent META block could cause server abot when splitting --- Key: HBASE-6479 URL: https://issues.apache.org/jira/browse/HBASE-6479 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6479.patch, HBASE-6479v2.patch, test.patch If the hfile's version is 1 now, when splitting, two daughters would loadBloomfilter concurrently in the open progress. Because their META block is the same one(parent's META block), the following expection would be thrown when doing HFileReaderV1#getMetaBlock {code} java.io.IOException: Failed null-daughterOpener=af73f8c9a9b409531ac211a9a7f92eba at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:367) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:453) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplit(TestSplitTransaction.java:225) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplitWithHFileV1(TestSplitTransaction.java:203) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:540) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:463) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3784) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:506) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:486) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:424) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:271) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2918) at
[jira] [Commented] (HBASE-6460) hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans
[ https://issues.apache.org/jira/browse/HBASE-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427077#comment-13427077 ] Jie Huang commented on HBASE-6460: -- @Jimmy @Jonathan, Thanks for your comments. I have fixed the usage info instead of changing the code. Any more comment? hbck -repairHoles shortcut doesn't enable -fixHdfsOrphans - Key: HBASE-6460 URL: https://issues.apache.org/jira/browse/HBASE-6460 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.96.0 Reporter: Jie Huang Priority: Minor Attachments: hbase-6460.patch According to the hbck's help info, shortcut - -repairHoles will enable -fixHdfsOrphans as below. {noformat} -repairHoles Shortcut for -fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans {noformat} However, in the implementation, the function fsck.setFixHdfsOrphans(false); is called in -repairHoles. This is not consistent with the usage information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6479) HFileReaderV1 caching the same parent META block could cause server abot when splitting
[ https://issues.apache.org/jira/browse/HBASE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6479: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Patch Available (was: Open) HFileReaderV1 caching the same parent META block could cause server abot when splitting --- Key: HBASE-6479 URL: https://issues.apache.org/jira/browse/HBASE-6479 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6479.patch, HBASE-6479v2.patch, test.patch If the hfile's version is 1 now, when splitting, two daughters would loadBloomfilter concurrently in the open progress. Because their META block is the same one(parent's META block), the following expection would be thrown when doing HFileReaderV1#getMetaBlock {code} java.io.IOException: Failed null-daughterOpener=af73f8c9a9b409531ac211a9a7f92eba at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:367) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:453) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplit(TestSplitTransaction.java:225) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplitWithHFileV1(TestSplitTransaction.java:203) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.io.IOException: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:540) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:463) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3784) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:506) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:486) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:424) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:271) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2918) at
[jira] [Commented] (HBASE-6464) reportTableInFlux() throws NPE when no table is returned.
[ https://issues.apache.org/jira/browse/HBASE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427079#comment-13427079 ] Jie Huang commented on HBASE-6464: -- It seems that the problem has already been solved in 0.94 branch. We can close this bug entry. Thanks. reportTableInFlux() throws NPE when no table is returned. - Key: HBASE-6464 URL: https://issues.apache.org/jira/browse/HBASE-6464 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0 Reporter: Jie Huang Priority: Critical Fix For: 0.94.2 Attachments: hbase-6464.patch reportTableInFlux() gets all tables not in flux. However, when no table is found in getTable(numSkipped) function, it will return null. Then, errorReporter attempts to print the table number, which causes a NPE here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit
[ https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Huang updated HBASE-6429: - Attachment: hbase-6429-trunk-v4.patch Thanks. Quite reasonable. Done in the updated version (v4). Filter with filterRow() returning true is incompatible with scan with limit --- Key: HBASE-6429 URL: https://issues.apache.org/jira/browse/HBASE-6429 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.96.0 Reporter: Jason Dai Attachments: hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, hbase-6429-trunk-v4.patch, hbase-6429-trunk.patch, hbase-6429_0_94_0.patch Currently if we scan with bot limit and a Filter with filterRow(ListKeyValue) implemented, an IncompatibleFilterException will be thrown. The same exception should also be thrown if the filer has its filterRow() implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6473) deletedtable is not deleted completely, some region may be still online
[ https://issues.apache.org/jira/browse/HBASE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427082#comment-13427082 ] Jimmy Xiang commented on HBASE-6473: I see. In your patch, can you get the region state and make sure it is OFFLINE instead of checking if it is not assigned? deletedtable is not deleted completely, some region may be still online --- Key: HBASE-6473 URL: https://issues.apache.org/jira/browse/HBASE-6473 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6473-trunk.patch consider such Scenario: we have a table called T1, which has 1 regions: A 1. move A from rs1 to rs2,and A is now closed 2. disable T1, 3. delete T1. when we disable T1, disable handler will just set the zk to disabled and A will still be assigned. when Ais opened, A in transition will be clean out. At that time, Deletetable found it is safe to delete all regions and table in meta and fs , it will also delete the zk node of T1. {code} while (System.currentTimeMillis() done) { AssignmentManager.RegionState rs = am.isRegionInTransition(region); if (rs == null) break; Threads.sleep(waitingTimeForEvents); LOG.debug(Waiting on region to clear regions in transition; + rs); } if (am.isRegionInTransition(region) != null) { throw new IOException(Waited hbase.master.wait.on.region ( + waitTime + ms) for region to leave region + region.getRegionNameAsString() + in transitions); } {code} however A is still being unassigned, when it finished closed the A,it finds that the disabled state in zk is deleted, and then A will be assigned again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6473) deletedtable is not deleted completely, some region may be still online
[ https://issues.apache.org/jira/browse/HBASE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427083#comment-13427083 ] Zhou wenjian commented on HBASE-6473: - @Jimmy Xiang. You mean check the region state in rit to see it is offline? Region maybe is not in rit when delete table occours. And i do not quite follow the reason we check the region in rit offline. Do you have a email? I suggest we discuss it personally deletedtable is not deleted completely, some region may be still online --- Key: HBASE-6473 URL: https://issues.apache.org/jira/browse/HBASE-6473 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6473-trunk.patch consider such Scenario: we have a table called T1, which has 1 regions: A 1. move A from rs1 to rs2,and A is now closed 2. disable T1, 3. delete T1. when we disable T1, disable handler will just set the zk to disabled and A will still be assigned. when Ais opened, A in transition will be clean out. At that time, Deletetable found it is safe to delete all regions and table in meta and fs , it will also delete the zk node of T1. {code} while (System.currentTimeMillis() done) { AssignmentManager.RegionState rs = am.isRegionInTransition(region); if (rs == null) break; Threads.sleep(waitingTimeForEvents); LOG.debug(Waiting on region to clear regions in transition; + rs); } if (am.isRegionInTransition(region) != null) { throw new IOException(Waited hbase.master.wait.on.region ( + waitTime + ms) for region to leave region + region.getRegionNameAsString() + in transitions); } {code} however A is still being unassigned, when it finished closed the A,it finds that the disabled state in zk is deleted, and then A will be assigned again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6399) MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics
[ https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427084#comment-13427084 ] chunhui shen commented on HBASE-6399: - I think this issus is ok now, anyone take a see? MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics - Key: HBASE-6399 URL: https://issues.apache.org/jira/browse/HBASE-6399 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6399.patch, HBASE-6399v2.patch, HBASE-6399v3.patch In hadoop-metrics.properties, GangliaContext is optional metrics context, I think we will use ganglia to monitor hbase cluster generally. However, I find a serious problem: RegionServerDynamicMetrics will generate lots of rrd file because we would move region or create/delete table. Especially if table is created everyday in some applications, there are much more and more rrd files in Gmetad Server. It will make Gmetad Server corrupted. IMO, MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6399) MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics
[ https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6399: Issue Type: Improvement (was: Bug) MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics - Key: HBASE-6399 URL: https://issues.apache.org/jira/browse/HBASE-6399 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6399.patch, HBASE-6399v2.patch, HBASE-6399v3.patch In hadoop-metrics.properties, GangliaContext is optional metrics context, I think we will use ganglia to monitor hbase cluster generally. However, I find a serious problem: RegionServerDynamicMetrics will generate lots of rrd file because we would move region or create/delete table. Especially if table is created everyday in some applications, there are much more and more rrd files in Gmetad Server. It will make Gmetad Server corrupted. IMO, MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6479) HFileReaderV1 caching the same parent META block could cause server abot when splitting
[ https://issues.apache.org/jira/browse/HBASE-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427085#comment-13427085 ] Hadoop QA commented on HBASE-6479: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538861/HBASE-6479v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2477//console This message is automatically generated. HFileReaderV1 caching the same parent META block could cause server abot when splitting --- Key: HBASE-6479 URL: https://issues.apache.org/jira/browse/HBASE-6479 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6479.patch, HBASE-6479v2.patch, test.patch If the hfile's version is 1 now, when splitting, two daughters would loadBloomfilter concurrently in the open progress. Because their META block is the same one(parent's META block), the following expection would be thrown when doing HFileReaderV1#getMetaBlock {code} java.io.IOException: Failed null-daughterOpener=af73f8c9a9b409531ac211a9a7f92eba at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:367) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:453) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplit(TestSplitTransaction.java:225) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testWholesomeSplitWithHFileV1(TestSplitTransaction.java:203) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
[jira] [Commented] (HBASE-6473) deletedtable is not deleted completely, some region may be still online
[ https://issues.apache.org/jira/browse/HBASE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427092#comment-13427092 ] Jimmy Xiang commented on HBASE-6473: RegionStates now has the state of a region no matter it is in transition or not. deletedtable is not deleted completely, some region may be still online --- Key: HBASE-6473 URL: https://issues.apache.org/jira/browse/HBASE-6473 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6473-trunk.patch consider such Scenario: we have a table called T1, which has 1 regions: A 1. move A from rs1 to rs2,and A is now closed 2. disable T1, 3. delete T1. when we disable T1, disable handler will just set the zk to disabled and A will still be assigned. when Ais opened, A in transition will be clean out. At that time, Deletetable found it is safe to delete all regions and table in meta and fs , it will also delete the zk node of T1. {code} while (System.currentTimeMillis() done) { AssignmentManager.RegionState rs = am.isRegionInTransition(region); if (rs == null) break; Threads.sleep(waitingTimeForEvents); LOG.debug(Waiting on region to clear regions in transition; + rs); } if (am.isRegionInTransition(region) != null) { throw new IOException(Waited hbase.master.wait.on.region ( + waitTime + ms) for region to leave region + region.getRegionNameAsString() + in transitions); } {code} however A is still being unassigned, when it finished closed the A,it finds that the disabled state in zk is deleted, and then A will be assigned again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit
[ https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427093#comment-13427093 ] Hadoop QA commented on HBASE-6429: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12538863/hbase-6429-trunk-v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSide Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2478//console This message is automatically generated. Filter with filterRow() returning true is incompatible with scan with limit --- Key: HBASE-6429 URL: https://issues.apache.org/jira/browse/HBASE-6429 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.96.0 Reporter: Jason Dai Attachments: hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, hbase-6429-trunk-v4.patch, hbase-6429-trunk.patch, hbase-6429_0_94_0.patch Currently if we scan with bot limit and a Filter with filterRow(ListKeyValue) implemented, an IncompatibleFilterException will be thrown. The same exception should also be thrown if the filer has its filterRow() implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6473) deletedtable is not deleted completely, some region may be still online
[ https://issues.apache.org/jira/browse/HBASE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427098#comment-13427098 ] Zhou wenjian commented on HBASE-6473: - @Jimmy Xiang got you. RegionStates is a new structure introduced in 96. I misunderstand it as a state in RIT like 94. Thank for you reply. deletedtable is not deleted completely, some region may be still online --- Key: HBASE-6473 URL: https://issues.apache.org/jira/browse/HBASE-6473 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.96.0, 0.94.2 Attachments: HBASE-6473-trunk.patch consider such Scenario: we have a table called T1, which has 1 regions: A 1. move A from rs1 to rs2,and A is now closed 2. disable T1, 3. delete T1. when we disable T1, disable handler will just set the zk to disabled and A will still be assigned. when Ais opened, A in transition will be clean out. At that time, Deletetable found it is safe to delete all regions and table in meta and fs , it will also delete the zk node of T1. {code} while (System.currentTimeMillis() done) { AssignmentManager.RegionState rs = am.isRegionInTransition(region); if (rs == null) break; Threads.sleep(waitingTimeForEvents); LOG.debug(Waiting on region to clear regions in transition; + rs); } if (am.isRegionInTransition(region) != null) { throw new IOException(Waited hbase.master.wait.on.region ( + waitTime + ms) for region to leave region + region.getRegionNameAsString() + in transitions); } {code} however A is still being unassigned, when it finished closed the A,it finds that the disabled state in zk is deleted, and then A will be assigned again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit
[ https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6429: -- Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Filter with filterRow() returning true is incompatible with scan with limit --- Key: HBASE-6429 URL: https://issues.apache.org/jira/browse/HBASE-6429 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.96.0 Reporter: Jason Dai Assignee: Jie Huang Fix For: 0.96.0 Attachments: hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, hbase-6429-trunk-v4.patch, hbase-6429-trunk.patch, hbase-6429_0_94_0.patch Currently if we scan with bot limit and a Filter with filterRow(ListKeyValue) implemented, an IncompatibleFilterException will be thrown. The same exception should also be thrown if the filer has its filterRow() implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit
[ https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427100#comment-13427100 ] Zhihong Ted Yu commented on HBASE-6429: --- TestFromClientSide#testPoolBehavior is not related to Filter's. I ran the test individually and it passed. Will integrate tomorrow if there is no objection. Filter with filterRow() returning true is incompatible with scan with limit --- Key: HBASE-6429 URL: https://issues.apache.org/jira/browse/HBASE-6429 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.96.0 Reporter: Jason Dai Attachments: hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, hbase-6429-trunk-v4.patch, hbase-6429-trunk.patch, hbase-6429_0_94_0.patch Currently if we scan with bot limit and a Filter with filterRow(ListKeyValue) implemented, an IncompatibleFilterException will be thrown. The same exception should also be thrown if the filer has its filterRow() implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit
[ https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu reassigned HBASE-6429: - Assignee: Jie Huang Filter with filterRow() returning true is incompatible with scan with limit --- Key: HBASE-6429 URL: https://issues.apache.org/jira/browse/HBASE-6429 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.96.0 Reporter: Jason Dai Assignee: Jie Huang Attachments: hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, hbase-6429-trunk-v4.patch, hbase-6429-trunk.patch, hbase-6429_0_94_0.patch Currently if we scan with bot limit and a Filter with filterRow(ListKeyValue) implemented, an IncompatibleFilterException will be thrown. The same exception should also be thrown if the filer has its filterRow() implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6487) assign region doesn't check if the region is already assigned
[ https://issues.apache.org/jira/browse/HBASE-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427104#comment-13427104 ] ramkrishna.s.vasudevan commented on HBASE-6487: --- @Jimmy Can we also if something is in transition also here? Because internally in one of our code we are trying to use RIT also as a check? assign region doesn't check if the region is already assigned - Key: HBASE-6487 URL: https://issues.apache.org/jira/browse/HBASE-6487 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 0.96.0 Attachments: trunk-6487.patch Tried to assign a region already assigned somewhere from hbase shell, the region is assigned to a different place but the previous assignment is not closed. So it causes double assignments. In such a case, it's better to issue a warning instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427112#comment-13427112 ] bill commented on HBASE-6491: - 1. how about change the method's name to next 2. why not add the method in ResultScanner add limit function at ClientScanner --- Key: HBASE-6491 URL: https://issues.apache.org/jira/browse/HBASE-6491 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.96.0 Reporter: ronghai.ma Assignee: ronghai.ma Labels: patch Fix For: 0.96.0 Attachments: HBASE-6491.patch Get N rows from START row number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6399) MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics
[ https://issues.apache.org/jira/browse/HBASE-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427123#comment-13427123 ] Zhihong Ted Yu commented on HBASE-6399: --- I ran the two failed tests locally and they passed. Integrated to trunk. Thanks for the patch, Chunhui. Thanks for the review, Elliot and Stack. MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics - Key: HBASE-6399 URL: https://issues.apache.org/jira/browse/HBASE-6399 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6399.patch, HBASE-6399v2.patch, HBASE-6399v3.patch In hadoop-metrics.properties, GangliaContext is optional metrics context, I think we will use ganglia to monitor hbase cluster generally. However, I find a serious problem: RegionServerDynamicMetrics will generate lots of rrd file because we would move region or create/delete table. Especially if table is created everyday in some applications, there are much more and more rrd files in Gmetad Server. It will make Gmetad Server corrupted. IMO, MetricsContext should be different between RegionServerMetrics and RegionServerDynamicMetrics -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6496) Example ZK based scan policy
Lars Hofhansl created HBASE-6496: Summary: Example ZK based scan policy Key: HBASE-6496 URL: https://issues.apache.org/jira/browse/HBASE-6496 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.2 Provide an example of a RegionServer that listens to a ZK node to learn about what set of KVs can safely be deleted during a compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6497) Revisit HLog sizing and roll parameters
Lars George created HBASE-6497: -- Summary: Revisit HLog sizing and roll parameters Key: HBASE-6497 URL: https://issues.apache.org/jira/browse/HBASE-6497 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Lars George The last major update to the HLog sizing and roll features were done in HBASE-1394. I am proposing to revisit these settings to overcome recent issues where the HLog becomes a major bottleneck. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira