[jira] [Created] (HBASE-6877) Coprocessor exec result is incorrect when region is in splitting
chunhui shen created HBASE-6877: --- Summary: Coprocessor exec result is incorrect when region is in splitting Key: HBASE-6877 URL: https://issues.apache.org/jira/browse/HBASE-6877 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical When we execute the coprocessor, we will called HTable#getStartKeysInRange first and get the Keys to exec coprocessor, if then some regions are split before execCoprocessor RPC, the Keys are something wrong now, and the result we get is not integrated, for example: parent region is split into daughter region A and daughter region B, we executed coprocessor on the parent region, but the result data is only daughter region A or daughter region B -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6877) Coprocessor exec result is incorrect when region is in splitting
[ https://issues.apache.org/jira/browse/HBASE-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-6877: Attachment: HBASE-6877.patch There is a test case in the patch to show this bug Coprocessor exec result is incorrect when region is in splitting - Key: HBASE-6877 URL: https://issues.apache.org/jira/browse/HBASE-6877 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: HBASE-6877.patch When we execute the coprocessor, we will called HTable#getStartKeysInRange first and get the Keys to exec coprocessor, if then some regions are split before execCoprocessor RPC, the Keys are something wrong now, and the result we get is not integrated, for example: parent region is split into daughter region A and daughter region B, we executed coprocessor on the parent region, but the result data is only daughter region A or daughter region B -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table
[ https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462475#comment-13462475 ] chunhui shen commented on HBASE-6870: - Coprocessor exec result is incorrect if cached region location is wrong HBASE-6877 HTable#coprocessorExec always scan the whole table --- Key: HBASE-6870 URL: https://issues.apache.org/jira/browse/HBASE-6870 Project: HBase Issue Type: Improvement Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, HBASE-6870v2.patch, HBASE-6870v3.patch In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low and will affect the Regionserver carrying .META. under large coprocessorExec requests -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table
[ https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462479#comment-13462479 ] Andrew Purtell commented on HBASE-6870: --- Thanks [~zjushch]. HTable#coprocessorExec always scan the whole table --- Key: HBASE-6870 URL: https://issues.apache.org/jira/browse/HBASE-6870 Project: HBase Issue Type: Improvement Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, HBASE-6870v2.patch, HBASE-6870v3.patch In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low and will affect the Regionserver carrying .META. under large coprocessorExec requests -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6877) Coprocessor exec result is incorrect when region is in splitting
[ https://issues.apache.org/jira/browse/HBASE-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462482#comment-13462482 ] Andrew Purtell commented on HBASE-6877: --- Reissuing requests to the other daughter when a split is detected makes sense. Minor issue with the patch is by dropping actual in method names and variables, the result seems to read better. Coprocessor exec result is incorrect when region is in splitting - Key: HBASE-6877 URL: https://issues.apache.org/jira/browse/HBASE-6877 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Attachments: HBASE-6877.patch When we execute the coprocessor, we will called HTable#getStartKeysInRange first and get the Keys to exec coprocessor, if then some regions are split before execCoprocessor RPC, the Keys are something wrong now, and the result we get is not integrated, for example: parent region is split into daughter region A and daughter region B, we executed coprocessor on the parent region, but the result data is only daughter region A or daughter region B -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6875) Remove commons-httpclient, -component, and up versions on other jars (remove unused repository)
[ https://issues.apache.org/jira/browse/HBASE-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462495#comment-13462495 ] Hadoop QA commented on HBASE-6875: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546445/pom.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHCM Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2928//console This message is automatically generated. Remove commons-httpclient, -component, and up versions on other jars (remove unused repository) --- Key: HBASE-6875 URL: https://issues.apache.org/jira/browse/HBASE-6875 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: stack Assignee: stack Attachments: pom.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462526#comment-13462526 ] nkeywal commented on HBASE-6702: bq. What is this change? I've changed the interface of the resource checker, but not yet removed ResourceCheckerJUnitRule, so I've just commented the removed methods. bq. Whats this mean 'migrate the localTests to a newer version of surefire'? The log lines don't show up with surefire 2.10. It works with my patched version. But the localTests profile uses the 2.10. It's historical: I've done it this way because we don't use categories nor parallelization for localTests. The v2 should be ready for commit' and will include your comments. Thanks for the review! ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5954) Allow proper fsync support for HBase
[ https://issues.apache.org/jira/browse/HBASE-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462584#comment-13462584 ] Luke Lu commented on HBASE-5954: Hi Lars, We just noticed that HDFS-744 did not implement the correct hsync semantics (mostly due to HDFS-265) so that the hsync is slower AND (arguably) less durable than hflush in Hadoop 1.x. Allow proper fsync support for HBase Key: HBASE-5954 URL: https://issues.apache.org/jira/browse/HBASE-5954 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.3, 0.96.0 Attachments: 5954-trunk-hdfs-trunk.txt, 5954-trunk-hdfs-trunk-v2.txt, 5954-trunk-hdfs-trunk-v3.txt, 5954-trunk-hdfs-trunk-v4.txt, 5954-trunk-hdfs-trunk-v5.txt, 5954-trunk-hdfs-trunk-v6.txt, hbase-hdfs-744.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5835) [hbck] Catch and handle NotServingRegionException when close region attempt fails
[ https://issues.apache.org/jira/browse/HBASE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liang xie updated HBASE-5835: - Status: Patch Available (was: Open) seems i forgot to click submit patch... [hbck] Catch and handle NotServingRegionException when close region attempt fails - Key: HBASE-5835 URL: https://issues.apache.org/jira/browse/HBASE-5835 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.0, 0.90.7, 0.92.2, 0.96.0 Reporter: Jonathan Hsieh Attachments: HBASE-5835.patch Currently, if hbck attempts to close a region and catches a NotServerRegionException, hbck may hang outputting a stack trace. Since the goal is to close the region at a particular server, and since it is not serving the region, the region is closed, and we should just warn and eat this exception. {code} Exception in thread main org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.NotServingRegionException: Received close for regionid but we are not serving it at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2162) at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy5.closeRegion(Unknown Source) at org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:165) at org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1185) at org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1302) at org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1065) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:351) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:370) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3001) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5835) [hbck] Catch and handle NotServingRegionException when close region attempt fails
[ https://issues.apache.org/jira/browse/HBASE-5835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462665#comment-13462665 ] Hadoop QA commented on HBASE-5835: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12542730/HBASE-5835.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestLogRolling Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2929//console This message is automatically generated. [hbck] Catch and handle NotServingRegionException when close region attempt fails - Key: HBASE-5835 URL: https://issues.apache.org/jira/browse/HBASE-5835 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Reporter: Jonathan Hsieh Attachments: HBASE-5835.patch Currently, if hbck attempts to close a region and catches a NotServerRegionException, hbck may hang outputting a stack trace. Since the goal is to close the region at a particular server, and since it is not serving the region, the region is closed, and we should just warn and eat this exception. {code} Exception in thread main org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.NotServingRegionException: Received close for regionid but we are not serving it at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2162) at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) at $Proxy5.closeRegion(Unknown Source) at org.apache.hadoop.hbase.util.HBaseFsckRepair.closeRegionSilentlyAndWait(HBaseFsckRepair.java:165) at org.apache.hadoop.hbase.util.HBaseFsck.closeRegion(HBaseFsck.java:1185) at org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1302) at org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1065) at org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:351) at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:370) at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3001) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6702: --- Attachment: 6702.v4.patch ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch, 6702.v4.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6702: --- Status: Patch Available (was: Open) ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch, 6702.v4.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6737) NullPointerException at regionserver.wal.SequenceFileLogWriter.append
[ https://issues.apache.org/jira/browse/HBASE-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462701#comment-13462701 ] nkeywal commented on HBASE-6737: Stack 1: It seems to be an expected case, from the code: {code} @Override public void append(HLog.Entry entry) throws IOException { entry.setCompressionContext(compressionContext); try { this.writer.append(entry.getKey(), entry.getEdit()); } catch (NullPointerException npe) { // Concurrent close... throw new IOException(npe); } } {code} NullPointerException at regionserver.wal.SequenceFileLogWriter.append - Key: HBASE-6737 URL: https://issues.apache.org/jira/browse/HBASE-6737 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: nkeywal Priority: Critical Real cluster, scenario in HBASE-5843. There are two exceptions, I create a single JIRA with both of them. 2012-09-04 18:14:49,264 FATAL org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got while writing log entry to log java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:229) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:949) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:919) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:891) Caused by: java.lang.NullPointerException at org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026) at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068) at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:226) ... 3 more 2012-09-04 18:15:52,546 ERROR org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Error in log splitting write thread java.lang.reflect.UndeclaredThrowableException at $Proxy7.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:875) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:768) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:559) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.createWAP(HLogSplitter.java:974) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$800(HLogSplitter.java:82) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.getWriterAndPath(HLogSplitter.java:1309) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:942) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:919) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:891) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:261) ... 11 more Caused by: java.io.IOException: Call to BOX1/192.168.15.5:9000 failed on local exception: java.nio.channels.ClosedByInterruptException at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107) at org.apache.hadoop.ipc.Client.call(Client.java:1075) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy7.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy7.getFileInfo(Unknown Source) ... 15 more Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) at
[jira] [Commented] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462724#comment-13462724 ] Hadoop QA commented on HBASE-6702: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546497/6702.v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 858 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRowProcessorEndpoint Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2930//console This message is automatically generated. ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch, 6702.v4.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462726#comment-13462726 ] nkeywal commented on HBASE-6702: Seems ok... ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch, 6702.v4.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6309) [MTTR] Do NN operations outside of the ZK EventThread in SplitLogManager
[ https://issues.apache.org/jira/browse/HBASE-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462745#comment-13462745 ] nkeywal commented on HBASE-6309: I'm was having a look at this. Could we have the log archiving done by the regionserver instead of the master? This would lower the work done in the event thread? The only remaining stuff would be the renaming of the region log dir at the end. I see one impact: if the same log was processed simultaneously by multiple region server, this archiving could occur in parallel on two different region server. Manageable I think... [MTTR] Do NN operations outside of the ZK EventThread in SplitLogManager Key: HBASE-6309 URL: https://issues.apache.org/jira/browse/HBASE-6309 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0, 0.96.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.96.0 We found this issue during the leap second cataclysm which prompted a distributed splitting of all our logs. I saw that none of the RS were splitting after some time while the master was showing that it wasn't even 30% done. jstack'ing I saw this: {noformat} main-EventThread daemon prio=10 tid=0x7f6ce46d8800 nid=0x5376 in Object.wait() [0x7f6ce2ecb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.ipc.Client.call(Client.java:1093) - locked 0x0005fdd661a0 (a org.apache.hadoop.ipc.Client$Call) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226) at $Proxy9.rename(Unknown Source) at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy9.rename(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:759) at org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:253) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.moveRecoveredEditsFromTemp(HLogSplitter.java:553) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.moveRecoveredEditsFromTemp(HLogSplitter.java:519) at org.apache.hadoop.hbase.master.SplitLogManager$1.finish(SplitLogManager.java:138) at org.apache.hadoop.hbase.master.SplitLogManager.getDataSetWatchSuccess(SplitLogManager.java:431) at org.apache.hadoop.hbase.master.SplitLogManager.access$1200(SplitLogManager.java:95) at org.apache.hadoop.hbase.master.SplitLogManager$GetDataAsyncCallback.processResult(SplitLogManager.java:1011) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:571) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497) {noformat} We are effectively bottlenecking on doing NN operations and whatever else is happening in GetDataAsyncCallback. It was so bad that on our 100 offline cluster it took a few hours for the master to process all the incoming ZK events while the actual splitting took a fraction of that time. I'm marking this as critical and against 0.96 but depending on how involved the fix is we might want to backport. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6878) DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving
nkeywal created HBASE-6878: -- Summary: DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving Key: HBASE-6878 URL: https://issues.apache.org/jira/browse/HBASE-6878 Project: HBase Issue Type: Bug Components: master Reporter: nkeywal Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6878) DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving
[ https://issues.apache.org/jira/browse/HBASE-6878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-6878: --- Description: The code in SplitLogManager# getDataSetWatchSuccess is: {code} if (slt.isDone()) { LOG.info(task + path + entered state: + slt.toString()); if (taskFinisher != null !ZKSplitLog.isRescanNode(watcher, path)) { if (taskFinisher.finish(slt.getServerName(), ZKSplitLog.getFileName(path)) == Status.DONE) { setDone(path, SUCCESS); } else { resubmitOrFail(path, CHECK); } } else { setDone(path, SUCCESS); } {code} resubmitOrFail(path, CHECK); should be resubmitOrFail(path, FORCE); Without it, the task won't be resubmitted if the delay is not reached, and the task will be marked as failed. DistributerLogSplit can fail to resubmit a task done if there is an exception during the log archiving -- Key: HBASE-6878 URL: https://issues.apache.org/jira/browse/HBASE-6878 Project: HBase Issue Type: Bug Components: master Reporter: nkeywal Priority: Minor The code in SplitLogManager# getDataSetWatchSuccess is: {code} if (slt.isDone()) { LOG.info(task + path + entered state: + slt.toString()); if (taskFinisher != null !ZKSplitLog.isRescanNode(watcher, path)) { if (taskFinisher.finish(slt.getServerName(), ZKSplitLog.getFileName(path)) == Status.DONE) { setDone(path, SUCCESS); } else { resubmitOrFail(path, CHECK); } } else { setDone(path, SUCCESS); } {code} resubmitOrFail(path, CHECK); should be resubmitOrFail(path, FORCE); Without it, the task won't be resubmitted if the delay is not reached, and the task will be marked as failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462757#comment-13462757 ] nkeywal commented on HBASE-4955: Monthly update... Surefire: the regression on elapsed time is fixed on 2.12.4 (not tested). Still waiting for #800. May be it will make it to the 2.13. No date. JUnit: no life there. Still a release this quarter is likely... Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462851#comment-13462851 ] Cody Marcel commented on HBASE-5961: Nice! New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6868: - Status: Patch Available (was: Open) Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.1, 0.94.0 Reporter: LiuLei Priority: Blocker Fix For: 0.94.3, 0.96.0 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6736) Distributed Split: a split tasks can be mark as DONE but keep unassigned
[ https://issues.apache.org/jira/browse/HBASE-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462897#comment-13462897 ] nkeywal commented on HBASE-6736: There are multiple synchro issues. One of them is {code} @Override protected void chore() { // [...] for (Map.EntryString, Task e : tasks.entrySet()) { {code} As we're iterating over a set that can be modified we can have reliability issues, cf. javadoc: If the map is modified while an iteration over the set is in progress (except through the iterator's own remove operation, or through the setValue operation on a map entry returned by the iterator) the results of the iteration are undefined. Distributed Split: a split tasks can be mark as DONE but keep unassigned Key: HBASE-6736 URL: https://issues.apache.org/jira/browse/HBASE-6736 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.96.0 Reporter: nkeywal Real cluster, scenario mentioned on HBASE-5843. Got it once out of 5 tests on 0.96 Didn't get it on 0.94 after 3 tests. It seems we have a race condition on split logs: the task was nearly simultaneously marked as done and resubmitted. Then it remained in the unassigned state. 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 0 2012-09-04 17:27:06,237 INFO org.apache.hadoop.hbase.master.SplitLogManager: resubmitted 1 out of 1 tasks 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346772046399-splitting%2FBOX0%252C60020%252C1346772046399.1346772046609 ver = 7 2012-09-04 17:27:06,314 INFO org.apache.hadoop.hbase.master.SplitLogManager: task /hbase/splitlog/RESCAN02 entered state: DONE BOX1,6,1346771990737 2012-09-04 17:27:06,337 DEBUG org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted /hbase/splitlog/RESCAN02 2012-09-04 17:27:06,337 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: deleted task without in memory state /hbase/splitlog/RESCAN02 2012-09-04 17:27:07,226 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1 unassigned = 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462923#comment-13462923 ] Hadoop QA commented on HBASE-6868: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546437/6868-0.96-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2931//console This message is automatically generated. Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Priority: Blocker Fix For: 0.94.3, 0.96.0 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13462925#comment-13462925 ] Lars Hofhansl commented on HBASE-6868: -- I looked through the run, nothing stuck out... All the tests passed. I'll do some manual testing today and then commit. Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Priority: Blocker Fix For: 0.94.3, 0.96.0 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6868: - Status: Open (was: Patch Available) Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.1, 0.94.0 Reporter: LiuLei Priority: Blocker Fix For: 0.94.3, 0.96.0 Attachments: 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6868: - Attachment: 6868-0.94.txt Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6868: - Fix Version/s: (was: 0.94.3) 0.94.2 Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463017#comment-13463017 ] Lars Hofhansl commented on HBASE-6868: -- I manually did these tests (0.94 patch): * started HBase with HBase checksums off, inserted some data, flushed, compacted, scanned * restarted HBase with HBase checksums on, inserted some more data, flush/compacted, scanned * restarted HBase again with HBase checksums off, inserted some more data, flush/compacted, scanned Checked the logs for anything weird. Looks good. Going to commit to 0.94 and 0.96. Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-6868. -- Resolution: Fixed Assignee: Lars Hofhansl Hadoop Flags: Reviewed Committed to 0.94 and 0.96. Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()
[ https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6851: - Fix Version/s: (was: 0.94.3) 0.94.2 Race condition in TableAuthManager.updateGlobalCache() -- Key: HBASE-6851 URL: https://issues.apache.org/jira/browse/HBASE-6851 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.1, 0.96.0 Reporter: Gary Helmling Assignee: Gary Helmling Priority: Critical Fix For: 0.94.2, 0.96.0 Attachments: HBASE-6851_2.patch, HBASE-6851_3.patch, HBASE-6851.patch When new global permissions are assigned, there is a race condition, during which further authorization checks relying on global permissions may fail. In TableAuthManager.updateGlobalCache(), we have: {code:java} USER_CACHE.clear(); GROUP_CACHE.clear(); try { initGlobal(conf); } catch (IOException e) { // Never happens LOG.error(Error occured while updating the user cache, e); } for (Map.EntryString,TablePermission entry : userPerms.entries()) { if (AccessControlLists.isGroupPrincipal(entry.getKey())) { GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()), new Permission(entry.getValue().getActions())); } else { USER_CACHE.put(entry.getKey(), new Permission(entry.getValue().getActions())); } } {code} If authorization checks come in following the .clear() but before repopulating, they will fail. We should have some synchronization here to serialize multiple updates and use a COW type rebuild and reassign of the new maps. This particular issue crept in with the fix in HBASE-6157, so I'm flagging for 0.94 and 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally
[ https://issues.apache.org/jira/browse/HBASE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6784: - Fix Version/s: (was: 0.94.3) 0.94.2 TestCoprocessorScanPolicy is sometimes flaky when run locally - Key: HBASE-6784 URL: https://issues.apache.org/jira/browse/HBASE-6784 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.2, 0.96.0 Attachments: 6784.txt The problem is not seen in jenkins build. When we run TestCoprocessorScanPolicy.testBaseCases locally or in our internal jenkins we tend to get random failures. The reason is the 2 puts that we do here is sometimes getting the same timestamp. This is leading to improper scan results as the version check tends to skip one of the row seeing the timestamp to be same. Marking this as minor. As we are trying to solve testcase related failures just raising this incase we need to resolve this also. For eg, Both the puts are getting the time {code} time 1347635287360 time 1347635287360 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table
[ https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463031#comment-13463031 ] Himanshu Vashishtha commented on HBASE-6870: Looked at the patch: Can you make the these two if statements in-line {code} +if (Bytes.compareTo(start, startKeys[i]) = 0) { + if (Bytes.equals(endKeys[i], HConstants.EMPTY_END_ROW) + || Bytes.compareTo(start, endKeys[i]) 0) { +rangeKeys.add(start); + } {code} Can it be private? {code} + public LinkedHashMapbyte[], HRegionLocation getKeysToRegionsInRange( {code} Re: Andrew's concern regarding cache use: 6877 will take care of region move too? cache may become stale for reasons other than splits too. Will look at 6877. HTable#coprocessorExec always scan the whole table --- Key: HBASE-6870 URL: https://issues.apache.org/jira/browse/HBASE-6870 Project: HBase Issue Type: Improvement Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, HBASE-6870v2.patch, HBASE-6870v3.patch In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low and will affect the Regionserver carrying .META. under large coprocessorExec requests -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6854) Deletion of SPLITTING node on split rollback should clear the region from RIT
[ https://issues.apache.org/jira/browse/HBASE-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463037#comment-13463037 ] ramkrishna.s.vasudevan commented on HBASE-6854: --- I found that the testcase added with this is sometimes failing. Seems there is something in the AM and the way the watcher is set. I will debug it and then commit the patch though it is only a testcase change. Deletion of SPLITTING node on split rollback should clear the region from RIT - Key: HBASE-6854 URL: https://issues.apache.org/jira/browse/HBASE-6854 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Fix For: 0.94.3 Attachments: HBASE-6854.patch If a failure happens in split before OFFLINING_PARENT, we tend to rollback the split including deleting the znodes created. On deletion of the RS_ZK_SPLITTING node we are getting a callback but not remvoving from RIT. We need to remove it from RIT, anyway SSH logic is well guarded in case the delete event comes due to RS down scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6853) IllegalArgument Exception is thrown when an empty region is spliitted.
[ https://issues.apache.org/jira/browse/HBASE-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463039#comment-13463039 ] ramkrishna.s.vasudevan commented on HBASE-6853: --- @Stack Can we commit patch 1? IllegalArgument Exception is thrown when an empty region is spliitted. -- Key: HBASE-6853 URL: https://issues.apache.org/jira/browse/HBASE-6853 Project: HBase Issue Type: Bug Affects Versions: 0.92.1, 0.94.1 Reporter: ramkrishna.s.vasudevan Attachments: HBASE-6853_2_splitsuccess.patch, HBASE-6853_splitfailure.patch This is w.r.t a mail sent in the dev mail list. Empty region split should be handled gracefully. Either we should not allow the split to happen if we know that the region is empty or we should allow the split to happen by setting the no of threads to the thread pool executor as 1. {code} int nbFiles = hstoreFilesToSplit.size(); ThreadFactoryBuilder builder = new ThreadFactoryBuilder(); builder.setNameFormat(StoreFileSplitter-%1$d); ThreadFactory factory = builder.build(); ThreadPoolExecutor threadPool = (ThreadPoolExecutor) Executors.newFixedThreadPool(nbFiles, factory); ListFutureVoid futures = new ArrayListFutureVoid(nbFiles); {code} Here the nbFiles needs to be a non zero positive value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463047#comment-13463047 ] stack commented on HBASE-5961: -- I committed this formatter under dev-support and I added how to install doc from HBASE-3678. New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5961. -- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Committed to trunk. Thanks for the patch Jesse. New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463063#comment-13463063 ] Jesse Yates commented on HBASE-6702: Good stuff keywal! Just a couple comments: {code} + artifactIdhbase-common/artifactId + version${project.version}/version + typetest-jar/type + scopetest/scope +/dependency +dependency {code} To keep DRY, the aboves should go into hbase/pom.xml's dependencyManagement section and then the children projects should just use: {code} + artifactIdhbase-common/artifactId + typetest-jar/type +/dependency +dependency {code} Also, any chance for some javadocs on things like: {code} + public ResourceChecker(String tagLine) { +this.tagLine = tagLine; + } {code} Otherwise, this is a really sweet add. ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch, 6702.v4.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common
[ https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463064#comment-13463064 ] Jesse Yates commented on HBASE-6637: As mentioned, failing tests passed locally... Move DaemonThreadFactory into Threads and Threads to hbase-common - Key: HBASE-6637 URL: https://issues.apache.org/jira/browse/HBASE-6637 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, hbase-6637-v0.patch, hbase-6637-v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463079#comment-13463079 ] Hudson commented on HBASE-6868: --- Integrated in HBase-0.94-security #57 (See [https://builds.apache.org/job/HBase-0.94-security/57/]) HBASE-6868 Skip checksum is broke; are we double-checksumming by default? (Revision 1390012) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()
[ https://issues.apache.org/jira/browse/HBASE-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463080#comment-13463080 ] Hudson commented on HBASE-6851: --- Integrated in HBase-0.94-security #57 (See [https://builds.apache.org/job/HBase-0.94-security/57/]) HBASE-6851 Fix race condition in TableAuthManager.updateGlobalCache() (Revision 1388898) Result = SUCCESS garyh : Files : * /hbase/branches/0.94/security/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java * /hbase/branches/0.94/security/src/test/java/org/apache/hadoop/hbase/security/access/TestTablePermissions.java Race condition in TableAuthManager.updateGlobalCache() -- Key: HBASE-6851 URL: https://issues.apache.org/jira/browse/HBASE-6851 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.1, 0.96.0 Reporter: Gary Helmling Assignee: Gary Helmling Priority: Critical Fix For: 0.94.2, 0.96.0 Attachments: HBASE-6851_2.patch, HBASE-6851_3.patch, HBASE-6851.patch When new global permissions are assigned, there is a race condition, during which further authorization checks relying on global permissions may fail. In TableAuthManager.updateGlobalCache(), we have: {code:java} USER_CACHE.clear(); GROUP_CACHE.clear(); try { initGlobal(conf); } catch (IOException e) { // Never happens LOG.error(Error occured while updating the user cache, e); } for (Map.EntryString,TablePermission entry : userPerms.entries()) { if (AccessControlLists.isGroupPrincipal(entry.getKey())) { GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()), new Permission(entry.getValue().getActions())); } else { USER_CACHE.put(entry.getKey(), new Permission(entry.getValue().getActions())); } } {code} If authorization checks come in following the .clear() but before repopulating, they will fail. We should have some synchronization here to serialize multiple updates and use a COW type rebuild and reassign of the new maps. This particular issue crept in with the fix in HBASE-6157, so I'm flagging for 0.94 and 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6784) TestCoprocessorScanPolicy is sometimes flaky when run locally
[ https://issues.apache.org/jira/browse/HBASE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463081#comment-13463081 ] Hudson commented on HBASE-6784: --- Integrated in HBase-0.94-security #57 (See [https://builds.apache.org/job/HBase-0.94-security/57/]) HBASE-6784 TestCoprocessorScanPolicy is sometimes flaky when run locally (Revision 1389619) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestCoprocessorScanPolicy.java TestCoprocessorScanPolicy is sometimes flaky when run locally - Key: HBASE-6784 URL: https://issues.apache.org/jira/browse/HBASE-6784 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.2, 0.96.0 Attachments: 6784.txt The problem is not seen in jenkins build. When we run TestCoprocessorScanPolicy.testBaseCases locally or in our internal jenkins we tend to get random failures. The reason is the 2 puts that we do here is sometimes getting the same timestamp. This is leading to improper scan results as the version check tends to skip one of the row seeing the timestamp to be same. Marking this as minor. As we are trying to solve testcase related failures just raising this incase we need to resolve this also. For eg, Both the puts are getting the time {code} time 1347635287360 time 1347635287360 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common
[ https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6637: - Resolution: Fixed Status: Resolved (was: Patch Available) Move DaemonThreadFactory into Threads and Threads to hbase-common - Key: HBASE-6637 URL: https://issues.apache.org/jira/browse/HBASE-6637 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, hbase-6637-v0.patch, hbase-6637-v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common
[ https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6637: - Committed to 0.96 (for the new files first, added those in a 2nd commit). Move DaemonThreadFactory into Threads and Threads to hbase-common - Key: HBASE-6637 URL: https://issues.apache.org/jira/browse/HBASE-6637 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, hbase-6637-v0.patch, hbase-6637-v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463090#comment-13463090 ] Hudson commented on HBASE-6868: --- Integrated in HBase-0.94 #488 (See [https://builds.apache.org/job/HBase-0.94/488/]) HBASE-6868 Skip checksum is broke; are we double-checksumming by default? (Revision 1390012) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-6879) Add HBase Code Template
[ https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates reassigned HBASE-6879: -- Assignee: Jesse Yates Add HBase Code Template --- Key: HBASE-6879 URL: https://issues.apache.org/jira/browse/HBASE-6879 Project: HBase Issue Type: Bug Components: build, documentation Reporter: Jesse Yates Assignee: Jesse Yates Add a standard code template to do along with the code formatter for HBase. This helps make sure people have the correct license and general commenting for auto-generated elements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6879) Add HBase Code Template
Jesse Yates created HBASE-6879: -- Summary: Add HBase Code Template Key: HBASE-6879 URL: https://issues.apache.org/jira/browse/HBASE-6879 Project: HBase Issue Type: Bug Components: build, documentation Reporter: Jesse Yates Add a standard code template to do along with the code formatter for HBase. This helps make sure people have the correct license and general commenting for auto-generated elements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6879) Add HBase Code Template
[ https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-6879: --- Attachment: HBase Code Template.xml Attaching template to go into hbase/dev-support. Easier to see this way than as an actual patch. Add HBase Code Template --- Key: HBASE-6879 URL: https://issues.apache.org/jira/browse/HBASE-6879 Project: HBase Issue Type: Bug Components: build, documentation Reporter: Jesse Yates Assignee: Jesse Yates Attachments: HBase Code Template.xml Add a standard code template to do along with the code formatter for HBase. This helps make sure people have the correct license and general commenting for auto-generated elements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6879) Add HBase Code Template
[ https://issues.apache.org/jira/browse/HBASE-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463098#comment-13463098 ] Jesse Yates commented on HBASE-6879: [~saint@gmail.com] here's a stab at a code template to go with the formmatter from HBASE-5961 Add HBase Code Template --- Key: HBASE-6879 URL: https://issues.apache.org/jira/browse/HBASE-6879 Project: HBase Issue Type: Bug Components: build, documentation Reporter: Jesse Yates Assignee: Jesse Yates Attachments: HBase Code Template.xml Add a standard code template to do along with the code formatter for HBase. This helps make sure people have the correct license and general commenting for auto-generated elements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common
[ https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463103#comment-13463103 ] Hudson commented on HBASE-6637: --- Integrated in HBase-TRUNK #3377 (See [https://builds.apache.org/job/HBase-TRUNK/3377/]) HBASE-6637 Argghh... Missed deleted files too (Revision 1390040) HBASE-6637 Missed new files (Revision 1390035) HBASE-6637 Move DaemonThreadFactory into Threads and Threads to hbase-common (Jesse Yates) (Revision 1390034) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Threads.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java larsh : Files : * /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java * /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java Move DaemonThreadFactory into Threads and Threads to hbase-common - Key: HBASE-6637 URL: https://issues.apache.org/jira/browse/HBASE-6637 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, hbase-6637-v0.patch, hbase-6637-v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3678) Add Eclipse-based Apache Formatter to HBase Wiki
[ https://issues.apache.org/jira/browse/HBASE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463104#comment-13463104 ] Hudson commented on HBASE-3678: --- Integrated in HBase-TRUNK #3377 (See [https://builds.apache.org/job/HBase-TRUNK/3377/]) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026) Result = FAILURE stack : Files : * /hbase/trunk/src/docbkx/developer.xml stack : Files : * /hbase/trunk/dev-support/hbase_eclipse_formatter.xml * /hbase/trunk/src/docbkx/developer.xml * /hbase/trunk/src/docbkx/troubleshooting.xml Add Eclipse-based Apache Formatter to HBase Wiki Key: HBASE-3678 URL: https://issues.apache.org/jira/browse/HBASE-3678 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Trivial Fix For: 0.92.0 Attachments: eclipse_formatter_apache.xml Currently, on http://wiki.apache.org/hadoop/Hbase/HowToContribute , we tell the user to follow Sun's code conventions and then add a couple things. For lazy people like myself, it would be much easier to just tell us to import an Apache formatter into your Eclipse project and not worry about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463105#comment-13463105 ] Hudson commented on HBASE-6868: --- Integrated in HBase-TRUNK #3377 (See [https://builds.apache.org/job/HBase-TRUNK/3377/]) HBASE-6868 Skip checksum is broke; are we double-checksumming by default? (Revision 1390013) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5691) Importtsv stops the webservice from which it is evoked
[ https://issues.apache.org/jira/browse/HBASE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463106#comment-13463106 ] Hudson commented on HBASE-5691: --- Integrated in HBase-TRUNK #3377 (See [https://builds.apache.org/job/HBase-TRUNK/3377/]) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026) Result = FAILURE stack : Files : * /hbase/trunk/src/docbkx/developer.xml stack : Files : * /hbase/trunk/dev-support/hbase_eclipse_formatter.xml * /hbase/trunk/src/docbkx/developer.xml * /hbase/trunk/src/docbkx/troubleshooting.xml Importtsv stops the webservice from which it is evoked -- Key: HBASE-5691 URL: https://issues.apache.org/jira/browse/HBASE-5691 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: debarshi basak Priority: Minor I was trying to run importtsv from a servlet. Everytime after the completion of job, the tomcat server was shutdown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5961) New standard HBase code formatter
[ https://issues.apache.org/jira/browse/HBASE-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463109#comment-13463109 ] Jesse Yates commented on HBASE-5961: hmmm, looks like we might need to add this to the rat excludes file too. New standard HBase code formatter - Key: HBASE-5961 URL: https://issues.apache.org/jira/browse/HBASE-5961 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: HBase-Formmatter.xml There is currently no good way of passing out the formmatter currently the 'standard' in HBase. The standard Apache formatter is actually not very close to what we are considering 'good'/'pretty' code. Further, its not trivial to get a good formatter setup. Proposing two things: 1) Adding a formmatter to the dev tools and calling out the formmatter usage in the docs 2) Move to a 'better' formmatter that is not the standard apache formmatter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older
[ https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463115#comment-13463115 ] Lars Hofhansl commented on HBASE-6401: -- Hadoop-2 has other issues, though (see last few comments on HDFS-744). HBase may lose edits after a crash if used with HDFS 1.0.3 or older --- Key: HBASE-6401 URL: https://issues.apache.org/jira/browse/HBASE-6401 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Critical Attachments: TestReadAppendWithDeadDN.java This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the hdfs jira for this. Context: HBase Write Ahead Log features. This is using hdfs append. If the node crashes, the file that was written is read by other processes to replay the action. - So we have in hdfs one (dead) process writing with another process reading. - But, despite the call to syncFs, we don't always see the data when we have a dead node. It seems to be because the call in DFSClient#updateBlockInfo ignores the ipc errors and set the length to 0. - So we may miss all the writes to the last block if we try to connect to the dead DN. hdfs 1.0.3, branch-1 or branch-1-win: we have the issue http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup hdfs branch-2 or trunk: we should not have the issue (but not tested) http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup The attached test will fail ~50 of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6880) Failure in assigning root causes system hang
Jimmy Xiang created HBASE-6880: -- Summary: Failure in assigning root causes system hang Key: HBASE-6880 URL: https://issues.apache.org/jira/browse/HBASE-6880 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang In looking into a TestReplication failure, I found out sometimes assignRoot could fail, for example, RS is not serving traffic yet. In this case, the master will keep waiting for root to be available, which could never happen. Need to gracefully terminate master if root is not assigned properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6401) HBase may lose edits after a crash if used with HDFS 1.0.3 or older
[ https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463144#comment-13463144 ] nkeywal commented on HBASE-6401: HDFS-3701 has just been fixed, so we may have a reasonable hdfs 1.1 version as HDFS-3703 made it as well. We need HDFS-3912 to be complete from a failure management point of view. Then there is the question of durability... HBase may lose edits after a crash if used with HDFS 1.0.3 or older --- Key: HBASE-6401 URL: https://issues.apache.org/jira/browse/HBASE-6401 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.96.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Critical Attachments: TestReadAppendWithDeadDN.java This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the hdfs jira for this. Context: HBase Write Ahead Log features. This is using hdfs append. If the node crashes, the file that was written is read by other processes to replay the action. - So we have in hdfs one (dead) process writing with another process reading. - But, despite the call to syncFs, we don't always see the data when we have a dead node. It seems to be because the call in DFSClient#updateBlockInfo ignores the ipc errors and set the length to 0. - So we may miss all the writes to the last block if we try to connect to the dead DN. hdfs 1.0.3, branch-1 or branch-1-win: we have the issue http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853view=markup hdfs branch-2 or trunk: we should not have the issue (but not tested) http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup The attached test will fail ~50 of the time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6881) All regionservers are marked offline even there is still one up
Jimmy Xiang created HBASE-6881: -- Summary: All regionservers are marked offline even there is still one up Key: HBASE-6881 URL: https://issues.apache.org/jira/browse/HBASE-6881 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang This is an issue caused by HBASE-6438: {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up
[ https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6881: --- Description: {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. was: This is an issue caused by HBASE-6438: {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. All regionservers are marked offline even there is still one up --- Key: HBASE-6881 URL: https://issues.apache.org/jira/browse/HBASE-6881 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6881) All regionservers are marked offline even there is still one up
[ https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463162#comment-13463162 ] Jimmy Xiang commented on HBASE-6881: This is NOT an issue caused by HBASE-6438 actually. I fixed the description. It is an existing issue. During unit test, there could be just one region server. This can lead to HBASE-6880, and hanging tests. All regionservers are marked offline even there is still one up --- Key: HBASE-6881 URL: https://issues.apache.org/jira/browse/HBASE-6881 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6882) Thrift IOError should include exception class
Mikhail Bautin created HBASE-6882: - Summary: Thrift IOError should include exception class Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-6882: --- Attachment: D5679.1.patch mbautin requested code review of [jira] [HBASE-6882] [89-fb] Thrift IOError should include exception class. Reviewers: Liyin, Karthik, aaiyer, chip, JIRA Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. TEST PLAN Unit tests Test through C++ HBase client REVISION DETAIL https://reviews.facebook.net/D5679 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/RegionException.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java src/main/java/org/apache/hadoop/hbase/thrift/generated/IOError.java src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/13341/ To: Liyin, Karthik, aaiyer, chip, JIRA, mbautin Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up
[ https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6881: --- Attachment: trunk-6881.patch All regionservers are marked offline even there is still one up --- Key: HBASE-6881 URL: https://issues.apache.org/jira/browse/HBASE-6881 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6881.patch {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6881) All regionservers are marked offline even there is still one up
[ https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6881: --- Status: Patch Available (was: Open) All regionservers are marked offline even there is still one up --- Key: HBASE-6881 URL: https://issues.apache.org/jira/browse/HBASE-6881 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6881.patch {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463218#comment-13463218 ] Mikhail Bautin commented on HBASE-6882: --- Phabricator diff for 0.89-fb: https://reviews.facebook.net/D5679 Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5456) Introduce PowerMock into our unit tests to reduce unnecessary method exposure
[ https://issues.apache.org/jira/browse/HBASE-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-5456: --- Attachment: hbase-5456-v0.patch Attaching patch to add jmockit and powermock to the test depdendencies. For more discussion and examples of why its the right way to go, see http://search-hadoop.com/m/HbsjjRSKLc2 Introduce PowerMock into our unit tests to reduce unnecessary method exposure - Key: HBASE-5456 URL: https://issues.apache.org/jira/browse/HBASE-5456 Project: HBase Issue Type: Task Reporter: Ted Yu Attachments: hbase-5456-v0.patch We should introduce PowerMock into our unit tests so that we don't have to expose methods intended to be used by unit tests. Here was Benoit's reply to a user of asynchbase about testability: OpenTSDB has unit tests that are mocking out HBaseClient just fine [1]. You can mock out pretty much anything on the JVM: final, private, JDK stuff, etc. All you need is the right tools. I've been very happy with PowerMock. It supports Mockito and EasyMock. I've never been keen on mutilating public interfaces for the sake of testing. With tools like PowerMock, we can keep the public APIs tidy while mocking and overriding anything, even in the most private guts of the classes. [1] https://github.com/stumbleupon/opentsdb/blob/master/src/uid/TestUniqueId.java#L66 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6883) CleanerChore treats .archive as a table and throws TableInfoMissingException
Jimmy Xiang created HBASE-6883: -- Summary: CleanerChore treats .archive as a table and throws TableInfoMissingException Key: HBASE-6883 URL: https://issues.apache.org/jira/browse/HBASE-6883 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang {noformat} 2012-09-25 14:52:21,902 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during readTableDecriptor. Current table name = .archive org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs://c0322.hal.cloudera.com:56020/hbase/.archive at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:417) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptor(FSTableDescriptors.java:408) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:170) at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:201) at org.apache.hadoop.hbase.master.HMaster.getTableDescriptors(HMaster.java:2205) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.ProtobufRpcEngine$Server.call(ProtobufRpcEngine.java:357) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1816) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463294#comment-13463294 ] Phabricator commented on HBASE-6882: Liyin has accepted the revision [jira] [HBASE-6882] [89-fb] Thrift IOError should include exception class. LGTM ! REVISION DETAIL https://reviews.facebook.net/D5679 BRANCH ioerror_class_name To: Liyin, Karthik, aaiyer, chip, JIRA, mbautin Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6881) All regionservers are marked offline even there is still one up
[ https://issues.apache.org/jira/browse/HBASE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463307#comment-13463307 ] Hadoop QA commented on HBASE-6881: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546581/trunk-6881.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 140 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2932//console This message is automatically generated. All regionservers are marked offline even there is still one up --- Key: HBASE-6881 URL: https://issues.apache.org/jira/browse/HBASE-6881 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: trunk-6881.patch {noformat} +RegionPlan newPlan = plan; +if (!regionAlreadyInTransitionException) { + // Force a new plan and reassign. Will return null if no servers. + newPlan = getRegionPlan(state, plan.getDestination(), true); +} +if (newPlan == null) { this.timeoutMonitor.setAllRegionServersOffline(true); LOG.warn(Unable to find a viable location to assign region + state.getRegion().getRegionNameAsString()); {noformat} Here, when newPlan is null, plan.getDestination() could be up actually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6424) TestReplication frequently hangs
[ https://issues.apache.org/jira/browse/HBASE-6424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463308#comment-13463308 ] Jimmy Xiang commented on HBASE-6424: May relate to HBASE-6880 TestReplication frequently hangs Key: HBASE-6424 URL: https://issues.apache.org/jira/browse/HBASE-6424 Project: HBase Issue Type: Bug Components: Replication, test Affects Versions: 0.94.0 Reporter: Andrew Purtell Attachments: testReplication.jstack TestReplication frequently hangs. Separated out from HBASE-6406. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6572) Tiered HFile storage
[ https://issues.apache.org/jira/browse/HBASE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6572: -- Description: Consider how we might enable tiered HFile storage. If HDFS has the capability, we could create certain files on solid state devices where they might be frequently accessed, especially for random reads; and others (and by default) on spinning media as before. We could support the move of frequently read HFiles from spinning media to solid state. We already have CF statistics for this, would only need to add requisite admin interface; could even consider an autotiering option. Dhruba Borthakur did some early work in this area and wrote up his findings: http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . It is important to note the findings but I suggest most of the recommendations are out of scope of this JIRA. This JIRA seeks to find an initial use case that produces a reasonable benefit, and serves as a testbed for further improvements. If I may paraphrase Dhruba's findings (any misstatements and errors are mine): First, the DFSClient code paths introduce significant latency, so the HDFS client (and presumably the DataNode, as the next bottleneck) will need significant work to knock that down. Need to investigate optimized (perhaps read-only) DFS clients, server side read and caching strategies. Second, RegionServers are heavily threaded and this imposes a lot of monitor contention and context switching cost. Need to investigate reducing the number of threads in a RegionServer, nonblocking IO and RPC. was:Consider how we might enable tiered HFile storage. If HDFS has the capability, we could create certain files on solid state devices where they might be frequently accessed, especially for random reads; and others (and by default) on spinning media as before. We could support the move of frequently read HFiles from spinning media to solid state. We already have CF statistics for this, would only need to add requisite admin interface; could even consider an autotiering option. Tiered HFile storage Key: HBASE-6572 URL: https://issues.apache.org/jira/browse/HBASE-6572 Project: HBase Issue Type: Brainstorming Reporter: Andrew Purtell Assignee: Andrew Purtell Consider how we might enable tiered HFile storage. If HDFS has the capability, we could create certain files on solid state devices where they might be frequently accessed, especially for random reads; and others (and by default) on spinning media as before. We could support the move of frequently read HFiles from spinning media to solid state. We already have CF statistics for this, would only need to add requisite admin interface; could even consider an autotiering option. Dhruba Borthakur did some early work in this area and wrote up his findings: http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . It is important to note the findings but I suggest most of the recommendations are out of scope of this JIRA. This JIRA seeks to find an initial use case that produces a reasonable benefit, and serves as a testbed for further improvements. If I may paraphrase Dhruba's findings (any misstatements and errors are mine): First, the DFSClient code paths introduce significant latency, so the HDFS client (and presumably the DataNode, as the next bottleneck) will need significant work to knock that down. Need to investigate optimized (perhaps read-only) DFS clients, server side read and caching strategies. Second, RegionServers are heavily threaded and this imposes a lot of monitor contention and context switching cost. Need to investigate reducing the number of threads in a RegionServer, nonblocking IO and RPC. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6637) Move DaemonThreadFactory into Threads and Threads to hbase-common
[ https://issues.apache.org/jira/browse/HBASE-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463344#comment-13463344 ] Hudson commented on HBASE-6637: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/]) HBASE-6637 Argghh... Missed deleted files too (Revision 1390040) HBASE-6637 Missed new files (Revision 1390035) HBASE-6637 Move DaemonThreadFactory into Threads and Threads to hbase-common (Jesse Yates) (Revision 1390034) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Threads.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java larsh : Files : * /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java * /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestThreads.java larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java Move DaemonThreadFactory into Threads and Threads to hbase-common - Key: HBASE-6637 URL: https://issues.apache.org/jira/browse/HBASE-6637 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.96.0 Attachments: hbase-6637-r1.patch, hbase-6637-r1.patch, hbase-6637-v0.patch, hbase-6637-v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3678) Add Eclipse-based Apache Formatter to HBase Wiki
[ https://issues.apache.org/jira/browse/HBASE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463345#comment-13463345 ] Hudson commented on HBASE-3678: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/]) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026) Result = FAILURE stack : Files : * /hbase/trunk/src/docbkx/developer.xml stack : Files : * /hbase/trunk/dev-support/hbase_eclipse_formatter.xml * /hbase/trunk/src/docbkx/developer.xml * /hbase/trunk/src/docbkx/troubleshooting.xml Add Eclipse-based Apache Formatter to HBase Wiki Key: HBASE-3678 URL: https://issues.apache.org/jira/browse/HBASE-3678 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Trivial Fix For: 0.92.0 Attachments: eclipse_formatter_apache.xml Currently, on http://wiki.apache.org/hadoop/Hbase/HowToContribute , we tell the user to follow Sun's code conventions and then add a couple things. For lazy people like myself, it would be much easier to just tell us to import an Apache formatter into your Eclipse project and not worry about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5691) Importtsv stops the webservice from which it is evoked
[ https://issues.apache.org/jira/browse/HBASE-5691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463347#comment-13463347 ] Hudson commented on HBASE-5691: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/]) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390028) HBASE-5691 and HBASE-3678 New standard HBase code formatter AND Add Eclipse-based Apache Formatter to HBase Wiki (Revision 1390026) Result = FAILURE stack : Files : * /hbase/trunk/src/docbkx/developer.xml stack : Files : * /hbase/trunk/dev-support/hbase_eclipse_formatter.xml * /hbase/trunk/src/docbkx/developer.xml * /hbase/trunk/src/docbkx/troubleshooting.xml Importtsv stops the webservice from which it is evoked -- Key: HBASE-5691 URL: https://issues.apache.org/jira/browse/HBASE-5691 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: debarshi basak Priority: Minor I was trying to run importtsv from a servlet. Everytime after the completion of job, the tomcat server was shutdown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6868) Skip checksum is broke; are we double-checksumming by default?
[ https://issues.apache.org/jira/browse/HBASE-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463346#comment-13463346 ] Hudson commented on HBASE-6868: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #192 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/192/]) HBASE-6868 Skip checksum is broke; are we double-checksumming by default? (Revision 1390013) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Skip checksum is broke; are we double-checksumming by default? -- Key: HBASE-6868 URL: https://issues.apache.org/jira/browse/HBASE-6868 Project: HBase Issue Type: Bug Components: HFile, wal Affects Versions: 0.94.0, 0.94.1 Reporter: LiuLei Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.2, 0.96.0 Attachments: 6868-0.94.txt, 6868-0.96-idea.txt, 6868-0.96-v2.txt, 6868-0.96-v3.txt The HFile contains checksums for decrease the iops, so when Hbase read HFile , that dont't need to read the checksum from meta file of HDFS. But HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must read checksum from meta file of HDFS. We could add setSkipChecksum per file to hdfs or we could write checksums into WAL if this skip checksum facility is enabled -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6353) Snapshots shell
[ https://issues.apache.org/jira/browse/HBASE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-6353: --- Issue Type: Sub-task (was: New Feature) Parent: HBASE-6055 Snapshots shell --- Key: HBASE-6353 URL: https://issues.apache.org/jira/browse/HBASE-6353 Project: HBase Issue Type: Sub-task Components: shell Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-6353-v0.patch h6. hbase shell with snapshot commands * snapshot snapshot name table name ** Take a snapshot of the specified name with the specified name * restore_snapshot snapshot name ** Restore specified snapshot on the original table * mount_snapshot snapshot name table name [readonly] ** Load the snapshot data as specified table (optional readonly flag) * list_snapshots [filter] ** Show a list of snapshots * delete_snapshot snapshot name ** Remove a specified snapshot h6. Restore Table Given a snapshot name restore override the original table with the snapshot content. Before restoring a new snapshot of the table is taken, just to avoid bad situations. (If the table is not disabled we can keep serving reads) This allows a full and quick rollback to a previous snapshot. h6. Mount Table (Aka Clone Table) Given a snapshot name a new table is created with the content of the specified snapshot. This operation allows: * To have an old version of the table in parallel with the current one. ** Look at snapshot side-by-side with the current before making the decision whether to roll back or not * To Restore only individual items (only some small range of data was lost from current) ** MR job that scan the cloned table and update the data in the original one. (Partial restore of the data) * if the table is not marked as read-only ** To Add/Remove data from this table without affecting the original one or the snapshot. h6. Open points * Add snapshot type option on take snapshot command (global, timestamp)? * Keep separate the restore from mount? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface
[ https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6025: - Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Applied to trunk. Thanks for the patch Elliott. Expose Hadoop Dynamic Metrics through JSON Rest interface - Key: HBASE-6025 URL: https://issues.apache.org/jira/browse/HBASE-6025 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 0.96.0 Attachments: HBASE-6025-0.patch, HBASE-6025-1.patch, HBASE-6025-2.patch, HBASE-6025-3.patch, HBASE-6025-4.patch, hbase-jmx2.patch, hbase-jmx.patch, hbase-jmx.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash
[ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463401#comment-13463401 ] Jean-Daniel Cryans commented on HBASE-5844: --- One thing that worries about this patch is the situation where the pid file is gone and someone tries to start the region server. It happened to me a bunch of times. I tried it with you patch and since it removes ephemeral znode it _kills_ the region server that's already running and doesn't start a new one because the ports are already occupied. I'm not sure if this is related to this patch, but we're now missing info when using the scripts. We used to have: {noformat} su-jdcryans-2:0.94 jdcryans$ ./bin/start-hbase.sh localhost: starting zookeeper, logging to /Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-zookeeper-h-25-185.sfo.stumble.net.out starting master, logging to /Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-master-h-25-185.sfo.stumble.net.out localhost: starting regionserver, logging to /Users/jdcryans/Work/HBase/0.94/bin/../logs/hbase-jdcryans-regionserver-h-25-185.sfo.stumble.net.out {noformat} Now we have: {noformat} su-jdcryans-2:trunk-commit jdcryans$ ./bin/start-hbase.sh su-jdcryans-2:trunk-commit jdcryans$ {noformat} Delete the region servers znode after a regions server crash Key: HBASE-5844 URL: https://issues.apache.org/jira/browse/HBASE-5844 Project: HBase Issue Type: Improvement Components: regionserver, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 5844.v3.patch, 5844.v4.patch today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery process will stop only after a timeout, usually 30s. By deleting the znode in start script, we remove this delay and the recovery starts immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash
[ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463409#comment-13463409 ] stack commented on HBASE-5844: -- Looking at this w/ j-d, now we no longer do nohup so the parent process can stick around to watch out for the server crash. This make it so now there are two hbase processes listed per launched daemon. This is kinda ugly. When we have this bash script watching the running java process we verge into the territory normally occupied by babysitters like supervise. Our parent bash script will always be less than a real babysitter -- supervise, god, etc. -- so maybe we should just have this kill znode as an optional script w/ prescription for how to set it up -- e.g. run znode remover on daemon crash before starting new one (if we want supervise to start a new one). I'm thinking we should back this out since there are open questions still. Delete the region servers znode after a regions server crash Key: HBASE-5844 URL: https://issues.apache.org/jira/browse/HBASE-5844 Project: HBase Issue Type: Improvement Components: regionserver, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Fix For: 0.96.0 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 5844.v3.patch, 5844.v4.patch today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery process will stop only after a timeout, usually 30s. By deleting the znode in start script, we remove this delay and the recovery starts immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96
[ https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463433#comment-13463433 ] Jesse Yates commented on HBASE-6055: I was going through the offline snapshot code (https://github.com/jyates/hbase/tree/offline-snapshots) and noticed that apparently I wrote the following: {code} Path editsdir = HLog.getRegionDirRecoveredEditsDir(HRegion.getRegionDir(tdir,regionInfo.getEncodedName())); WALReferenceTask op = new WALReferenceTask(snapshot, this.monitor, editsdir, conf, fs, disabledTableSnapshot); {code} For referencing the current hfiles for a disabled table, this makes no sense. However, it got me thinking about dealing with recovered edits for a table. Even if a table is disabled, it may have recovered edits that haven't been applied to the table (a RS comes up, splits the logs, but then dies again before replaying the split log). If I'm reading the log-splitting code correctly, I think it archives the original HLog after splitting, but not before the edits are applied to the region. This would mean we also need to reference the recovered.edits directory under each region, if we keep the current implementation...right? I was thinking that instead we can keep the hfiles around in the .logs directory until the recovered.edits files for that log file have been replayed. This way we can avoid another task for snapshotting (referencing all the recovered edits) and keep everything simple fairly simple. There would need to be some extra work to keep track of the source hlog - either an 'info' file for the source hlog that lists the written recovered.edits files or special naming of the recovered.edits files that point back to the source file. Thoughts? Snapshots in HBase 0.96 --- Key: HBASE-6055 URL: https://issues.apache.org/jira/browse/HBASE-6055 Project: HBase Issue Type: New Feature Components: Client, master, regionserver, snapshots, Zookeeper Reporter: Jesse Yates Assignee: Jesse Yates Fix For: hbase-6055, 0.96.0 Attachments: Snapshots in HBase.docx Continuation of HBASE-50 for the current trunk. Since the implementation has drastically changed, opening as a new ticket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6870) HTable#coprocessorExec always scan the whole table
[ https://issues.apache.org/jira/browse/HBASE-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463466#comment-13463466 ] chunhui shen commented on HBASE-6870: - [~v.himanshu] These two if statements is not made by this patch, so I just keep the previous. {code} public LinkedHashMapbyte[], HRegionLocation getKeysToRegionsInRange( {code} Yes, it could be private. Thanks for the review. I will rework patch with other comments later HTable#coprocessorExec always scan the whole table --- Key: HBASE-6870 URL: https://issues.apache.org/jira/browse/HBASE-6870 Project: HBase Issue Type: Improvement Components: Coprocessors Affects Versions: 0.94.1 Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-6870.patch, HBASE-6870-testPerformance.patch, HBASE-6870v2.patch, HBASE-6870v3.patch In current logic, HTable#coprocessorExec always scan the whole table, its efficiency is low and will affect the Regionserver carrying .META. under large coprocessorExec requests -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6025) Expose Hadoop Dynamic Metrics through JSON Rest interface
[ https://issues.apache.org/jira/browse/HBASE-6025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463471#comment-13463471 ] Hudson commented on HBASE-6025: --- Integrated in HBase-TRUNK #3379 (See [https://builds.apache.org/job/HBase-TRUNK/3379/]) HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface; REAPPLY (Revision 1390240) HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface; REVERT -- OVERCOMMIT (Revision 1390239) HBASE-6025 Expose Hadoop Dynamic Metrics through JSON Rest interface (Revision 1390238) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp stack : Files : * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp * /hbase/trunk/hbase-server/src/main/ruby/hbase/admin.rb * /hbase/trunk/hbase-server/src/main/ruby/hbase/hbase.rb * /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb * /hbase/trunk/hbase-server/src/main/ruby/shell.rb * /hbase/trunk/hbase-server/src/main/ruby/shell/commands.rb * /hbase/trunk/hbase-server/src/main/ruby/shell/formatter.rb stack : Files : * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/RSStatusTmpl.jamon * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/table.jsp * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/tablesDetailed.jsp * /hbase/trunk/hbase-server/src/main/resources/hbase-webapps/master/zk.jsp * /hbase/trunk/hbase-server/src/main/ruby/hbase/admin.rb * /hbase/trunk/hbase-server/src/main/ruby/hbase/hbase.rb * /hbase/trunk/hbase-server/src/main/ruby/hbase/table.rb * /hbase/trunk/hbase-server/src/main/ruby/shell.rb * /hbase/trunk/hbase-server/src/main/ruby/shell/commands.rb * /hbase/trunk/hbase-server/src/main/ruby/shell/formatter.rb Expose Hadoop Dynamic Metrics through JSON Rest interface - Key: HBASE-6025 URL: https://issues.apache.org/jira/browse/HBASE-6025 Project: HBase Issue Type: Improvement Affects Versions: 0.96.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 0.96.0 Attachments: HBASE-6025-0.patch, HBASE-6025-1.patch, HBASE-6025-2.patch, HBASE-6025-3.patch, HBASE-6025-4.patch, hbase-jmx2.patch, hbase-jmx.patch, hbase-jmx.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split
[ https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463494#comment-13463494 ] Devaraj Das commented on HBASE-6679: Okay, did some digging into the logs (that was attached in the jira earlier) and the code. Doesn't seem like a race between compaction and split (apologies for the confusion I might have created). The two are sequential (at the end of a compaction, split is requested for). But I'll note that the split happens in a separate thread. The problem is that the daughter tries to open a reader to a file that doesn't exist. {noformat} java.io.IOException: Failed ip-10-4-197-133.ec2.internal,60020,1346119706203-daughterOpener=4efb1c92918bbf3c54d0ead3345bb735 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:368) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:456) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /apps/hbase/data/TestLoadAndVerify_1346120615716/5689a8785bbc9a8aa8e526cd7ef1542a/f1/5a55df83829f401993d95ecf2e539ba1 {noformat} The method SplitTransaction.createDaughters creates the reference files (via a call to the method SplitTransaction.splitStoreFiles) that the daughter then tries to open. The list of files to create references to is the set of entries in the storeFiles field in Store.java (obtained via the call to this.parent.close in createDaughters). The storeFiles is last updated (in the thread doing the compaction) in the method Store.completeCompaction. My suspicion is that the problem is due to the fact that accesses to storeFiles is not synchronized, and it not volatile either. This leads to inconsistencies in the compaction-thread and split-thread and the split thread doesn't see the last updated value of the field. If the above theory is right (and I have this theory only), then the solution could be to make the storeFiles field volatile. Thoughts? RegionServer aborts due to race between compaction and split Key: HBASE-6679 URL: https://issues.apache.org/jira/browse/HBASE-6679 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.92.3 Attachments: rs-crash-parallel-compact-split.log In our nightlies, we have seen RS aborts due to compaction and split racing. Original parent file gets deleted after the compaction, and hence, the daughters don't find the parent data file. The RS kills itself when this happens. Will attach a snippet of the relevant RS logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463503#comment-13463503 ] liang xie commented on HBASE-6882: -- Hi Mikhail, seems attached file is not for current community TRUNK version? since i saw : {code:title=Hbase.thrift|borderStyle=solid} exception IOError { 1: string message, - 2: i64 backoffTimeMillis + 2: i64 backoffTimeMillis, + 3: string exceptionClass } {code} there is no backoffTimeMillis parameter in struct IOError on current trunk code and another thing, do we encourage using thrift2 more than thrift right now ? if that's right, maybe changing thrift2's TIOError is great ? Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463504#comment-13463504 ] liang xie commented on HBASE-6882: -- Hi Mikhail, seems attached file is not for current community TRUNK version? since i saw : {code:title=Hbase.thrift|borderStyle=solid} exception IOError { 1: string message, - 2: i64 backoffTimeMillis + 2: i64 backoffTimeMillis, + 3: string exceptionClass } {code} there is no backoffTimeMillis parameter in struct IOError on current trunk code and another thing, do we encourage using thrift2 more than thrift right now ? if that's right, maybe changing thrift2's TIOError is great ? Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463507#comment-13463507 ] stack commented on HBASE-6882: -- @Lang thrift2 tries to make the thrift apis more align w/ current trunk. thrift1 has most usage and hence more trust. What is lacking is an owner for either package. Without this folks show up and fix their particular issue in whatever package they are using and then move on. Would be grand if someone could drive thrift2 so it had all of thrift1 and was better aligned w/ the native apis. Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463511#comment-13463511 ] liang xie commented on HBASE-6882: -- Got it, [~saint@gmail.com] I'd like to have a try:) Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463514#comment-13463514 ] stack commented on HBASE-6882: -- @liang That'd be great. Would suggest first a survey of thrift1 and thrift2. Figure what the difference is. Do you want to have the two packages achieve parity? Or do you want to add what is in thrift2 to thrift1 and keep up thrift1? The exmamples package has stuff to exercise the thrift stuff. A few more unit tests would probably not go amiss. Good on you. Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.
[ https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463518#comment-13463518 ] Suraj Varma commented on HBASE-4565: This is no longer an issue on trunk, it appears. The build script modularization changes have completely done away with the copynativelibs.sh which caused the original issue. I am able to build from trunk successfully via cygwin now. Maven HBase build broken on cygwin with copynativelib.sh call. -- Key: HBASE-4565 URL: https://issues.apache.org/jira/browse/HBASE-4565 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Environment: cygwin (on xp and win7) Reporter: Suraj Varma Assignee: Suraj Varma Labels: build, maven Fix For: 0.96.0 Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch This is broken in both 0.92 as well as trunk pom.xml Here's a sample maven log snippet from trunk (from Mayuresh on user mailing list) [INFO] [antrun:run {execution: package}] [INFO] Executing tasks main: [mkdir] Created dir: D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform} [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: No such file or directory [exec] tar (child): Cannot connect to D: resolve failed [INFO] [ERROR] BUILD ERROR [INFO] [INFO] An Ant BuildException has occured: exec returned: 3328 There are two issues: 1) The ant run task below doesn't resolve the windows file separator returned by the project.build.directory - this causes the above resolve failed. !-- Using Unix cp to preserve symlinks, using script to handle wildcards -- echo file=${project.build.directory}/copynativelibs.sh if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then 2) The tar argument value below also has a similar issue in that the path arg doesn't resolve right. !-- Using Unix tar to preserve symlinks -- exec executable=tar failonerror=yes dir=${project.build.directory}/${project.artifactId}-${project.version} arg value=czf/ arg value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/ arg value=./ /exec In both cases, the fix would probably be to use a cross-platform way to handle the directory locations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split
[ https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463523#comment-13463523 ] stack commented on HBASE-6679: -- For sure the regions was not doubly-assigned? Split happened of the region on one server but on another server, the same region was being compacted? You'd need the master logs to figure it a dbl-assign. Storefiles are an ImmutableList. Can you figure a place where we'd be running compactions on a region concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction is running (closing parent, it'll need write lock... thats after daughters open though). RegionServer aborts due to race between compaction and split Key: HBASE-6679 URL: https://issues.apache.org/jira/browse/HBASE-6679 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.92.3 Attachments: rs-crash-parallel-compact-split.log In our nightlies, we have seen RS aborts due to compaction and split racing. Original parent file gets deleted after the compaction, and hence, the daughters don't find the parent data file. The RS kills itself when this happens. Will attach a snippet of the relevant RS logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463525#comment-13463525 ] stack commented on HBASE-6882: -- @Liang ... or just pick up any outstanding thrift issues and take a look at resolving them? Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463528#comment-13463528 ] stack commented on HBASE-6702: -- +1 on commit after addressing Jesse comments. The rest of the convertion work would be done in another issue? Good stuff N. ResourceChecker refinement -- Key: HBASE-6702 URL: https://issues.apache.org/jira/browse/HBASE-6702 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.96.0 Reporter: Jesse Yates Assignee: nkeywal Priority: Critical Fix For: 0.96.0 Attachments: 6702.v1.patch, 6702.v4.patch This was based on some discussion from HBASE-6234. The ResourceChecker was added by N. Keywal to help resolve some hadoop qa issues, but has since not be widely utilized. Further, with modularization we have had to drop the ResourceChecker from the tests that are moved into the hbase-common module because bringing the ResourceChecker up to hbase-common would involved bringing all its dependencies (which are quite far reaching). The question then is, what should we do with it? Get rid of it? Refactor and resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6882) Thrift IOError should include exception class
[ https://issues.apache.org/jira/browse/HBASE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463532#comment-13463532 ] liang xie commented on HBASE-6882: -- Thanks [~saint@gmail.com] for nice guiding ! My plan is to resolve some outstanding thrift related issues firstly, afterwards i could know more details, then maybe i'll have a good feeling on how to fuse thriftthrift2. Don't worry, i'll send a design note before making any big change:) Thrift IOError should include exception class - Key: HBASE-6882 URL: https://issues.apache.org/jira/browse/HBASE-6882 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D5679.1.patch Return exception class as part of IOError thrown from the Thrift proxy or the embedded Thrift server in the regionserver. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split
[ https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463535#comment-13463535 ] ramkrishna.s.vasudevan commented on HBASE-6679: --- @Deva Am not able to tell clearly what is the problem. I too went thro those logs and found that the region 5689a8785bbc9a8aa8e526cd7ef1542a has completed the compaction. {code} 2012-08-28 06:15:34,107 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a., storeName=f1, fileCount=3, fileSize=27.3m, priority=3, time=14360293782301; duration=4sec {code} and later the split has started for the region (after 2 ms) {code} 2012-08-28 06:15:34,109 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split of region TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a. {code} The offlining of the region is done here {code} 2012-08-28 06:15:34,788 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region TestLoadAndVerify_1346120615716,\xD8\x0D\x03\x00\x00\x00\x00\x00/07_0,1346125261573.5689a8785bbc9a8aa8e526cd7ef1542a. in META {code} So before this itself the region got closed. I feel the store file list should have been updated by the time. No ? RegionServer aborts due to race between compaction and split Key: HBASE-6679 URL: https://issues.apache.org/jira/browse/HBASE-6679 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.92.3 Attachments: rs-crash-parallel-compact-split.log In our nightlies, we have seen RS aborts due to compaction and split racing. Original parent file gets deleted after the compaction, and hence, the daughters don't find the parent data file. The RS kills itself when this happens. Will attach a snippet of the relevant RS logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.
[ https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463543#comment-13463543 ] stack commented on HBASE-4565: -- [~svarma] So we should apply the patch to 0.92 and 0.94? The v3 patch still works on windows? Thanks for checking trunk. Maven HBase build broken on cygwin with copynativelib.sh call. -- Key: HBASE-4565 URL: https://issues.apache.org/jira/browse/HBASE-4565 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Environment: cygwin (on xp and win7) Reporter: Suraj Varma Assignee: Suraj Varma Labels: build, maven Fix For: 0.96.0 Attachments: HBASE-4565-0.92.patch, HBASE-4565.patch, HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch This is broken in both 0.92 as well as trunk pom.xml Here's a sample maven log snippet from trunk (from Mayuresh on user mailing list) [INFO] [antrun:run {execution: package}] [INFO] Executing tasks main: [mkdir] Created dir: D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform} [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: No such file or directory [exec] tar (child): Cannot connect to D: resolve failed [INFO] [ERROR] BUILD ERROR [INFO] [INFO] An Ant BuildException has occured: exec returned: 3328 There are two issues: 1) The ant run task below doesn't resolve the windows file separator returned by the project.build.directory - this causes the above resolve failed. !-- Using Unix cp to preserve symlinks, using script to handle wildcards -- echo file=${project.build.directory}/copynativelibs.sh if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then 2) The tar argument value below also has a similar issue in that the path arg doesn't resolve right. !-- Using Unix tar to preserve symlinks -- exec executable=tar failonerror=yes dir=${project.build.directory}/${project.artifactId}-${project.version} arg value=czf/ arg value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/ arg value=./ /exec In both cases, the fix would probably be to use a cross-platform way to handle the directory locations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split
[ https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463548#comment-13463548 ] Devaraj Das commented on HBASE-6679: bq. For sure the regions was not doubly-assigned? Split happened of the region on one server but on another server, the same region was being compacted? You'd need the master logs to figure it a dbl-assign Unfortunately, didn't save the master logs when the failure happened.. bq. Can you figure a place where we'd be running compactions on a region concurrent w/ our splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction is running (closing parent, it'll need write lock... thats after daughters open though). I can't figure out a place where this could happen in the natural execution of the regionserver. bq. Storefiles are an ImmutableList. Yes.. but that still could be exposed to the problems of memory inconsistencies when multiple threads are accessing the object in unsynchronized/non-volatile ways, no? bq. @Deva After a long time, someone addressed me by that name :-) bq. So before this itself the region got closed. I feel the store file list should have been updated by the time. No ? Can't say Ram for sure. There is no guarantee unless the access (read/write) are synchronized or the field is declared volatile.. RegionServer aborts due to race between compaction and split Key: HBASE-6679 URL: https://issues.apache.org/jira/browse/HBASE-6679 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.92.3 Attachments: rs-crash-parallel-compact-split.log In our nightlies, we have seen RS aborts due to compaction and split racing. Original parent file gets deleted after the compaction, and hence, the daughters don't find the parent data file. The RS kills itself when this happens. Will attach a snippet of the relevant RS logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira