[jira] [Commented] (HBASE-11542) Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078963#comment-14078963 ] Hadoop QA commented on HBASE-11542: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658588/hbase11542-0.99-v3.patch against trunk revision . ATTACHMENT ID: 12658588 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 21 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10223//console This message is automatically generated. Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK Key: HBASE-11542 URL: https://issues.apache.org/jira/browse/HBASE-11542 Project: HBase Issue Type: Improvement Components: build, test Affects Versions: 0.99.0 Environment: RHEL 6.3 ,IBM JDK 6 Reporter: LinseyPang Priority: Minor Fix For: 2.0.0 Attachments: HBASE_11542-1.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase_11542-v2.patch In trunk, jira HBase-10336 added a utility test KeyStoreTestUtil.java, which leverages the following sun classes: import sun.security.x509.AlgorithmId; import sun.security.x509.CertificateAlgorithmId; this cause hbase compiler failure if using IBM JDK, There are similar classes like below in IBM jdk: import com.ibm.security.x509.AlgorithmId; import com.ibm.security.x509.CertificateAlgorithmId; This jira is to add handling of the x509 references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11333) Remove deprecated class MetaMigrationConvertingToPB
[ https://issues.apache.org/jira/browse/HBASE-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078970#comment-14078970 ] Virag Kothari commented on HBASE-11333: --- That would be great Remove deprecated class MetaMigrationConvertingToPB --- Key: HBASE-11333 URL: https://issues.apache.org/jira/browse/HBASE-11333 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.99.0 Reporter: Mikhail Antonov Assignee: Mikhail Antonov Priority: Trivial Fix For: 0.99.0 Attachments: HBASE-11333.patch MetaMigrationConvertingToPB is marked deprecated and to be deleted next major release after 0.96. Is that the time? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078995#comment-14078995 ] Ishan Chhabra commented on HBASE-11558: --- [~ndimiduk], can you +1 and commit? Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ --- Key: HBASE-11558 URL: https://issues.apache.org/jira/browse/HBASE-11558 Project: HBase Issue Type: Bug Components: mapreduce, Scanners Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, HBASE_11558_v2.patch, HBASE_11558_v2.patch 0.94 and before, if one sets caching on the Scan object in the Job by calling scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly read and used by the mappers during a mapreduce job. This is because Scan.write respects and serializes caching, which is used internally by TableMapReduceUtil to serialize and transfer the scan object to the mappers. 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect caching anymore as ClientProtos.Scan does not have the field caching. Caching is passed via the ScanRequest object to the server and so is not needed in the Scan object. However, this breaks application code that relies on the earlier behavior. This will lead to sudden degradation in Scan performance 0.96+ for users relying on the old behavior. There are 2 options here: 1. Add caching to Scan object, adding an extra int to the payload for the Scan object which is really not needed in the general case. 2. Document and preach that TableMapReduceUtil.setScannerCaching must be called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11609) LoadIncrementalHFiles fails if the namespace is specified
[ https://issues.apache.org/jira/browse/HBASE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-11609: Resolution: Fixed Status: Resolved (was: Patch Available) LoadIncrementalHFiles fails if the namespace is specified - Key: HBASE-11609 URL: https://issues.apache.org/jira/browse/HBASE-11609 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.0, 0.98.4, 2.0.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 1.0.0, 0.98.5, 2.0.0 Attachments: HBASE-11609-v0.patch, HBASE-11609-v1.patch from Jianshi Huang on the user list trying to bulk load a table in a namespace, like: $ hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles test/ foo:testtb we get an exception {code} 2014-07-29 19:59:53,373 ERROR [main] mapreduce.LoadIncrementalHFiles: Unexpected execution exception during splitting java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: foo:testtb,1.bottom at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:449) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:304) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:899) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) {code} The problem is related to the ':' symbol going to the file path. the simple fix is to replace the current LoadIncrementalHFiles.getUniqueName() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11609) LoadIncrementalHFiles fails if the namespace is specified
[ https://issues.apache.org/jira/browse/HBASE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079143#comment-14079143 ] Hudson commented on HBASE-11609: SUCCESS: Integrated in HBase-0.98 #423 (See [https://builds.apache.org/job/HBase-0.98/423/]) HBASE-11609 LoadIncrementalHFiles fails if the namespace is specified (matteo.bertozzi: rev e426d43e8c21073204459a3c090601efc233a0c6) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java LoadIncrementalHFiles fails if the namespace is specified - Key: HBASE-11609 URL: https://issues.apache.org/jira/browse/HBASE-11609 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.0, 0.98.4, 2.0.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 1.0.0, 0.98.5, 2.0.0 Attachments: HBASE-11609-v0.patch, HBASE-11609-v1.patch from Jianshi Huang on the user list trying to bulk load a table in a namespace, like: $ hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles test/ foo:testtb we get an exception {code} 2014-07-29 19:59:53,373 ERROR [main] mapreduce.LoadIncrementalHFiles: Unexpected execution exception during splitting java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: foo:testtb,1.bottom at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:449) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:304) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:899) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) {code} The problem is related to the ':' symbol going to the file path. the simple fix is to replace the current LoadIncrementalHFiles.getUniqueName() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11609) LoadIncrementalHFiles fails if the namespace is specified
[ https://issues.apache.org/jira/browse/HBASE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079154#comment-14079154 ] Hudson commented on HBASE-11609: FAILURE: Integrated in HBase-TRUNK #5354 (See [https://builds.apache.org/job/HBase-TRUNK/5354/]) HBASE-11609 LoadIncrementalHFiles fails if the namespace is specified (matteo.bertozzi: rev fa160bd124136069a6c4440b2b3807c7ed608ff7) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java LoadIncrementalHFiles fails if the namespace is specified - Key: HBASE-11609 URL: https://issues.apache.org/jira/browse/HBASE-11609 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.0, 0.98.4, 2.0.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 1.0.0, 0.98.5, 2.0.0 Attachments: HBASE-11609-v0.patch, HBASE-11609-v1.patch from Jianshi Huang on the user list trying to bulk load a table in a namespace, like: $ hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles test/ foo:testtb we get an exception {code} 2014-07-29 19:59:53,373 ERROR [main] mapreduce.LoadIncrementalHFiles: Unexpected execution exception during splitting java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: foo:testtb,1.bottom at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:449) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:304) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:899) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) {code} The problem is related to the ':' symbol going to the file path. the simple fix is to replace the current LoadIncrementalHFiles.getUniqueName() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
Y. SREENIVASULU REDDY created HBASE-11613: - Summary: get_counter shell command is not displaying the result for counter columns. Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11542) Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079166#comment-14079166 ] LinseyPang commented on HBASE-11542: Hi,stack, Thanks a lot. 1. I will find a time to work on Linux environment. 2. for failed tests of TestRegionReplicas,TestRestartCluster,etc.. , currently I don't think the failure is caused by my new added classes. I will re-test it .3. For Instead of programmatically creating a keystore containing a self-signed certificate and keypair, perhaps we can generate one once by hand, stringify it, and just use that, will think about it. thanks. Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK Key: HBASE-11542 URL: https://issues.apache.org/jira/browse/HBASE-11542 Project: HBase Issue Type: Improvement Components: build, test Affects Versions: 0.99.0 Environment: RHEL 6.3 ,IBM JDK 6 Reporter: LinseyPang Priority: Minor Fix For: 2.0.0 Attachments: HBASE_11542-1.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase_11542-v2.patch In trunk, jira HBase-10336 added a utility test KeyStoreTestUtil.java, which leverages the following sun classes: import sun.security.x509.AlgorithmId; import sun.security.x509.CertificateAlgorithmId; this cause hbase compiler failure if using IBM JDK, There are similar classes like below in IBM jdk: import com.ibm.security.x509.AlgorithmId; import com.ibm.security.x509.CertificateAlgorithmId; This jira is to add handling of the x509 references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
[ https://issues.apache.org/jira/browse/HBASE-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079181#comment-14079181 ] bharath v commented on HBASE-11613: --- Duplicate of HBASE-10728 get_counter shell command is not displaying the result for counter columns. - Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079274#comment-14079274 ] Jean-Marc Spaggiari commented on HBASE-10728: - Seems that someone else also need this fix in HBASE-11613 . Let met rebase it... get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
[ https://issues.apache.org/jira/browse/HBASE-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079273#comment-14079273 ] Jean-Marc Spaggiari commented on HBASE-11613: - Hey, I think it is. Can you try this: get_counter 't', 'r1', 'f:c1', 'dummy' If it works for you will close this JIRA and will re-base the other path to get it commited... get_counter shell command is not displaying the result for counter columns. - Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11542) Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079276#comment-14079276 ] pascal oliva commented on HBASE-11542: -- I am working on KeyStoreTestUtil.java file from hadoop-common compoment about the break with IBM JAVA I changed the generateCertification function into KeyStoreTestUtil by using bouncycastle library dependency groupIdorg.bouncycastle/groupId artifactIdbcprov-jdk16/artifactId version1.46/version /dependency Here in attachment the patch that i used into hadoop-common : sslkeystore.patch I tested successfully that patch with OpenJDJ 1.7 and IBM JAVA J9 VM (build 2.7, JRE 1.7.0 with hadoop environment. Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK Key: HBASE-11542 URL: https://issues.apache.org/jira/browse/HBASE-11542 Project: HBase Issue Type: Improvement Components: build, test Affects Versions: 0.99.0 Environment: RHEL 6.3 ,IBM JDK 6 Reporter: LinseyPang Priority: Minor Fix For: 2.0.0 Attachments: HBASE_11542-1.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase_11542-v2.patch In trunk, jira HBase-10336 added a utility test KeyStoreTestUtil.java, which leverages the following sun classes: import sun.security.x509.AlgorithmId; import sun.security.x509.CertificateAlgorithmId; this cause hbase compiler failure if using IBM JDK, There are similar classes like below in IBM jdk: import com.ibm.security.x509.AlgorithmId; import com.ibm.security.x509.CertificateAlgorithmId; This jira is to add handling of the x509 references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-10728: Status: Open (was: Patch Available) get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.98.1, 0.96.2, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11542) Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pascal oliva updated HBASE-11542: - Attachment: sslkeystore.patch Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK Key: HBASE-11542 URL: https://issues.apache.org/jira/browse/HBASE-11542 Project: HBase Issue Type: Improvement Components: build, test Affects Versions: 0.99.0 Environment: RHEL 6.3 ,IBM JDK 6 Reporter: LinseyPang Priority: Minor Fix For: 2.0.0 Attachments: HBASE_11542-1.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase_11542-v2.patch, sslkeystore.patch In trunk, jira HBase-10336 added a utility test KeyStoreTestUtil.java, which leverages the following sun classes: import sun.security.x509.AlgorithmId; import sun.security.x509.CertificateAlgorithmId; this cause hbase compiler failure if using IBM JDK, There are similar classes like below in IBM jdk: import com.ibm.security.x509.AlgorithmId; import com.ibm.security.x509.CertificateAlgorithmId; This jira is to add handling of the x509 references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-10728: Status: Patch Available (was: Open) get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.98.1, 0.96.2, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch, HBASE-10728-v2-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-10728: Attachment: HBASE-10728-v2-trunk.patch get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch, HBASE-10728-v2-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11614) Table create should failed when user provised MAX_FILESIZE = 0.
Jean-Marc Spaggiari created HBASE-11614: --- Summary: Table create should failed when user provised MAX_FILESIZE = 0. Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE = 0.
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-11614: Summary: Table create should failed when user provides MAX_FILESIZE = 0. (was: Table create should failed when user provised MAX_FILESIZE = 0.) Table create should failed when user provides MAX_FILESIZE = 0. --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-11614: Summary: Table create should failed when user provides MAX_FILESIZE lower to minimum (was: Table create should failed when user provides MAX_FILESIZE = 0.) Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-11614: Attachment: HBASE-11614-v0-trunk.patch Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-11614: Status: Patch Available (was: Open) Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11542) Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079342#comment-14079342 ] Hadoop QA commented on HBASE-11542: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658645/sslkeystore.patch against trunk revision . ATTACHMENT ID: 12658645 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10225//console This message is automatically generated. Unit Test KeyStoreTestUtil.java compilation failure in IBM JDK Key: HBASE-11542 URL: https://issues.apache.org/jira/browse/HBASE-11542 Project: HBase Issue Type: Improvement Components: build, test Affects Versions: 0.99.0 Environment: RHEL 6.3 ,IBM JDK 6 Reporter: LinseyPang Priority: Minor Fix For: 2.0.0 Attachments: HBASE_11542-1.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase11542-0.99-v3.patch, hbase_11542-v2.patch, sslkeystore.patch In trunk, jira HBase-10336 added a utility test KeyStoreTestUtil.java, which leverages the following sun classes: import sun.security.x509.AlgorithmId; import sun.security.x509.CertificateAlgorithmId; this cause hbase compiler failure if using IBM JDK, There are similar classes like below in IBM jdk: import com.ibm.security.x509.AlgorithmId; import com.ibm.security.x509.CertificateAlgorithmId; This jira is to add handling of the x509 references. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11609) LoadIncrementalHFiles fails if the namespace is specified
[ https://issues.apache.org/jira/browse/HBASE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079344#comment-14079344 ] Hudson commented on HBASE-11609: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #402 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/402/]) HBASE-11609 LoadIncrementalHFiles fails if the namespace is specified (matteo.bertozzi: rev e426d43e8c21073204459a3c090601efc233a0c6) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java LoadIncrementalHFiles fails if the namespace is specified - Key: HBASE-11609 URL: https://issues.apache.org/jira/browse/HBASE-11609 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.0, 0.98.4, 2.0.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 1.0.0, 0.98.5, 2.0.0 Attachments: HBASE-11609-v0.patch, HBASE-11609-v1.patch from Jianshi Huang on the user list trying to bulk load a table in a namespace, like: $ hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles test/ foo:testtb we get an exception {code} 2014-07-29 19:59:53,373 ERROR [main] mapreduce.LoadIncrementalHFiles: Unexpected execution exception during splitting java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: foo:testtb,1.bottom at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:449) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:304) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:899) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) {code} The problem is related to the ':' symbol going to the file path. the simple fix is to replace the current LoadIncrementalHFiles.getUniqueName() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11609) LoadIncrementalHFiles fails if the namespace is specified
[ https://issues.apache.org/jira/browse/HBASE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079367#comment-14079367 ] Hudson commented on HBASE-11609: FAILURE: Integrated in HBase-1.0 #75 (See [https://builds.apache.org/job/HBase-1.0/75/]) HBASE-11609 LoadIncrementalHFiles fails if the namespace is specified (matteo.bertozzi: rev a149219707c2810c4887568313db10cdad4d9391) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java LoadIncrementalHFiles fails if the namespace is specified - Key: HBASE-11609 URL: https://issues.apache.org/jira/browse/HBASE-11609 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.0.0, 0.98.4, 2.0.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: 1.0.0, 0.98.5, 2.0.0 Attachments: HBASE-11609-v0.patch, HBASE-11609-v1.patch from Jianshi Huang on the user list trying to bulk load a table in a namespace, like: $ hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles test/ foo:testtb we get an exception {code} 2014-07-29 19:59:53,373 ERROR [main] mapreduce.LoadIncrementalHFiles: Unexpected execution exception during splitting java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: foo:testtb,1.bottom at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.groupOrSplitPhase(LoadIncrementalHFiles.java:449) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:304) at org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:899) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) {code} The problem is related to the ':' symbol going to the file path. the simple fix is to replace the current LoadIncrementalHFiles.getUniqueName() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079379#comment-14079379 ] Hadoop QA commented on HBASE-10728: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658646/HBASE-10728-v2-trunk.patch against trunk revision . ATTACHMENT ID: 12658646 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.master.TestAssignmentManager org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.TestRegionRebalancing {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.master.TestRollingRestart.testBasicRollingRestart(TestRollingRestart.java:140) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10224//console This message is automatically generated. get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch, HBASE-10728-v2-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
Mike Drob created HBASE-11615: - Summary: TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Bug Components: master Reporter: Mike Drob Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079384#comment-14079384 ] Mike Drob commented on HBASE-11615: --- Potentially related to HBASE-8899? It's the same test, but that was many moons ago. TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Bug Components: master Reporter: Mike Drob Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079389#comment-14079389 ] Jimmy Xiang commented on HBASE-11615: - I can take a look when I get a chance. TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Bug Components: master Reporter: Mike Drob Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11527) Cluster free memory limit check should consider L2 block cache size also when L2 cache is onheap.
[ https://issues.apache.org/jira/browse/HBASE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11527: --- Attachment: HBASE-11527.patch Cluster free memory limit check should consider L2 block cache size also when L2 cache is onheap. - Key: HBASE-11527 URL: https://issues.apache.org/jira/browse/HBASE-11527 Project: HBase Issue Type: Bug Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11527) Cluster free memory limit check should consider L2 block cache size also when L2 cache is onheap.
[ https://issues.apache.org/jira/browse/HBASE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079464#comment-14079464 ] Anoop Sam John commented on HBASE-11527: bq.Only nit is your moving of CacheConfig Constants up to HConstant. Do you have to? Because referring that in HeapMemorySizeUtil also and this needs to be in hbase-common. So we can not refer to CacheConfig as that is in hbase-server. Cluster free memory limit check should consider L2 block cache size also when L2 cache is onheap. - Key: HBASE-11527 URL: https://issues.apache.org/jira/browse/HBASE-11527 Project: HBase Issue Type: Bug Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079466#comment-14079466 ] Hadoop QA commented on HBASE-11614: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658652/HBASE-11614-v0-trunk.patch against trunk revision . ATTACHMENT ID: 12658652 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10226//console This message is automatically generated. Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari updated HBASE-11614: Status: Open (was: Patch Available) Need to update related tests. Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079560#comment-14079560 ] Jean-Marc Spaggiari commented on HBASE-11614: - This is already covers by HMaster.sanityCheckTableDescriptor. However, I have been able to create a table with MAX_FILESIZE = 0, which should not have been possible. Will continue to investigate on that. Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-11614) Table create should failed when user provides MAX_FILESIZE lower to minimum
[ https://issues.apache.org/jira/browse/HBASE-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari resolved HBASE-11614. - Resolution: Not a Problem Ok. This is already fixed in trunk HMaster.sanityCheckTableDescriptor is not there in 0.98. That's why I have been able to create a table with 0 as the MAX_FILESIZE. We can re-open a JIRA if we want this to be backported. Table create should failed when user provides MAX_FILESIZE lower to minimum --- Key: HBASE-11614 URL: https://issues.apache.org/jira/browse/HBASE-11614 Project: HBase Issue Type: Bug Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-11614-v0-trunk.patch MAX_FILESIZE = 0 doesn't make sense. We should avoid creating such tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11616) TestNamespaceUpgrade fails in trunk
Ted Yu created HBASE-11616: -- Summary: TestNamespaceUpgrade fails in trunk Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Bug Reporter: Ted Yu I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11527) Cluster free memory limit check should consider L2 block cache size also when L2 cache is onheap.
[ https://issues.apache.org/jira/browse/HBASE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079637#comment-14079637 ] Hadoop QA commented on HBASE-11527: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658668/HBASE-11527.patch against trunk revision . ATTACHMENT ID: 12658668 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10227//console This message is automatically generated. Cluster free memory limit check should consider L2 block cache size also when L2 cache is onheap. - Key: HBASE-11527 URL: https://issues.apache.org/jira/browse/HBASE-11527 Project: HBase Issue Type: Bug Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch, HBASE-11527.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11551) BucketCache$WriterThread.run() doesn't handle exceptions correctly
[ https://issues.apache.org/jira/browse/HBASE-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079675#comment-14079675 ] Ted Yu commented on HBASE-11551: [~apurtell]: do you want this in 0.98 ? BucketCache$WriterThread.run() doesn't handle exceptions correctly -- Key: HBASE-11551 URL: https://issues.apache.org/jira/browse/HBASE-11551 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 2.0.0 Attachments: 11551-v1.txt Currently the catch is outside the while loop: {code} try { while (cacheEnabled writerEnabled) { ... } catch (Throwable t) { LOG.warn(Failed doing drain, t); } {code} When exception (e.g. BucketAllocatorException) is thrown, run() method would terminate, silently. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Description: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
Demai Ni created HBASE-11617: Summary: AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Environment: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Environment: (was: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ]) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Description: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); + } else { + this.age = 0; // no new Sink OP coming. the last one already applied + } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] was: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { +
[jira] [Created] (HBASE-11618) CorruptedSnapshotException when running ExportSnapshot
William Watson created HBASE-11618: -- Summary: CorruptedSnapshotException when running ExportSnapshot Key: HBASE-11618 URL: https://issues.apache.org/jira/browse/HBASE-11618 Project: HBase Issue Type: Bug Environment: hadoop jar 2.4.0.2.1.2.1-471 hbase jar hbase-common-0.98.0.2.1.2.1-471-hadoop2 yarn CentOS release 6.5 (Final) Reporter: William Watson After much digging, I finally figured out how to get the classpaths just right to run the ExportSnapshot command: {code} /usr/bin/hbase -classpath '/usr/lib/hbase/lib/*:/usr/lib/hadoop/client/*' org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 'myTable_hbaseSnapshot_20140730' -copy-to hdfs://10.0.1.21:8020/hbase -mappers 4 {code} only to run into the following error: {code} Exception in thread main org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:hdfs://ip-10-0-1-31.ec2.internal:8020/tmp/hbase-hbase/hbase/.hbase-snapshot/myTable_hbaseSnapshot_20140730/.snapshotinfo {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11143) Improve replication metrics
[ https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079734#comment-14079734 ] Lars Hofhansl commented on HBASE-11143: --- Turns out there more problems (0.98 at least): # ageOfLastShippedOp will not increase when there is nothing to ship, but it will be stuck at whatever the age of the last shipped edit was. If there is nothing to ship we are (by definition) current. So I think I should do the same as I did in 0.94: Set the ageOfLastShippedEdit to 0 just as I did in 0.94. # ageOfLastAppliedOp is ever increasing even when there is nothing to replicate. 0.94 does not have this, only 0.98 (brought up on the mailing list by [~nidmhbase]). I'll file a new issue to fix these. Improve replication metrics --- Key: HBASE-11143 URL: https://issues.apache.org/jira/browse/HBASE-11143 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 0.94.20, 0.98.3 Attachments: 11143-0.94-v2.txt, 11143-0.94-v3.txt, 11143-0.94.txt, 11143-trunk.txt We are trying to report on replication lag and find that there is no good single metric to do that. ageOfLastShippedOp is close, but unfortunately it is increased even when there is nothing to ship on a particular RegionServer. I would like discuss a few options here: Add a new metric: replicationQueueTime (or something) with the above meaning. I.e. if we have something to ship we set the age of that last shipped edit, if we fail we increment that last time (just like we do now). But if there is nothing to replicate we set it to current time (and hence that metric is reported to close to 0). Alternatively we could change the meaning of ageOfLastShippedOp to mean to do that. That might lead to surprises, but the current behavior is clearly weird when there is nothing to replicate. Comments? [~jdcryans], [~stack]. If approach sounds good, I'll make a patch for all branches. Edit: Also adds a new shippedKBs metric to track the amount of data that is shipped via replication. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-11618) CorruptedSnapshotException when running ExportSnapshot
[ https://issues.apache.org/jira/browse/HBASE-11618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi resolved HBASE-11618. - Resolution: Invalid please post on the user@ list. the first problem is that you have some setup issue, you should be able to use $ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot without any extra class path to specify. otherwise you end up like in this case with the configuration missing and you are using the default /tmp/hbase location instead of your hbase.rootdir CorruptedSnapshotException when running ExportSnapshot -- Key: HBASE-11618 URL: https://issues.apache.org/jira/browse/HBASE-11618 Project: HBase Issue Type: Bug Environment: hadoop jar 2.4.0.2.1.2.1-471 hbase jar hbase-common-0.98.0.2.1.2.1-471-hadoop2 yarn CentOS release 6.5 (Final) Reporter: William Watson After much digging, I finally figured out how to get the classpaths just right to run the ExportSnapshot command: {code} /usr/bin/hbase -classpath '/usr/lib/hbase/lib/*:/usr/lib/hadoop/client/*' org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot 'myTable_hbaseSnapshot_20140730' -copy-to hdfs://10.0.1.21:8020/hbase -mappers 4 {code} only to run into the following error: {code} Exception in thread main org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:hdfs://ip-10-0-1-31.ec2.internal:8020/tmp/hbase-hbase/hbase/.hbase-snapshot/myTable_hbaseSnapshot_20140730/.snapshotinfo {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Description: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] was: AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} // a new value + private long age; public long setAgeOfLastAppliedOp(long timestamp) { + if (lastTimestampForAge != timestamp) { lastTimestampForAge = timestamp; - long age = System.currentTimeMillis() - lastTimestampForAge; +this.age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); + } else { + this.age = 0; // no new Sink OP coming. the last one already applied + } return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() -
[jira] [Commented] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079748#comment-14079748 ] Lars Hofhansl commented on HBASE-11617: --- Also note HBASE-11143 (my last comment), there's also a quick fix for agoOfLastShippedEdit that we should do. AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079747#comment-14079747 ] Lars Hofhansl commented on HBASE-11617: --- Can we just not refresh from getStats? That way the metric retains the value it was last set to by ReplicationSink. AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11619) Remove unused test class from TestHLogSplit
Sean Busbey created HBASE-11619: --- Summary: Remove unused test class from TestHLogSplit Key: HBASE-11619 URL: https://issues.apache.org/jira/browse/HBASE-11619 Project: HBase Issue Type: Task Components: wal Reporter: Sean Busbey Priority: Trivial With the changes introduced by HBASE-8962, we no longer need the class ZombieNewLogWriterRegionServer in TestHLogSplit. It should be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079784#comment-14079784 ] Demai Ni commented on HBASE-11617: -- [~lhofhansl], thanks for confirming the problem. bq. Can we just not refresh from getStats? That way the metric retains the value it was last set to by ReplicationSink. I am not sure how to stop refresh getStats(), it is a public method, which can be invoke by other application. And it is also invoked by ReplicationStatisticsThread. Also the invocation won't pass in a parm to check whether a refresh is needed. Suggestions? Demai AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11551) BucketCache$WriterThread.run() doesn't handle exceptions correctly
[ https://issues.apache.org/jira/browse/HBASE-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079801#comment-14079801 ] Andrew Purtell commented on HBASE-11551: +1 for 0.98, thanks Ted BucketCache$WriterThread.run() doesn't handle exceptions correctly -- Key: HBASE-11551 URL: https://issues.apache.org/jira/browse/HBASE-11551 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 2.0.0 Attachments: 11551-v1.txt Currently the catch is outside the while loop: {code} try { while (cacheEnabled writerEnabled) { ... } catch (Throwable t) { LOG.warn(Failed doing drain, t); } {code} When exception (e.g. BucketAllocatorException) is thrown, run() method would terminate, silently. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-5826) Improve sync of HLog edits
[ https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-5826: --- Component/s: wal Improve sync of HLog edits -- Key: HBASE-5826 URL: https://issues.apache.org/jira/browse/HBASE-5826 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: Todd Lipcon Attachments: 5826-v2.txt, 5826-v3.txt, 5826-v4.txt, 5826-v5.txt, 5826.txt HBASE-5782 solved the correctness issue for the sync of HLog edits. Todd provided a patch that would achieve higher throughput. This JIRA is a continuation of Todd's work submitted there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-6931) Refine WAL interface
[ https://issues.apache.org/jira/browse/HBASE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-6931: --- Component/s: wal Refine WAL interface Key: HBASE-6931 URL: https://issues.apache.org/jira/browse/HBASE-6931 Project: HBase Issue Type: Improvement Components: wal Reporter: Flavio Junqueira We have transformed HLog into an interface and created FSHLog to contain the current implementation of HLog in HBASE-5937. In that patch, we have essentially exposed the public methods, moved method implementations to FSHLog, created a factory for HLog, and moved static methods to HLogUtil. In this umbrella jira, the idea is to refine the WAL interface, making it not dependent upon a file system as it is currently. The high-level idea is to revisit the methods in HLog and HLogUtil and come up an interface that can accommodate other backends, such as BookKeeper. Another major task here is to decide what to do with the splitter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10278) Provide better write predictability
[ https://issues.apache.org/jira/browse/HBASE-10278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-10278: Component/s: wal Provide better write predictability --- Key: HBASE-10278 URL: https://issues.apache.org/jira/browse/HBASE-10278 Project: HBase Issue Type: New Feature Components: wal Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Attachments: 10278-trunk-v2.1.patch, 10278-trunk-v2.1.patch, 10278-wip-1.1.patch, Multiwaldesigndoc.pdf, SwitchWriterFlow.pptx Currently, HBase has one WAL per region server. Whenever there is any latency in the write pipeline (due to whatever reasons such as n/w blip, a node in the pipeline having a bad disk, etc), the overall write latency suffers. Jonathan Hsieh and I analyzed various approaches to tackle this issue. We also looked at HBASE-5699, which talks about adding concurrent multi WALs. Along with performance numbers, we also focussed on design simplicity, minimum impact on MTTR Replication, and compatibility with 0.96 and 0.98. Considering all these parameters, we propose a new HLog implementation with WAL Switching functionality. Please find attached the design doc for the same. It introduces the WAL Switching feature, and experiments/results of a prototype implementation, showing the benefits of this feature. The second goal of this work is to serve as a building block for concurrent multiple WALs feature. Please review the doc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-5826) Improve sync of HLog edits
[ https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079813#comment-14079813 ] Hadoop QA commented on HBASE-5826: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12542625/5826-v5.txt against trunk revision . ATTACHMENT ID: 12542625 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10228//console This message is automatically generated. Improve sync of HLog edits -- Key: HBASE-5826 URL: https://issues.apache.org/jira/browse/HBASE-5826 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: Todd Lipcon Attachments: 5826-v2.txt, 5826-v3.txt, 5826-v4.txt, 5826-v5.txt, 5826.txt HBASE-5782 solved the correctness issue for the sync of HLog edits. Todd provided a patch that would achieve higher throughput. This JIRA is a continuation of Todd's work submitted there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079818#comment-14079818 ] Demai Ni commented on HBASE-11617: -- actually, putting the checking in MetricsSink.refreshAgeOfLastAppliedOp() may be better? AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11551) BucketCache$WriterThread.run() doesn't handle exceptions correctly
[ https://issues.apache.org/jira/browse/HBASE-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11551: --- Fix Version/s: 0.98.5 Hadoop Flags: Reviewed Integrated to 0.98 as well. BucketCache$WriterThread.run() doesn't handle exceptions correctly -- Key: HBASE-11551 URL: https://issues.apache.org/jira/browse/HBASE-11551 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 0.98.5, 2.0.0 Attachments: 11551-v1.txt Currently the catch is outside the while loop: {code} try { while (cacheEnabled writerEnabled) { ... } catch (Throwable t) { LOG.warn(Failed doing drain, t); } {code} When exception (e.g. BucketAllocatorException) is thrown, run() method would terminate, silently. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10674) HBCK should be updated to do replica related checks
[ https://issues.apache.org/jira/browse/HBASE-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-10674: Assignee: Devaraj Das Status: Patch Available (was: Open) HBCK should be updated to do replica related checks --- Key: HBASE-10674 URL: https://issues.apache.org/jira/browse/HBASE-10674 Project: HBase Issue Type: Sub-task Reporter: Devaraj Das Assignee: Devaraj Das Attachments: 10674-1.2.txt, 10674-1.txt HBCK should be updated to have a check for whether the replicas are assigned to the right machines (default and non-default replicas ideally should not be in the same server if there is more than one server in the cluster and such scenarios). [~jmhsieh] suggested this in HBASE-10362. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10674) HBCK should be updated to do replica related checks
[ https://issues.apache.org/jira/browse/HBASE-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-10674: Attachment: 10674-1.2.txt This patch addresses the cases where some replicas are not deployed in the cluster, or excess replicas are deployed in the cluster. It also deals with plugging meta holes with the appropriate replica qualifiers. HBCK should be updated to do replica related checks --- Key: HBASE-10674 URL: https://issues.apache.org/jira/browse/HBASE-10674 Project: HBase Issue Type: Sub-task Reporter: Devaraj Das Attachments: 10674-1.2.txt, 10674-1.txt HBCK should be updated to have a check for whether the replicas are assigned to the right machines (default and non-default replicas ideally should not be in the same server if there is more than one server in the cluster and such scenarios). [~jmhsieh] suggested this in HBASE-10362. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079867#comment-14079867 ] Lars Hofhansl commented on HBASE-11617: --- The problem is that getStats(), is called periodically to dump the replication metrics to the logs and it calls refreshAgeOfLastAppliedOp() (because there is not other way to get the metric's value). Your solution will work and is maybe the easiest to do. (you do not need to turn age into a member, though) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-5826) Improve sync of HLog edits
[ https://issues.apache.org/jira/browse/HBASE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5826: - Resolution: Won't Fix Status: Resolved (was: Patch Available) Resolving. Patch no longer applies (sequenceid and batching is different in trunk now). Improve sync of HLog edits has been done over in JIRAs such as HBASE-8755. Improve sync of HLog edits -- Key: HBASE-5826 URL: https://issues.apache.org/jira/browse/HBASE-5826 Project: HBase Issue Type: Improvement Components: wal Reporter: Ted Yu Assignee: Todd Lipcon Attachments: 5826-v2.txt, 5826-v3.txt, 5826-v4.txt, 5826-v5.txt, 5826.txt HBASE-5782 solved the correctness issue for the sync of HLog edits. Todd provided a patch that would achieve higher throughput. This JIRA is a continuation of Todd's work submitted there. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079869#comment-14079869 ] Andrew Purtell commented on HBASE-10728: Those precommit failures are unrelated. Our precommit builds are broken. Let me see if there's a JIRA open for that or not.. +1 for commit get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch, HBASE-10728-v2-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
Ted Yu created HBASE-11620: -- Summary: Propagate decoder exception to HLogSplitter so that loss of data is avoided Key: HBASE-11620 URL: https://issues.apache.org/jira/browse/HBASE-11620 Project: HBase Issue Type: Bug Reporter: Ted Yu Priority: Critical Reported by Kiran in this thread: HBase file encryption, inconsistencies observed and data loss After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs. Following is observed in log files: {code} 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down. 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false {code} To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix? (end of quote from Kiran) In BaseDecoder#rethrowEofException() : {code} if (!isEof) throw ioEx; LOG.error(Partial cell read caused by EOF: + ioEx); EOFException eofEx = new EOFException(Partial cell read); eofEx.initCause(ioEx); throw eofEx; {code} throwing EOFException would not propagate the Partial cell read condition to HLogSplitter which doesn't treat EOFException as an error. I think a new exception type (DecoderException e.g.) should be used above. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079900#comment-14079900 ] Jimmy Xiang commented on HBASE-11616: - This is related to HBASE-11606. Attached a patch to fix it. TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Bug Reporter: Ted Yu Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11616: Attachment: hbase-11616.patch TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Bug Reporter: Ted Yu Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11616: Fix Version/s: 2.0.0 Assignee: Jimmy Xiang Status: Patch Available (was: Open) TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079920#comment-14079920 ] Nick Dimiduk commented on HBASE-11558: -- +1 for patch v2. TestStripeCompactionPolicy and TestDefaultCompactSelection both pass for me locally. Will commit. Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ --- Key: HBASE-11558 URL: https://issues.apache.org/jira/browse/HBASE-11558 Project: HBase Issue Type: Bug Components: mapreduce, Scanners Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, HBASE_11558_v2.patch, HBASE_11558_v2.patch 0.94 and before, if one sets caching on the Scan object in the Job by calling scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly read and used by the mappers during a mapreduce job. This is because Scan.write respects and serializes caching, which is used internally by TableMapReduceUtil to serialize and transfer the scan object to the mappers. 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect caching anymore as ClientProtos.Scan does not have the field caching. Caching is passed via the ScanRequest object to the server and so is not needed in the Scan object. However, this breaks application code that relies on the earlier behavior. This will lead to sudden degradation in Scan performance 0.96+ for users relying on the old behavior. There are 2 options here: 1. Add caching to Scan object, adding an extra int to the payload for the Scan object which is really not needed in the general case. 2. Document and preach that TableMapReduceUtil.setScannerCaching must be called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-11615: --- Assignee: Jimmy Xiang TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Bug Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11616: --- Issue Type: Test (was: Bug) TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10834) Better error messaging on issuing grant commands in non-authz mode
[ https://issues.apache.org/jira/browse/HBASE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srikanth Srungarapu updated HBASE-10834: Description: Running the below sequence of steps should give a better error messaging rather than table not found error. {code} hbase(main):009:0 grant test, RWCXA ERROR: Unknown table _acl_! Here is some help for this command: Grant users specific rights. Syntax : grant user permissions [table [column family [column qualifier]] permissions is either zero or more letters from the set RWXCA. READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') For example: hbase grant 'bobsmith', 'RWXCA' hbase grant 'bobsmith', 'RW', 't1', 'f1', 'col1' {code} Instead of ERROR: Unknown table _acl_!, hbase should give out a warning like Command not supported in non-authz mode(as acl table is only created if authz is turned on) was: Running the below sequence of steps should give a better error messaging rather than table not found error. hbase(main):013:0 create test, {NAME='f1'} gr0 row(s) in 6.1320 seconds hbase(main):014:0 disable test drop test 0 row(s) in 10.2100 seconds hbase(main):015:0 drop test 0 row(s) in 1.0500 seconds hbase(main):016:0 create test, {NAME='f1'} 0 row(s) in 1.0510 seconds hbase(main):017:0 grant systest, RWXCA, test ERROR: Unknown table systest! Instead of ERROR: Unknown table systest!, hbase should give out a warning like Command not supported in non-authz mode(as acl table is only created if authz is turned on) Better error messaging on issuing grant commands in non-authz mode -- Key: HBASE-10834 URL: https://issues.apache.org/jira/browse/HBASE-10834 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.17 Reporter: Srikanth Srungarapu Assignee: Srikanth Srungarapu Priority: Trivial Attachments: HBASE-10834.patch, HBASE-10834_v2.patch Running the below sequence of steps should give a better error messaging rather than table not found error. {code} hbase(main):009:0 grant test, RWCXA ERROR: Unknown table _acl_! Here is some help for this command: Grant users specific rights. Syntax : grant user permissions [table [column family [column qualifier]] permissions is either zero or more letters from the set RWXCA. READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') For example: hbase grant 'bobsmith', 'RWXCA' hbase grant 'bobsmith', 'RW', 't1', 'f1', 'col1' {code} Instead of ERROR: Unknown table _acl_!, hbase should give out a warning like Command not supported in non-authz mode(as acl table is only created if authz is turned on) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10834) Better error messaging on issuing grant commands in non-authz mode
[ https://issues.apache.org/jira/browse/HBASE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079931#comment-14079931 ] Srikanth Srungarapu commented on HBASE-10834: - [~jxiang] The table _acl_ gets accessed as part of authorization coprocessor classes, so I couldn't really see any other reason for _acl_ to get accessed when authorization is turned off. Better error messaging on issuing grant commands in non-authz mode -- Key: HBASE-10834 URL: https://issues.apache.org/jira/browse/HBASE-10834 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.17 Reporter: Srikanth Srungarapu Assignee: Srikanth Srungarapu Priority: Trivial Attachments: HBASE-10834.patch, HBASE-10834_v2.patch Running the below sequence of steps should give a better error messaging rather than table not found error. {code} hbase(main):009:0 grant test, RWCXA ERROR: Unknown table _acl_! Here is some help for this command: Grant users specific rights. Syntax : grant user permissions [table [column family [column qualifier]] permissions is either zero or more letters from the set RWXCA. READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') For example: hbase grant 'bobsmith', 'RWXCA' hbase grant 'bobsmith', 'RW', 't1', 'f1', 'col1' {code} Instead of ERROR: Unknown table _acl_!, hbase should give out a warning like Command not supported in non-authz mode(as acl table is only created if authz is turned on) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11551) BucketCache$WriterThread.run() doesn't handle exceptions correctly
[ https://issues.apache.org/jira/browse/HBASE-11551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079932#comment-14079932 ] Hudson commented on HBASE-11551: FAILURE: Integrated in HBase-0.98 #424 (See [https://builds.apache.org/job/HBase-0.98/424/]) HBASE-11551 BucketCache.run() doesn't handle exceptions correctly (Ted Yu) (tedyu: rev 76e89cb7fface4d91b8c62192832be581bf67a3b) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java BucketCache$WriterThread.run() doesn't handle exceptions correctly -- Key: HBASE-11551 URL: https://issues.apache.org/jira/browse/HBASE-11551 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.99.0, 0.98.5, 2.0.0 Attachments: 11551-v1.txt Currently the catch is outside the while loop: {code} try { while (cacheEnabled writerEnabled) { ... } catch (Throwable t) { LOG.warn(Failed doing drain, t); } {code} When exception (e.g. BucketAllocatorException) is thrown, run() method would terminate, silently. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-11558: - Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to 4 branches. Thanks for the patch, [~ishanc]. Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ --- Key: HBASE-11558 URL: https://issues.apache.org/jira/browse/HBASE-11558 Project: HBase Issue Type: Bug Components: mapreduce, Scanners Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, HBASE_11558_v2.patch, HBASE_11558_v2.patch 0.94 and before, if one sets caching on the Scan object in the Job by calling scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly read and used by the mappers during a mapreduce job. This is because Scan.write respects and serializes caching, which is used internally by TableMapReduceUtil to serialize and transfer the scan object to the mappers. 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect caching anymore as ClientProtos.Scan does not have the field caching. Caching is passed via the ScanRequest object to the server and so is not needed in the Scan object. However, this breaks application code that relies on the earlier behavior. This will lead to sudden degradation in Scan performance 0.96+ for users relying on the old behavior. There are 2 options here: 1. Add caching to Scan object, adding an extra int to the payload for the Scan object which is really not needed in the general case. 2. Document and preach that TableMapReduceUtil.setScannerCaching must be called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079944#comment-14079944 ] Nick Dimiduk commented on HBASE-11558: -- [~ishanc] as a follow-on, what do you think about deprecating TableMapReduceUtil.setScannerCaching in favor of setScanner? Is there any sense in having two ways to specify this? We should also look at what happens when a user specifies both. What's the effective behavior? Mind updating the release note appropriately? Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ --- Key: HBASE-11558 URL: https://issues.apache.org/jira/browse/HBASE-11558 Project: HBase Issue Type: Bug Components: mapreduce, Scanners Reporter: Ishan Chhabra Assignee: Ishan Chhabra Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, HBASE_11558_v2.patch, HBASE_11558_v2.patch 0.94 and before, if one sets caching on the Scan object in the Job by calling scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly read and used by the mappers during a mapreduce job. This is because Scan.write respects and serializes caching, which is used internally by TableMapReduceUtil to serialize and transfer the scan object to the mappers. 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect caching anymore as ClientProtos.Scan does not have the field caching. Caching is passed via the ScanRequest object to the server and so is not needed in the Scan object. However, this breaks application code that relies on the earlier behavior. This will lead to sudden degradation in Scan performance 0.96+ for users relying on the old behavior. There are 2 options here: 1. Add caching to Scan object, adding an extra int to the payload for the Scan object which is really not needed in the general case. 2. Document and preach that TableMapReduceUtil.setScannerCaching must be called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Issue Type: Test (was: Bug) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Test Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-4744: --- Attachment: HBASE_4744.patch first pass at re-enabling this test. had to clean up some null handling in logging. reading through things, would testLogRollAfterSplitStart be better placed in TestLogRollAbort rather than TestHLogSplit? The test is related to splitting, but AFAICT the desired end result is that the RS should abort, which seems be the intention of the LogRollAbort class. Remove @Ignore for testLogRollAfterSplitStart - Key: HBASE-4744 URL: https://issues.apache.org/jira/browse/HBASE-4744 Project: HBase Issue Type: Test Affects Versions: 0.94.0 Reporter: Nicolas Spiegelberg Priority: Critical Labels: newbie Attachments: HBASE_4744.patch We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to HDFS. Although a number of HDFS versions have this fix, the official HDFS 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. Please revisit before the RC of 0.94, which should have 0.20.205.1 or later the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11620: --- Description: Reported by Kiran in this thread: HBase file encryption, inconsistencies observed and data loss After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs. Following is observed in log files: {code} 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down. 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false {code} To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix? (end of quote from Kiran) In BaseDecoder#rethrowEofException() : {code} if (!isEof) throw ioEx; LOG.error(Partial cell read caused by EOF: + ioEx); EOFException eofEx = new EOFException(Partial cell read); eofEx.initCause(ioEx); throw eofEx; {code} throwing EOFException would not propagate the Partial cell read condition to HLogSplitter which doesn't treat EOFException as an error. I think IOException should be thrown above - HLogSplitter#getNextLogLine() would translate the IOEx to CorruptedLogFileException. was: Reported by Kiran in this thread: HBase file encryption, inconsistencies observed and data loss After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs. Following is observed in log files: {code} 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread
[jira] [Updated] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11620: --- Attachment: 11620-v1.txt Tentative patch. Propagate decoder exception to HLogSplitter so that loss of data is avoided --- Key: HBASE-11620 URL: https://issues.apache.org/jira/browse/HBASE-11620 Project: HBase Issue Type: Bug Reporter: Ted Yu Priority: Critical Attachments: 11620-v1.txt Reported by Kiran in this thread: HBase file encryption, inconsistencies observed and data loss After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs. Following is observed in log files: {code} 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down. 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false {code} To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix? (end of quote from Kiran) In BaseDecoder#rethrowEofException() : {code} if (!isEof) throw ioEx; LOG.error(Partial cell read caused by EOF: + ioEx); EOFException eofEx = new EOFException(Partial cell read); eofEx.initCause(ioEx); throw eofEx; {code} throwing EOFException would not propagate the Partial cell read condition to HLogSplitter which doesn't treat EOFException as an error. I think IOException should be thrown above - HLogSplitter#getNextLogLine() would translate the IOEx to CorruptedLogFileException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10674) HBCK should be updated to do replica related checks
[ https://issues.apache.org/jira/browse/HBASE-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14079994#comment-14079994 ] Hadoop QA commented on HBASE-10674: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658715/10674-1.2.txt against trunk revision . ATTACHMENT ID: 12658715 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.TestIOFencing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10229//console This message is automatically generated. HBCK should be updated to do replica related checks --- Key: HBASE-10674 URL: https://issues.apache.org/jira/browse/HBASE-10674 Project: HBase Issue Type: Sub-task Reporter: Devaraj Das Assignee: Devaraj Das Attachments: 10674-1.2.txt, 10674-1.txt HBCK should be updated to have a check for whether the replicas are assigned to the right machines (default and non-default replicas ideally should not be in the same server if there is more than one server in the cluster and such scenarios). [~jmhsieh] suggested this in HBASE-10362. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Fix Version/s: 2.0.0 1.0.0 Status: Patch Available (was: Open) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Test Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Fix For: 1.0.0, 2.0.0 Attachments: hbase-11615.patch Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-4744: --- Status: Patch Available (was: Open) Remove @Ignore for testLogRollAfterSplitStart - Key: HBASE-4744 URL: https://issues.apache.org/jira/browse/HBASE-4744 Project: HBase Issue Type: Test Affects Versions: 0.94.0 Reporter: Nicolas Spiegelberg Priority: Critical Labels: newbie Attachments: HBASE_4744.patch We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to HDFS. Although a number of HDFS versions have this fix, the official HDFS 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. Please revisit before the RC of 0.94, which should have 0.20.205.1 or later the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Attachment: hbase-11615.patch TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Test Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Fix For: 1.0.0, 2.0.0 Attachments: hbase-11615.patch Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-4744: --- Attachment: HBASE_4744-v2.patch missed an instance of trailing whitespace in the first patch. this one should be ready for review. Remove @Ignore for testLogRollAfterSplitStart - Key: HBASE-4744 URL: https://issues.apache.org/jira/browse/HBASE-4744 Project: HBase Issue Type: Test Affects Versions: 0.94.0 Reporter: Nicolas Spiegelberg Priority: Critical Labels: newbie Attachments: HBASE_4744-v2.patch, HBASE_4744.patch We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to HDFS. Although a number of HDFS versions have this fix, the official HDFS 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. Please revisit before the RC of 0.94, which should have 0.20.205.1 or later the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Attachment: HBASE-11617-master-v1.patch upload the patch for both AgeOfLastAppliedOp and AgeOfLatShippedOp(from [HBase-11143 | https://issues.apache.org/jira/browse/HBASE-11143] ) AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 Attachments: HBASE-11617-master-v1.patch AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Summary: incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP (was: AgeOfLastAppliedOp in MetricsSink got increased when no new replication sink OP ) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP -- Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 Attachments: HBASE-11617-master-v1.patch AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11620) Propagate decoder exception to HLogSplitter so that loss of data is avoided
[ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11620: --- Affects Version/s: 0.98.4 Propagate decoder exception to HLogSplitter so that loss of data is avoided --- Key: HBASE-11620 URL: https://issues.apache.org/jira/browse/HBASE-11620 Project: HBase Issue Type: Bug Affects Versions: 0.98.4 Reporter: Ted Yu Priority: Critical Attachments: 11620-v1.txt Reported by Kiran in this thread: HBase file encryption, inconsistencies observed and data loss After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This is not considered as error and these WAL are being moved to /oldWALs. Following is observed in log files: {code} 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017, length=172 2014-07-30 19:44:29,254 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay = false 2014-07-30 19:44:29,313 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 2014-07-30 19:44:29,315 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 after 1ms 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter: Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing writing output logs and closing down. 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting for split writer threads to finish 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split writers finished 2014-07-30 19:44:29,592 INFO [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed 0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017 is corrupted = false progress failed = false {code} To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the fix? (end of quote from Kiran) In BaseDecoder#rethrowEofException() : {code} if (!isEof) throw ioEx; LOG.error(Partial cell read caused by EOF: + ioEx); EOFException eofEx = new EOFException(Partial cell read); eofEx.initCause(ioEx); throw eofEx; {code} throwing EOFException would not propagate the Partial cell read condition to HLogSplitter which doesn't treat EOFException as an error. I think IOException should be thrown above - HLogSplitter#getNextLogLine() would translate the IOEx to CorruptedLogFileException. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni updated HBASE-11617: - Status: Patch Available (was: In Progress) [~lhofhansl], would you please take a look at the patch, whether it matches your take? thanks incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP -- Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 Attachments: HBASE-11617-master-v1.patch AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11143) Improve replication metrics
[ https://issues.apache.org/jira/browse/HBASE-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080018#comment-14080018 ] Demai Ni commented on HBASE-11143: -- thanks to [~lhofhansl]'s suggestion, the patch is uploaded in [HBASE-11617 | https://issues.apache.org/jira/browse/HBASE-11617] Improve replication metrics --- Key: HBASE-11143 URL: https://issues.apache.org/jira/browse/HBASE-11143 Project: HBase Issue Type: Bug Components: Replication Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.99.0, 0.94.20, 0.98.3 Attachments: 11143-0.94-v2.txt, 11143-0.94-v3.txt, 11143-0.94.txt, 11143-trunk.txt We are trying to report on replication lag and find that there is no good single metric to do that. ageOfLastShippedOp is close, but unfortunately it is increased even when there is nothing to ship on a particular RegionServer. I would like discuss a few options here: Add a new metric: replicationQueueTime (or something) with the above meaning. I.e. if we have something to ship we set the age of that last shipped edit, if we fail we increment that last time (just like we do now). But if there is nothing to replicate we set it to current time (and hence that metric is reported to close to 0). Alternatively we could change the meaning of ageOfLastShippedOp to mean to do that. That might lead to surprises, but the current behavior is clearly weird when there is nothing to replicate. Comments? [~jdcryans], [~stack]. If approach sounds good, I'll make a patch for all branches. Edit: Also adds a new shippedKBs metric to track the amount of data that is shipped via replication. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11617) incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP
[ https://issues.apache.org/jira/browse/HBASE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080026#comment-14080026 ] Demai Ni commented on HBASE-11617: -- btw, with this patch, I am not sure what the purpose of MetricsSink.refreshAgeOfLastAppliedOp() ? As it will be ignored and always return age = 0; incorrect AgeOfLastAppliedOp and AgeOfLastShippedOp in replication Metrics when no new replication OP -- Key: HBASE-11617 URL: https://issues.apache.org/jira/browse/HBASE-11617 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.2 Reporter: Demai Ni Assignee: Demai Ni Priority: Minor Fix For: 0.99.0, 0.98.5, 2.0.0 Attachments: HBASE-11617-master-v1.patch AgeOfLastAppliedOp in MetricsSink.java is to indicate the time an edit sat in the 'replication queue' before it got replicated(aka applied) {code} /** * Set the age of the last applied operation * * @param timestamp The timestamp of the last operation applied. * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { lastTimestampForAge = timestamp; long age = System.currentTimeMillis() - lastTimestampForAge; rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} In the following scenario: 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is set for example 100ms; 2) and then NO new Sink op occur. 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, It was because that refreshAgeOfLastAppliedOp() get invoked periodically by getStats(). proposed fix: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsSink.java @@ -35,6 +35,7 @@ public class MetricsSink { private MetricsReplicationSource rms; private long lastTimestampForAge = System.currentTimeMillis(); + private long age = 0; public MetricsSink() { rms = CompatibilitySingletonFactory.getInstance(MetricsReplicationSource.class); @@ -47,8 +48,12 @@ public class MetricsSink { * @return the age that was set */ public long setAgeOfLastAppliedOp(long timestamp) { -lastTimestampForAge = timestamp; -long age = System.currentTimeMillis() - lastTimestampForAge; +if (lastTimestampForAge != timestamp) { + lastTimestampForAge = timestamp; + this.age = System.currentTimeMillis() - lastTimestampForAge; +} else { + this.age = 0; +} rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age); return age; } {code} detail discussion in [dev@hbase | http://mail-archives.apache.org/mod_mbox/hbase-dev/201407.mbox/%3CCAOEq2C5BKMXAM2Fv4LGVb_Ktek-Pm%3DhjOi33gSHX-2qHqAou6w%40mail.gmail.com%3E ] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080028#comment-14080028 ] Hadoop QA commented on HBASE-11616: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658727/hbase-11616.patch against trunk revision . ATTACHMENT ID: 12658727 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10230//console This message is automatically generated. TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at
[jira] [Updated] (HBASE-3270) When we create the .version file, we should create it in a tmp location and then move it into place
[ https://issues.apache.org/jira/browse/HBASE-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-3270: --- Labels: newbie (was: ) When we create the .version file, we should create it in a tmp location and then move it into place --- Key: HBASE-3270 URL: https://issues.apache.org/jira/browse/HBASE-3270 Project: HBase Issue Type: Improvement Components: master Reporter: stack Priority: Minor Labels: newbie Fix For: 0.99.0, 0.98.5, 2.0.0 Todd suggests over in HBASE-3258 that writing hbase.version, we should write it off in a /tmp location and then move it into place after writing it to protect against case where file writer crashes between creation and write. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080050#comment-14080050 ] Hadoop QA commented on HBASE-11615: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658742/hbase-11615.patch against trunk revision . ATTACHMENT ID: 12658742 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.procedure.TestProcedureManager org.apache.hadoop.hbase.ipc.TestIPC org.apache.hadoop.hbase.master.TestClockSkewDetection Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10232//console This message is automatically generated. TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Test Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Fix For: 1.0.0, 2.0.0 Attachments: hbase-11615.patch Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11621) Make MiniDFSCluster run faster
[ https://issues.apache.org/jira/browse/HBASE-11621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-11621: --- Attachment: 11621-v1.txt Tentative patch. Running test suite to see which test(s) break. Make MiniDFSCluster run faster -- Key: HBASE-11621 URL: https://issues.apache.org/jira/browse/HBASE-11621 Project: HBase Issue Type: Task Reporter: Ted Yu Attachments: 11621-v1.txt Daryn proposed the following change in HDFS-6773: {code} EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); {code} With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11621) Make MiniDFSCluster run faster
Ted Yu created HBASE-11621: -- Summary: Make MiniDFSCluster run faster Key: HBASE-11621 URL: https://issues.apache.org/jira/browse/HBASE-11621 Project: HBase Issue Type: Task Reporter: Ted Yu Attachments: 11621-v1.txt Daryn proposed the following change in HDFS-6773: {code} EditLogFileOutputStream.setShouldSkipFsyncForTesting(true); {code} With this change in HBaseTestingUtility#startMiniDFSCluster(), runtime for TestAdmin went from 8:35 min to 7 min -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080095#comment-14080095 ] stack commented on HBASE-11615: --- +1 Failures are apache infra related I believe. TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Test Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Fix For: 1.0.0, 2.0.0 Attachments: hbase-11615.patch Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080103#comment-14080103 ] Ted Yu commented on HBASE-11616: lgtm TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080107#comment-14080107 ] stack commented on HBASE-11616: --- Why not remove TestNamespaceUpgrade in trunk (and all code associated with namespace upgrades?) We don't need it anymore? Otherwise +1 on patch for now. TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080113#comment-14080113 ] stack commented on HBASE-4744: -- Patch looks good to me. Will wait on hadoopqa. Get your reason for moving the test. Makes sense. Up to you. I could commit v2 or wait on a v3 where you move it. Good on you Sean. Remove @Ignore for testLogRollAfterSplitStart - Key: HBASE-4744 URL: https://issues.apache.org/jira/browse/HBASE-4744 Project: HBase Issue Type: Test Affects Versions: 0.94.0 Reporter: Nicolas Spiegelberg Priority: Critical Labels: newbie Attachments: HBASE_4744-v2.patch, HBASE_4744.patch We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to HDFS. Although a number of HDFS versions have this fix, the official HDFS 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. Please revisit before the RC of 0.94, which should have 0.20.205.1 or later the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11616) TestNamespaceUpgrade fails in trunk
[ https://issues.apache.org/jira/browse/HBASE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080119#comment-14080119 ] Jimmy Xiang commented on HBASE-11616: - Thought about removing it. Probably will do it in HBASE-11611. TestNamespaceUpgrade fails in trunk --- Key: HBASE-11616 URL: https://issues.apache.org/jira/browse/HBASE-11616 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Jimmy Xiang Fix For: 2.0.0 Attachments: hbase-11616.patch I see the following in test output: {code} error message=Canapos;t get the location type=org.apache.hadoop.hbase.client.RetriesExhaustedExceptionorg.apache.hadoop.hbase.client.RetriesExhaustedException: Canapos;t get the location at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:287) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:132) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:1) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:179) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:287) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:267) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:139) at org.apache.hadoop.hbase.client.ClientScanner.lt;initgt;(ClientScanner.java:134) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:814) at org.apache.hadoop.hbase.migration.TestNamespaceUpgrade.setUpBeforeClass(TestNamespaceUpgrade.java:147) ... Caused by: org.apache.hadoop.hbase.client.NoServerForRegionException: No server address listed in hbase:meta for region hbase:acl,,1376029204842. 06dfcfc239196403c5f1135b91dedc64. containing row at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1233) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1099) at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:279) ... 31 more {code} The cause for the above error is that the _acl_ table contained in the image (w.r.t. hbase:meta table) doesn't have server address. [~jxiang]: What do you think would be proper fix ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11615) TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins
[ https://issues.apache.org/jira/browse/HBASE-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-11615: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Integrated into master and branch-1. TestZKLessAMOnCluster.testForceAssignWhileClosing failed on Jenkins --- Key: HBASE-11615 URL: https://issues.apache.org/jira/browse/HBASE-11615 Project: HBase Issue Type: Test Components: master Reporter: Mike Drob Assignee: Jimmy Xiang Fix For: 1.0.0, 2.0.0 Attachments: hbase-11615.patch Failed on branch-1. Example Failure: https://builds.apache.org/job/HBase-1.0/75/testReport/org.apache.hadoop.hbase.master/TestZKLessAMOnCluster/testForceAssignWhileClosing/ -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4744) Remove @Ignore for testLogRollAfterSplitStart
[ https://issues.apache.org/jira/browse/HBASE-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14080129#comment-14080129 ] Hadoop QA commented on HBASE-4744: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12658744/HBASE_4744-v2.patch against trunk revision . ATTACHMENT ID: 12658744 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestRegionPlacement org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction org.apache.hadoop.hbase.migration.TestNamespaceUpgrade org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.TestIOFencing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10231//console This message is automatically generated. Remove @Ignore for testLogRollAfterSplitStart - Key: HBASE-4744 URL: https://issues.apache.org/jira/browse/HBASE-4744 Project: HBase Issue Type: Test Affects Versions: 0.94.0 Reporter: Nicolas Spiegelberg Priority: Critical Labels: newbie Attachments: HBASE_4744-v2.patch, HBASE_4744.patch We fixed a data loss bug in HBASE-2312 by adding non-recursive creates to HDFS. Although a number of HDFS versions have this fix, the official HDFS 0.20.205 branch currently doesn't, so we needed to mark the test as ignored. Please revisit before the RC of 0.94, which should have 0.20.205.1 or later the necessary HDFS patches. -- This message was sent by Atlassian JIRA (v6.2#6252)