[jira] [Commented] (HBASE-10097) Remove a region name string creation in HRegion#nextInternal
[ https://issues.apache.org/jira/browse/HBASE-10097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842154#comment-13842154 ] Hudson commented on HBASE-10097: SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-10097 Remove a region name string creation in HRegion#nextInternal (nkeywal: rev 1548711) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcCallContext.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Remove a region name string creation in HRegion#nextInternal Key: HBASE-10097 URL: https://issues.apache.org/jira/browse/HBASE-10097 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.96.1, 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Fix For: 0.96.1, 0.98.1, 0.99.0 Attachments: 10097.v1.patch We're creating a String in each nextInternal. Before HBASE-9983 this was cached, but it's not the case anymore... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE
[ https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842159#comment-13842159 ] Hudson commented on HBASE-10061: SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-10061 TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE (Amit Sela) (ndimiduk: rev 1548747) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE -- Key: HBASE-10061 URL: https://issues.apache.org/jira/browse/HBASE-10061 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.12 Reporter: Amit Sela Assignee: Amit Sela Priority: Minor Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0 Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch TableMapReduceUtil.findOrCreateJar line 596: jar = getJar(my_class); updateMap(jar, packagedClasses); In case getJar returns null, updateMap will throw NPE. Should check null==jar before calling updateMap. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9955) Make hadoop2 the default and deprecate hadoop1
[ https://issues.apache.org/jira/browse/HBASE-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842152#comment-13842152 ] Hudson commented on HBASE-9955: --- SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-9955 Make hadoop2 the default and deprecate hadoop1 (stack: rev 1548735) * /hbase/trunk/pom.xml Make hadoop2 the default and deprecate hadoop1 -- Key: HBASE-9955 URL: https://issues.apache.org/jira/browse/HBASE-9955 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.98.0 Attachments: 9955.txt, 9955v2.txt, 9955v3.txt, 9955v4.txt, 9955v5.098.txt, 9955v5.txt, addendum.txt, addendum.txt See Hadoop version trunk dependency? on the dev mailing ilst. Consensus seems to be forming to do the subject line (Recheck the mail thread before going ahead). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-4163) Create Split Strategy for YCSB Benchmark
[ https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842156#comment-13842156 ] Hudson commented on HBASE-4163: --- SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) Add note on how to presplit for ycsb from HBASE-4163 (stack: rev 1548760) * /hbase/trunk/src/main/docbkx/book.xml Create Split Strategy for YCSB Benchmark Key: HBASE-4163 URL: https://issues.apache.org/jira/browse/HBASE-4163 Project: HBase Issue Type: Improvement Components: util Affects Versions: 0.90.3, 0.92.0 Reporter: Nicolas Spiegelberg Assignee: Luke Lu Priority: Minor Labels: benchmark Fix For: 0.99.0 Talked with Lars about how we can make it easier for users to run the YCSB benchmarks against HBase get realistic results. Currently, HBase is optimized for the random/uniform read/write case, which is the YCSB load. The initial reason why we perform bad when users test against us is because they do not presplit regions have the split ratio really low. We need a one-line way for a user to create a table that is pre-split to 200 regions (or some decent number) by default disable splitting. Realistically, this is how a uniform load cluster should scale, so it's not a hack. This will also give us a good use case to point to for how users should pre-split regions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments
[ https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842153#comment-13842153 ] Hudson commented on HBASE-10099: SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-10099 javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments (tedyu: rev 1548725) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments Key: HBASE-10099 URL: https://issues.apache.org/jira/browse/HBASE-10099 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Demai Ni Assignee: Demai Ni Priority: Trivial Fix For: 0.98.0 Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch, HBASE-10099-trunk-v2.patch src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: warning - @return tag has no arguments -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842155#comment-13842155 ] Hudson commented on HBASE-10048: SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-10048 Add hlog number metric in regionserver (stack: rev 1548768) * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapper.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperStub.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionServer.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtilsForTests.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts
[ https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842158#comment-13842158 ] Hudson commented on HBASE-10085: SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-10085: Some regions aren't re-assigned after a master restarts (jeffreyz: rev 1548726) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java Some regions aren't re-assigned after a cluster restarts Key: HBASE-10085 URL: https://issues.apache.org/jira/browse/HBASE-10085 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.96.1 Attachments: hbase-10085.patch We see this issue happened in a cluster restart: 1) when shutdown a cluster, some regions are in offline state because no Region servers are available(stop RS and then Master) 2) When the cluster restarts, the offlined regions are forced to be offline again and SSH skip re-assigning them by function AM.processServerShutdown as shown below. {code} 2013-12-03 10:41:56,686 INFO [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE 2013-12-03 10:41:56,686 DEBUG [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on deadserver; forcing offline ... 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696} ... 2013-12-03 10:41:57,223 WARN [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696} {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842157#comment-13842157 ] Hudson commented on HBASE-10094: SUCCESS: Integrated in HBase-TRUNK #4715 (See [https://builds.apache.org/job/HBase-TRUNK/4715/]) HBASE-10094 Add batching to HLogPerformanceEvaluation (stack: rev 1548752) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java Add batching to HLogPerformanceEvaluation - Key: HBASE-10094 URL: https://issues.apache.org/jira/browse/HBASE-10094 Project: HBase Issue Type: Sub-task Components: Performance, wal Reporter: stack Assignee: Himanshu Vashishtha Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10094v2.txt As Himanshu points out in the the parent issue, HLogPE is using an unorthodox API appending edits to the WAL; it is using an API that is meant for tests only that does an append immediately followed by a sync call. In normal deploy, WAL appends are done as a bunch of appends followed by a sync on the tail of the transaction -- not a sync per append. This issue is about changing HLogPE to use append and then sync. It also adds an argument so you can specifying batching of a set of appends before the sync is called. The latter lets HLogPE mimic multi puts that use the minibatch... which appends, appends, appends.. and then syncs. Assigning to Himanshu for review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
[ https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842165#comment-13842165 ] Jeffrey Zhong commented on HBASE-10101: --- The test case did uncover a race condition which should pre-exists. Basically it relies on the old source server to be processed by SSH firstly otherwise you can see that even the region assignment(second one) triggered by SSH is also skipped. SSH region assignment should not be skipped at all. The issue should happen for RITs in failed open state before. Below are related log lines: {code} 2013-12-06 20:47:45,903 INFO [AM.-pool62-t1] master.AssignmentManager(1764): Skip assigning testOfflineRegionReAssginedAfterMasterRestart,I,1386391663080.be7906f27d850789818867916aa08c93., it is on a dead but not processed yet server ... 2013-12-06 20:47:45,926 INFO [localhost.localdomain,59276,1386391665426-GeneralBulkAssigner-2] master.AssignmentManager(1447): Skip assigning testOfflineRegionReAssginedAfterMasterRestart,I,1386391663080.be7906f27d850789818867916aa08c93., itapos;s host localhost.localdomain,47661,1386391655958 is dead but not processed yet {code} testOfflineRegionReAssginedAfterMasterRestart times out sometimes. -- Key: HBASE-10101 URL: https://issues.apache.org/jira/browse/HBASE-10101 Project: HBase Issue Type: Test Reporter: Jimmy Xiang Priority: Minor Attachments: hbase-10101.patch, test.log Sometimes, I got this test timed out. The log is attached. It could be because the new cluster takes a while to process the dead server, or assign meta. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
[ https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-10101: -- Attachment: hbase-10101.patch testOfflineRegionReAssginedAfterMasterRestart times out sometimes. -- Key: HBASE-10101 URL: https://issues.apache.org/jira/browse/HBASE-10101 Project: HBase Issue Type: Test Reporter: Jimmy Xiang Priority: Minor Attachments: hbase-10101.patch, test.log Sometimes, I got this test timed out. The log is attached. It could be because the new cluster takes a while to process the dead server, or assign meta. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
[ https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong updated HBASE-10101: -- Status: Patch Available (was: Open) testOfflineRegionReAssginedAfterMasterRestart times out sometimes. -- Key: HBASE-10101 URL: https://issues.apache.org/jira/browse/HBASE-10101 Project: HBase Issue Type: Test Reporter: Jimmy Xiang Priority: Minor Attachments: hbase-10101.patch, test.log Sometimes, I got this test timed out. The log is attached. It could be because the new cluster takes a while to process the dead server, or assign meta. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842167#comment-13842167 ] Liu Shaohui commented on HBASE-10048: - [~lhofhansl] The total blog size metric which is in trunk patch is not contained in 0.94 patch. Please wait and I will update the 0.94 patch. Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node
[ https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842168#comment-13842168 ] Liu Shaohui commented on HBASE-9892: [~stack] Sorry for the trunk patch makes the test TestMetricsRegionServerSourceImpl fail. I will find the reason and update the trunk patch later. Add info port to ServerName to support multi instances in a node Key: HBASE-9892 URL: https://issues.apache.org/jira/browse/HBASE-9892 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt The full GC time of regionserver with big heap( 30G ) usually can not be controlled in 30s. At the same time, the servers with 64G memory are normal. So we try to deploy multi rs instances(2-3 ) in a single node and the heap of each rs is about 20G ~ 24G. Most of the things works fine, except the hbase web ui. The master get the RS info port from conf, which is suitable for this situation of multi rs instances in a node. So we add info port to ServerName. a. at the startup, rs report it's info port to Hmaster. b, For root region, rs write the servername with info port ro the zookeeper root-region-server node. c, For meta regions, rs write the servername with info port to root region d. For user regions, rs write the servername with info port to meta regions So hmaster and client can get info port from the servername. To test this feature, I change the rs num from 1 to 3 in standalone mode, so we can test it in standalone mode, I think Hoya(hbase on yarn) will encounter the same problem. Anyone knows how Hoya handle this problem? PS: There are different formats for servername in zk node and meta table, i think we need to unify it and refactor the code. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10097) Remove a region name string creation in HRegion#nextInternal
[ https://issues.apache.org/jira/browse/HBASE-10097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842172#comment-13842172 ] Hudson commented on HBASE-10097: FAILURE: Integrated in hbase-0.96-hadoop2 #143 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/143/]) HBASE-10097 Remove a region name string creation in HRegion#nextInternal (nkeywal: rev 1548712) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcCallContext.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Remove a region name string creation in HRegion#nextInternal Key: HBASE-10097 URL: https://issues.apache.org/jira/browse/HBASE-10097 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0, 0.96.1, 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Fix For: 0.96.1, 0.98.1, 0.99.0 Attachments: 10097.v1.patch We're creating a String in each nextInternal. Before HBASE-9983 this was cached, but it's not the case anymore... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts
[ https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842174#comment-13842174 ] Hudson commented on HBASE-10085: FAILURE: Integrated in hbase-0.96-hadoop2 #143 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/143/]) HBASE-10085: Some regions aren't re-assigned after a master restarts (jeffreyz: rev 1548728) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java Some regions aren't re-assigned after a cluster restarts Key: HBASE-10085 URL: https://issues.apache.org/jira/browse/HBASE-10085 Project: HBase Issue Type: Bug Components: Region Assignment Affects Versions: 0.96.1 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Fix For: 0.98.0, 0.96.1 Attachments: hbase-10085.patch We see this issue happened in a cluster restart: 1) when shutdown a cluster, some regions are in offline state because no Region servers are available(stop RS and then Master) 2) When the cluster restarts, the offlined regions are forced to be offline again and SSH skip re-assigning them by function AM.processServerShutdown as shown below. {code} 2013-12-03 10:41:56,686 INFO [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE 2013-12-03 10:41:56,686 DEBUG [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on deadserver; forcing offline ... 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696} ... 2013-12-03 10:41:57,223 WARN [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696} {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842173#comment-13842173 ] Hudson commented on HBASE-10094: FAILURE: Integrated in hbase-0.96-hadoop2 #143 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/143/]) HBASE-10094 Add batching to HLogPerformanceEvaluation (stack: rev 1548754) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java Add batching to HLogPerformanceEvaluation - Key: HBASE-10094 URL: https://issues.apache.org/jira/browse/HBASE-10094 Project: HBase Issue Type: Sub-task Components: Performance, wal Reporter: stack Assignee: Himanshu Vashishtha Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10094v2.txt As Himanshu points out in the the parent issue, HLogPE is using an unorthodox API appending edits to the WAL; it is using an API that is meant for tests only that does an append immediately followed by a sync call. In normal deploy, WAL appends are done as a bunch of appends followed by a sync on the tail of the transaction -- not a sync per append. This issue is about changing HLogPE to use append and then sync. It also adds an argument so you can specifying batching of a set of appends before the sync is called. The latter lets HLogPE mimic multi puts that use the minibatch... which appends, appends, appends.. and then syncs. Assigning to Himanshu for review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10093) Unregister ReplicationSource metric bean when the replication source thread is terminated
[ https://issues.apache.org/jira/browse/HBASE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842176#comment-13842176 ] Hudson commented on HBASE-10093: SUCCESS: Integrated in HBase-0.94-security #354 (See [https://builds.apache.org/job/HBase-0.94-security/354/]) HBASE-10093 Unregister ReplicationSource metric bean when the replication source thread is terminated (cuijianwei) (larsh: rev 1548802) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceMetrics.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationStatistics.java Unregister ReplicationSource metric bean when the replication source thread is terminated -- Key: HBASE-10093 URL: https://issues.apache.org/jira/browse/HBASE-10093 Project: HBase Issue Type: Improvement Components: Replication Affects Versions: 0.94.14 Reporter: cuijianwei Assignee: cuijianwei Fix For: 0.94.15 Attachments: HBASE-10093-0.94-v1.patch Each replication source thread will register a metric bean to show its statistics. The source threads will be terminated when region server exit and the metric beans will be removed. However, replication source thread may also be terminated when user removing the peer explicitly or it just takes a recover queue and finished replicating the queued HLogs. In these situations, the metric bean won't be unregistered and user may be confused to always see the statistics from terminated replication source threads. Maybe, it is more clear to remove the metric bean after replication source thread terminated? Then, the statistics will only from active replication sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE
[ https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842177#comment-13842177 ] Hudson commented on HBASE-10061: SUCCESS: Integrated in HBase-0.94-security #354 (See [https://builds.apache.org/job/HBase-0.94-security/354/]) HBASE-10061 TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE (Amit Sela) (ndimiduk: rev 1548750) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE -- Key: HBASE-10061 URL: https://issues.apache.org/jira/browse/HBASE-10061 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.12 Reporter: Amit Sela Assignee: Amit Sela Priority: Minor Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0 Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch TableMapReduceUtil.findOrCreateJar line 596: jar = getJar(my_class); updateMap(jar, packagedClasses); In case getJar returns null, updateMap will throw NPE. Should check null==jar before calling updateMap. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE
[ https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842175#comment-13842175 ] Hudson commented on HBASE-10061: FAILURE: Integrated in hbase-0.96-hadoop2 #143 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/143/]) HBASE-10061 TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE (Amit Sela) (ndimiduk: rev 1548749) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE -- Key: HBASE-10061 URL: https://issues.apache.org/jira/browse/HBASE-10061 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.12 Reporter: Amit Sela Assignee: Amit Sela Priority: Minor Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0 Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch TableMapReduceUtil.findOrCreateJar line 596: jar = getJar(my_class); updateMap(jar, packagedClasses); In case getJar returns null, updateMap will throw NPE. Should check null==jar before calling updateMap. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842180#comment-13842180 ] Hudson commented on HBASE-10094: FAILURE: Integrated in hbase-0.98-hadoop2 #2 (See [https://builds.apache.org/job/hbase-0.98-hadoop2/2/]) HBASE-10094 Add batching to HLogPerformanceEvaluation (stack: rev 1548753) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java Add batching to HLogPerformanceEvaluation - Key: HBASE-10094 URL: https://issues.apache.org/jira/browse/HBASE-10094 Project: HBase Issue Type: Sub-task Components: Performance, wal Reporter: stack Assignee: Himanshu Vashishtha Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10094v2.txt As Himanshu points out in the the parent issue, HLogPE is using an unorthodox API appending edits to the WAL; it is using an API that is meant for tests only that does an append immediately followed by a sync call. In normal deploy, WAL appends are done as a bunch of appends followed by a sync on the tail of the transaction -- not a sync per append. This issue is about changing HLogPE to use append and then sync. It also adds an argument so you can specifying batching of a set of appends before the sync is called. The latter lets HLogPE mimic multi puts that use the minibatch... which appends, appends, appends.. and then syncs. Assigning to Himanshu for review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE
[ https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842181#comment-13842181 ] Hudson commented on HBASE-10061: FAILURE: Integrated in hbase-0.98-hadoop2 #2 (See [https://builds.apache.org/job/hbase-0.98-hadoop2/2/]) HBASE-10061 TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE (Amit Sela) (ndimiduk: rev 1548748) * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE -- Key: HBASE-10061 URL: https://issues.apache.org/jira/browse/HBASE-10061 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.12 Reporter: Amit Sela Assignee: Amit Sela Priority: Minor Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0 Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch TableMapReduceUtil.findOrCreateJar line 596: jar = getJar(my_class); updateMap(jar, packagedClasses); In case getJar returns null, updateMap will throw NPE. Should check null==jar before calling updateMap. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842179#comment-13842179 ] Hudson commented on HBASE-10048: FAILURE: Integrated in hbase-0.98-hadoop2 #2 (See [https://builds.apache.org/job/hbase-0.98-hadoop2/2/]) HBASE-10048 Add hlog number metric in regionserver (stack: rev 1548769) * /hbase/branches/0.98/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java * /hbase/branches/0.98/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapper.java * /hbase/branches/0.98/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/branches/0.98/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/branches/0.98/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/branches/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperStub.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionServer.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtilsForTests.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9955) Make hadoop2 the default and deprecate hadoop1
[ https://issues.apache.org/jira/browse/HBASE-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842178#comment-13842178 ] Hudson commented on HBASE-9955: --- FAILURE: Integrated in hbase-0.98-hadoop2 #2 (See [https://builds.apache.org/job/hbase-0.98-hadoop2/2/]) HBASE-9955 Make hadoop2 the default and deprecate hadoop1 (stack: rev 1548736) * /hbase/branches/0.98/dev-support/test-patch.sh * /hbase/branches/0.98/hbase-client/pom.xml * /hbase/branches/0.98/hbase-common/pom.xml * /hbase/branches/0.98/hbase-examples/pom.xml * /hbase/branches/0.98/hbase-it/pom.xml * /hbase/branches/0.98/hbase-prefix-tree/pom.xml * /hbase/branches/0.98/hbase-server/pom.xml * /hbase/branches/0.98/hbase-shell/pom.xml * /hbase/branches/0.98/hbase-testing-util/pom.xml * /hbase/branches/0.98/hbase-thrift/pom.xml * /hbase/branches/0.98/pom.xml Make hadoop2 the default and deprecate hadoop1 -- Key: HBASE-9955 URL: https://issues.apache.org/jira/browse/HBASE-9955 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.98.0 Attachments: 9955.txt, 9955v2.txt, 9955v3.txt, 9955v4.txt, 9955v5.098.txt, 9955v5.txt, addendum.txt, addendum.txt See Hadoop version trunk dependency? on the dev mailing ilst. Consensus seems to be forming to do the subject line (Recheck the mail thread before going ahead). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10093) Unregister ReplicationSource metric bean when the replication source thread is terminated
[ https://issues.apache.org/jira/browse/HBASE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842188#comment-13842188 ] Hudson commented on HBASE-10093: SUCCESS: Integrated in HBase-0.94 #1220 (See [https://builds.apache.org/job/HBase-0.94/1220/]) HBASE-10093 Unregister ReplicationSource metric bean when the replication source thread is terminated (cuijianwei) (larsh: rev 1548802) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceMetrics.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationStatistics.java Unregister ReplicationSource metric bean when the replication source thread is terminated -- Key: HBASE-10093 URL: https://issues.apache.org/jira/browse/HBASE-10093 Project: HBase Issue Type: Improvement Components: Replication Affects Versions: 0.94.14 Reporter: cuijianwei Assignee: cuijianwei Fix For: 0.94.15 Attachments: HBASE-10093-0.94-v1.patch Each replication source thread will register a metric bean to show its statistics. The source threads will be terminated when region server exit and the metric beans will be removed. However, replication source thread may also be terminated when user removing the peer explicitly or it just takes a recover queue and finished replicating the queued HLogs. In these situations, the metric bean won't be unregistered and user may be confused to always see the statistics from terminated replication source threads. Maybe, it is more clear to remove the metric bean after replication source thread terminated? Then, the statistics will only from active replication sources. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE
[ https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842189#comment-13842189 ] Hudson commented on HBASE-10061: SUCCESS: Integrated in HBase-0.94 #1220 (See [https://builds.apache.org/job/HBase-0.94/1220/]) HBASE-10061 TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE (Amit Sela) (ndimiduk: rev 1548750) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE -- Key: HBASE-10061 URL: https://issues.apache.org/jira/browse/HBASE-10061 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.12 Reporter: Amit Sela Assignee: Amit Sela Priority: Minor Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0 Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch TableMapReduceUtil.findOrCreateJar line 596: jar = getJar(my_class); updateMap(jar, packagedClasses); In case getJar returns null, updateMap will throw NPE. Should check null==jar before calling updateMap. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10010) eliminate the put latency spike on the new log file beginning
[ https://issues.apache.org/jira/browse/HBASE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842191#comment-13842191 ] Liang Xie commented on HBASE-10010: --- It's not a mistake for test change. since after the sync, we could open an empty new file without failure, so we need remove the original assert statement. eliminate the put latency spike on the new log file beginning - Key: HBASE-10010 URL: https://issues.apache.org/jira/browse/HBASE-10010 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.13 Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-10010-0.94-v2.txt, HBase-10010-0.94-v3.txt, HBase-10010-0.94.txt, HBase-10010-trunk-v2.txt, HBase-10010-trunk.txt In deed, the original finding came from fb, see HBASE-6813 for detailed discussion. Through this improvement doesn't expect obvious gain on 95th or 99th latency, it still could make the response time more stable to me. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94
[ https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842192#comment-13842192 ] Jean-Marc Spaggiari commented on HBASE-10089: - I have some free time this WE to test it if you want. But I don't think we will see much differences. One option is to script a loop to create/drop tables in the 2 version and compare, and for perf, maybe something doing random tables gets with existing and non-existing names and compare? Metrics intern table names cause eventual permgen OOM in 0.94 - Key: HBASE-10089 URL: https://issues.apache.org/jira/browse/HBASE-10089 Project: HBase Issue Type: Bug Affects Versions: 0.94.0, 0.94.14 Reporter: Dave Latham Assignee: Ted Yu Priority: Minor Fix For: 0.94.15 Attachments: 10089-0.94.txt As part of the metrics system introduced in HBASE-4768 there are two places that hbase uses String interning ( SchemaConfigured and SchemaMetrics ). This includes interning table names. We have long running environment where we run regular integration tests on our application using hbase. Those tests create and drop tables with new names regularly. These leads to filling up the permgen with interned table names. Workaround is to periodically restart the region servers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
[ https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842193#comment-13842193 ] Hadoop QA commented on HBASE-10101: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617549/hbase-10101.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8085//console This message is automatically generated. testOfflineRegionReAssginedAfterMasterRestart times out sometimes. -- Key: HBASE-10101 URL: https://issues.apache.org/jira/browse/HBASE-10101 Project: HBase Issue Type: Test Reporter: Jimmy Xiang Priority: Minor Attachments: hbase-10101.patch, test.log Sometimes, I got this test timed out. The log is attached. It could be because the new cluster takes a while to process the dead server, or assign meta. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10100) Hbase replication cluster can have varying peers under certain conditions
[ https://issues.apache.org/jira/browse/HBASE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842197#comment-13842197 ] Jean-Marc Spaggiari commented on HBASE-10100: - [~jdcryans] probably something you want to look at... Hbase replication cluster can have varying peers under certain conditions - Key: HBASE-10100 URL: https://issues.apache.org/jira/browse/HBASE-10100 Project: HBase Issue Type: Bug Affects Versions: 0.94.5, 0.95.0, 0.96.0 Reporter: churro morales We were trying to replicate hbase data over to a new datacenter recently. After we turned on replication and then did our copy tables. We noticed that verify replication had discrepancies. We ran a list_peers and it returned back both peers, the original datacenter we were replicating to and the new datacenter (this was correct). When grepping through the logs for a few regionservers we noticed that a few regionservers had the following entry in their logs: 2013-09-26 10:55:46,907 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Error while adding a new peer java.net.UnknownHostException: xxx.xxx.flurry.com (this was due to a transient dns issue) Thus a very small subet of our regionservers were not replicating to this new cluster while most were. We probably don't want to abort if this type of issue comes up, it could potentially be fatal if someone does an add_peer operation with a typo. This could potentially shut down the cluster. One solution I can think of is keeping some flag in ReplicationSourceManager which is a boolean that keeps track of whether there was an errorAddingPeer. Then in the logPositionAndCleanOldLogs we can do something like: {code} if (errorAddingPeer) { LOG.error(There was an error adding a peer, logs will not be marked for deletion); return; } {code} thus we are not deleting these logs from the queue. You will notice your replicating queue rising on certain machines and you can still replay the logs, thus avoiding a lengthy copy table. I have a patch (with unit test) for the above proposal, if everyone thinks that is an okay solution. An additional idea would be to add some retry logic inside the PeersWatcher class for the nodeChildrenChanged method. Thus if there happens to be some issue we could sort it out without having to bounce that particular regionserver. Would love to hear everyones thoughts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9648) collection one expired storefile causes it to be replaced by another expired storefile
[ https://issues.apache.org/jira/browse/HBASE-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842202#comment-13842202 ] Jean-Marc Spaggiari commented on HBASE-9648: So. Back on this JIRA ;) I think patch HBASE-9648-v1-trunk.patch can fix this issue the user faced on the mailing list. It basically do what need to be done to avoid this from the first level. It passed Hadoop QA, but I can also port it to 0.94 and give it a bigger try on my own cluster... collection one expired storefile causes it to be replaced by another expired storefile -- Key: HBASE-9648 URL: https://issues.apache.org/jira/browse/HBASE-9648 Project: HBase Issue Type: Bug Components: Compaction Reporter: Sergey Shelukhin Assignee: Jean-Marc Spaggiari Attachments: HBASE-9648-v0-0.94.patch, HBASE-9648-v0-trunk.patch, HBASE-9648-v1-trunk.patch, HBASE-9648.patch There's a shortcut in compaction selection that causes the selection of expired store files to quickly delete. However, there's also the code that ensures we write at least one file to preserve seqnum. This new empty file is expired, because it has no data, presumably. So it's collected again, etc. This affects 94, probably also 96. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9806) Add PerfEval tool for BlockCache
[ https://issues.apache.org/jira/browse/HBASE-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842207#comment-13842207 ] Jean-Marc Spaggiari commented on HBASE-9806: Working on that right now... Add PerfEval tool for BlockCache Key: HBASE-9806 URL: https://issues.apache.org/jira/browse/HBASE-9806 Project: HBase Issue Type: Test Components: Performance, test Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-9806.00.patch, HBASE-9806.01.patch, conf_20g.patch, conf_3g.patch, test1_run1_20g.pdf, test1_run1_3g.pdf, test1_run2_20g.pdf, test1_run2_3g.pdf We have at least three different block caching layers with myriad configuration settings. Let's add a tool for evaluating memory allocations and configuration combinations with different access patterns. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10094) Add batching to HLogPerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-10094: -- Attachment: addendum.0.96.txt My backport broke the 0.96 build. Here is fix I applied. Add batching to HLogPerformanceEvaluation - Key: HBASE-10094 URL: https://issues.apache.org/jira/browse/HBASE-10094 Project: HBase Issue Type: Sub-task Components: Performance, wal Reporter: stack Assignee: Himanshu Vashishtha Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10094v2.txt, addendum.0.96.txt As Himanshu points out in the the parent issue, HLogPE is using an unorthodox API appending edits to the WAL; it is using an API that is meant for tests only that does an append immediately followed by a sync call. In normal deploy, WAL appends are done as a bunch of appends followed by a sync on the tail of the transaction -- not a sync per append. This issue is about changing HLogPE to use append and then sync. It also adds an argument so you can specifying batching of a set of appends before the sync is called. The latter lets HLogPE mimic multi puts that use the minibatch... which appends, appends, appends.. and then syncs. Assigning to Himanshu for review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-10048: -- Attachment: 10048.096.v4.txt Patch for 0.96 is a bit different. Uploading what I applied. Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10048.096.v4.txt, HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10010) eliminate the put latency spike on the new log file beginning
[ https://issues.apache.org/jira/browse/HBASE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842223#comment-13842223 ] stack commented on HBASE-10010: --- OK. Thanks. Will commit soon as trunk build restabilizes. eliminate the put latency spike on the new log file beginning - Key: HBASE-10010 URL: https://issues.apache.org/jira/browse/HBASE-10010 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.13 Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-10010-0.94-v2.txt, HBase-10010-0.94-v3.txt, HBase-10010-0.94.txt, HBase-10010-trunk-v2.txt, HBase-10010-trunk.txt In deed, the original finding came from fb, see HBASE-6813 for detailed discussion. Through this improvement doesn't expect obvious gain on 95th or 99th latency, it still could make the response time more stable to me. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842225#comment-13842225 ] Hadoop QA commented on HBASE-10048: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617555/10048.096.v4.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 18 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8086//console This message is automatically generated. Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10048.096.v4.txt, HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842227#comment-13842227 ] stack commented on HBASE-1: --- Needs evidence works on real cluster and that edits are properly recovered before commit. The last time there was messing in this area it was a bit of a disaster: see HBASE-8389 and its subsequent fixup issues. Initiate lease recovery for outstanding WAL files at the very beginning of recovery --- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1 Attachments: 1-recover-ts-with-pb-2.txt, 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 1-v4.txt, 1-v5.txt, 1-v6.txt At the beginning of recovery, master can send lease recovery requests concurrently for outstanding WAL files using a thread pool. Each split worker would first check whether the WAL file it processes is closed. Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this idea. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842230#comment-13842230 ] stack commented on HBASE-10048: --- Committed the 0.96 patch. Leaving open for 0.94 patch. Thanks [~liushaohui] Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10048.096.v4.txt, HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9806) Add PerfEval tool for BlockCache
[ https://issues.apache.org/jira/browse/HBASE-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842269#comment-13842269 ] Jean-Marc Spaggiari commented on HBASE-9806: Hum. Seems to not compile in Trunk: [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /home/jmspaggi/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/BlockCachePerformanceEvaluation.java:[327,20] error: no suitable method found for createReader(FileSystem,Path,CacheConfig) I will take a look Add PerfEval tool for BlockCache Key: HBASE-9806 URL: https://issues.apache.org/jira/browse/HBASE-9806 Project: HBase Issue Type: Test Components: Performance, test Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-9806.00.patch, HBASE-9806.01.patch, conf_20g.patch, conf_3g.patch, test1_run1_20g.pdf, test1_run1_3g.pdf, test1_run2_20g.pdf, test1_run2_3g.pdf We have at least three different block caching layers with myriad configuration settings. Let's add a tool for evaluating memory allocations and configuration combinations with different access patterns. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9806) Add PerfEval tool for BlockCache
[ https://issues.apache.org/jira/browse/HBASE-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842272#comment-13842272 ] Jean-Marc Spaggiari commented on HBASE-9806: {code} reader = HFile.createReader(fs, mf, cacheConfig); {code} Need to be {code} reader = HFile.createReader(fs, mf, cacheConfig, conf); {code} Add PerfEval tool for BlockCache Key: HBASE-9806 URL: https://issues.apache.org/jira/browse/HBASE-9806 Project: HBase Issue Type: Test Components: Performance, test Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-9806.00.patch, HBASE-9806.01.patch, conf_20g.patch, conf_3g.patch, test1_run1_20g.pdf, test1_run1_3g.pdf, test1_run2_20g.pdf, test1_run2_3g.pdf We have at least three different block caching layers with myriad configuration settings. Let's add a tool for evaluating memory allocations and configuration combinations with different access patterns. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10010) eliminate the put latency spike on the new log file beginning
[ https://issues.apache.org/jira/browse/HBASE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842317#comment-13842317 ] Himanshu Vashishtha commented on HBASE-10010: - Got it. +1. eliminate the put latency spike on the new log file beginning - Key: HBASE-10010 URL: https://issues.apache.org/jira/browse/HBASE-10010 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.13 Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-10010-0.94-v2.txt, HBase-10010-0.94-v3.txt, HBase-10010-0.94.txt, HBase-10010-trunk-v2.txt, HBase-10010-trunk.txt In deed, the original finding came from fb, see HBASE-6813 for detailed discussion. Through this improvement doesn't expect obvious gain on 95th or 99th latency, it still could make the response time more stable to me. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842354#comment-13842354 ] Hudson commented on HBASE-10094: SUCCESS: Integrated in hbase-0.96 #218 (See [https://builds.apache.org/job/hbase-0.96/218/]) HBASE-10094 Add batching to HLogPerformanceEvaluation (stack: rev 1548913) * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java Add batching to HLogPerformanceEvaluation - Key: HBASE-10094 URL: https://issues.apache.org/jira/browse/HBASE-10094 Project: HBase Issue Type: Sub-task Components: Performance, wal Reporter: stack Assignee: Himanshu Vashishtha Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10094v2.txt, addendum.0.96.txt As Himanshu points out in the the parent issue, HLogPE is using an unorthodox API appending edits to the WAL; it is using an API that is meant for tests only that does an append immediately followed by a sync call. In normal deploy, WAL appends are done as a bunch of appends followed by a sync on the tail of the transaction -- not a sync per append. This issue is about changing HLogPE to use append and then sync. It also adds an argument so you can specifying batching of a set of appends before the sync is called. The latter lets HLogPE mimic multi puts that use the minibatch... which appends, appends, appends.. and then syncs. Assigning to Himanshu for review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842353#comment-13842353 ] Hudson commented on HBASE-10048: SUCCESS: Integrated in hbase-0.96 #218 (See [https://builds.apache.org/job/hbase-0.96/218/]) HBASE-10048 Add hlog number metric in regionserver (stack: rev 1548917) * /hbase/branches/0.96/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java * /hbase/branches/0.96/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapper.java * /hbase/branches/0.96/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/branches/0.96/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/branches/0.96/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperStub.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionServer.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtilsForTests.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10048.096.v4.txt, HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842359#comment-13842359 ] Ted Yu commented on HBASE-1: Cluster testing would be done next week. Initiate lease recovery for outstanding WAL files at the very beginning of recovery --- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1 Attachments: 1-recover-ts-with-pb-2.txt, 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 1-v4.txt, 1-v5.txt, 1-v6.txt At the beginning of recovery, master can send lease recovery requests concurrently for outstanding WAL files using a thread pool. Each split worker would first check whether the WAL file it processes is closed. Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this idea. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10010) eliminate the put latency spike on the new log file beginning
[ https://issues.apache.org/jira/browse/HBASE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842361#comment-13842361 ] Lars Hofhansl commented on HBASE-10010: --- Seems that the new proposed change would have the exact same effect as the patch on HBASE-7216. The change on HBASE-7216 seems cleaner. eliminate the put latency spike on the new log file beginning - Key: HBASE-10010 URL: https://issues.apache.org/jira/browse/HBASE-10010 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.94.13 Reporter: Liang Xie Assignee: Liang Xie Attachments: HBase-10010-0.94-v2.txt, HBase-10010-0.94-v3.txt, HBase-10010-0.94.txt, HBase-10010-trunk-v2.txt, HBase-10010-trunk.txt In deed, the original finding came from fb, see HBASE-6813 for detailed discussion. Through this improvement doesn't expect obvious gain on 95th or 99th latency, it still could make the response time more stable to me. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change
[ https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Shulman updated HBASE-9966: - Fix Version/s: (was: 0.95.2) 0.99.0 0.98.1 0.96.1 Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change --- Key: HBASE-9966 URL: https://issues.apache.org/jira/browse/HBASE-9966 Project: HBase Issue Type: Sub-task Components: HFile, test Affects Versions: 0.98.0, 0.96.1 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Fix For: 0.96.1, 0.98.1, 0.99.0 For online schema change, a user is perfectly with her rights to modify the compression algorithm used, or the bloom filter. Therefore, we should add these actions to our ChaosMonkey tests to ensure that they do not introduce instability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change
[ https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Shulman updated HBASE-9966: - Attachment: HBASE-9966-96.patch HBASE-9966-98.patch HBASE-9966-trunk.patch The patch ends up being the same for Trunk, 98, and 0.96. Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change --- Key: HBASE-9966 URL: https://issues.apache.org/jira/browse/HBASE-9966 Project: HBase Issue Type: Sub-task Components: HFile, test Affects Versions: 0.98.0, 0.96.1 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Fix For: 0.96.1, 0.98.1, 0.99.0 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, HBASE-9966-trunk.patch For online schema change, a user is perfectly with her rights to modify the compression algorithm used, or the bloom filter. Therefore, we should add these actions to our ChaosMonkey tests to ensure that they do not introduce instability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change
[ https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Shulman updated HBASE-9966: - Labels: online_schema_change (was: ) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change --- Key: HBASE-9966 URL: https://issues.apache.org/jira/browse/HBASE-9966 Project: HBase Issue Type: Sub-task Components: HFile, test Affects Versions: 0.98.0, 0.96.1 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Labels: online_schema_change Fix For: 0.96.1, 0.98.1, 0.99.0 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, HBASE-9966-trunk.patch For online schema change, a user is perfectly with her rights to modify the compression algorithm used, or the bloom filter. Therefore, we should add these actions to our ChaosMonkey tests to ensure that they do not introduce instability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-9553) Pad HFile blocks to a fixed size before placing them into the blockcache
[ https://issues.apache.org/jira/browse/HBASE-9553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-9553. -- Resolution: Invalid Pad HFile blocks to a fixed size before placing them into the blockcache Key: HBASE-9553 URL: https://issues.apache.org/jira/browse/HBASE-9553 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl In order to make it easy on the garbage collector and to avoid full compaction phases we should make sure that all (or at least a large percentage) of the HFile blocks as cached in the block cache are exactly the same size. Currently an HFile block is typically slightly larger than the declared block size, as the block will accommodate that last KV on the block. The padding would be a ColumnFamily option. In many cases 100 bytes would probably be a good value to make all blocks exactly the same size (but of course it depends on the max size of the KVs). This does not have to be perfect. The more blocks evicted and replaced in the block cache are of the exact same size the easier it should be on the GC. Thoughts? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-6476) Replace all occurrances of System.currentTimeMillis() with EnvironmentEdge equivalent
[ https://issues.apache.org/jira/browse/HBASE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-6476. -- Resolution: Won't Fix Replace all occurrances of System.currentTimeMillis() with EnvironmentEdge equivalent - Key: HBASE-6476 URL: https://issues.apache.org/jira/browse/HBASE-6476 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Priority: Minor Attachments: 6476-v2.txt, 6476-v2.txt, 6476.txt, 6476v3.txt There are still some areas where System.currentTimeMillis() is used in HBase. In order to make all parts of the code base testable and (potentially) to be able to configure HBase's notion of time, this should be generally be replaced with EnvironmentEdgeManager.currentTimeMillis(). How hard would it be to add a maven task that checks for that, so we do not introduce System.currentTimeMillis back in the future? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
[ https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842367#comment-13842367 ] Jeffrey Zhong commented on HBASE-10101: --- [~jxiang] Could you please take a quick look at the fix which is trivial? Thanks. testOfflineRegionReAssginedAfterMasterRestart times out sometimes. -- Key: HBASE-10101 URL: https://issues.apache.org/jira/browse/HBASE-10101 Project: HBase Issue Type: Test Reporter: Jimmy Xiang Priority: Minor Attachments: hbase-10101.patch, test.log Sometimes, I got this test timed out. The log is attached. It could be because the new cluster takes a while to process the dead server, or assign meta. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-7347) Allow multiple readers per storefile
[ https://issues.apache.org/jira/browse/HBASE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7347. -- Resolution: Won't Fix Closing. Any further discussion should be had in HBASE-5979. Allow multiple readers per storefile Key: HBASE-7347 URL: https://issues.apache.org/jira/browse/HBASE-7347 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Currently each store file is read only through the single reader regardless of how many concurrent read requests access that file. This issue is to explore alternate designs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842373#comment-13842373 ] Hudson commented on HBASE-10094: SUCCESS: Integrated in hbase-0.96-hadoop2 #144 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/144/]) HBASE-10094 Add batching to HLogPerformanceEvaluation (stack: rev 1548913) * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java Add batching to HLogPerformanceEvaluation - Key: HBASE-10094 URL: https://issues.apache.org/jira/browse/HBASE-10094 Project: HBase Issue Type: Sub-task Components: Performance, wal Reporter: stack Assignee: Himanshu Vashishtha Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10094v2.txt, addendum.0.96.txt As Himanshu points out in the the parent issue, HLogPE is using an unorthodox API appending edits to the WAL; it is using an API that is meant for tests only that does an append immediately followed by a sync call. In normal deploy, WAL appends are done as a bunch of appends followed by a sync on the tail of the transaction -- not a sync per append. This issue is about changing HLogPE to use append and then sync. It also adds an argument so you can specifying batching of a set of appends before the sync is called. The latter lets HLogPE mimic multi puts that use the minibatch... which appends, appends, appends.. and then syncs. Assigning to Himanshu for review. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842372#comment-13842372 ] Hudson commented on HBASE-10048: SUCCESS: Integrated in hbase-0.96-hadoop2 #144 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/144/]) HBASE-10048 Add hlog number metric in regionserver (stack: rev 1548917) * /hbase/branches/0.96/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java * /hbase/branches/0.96/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapper.java * /hbase/branches/0.96/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/branches/0.96/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java * /hbase/branches/0.96/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperImpl.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerWrapperStub.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionServer.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtilsForTests.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java Add hlog number metric in regionserver -- Key: HBASE-10048 URL: https://issues.apache.org/jira/browse/HBASE-10048 Project: HBase Issue Type: Improvement Components: metrics Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 0.98.0, 0.96.1, 0.99.0 Attachments: 10048.096.v4.txt, HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff Add hlog number metric in regionserver. We can use this metric to alert about memstore flush because of too many hlogs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-5164) Better HTable resource consumption in CoprocessorHost
[ https://issues.apache.org/jira/browse/HBASE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-5164. -- Resolution: Duplicate Largely fixed with HBASE-9534 Better HTable resource consumption in CoprocessorHost - Key: HBASE-5164 URL: https://issues.apache.org/jira/browse/HBASE-5164 Project: HBase Issue Type: Sub-task Components: Coprocessors Reporter: Lars Hofhansl Priority: Minor HBASE-4805 allows for more control over HTable's resource consumption. This is currently not used by CoprocessorHost (even though it would even be more critical to control this inside the RegionServer). It's not immediate obvious how to do that. Maybe CoprocessorHost should maintain a lazy ExecutorService and HConnection and reuse both for all HTables retrieved via CoprocessorEnvironment.getTable(...). Not sure how critical this is, but I feel without this it is dangerous to use getTable, as it would lead to all resource consumption problems we find in the client, but inside a crucial part of the HBase servers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-5002) Control replication peer per column family.
[ https://issues.apache.org/jira/browse/HBASE-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-5002. -- Resolution: Later Control replication peer per column family. --- Key: HBASE-5002 URL: https://issues.apache.org/jira/browse/HBASE-5002 Project: HBase Issue Type: Sub-task Components: Replication Reporter: Lars Hofhansl Priority: Minor With HBASE-2196 in place. Would be nice if we could control per CF to which peer(s) it is replicated. Could have a new option REPLICATION_PEERS, which holds a , separated list of replication peers that this CF should be replicated to. If not given replicate to all slaves. This list would need to written to each log entry (could reuse WALEdit.scopes for this), so we can decide in ReplicationSource who this entry should go to. Let's discuss... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-9025) Potential Thread safety issue in MetaScanner
[ https://issues.apache.org/jira/browse/HBASE-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-9025. -- Resolution: Cannot Reproduce Potential Thread safety issue in MetaScanner Key: HBASE-9025 URL: https://issues.apache.org/jira/browse/HBASE-9025 Project: HBase Issue Type: Bug Affects Versions: 0.94.9 Reporter: Lars Hofhansl I just saw this in a test run in 0.94: {code} Stacktrace java.lang.NullPointerException at java.util.TreeMap.getEntry(TreeMap.java:324) at java.util.TreeMap.get(TreeMap.java:255) at org.apache.hadoop.hbase.util.TestHBaseFsck.testSplitDaughtersNotInMeta(TestHBaseFsck.java:1346) ... {code} The TreeMap in question here is actually returned from {{HTable.getRegionLocations()}}, which in turns calls {{MetaScanner.allTableRegions(getConfiguration(), getTableName(), false);}} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-1: --- Attachment: (was: 1-v1.txt) Initiate lease recovery for outstanding WAL files at the very beginning of recovery --- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1 Attachments: 1-recover-ts-with-pb-2.txt, 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-recover-ts-with-pb-5.txt, 1-v4.txt, 1-v5.txt, 1-v6.txt At the beginning of recovery, master can send lease recovery requests concurrently for outstanding WAL files using a thread pool. Each split worker would first check whether the WAL file it processes is closed. Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this idea. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-1: --- Attachment: 1-recover-ts-with-pb-5.txt Patch v5 changed parameter ts to leaseRecoveryReqTS so that it is more readable. Also modified the computation of firstPause in recoverDFSFileLease() to account for the elapse of time since the lease recovery request. All \*Log\* tests passed. This would be the patch to be tested next week. Initiate lease recovery for outstanding WAL files at the very beginning of recovery --- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1 Attachments: 1-recover-ts-with-pb-2.txt, 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-recover-ts-with-pb-5.txt, 1-v4.txt, 1-v5.txt, 1-v6.txt At the beginning of recovery, master can send lease recovery requests concurrently for outstanding WAL files using a thread pool. Each split worker would first check whether the WAL file it processes is closed. Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this idea. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-10104) test-patch.sh doesn't need to test compilation against hadoop 1.0
Ted Yu created HBASE-10104: -- Summary: test-patch.sh doesn't need to test compilation against hadoop 1.0 Key: HBASE-10104 URL: https://issues.apache.org/jira/browse/HBASE-10104 Project: HBase Issue Type: Test Reporter: Ted Yu Priority: Minor test-patch.sh performs compilation check against hadoop 1.0 and 1.1 The compilation against hadoop 1.0 can be skipped. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-10105) SaslUtil#encodeIdentifier may throw NullPointerException
Ted Yu created HBASE-10105: -- Summary: SaslUtil#encodeIdentifier may throw NullPointerException Key: HBASE-10105 URL: https://issues.apache.org/jira/browse/HBASE-10105 Project: HBase Issue Type: Bug Reporter: Ted Yu Encountered the following exception when running TestHBaseSaslRpcClient on Mac: {code} 2013-12-08 14:18:24,754 ERROR [main] security.TestHBaseSaslRpcClient(243): java.lang.NullPointerException at java.lang.String.init(String.java:593) at org.apache.hadoop.hbase.security.SaslUtil.encodeIdentifier(SaslUtil.java:38) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient$SaslClientCallbackHandler.init(HBaseSaslRpcClient.java:259) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.init(HBaseSaslRpcClient.java:78) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.assertSuccessCreationDigestPrincipal(TestHBaseSaslRpcClient.java:240) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) 2013-12-08 14:18:24,755 DEBUG [main] security.HBaseSaslRpcClient(76): Creating SASL DIGEST-MD5 client to authenticate to service at null {code} Here is related code: {code} return new String(Base64.encodeBase64(identifier)); {code} Looks like Base64.encodeBase64() returned a null. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10105) SaslUtil#encodeIdentifier may throw NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10105: --- Attachment: org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient-output.txt Here is test output. SaslUtil#encodeIdentifier may throw NullPointerException Key: HBASE-10105 URL: https://issues.apache.org/jira/browse/HBASE-10105 Project: HBase Issue Type: Bug Reporter: Ted Yu Attachments: org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient-output.txt Encountered the following exception when running TestHBaseSaslRpcClient on Mac: {code} 2013-12-08 14:18:24,754 ERROR [main] security.TestHBaseSaslRpcClient(243): java.lang.NullPointerException at java.lang.String.init(String.java:593) at org.apache.hadoop.hbase.security.SaslUtil.encodeIdentifier(SaslUtil.java:38) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient$SaslClientCallbackHandler.init(HBaseSaslRpcClient.java:259) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.init(HBaseSaslRpcClient.java:78) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.assertSuccessCreationDigestPrincipal(TestHBaseSaslRpcClient.java:240) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) 2013-12-08 14:18:24,755 DEBUG [main] security.HBaseSaslRpcClient(76): Creating SASL DIGEST-MD5 client to authenticate to service at null {code} Here is related code: {code} return new String(Base64.encodeBase64(identifier)); {code} Looks like Base64.encodeBase64() returned a null. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10100) Hbase replication cluster can have varying peers under certain conditions
[ https://issues.apache.org/jira/browse/HBASE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842424#comment-13842424 ] Lars Hofhansl commented on HBASE-10100: --- Can do that + have ReplicationSource recheck periodically (it can interrogate the state from ZK). If we keep track when replication should have started and all logs are still present, we can automatically restart replication when the transient condition has been fixed. Also check out HBASE-9746, the issue there is that a RegionServer won't start up if it has outstanding edit for a host that it can no longer resolve. Hbase replication cluster can have varying peers under certain conditions - Key: HBASE-10100 URL: https://issues.apache.org/jira/browse/HBASE-10100 Project: HBase Issue Type: Bug Affects Versions: 0.94.5, 0.95.0, 0.96.0 Reporter: churro morales We were trying to replicate hbase data over to a new datacenter recently. After we turned on replication and then did our copy tables. We noticed that verify replication had discrepancies. We ran a list_peers and it returned back both peers, the original datacenter we were replicating to and the new datacenter (this was correct). When grepping through the logs for a few regionservers we noticed that a few regionservers had the following entry in their logs: 2013-09-26 10:55:46,907 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Error while adding a new peer java.net.UnknownHostException: xxx.xxx.flurry.com (this was due to a transient dns issue) Thus a very small subet of our regionservers were not replicating to this new cluster while most were. We probably don't want to abort if this type of issue comes up, it could potentially be fatal if someone does an add_peer operation with a typo. This could potentially shut down the cluster. One solution I can think of is keeping some flag in ReplicationSourceManager which is a boolean that keeps track of whether there was an errorAddingPeer. Then in the logPositionAndCleanOldLogs we can do something like: {code} if (errorAddingPeer) { LOG.error(There was an error adding a peer, logs will not be marked for deletion); return; } {code} thus we are not deleting these logs from the queue. You will notice your replicating queue rising on certain machines and you can still replay the logs, thus avoiding a lengthy copy table. I have a patch (with unit test) for the above proposal, if everyone thinks that is an okay solution. An additional idea would be to add some retry logic inside the PeersWatcher class for the nodeChildrenChanged method. Thus if there happens to be some issue we could sort it out without having to bounce that particular regionserver. Would love to hear everyones thoughts. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10105) SaslUtil#encodeIdentifier may throw NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-10105: --- Affects Version/s: 0.98.0 SaslUtil#encodeIdentifier may throw NullPointerException Key: HBASE-10105 URL: https://issues.apache.org/jira/browse/HBASE-10105 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: Ted Yu Attachments: org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient-output.txt Encountered the following exception when running TestHBaseSaslRpcClient on Mac: {code} 2013-12-08 14:18:24,754 ERROR [main] security.TestHBaseSaslRpcClient(243): java.lang.NullPointerException at java.lang.String.init(String.java:593) at org.apache.hadoop.hbase.security.SaslUtil.encodeIdentifier(SaslUtil.java:38) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient$SaslClientCallbackHandler.init(HBaseSaslRpcClient.java:259) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.init(HBaseSaslRpcClient.java:78) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.assertSuccessCreationDigestPrincipal(TestHBaseSaslRpcClient.java:240) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) 2013-12-08 14:18:24,755 DEBUG [main] security.HBaseSaslRpcClient(76): Creating SASL DIGEST-MD5 client to authenticate to service at null {code} Here is related code: {code} return new String(Base64.encodeBase64(identifier)); {code} Looks like Base64.encodeBase64() returned a null. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9068) Make hadoop 2 the default precommit for trunk
[ https://issues.apache.org/jira/browse/HBASE-9068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9068: -- Resolution: Duplicate Status: Resolved (was: Patch Available) Make hadoop 2 the default precommit for trunk - Key: HBASE-9068 URL: https://issues.apache.org/jira/browse/HBASE-9068 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.0 Attachments: 9068-v1.txt Here is discussion thread: http://search-hadoop.com/m/ggc1019WdVA/Making+hadoop+2+the+default+precommitsubj=Re+DISCUSS+Making+hadoop+2+the+default+precommit+for+trunk+ones+we+get+green+builds Jenkins builds have been stable recently: https://builds.apache.org/job/HBase-TRUNK/ https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/ http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/ We should run test suite against hadoop 2 in PreCommit build -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10105) SaslUtil#encodeIdentifier may throw NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-10105: --- Affects Version/s: 0.99.0 Fix Version/s: 0.98.0 SaslUtil#encodeIdentifier may throw NullPointerException Key: HBASE-10105 URL: https://issues.apache.org/jira/browse/HBASE-10105 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.99.0 Reporter: Ted Yu Fix For: 0.98.0 Attachments: org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient-output.txt Encountered the following exception when running TestHBaseSaslRpcClient on Mac: {code} 2013-12-08 14:18:24,754 ERROR [main] security.TestHBaseSaslRpcClient(243): java.lang.NullPointerException at java.lang.String.init(String.java:593) at org.apache.hadoop.hbase.security.SaslUtil.encodeIdentifier(SaslUtil.java:38) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient$SaslClientCallbackHandler.init(HBaseSaslRpcClient.java:259) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.init(HBaseSaslRpcClient.java:78) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.assertSuccessCreationDigestPrincipal(TestHBaseSaslRpcClient.java:240) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) 2013-12-08 14:18:24,755 DEBUG [main] security.HBaseSaslRpcClient(76): Creating SASL DIGEST-MD5 client to authenticate to service at null {code} Here is related code: {code} return new String(Base64.encodeBase64(identifier)); {code} Looks like Base64.encodeBase64() returned a null. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10103) TestNodeHealthCheckChore#testRSHealthChore: Stoppable must have been stopped
[ https://issues.apache.org/jira/browse/HBASE-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-10103: --- Affects Version/s: 0.99.0 Fix Version/s: (was: 0.99.0) TestNodeHealthCheckChore#testRSHealthChore: Stoppable must have been stopped Key: HBASE-10103 URL: https://issues.apache.org/jira/browse/HBASE-10103 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.99.0 Reporter: Andrew Purtell Fix For: 0.98.0 {noformat} Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 623.639 sec FAILURE! testRSHealthChore(org.apache.hadoop.hbase.TestNodeHealthCheckChore) Time elapsed: 0.001 sec FAILURE! java.lang.AssertionError: Stoppable must have been stopped. at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.hadoop.hbase.TestNodeHealthCheckChore.testRSHealthChore(TestNodeHealthCheckChore.java:108) {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change
[ https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842441#comment-13842441 ] Andrew Purtell commented on HBASE-9966: --- +1 Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change --- Key: HBASE-9966 URL: https://issues.apache.org/jira/browse/HBASE-9966 Project: HBase Issue Type: Sub-task Components: HFile, test Affects Versions: 0.98.0, 0.96.1 Reporter: Aleksandr Shulman Assignee: Aleksandr Shulman Labels: online_schema_change Fix For: 0.96.1, 0.98.1, 0.99.0 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, HBASE-9966-trunk.patch For online schema change, a user is perfectly with her rights to modify the compression algorithm used, or the bloom filter. Therefore, we should add these actions to our ChaosMonkey tests to ensure that they do not introduce instability. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery
[ https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842443#comment-13842443 ] Hadoop QA commented on HBASE-1: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617616/1-recover-ts-with-pb-5.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 25 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8087//console This message is automatically generated. Initiate lease recovery for outstanding WAL files at the very beginning of recovery --- Key: HBASE-1 URL: https://issues.apache.org/jira/browse/HBASE-1 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 0.98.1 Attachments: 1-recover-ts-with-pb-2.txt, 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-recover-ts-with-pb-5.txt, 1-v4.txt, 1-v5.txt, 1-v6.txt At the beginning of recovery, master can send lease recovery requests concurrently for outstanding WAL files using a thread pool. Each split worker would first check whether the WAL file it processes is closed. Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this idea. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10105) SaslUtil#encodeIdentifier may throw NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842444#comment-13842444 ] Andrew Purtell commented on HBASE-10105: The test is not failing for me locally. Did the test actually fail for you Ted? Some of the test cases deliberately use nulls. SaslUtil#encodeIdentifier may throw NullPointerException Key: HBASE-10105 URL: https://issues.apache.org/jira/browse/HBASE-10105 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.99.0 Reporter: Ted Yu Fix For: 0.98.0 Attachments: org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient-output.txt Encountered the following exception when running TestHBaseSaslRpcClient on Mac: {code} 2013-12-08 14:18:24,754 ERROR [main] security.TestHBaseSaslRpcClient(243): java.lang.NullPointerException at java.lang.String.init(String.java:593) at org.apache.hadoop.hbase.security.SaslUtil.encodeIdentifier(SaslUtil.java:38) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient$SaslClientCallbackHandler.init(HBaseSaslRpcClient.java:259) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.init(HBaseSaslRpcClient.java:78) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.assertSuccessCreationDigestPrincipal(TestHBaseSaslRpcClient.java:240) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) 2013-12-08 14:18:24,755 DEBUG [main] security.HBaseSaslRpcClient(76): Creating SASL DIGEST-MD5 client to authenticate to service at null {code} Here is related code: {code} return new String(Base64.encodeBase64(identifier)); {code} Looks like Base64.encodeBase64() returned a null. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-10105) SaslUtil#encodeIdentifier may throw NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-10105: --- Affects Version/s: (was: 0.98.0) Fix Version/s: (was: 0.98.0) SaslUtil#encodeIdentifier may throw NullPointerException Key: HBASE-10105 URL: https://issues.apache.org/jira/browse/HBASE-10105 Project: HBase Issue Type: Bug Affects Versions: 0.99.0 Reporter: Ted Yu Attachments: org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient-output.txt Encountered the following exception when running TestHBaseSaslRpcClient on Mac: {code} 2013-12-08 14:18:24,754 ERROR [main] security.TestHBaseSaslRpcClient(243): java.lang.NullPointerException at java.lang.String.init(String.java:593) at org.apache.hadoop.hbase.security.SaslUtil.encodeIdentifier(SaslUtil.java:38) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient$SaslClientCallbackHandler.init(HBaseSaslRpcClient.java:259) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.init(HBaseSaslRpcClient.java:78) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.assertSuccessCreationDigestPrincipal(TestHBaseSaslRpcClient.java:240) at org.apache.hadoop.hbase.security.TestHBaseSaslRpcClient.testHBaseSaslRpcClientCreation(TestHBaseSaslRpcClient.java:122) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:234) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:133) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:188) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:166) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:101) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) 2013-12-08 14:18:24,755 DEBUG [main] security.HBaseSaslRpcClient(76): Creating SASL DIGEST-MD5 client to authenticate to service at null {code} Here is related code: {code} return new String(Base64.encodeBase64(identifier)); {code} Looks like Base64.encodeBase64() returned a null. -- This message was sent by Atlassian JIRA (v6.1#6144)