[jira] [Updated] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-14178: -- Attachment: HBASE-14178_v5.patch Upload patch changes blow: 1. add function to check all situations we should read BC 2. add function to check if we should acquire the lock regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart
Heng Chen created HBASE-14182: - Summary: My regionserver change ip. But hmaster still connect to old ip after the rs restart Key: HBASE-14182 URL: https://issues.apache.org/jira/browse/HBASE-14182 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Heng Chen I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs. This is hmaster's log below: PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online {code} 2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072 2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999) at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447) at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-08-04 17:24:06,801 WARN [AM.ZK.Worker-pool2-t14140] master.AssignmentManager: Failed assignment of solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=2 of 10 java.net.ConnectException: Call to dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection exception: java.net.ConnectException: Connection timed out at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at
[jira] [Updated] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-14178: -- Attachment: HBASE-14178_v6.patch changes: 1. modify some comments regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653551#comment-14653551 ] Hadoop QA commented on HBASE-14178: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748640/HBASE-14178_v5.patch against master branch at commit 931e77d4507e1650c452cefadda450e0bf3f0528. ATTACHMENT ID: 12748640 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.trace.TestHTraceHooks org.apache.hadoop.hbase.client.TestScannersFromClientSide org.apache.hadoop.hbase.TestLocalHBaseCluster org.apache.hadoop.hbase.TestMetaTableAccessor org.apache.hadoop.hbase.snapshot.TestRestoreFlushSnapshotFromClient org.apache.hadoop.hbase.client.TestScannerTimeout org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas org.apache.hadoop.hbase.client.TestMetaWithReplicas org.apache.hadoop.hbase.namespace.TestNamespaceAuditor org.apache.hadoop.hbase.client.TestHCM org.apache.hadoop.hbase.snapshot.TestMobRestoreFlushSnapshotFromClient org.apache.hadoop.hbase.backup.TestHFileArchiving org.apache.hadoop.hbase.client.TestSnapshotFromClientWithRegionReplicas org.apache.hadoop.hbase.client.TestClientPushback org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.client.TestClientTimeouts org.apache.hadoop.hbase.client.TestMobSnapshotFromClient org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient org.apache.hadoop.hbase.TestMultiVersions {color:red}-1 core zombie tests{color}. There are 7 zombie test(s): at org.apache.hadoop.hbase.namespace.TestNamespaceAuditor.testRegionMerge(TestNamespaceAuditor.java:316) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14966//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14966//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14966//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14966//console This message is automatically generated. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my
[jira] [Commented] (HBASE-12865) WALs may be deleted before they are replicated to peers
[ https://issues.apache.org/jira/browse/HBASE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653394#comment-14653394 ] Lars Hofhansl commented on HBASE-12865: --- Yeah. Apologies from me as well... This went under the radar for some reason. WALs may be deleted before they are replicated to peers --- Key: HBASE-12865 URL: https://issues.apache.org/jira/browse/HBASE-12865 Project: HBase Issue Type: Bug Components: Replication Reporter: Liu Shaohui Assignee: He Liangliang Priority: Critical Attachments: HBASE-12865-V1.diff, HBASE-12865-V2.diff By design, ReplicationLogCleaner guarantee that the WALs being in replication queue can't been deleted by the HMaster. The ReplicationLogCleaner gets the WAL set from zookeeper by scanning the replication zk node. But it may get uncompleted WAL set during replication failover for the scan operation is not atomic. For example: There are three region servers: rs1, rs2, rs3, and peer id 10. The layout of replication zookeeper nodes is: {code} /hbase/replication/rs/rs1/10/wals /rs2/10/wals /rs3/10/wals {code} - t1: the ReplicationLogCleaner finished scanning the replication queue of rs1, and start to scan the queue of rs2. - t2: region server rs3 is down, and rs1 take over rs3's replication queue. The new layout is {code} /hbase/replication/rs/rs1/10/wals /rs1/10-rs3/wals /rs2/10/wals /rs3 {code} - t3, the ReplicationLogCleaner finished scanning the queue of rs2, and start to scan the node of rs3. But the the queue has been moved to replication/rs1/10-rs3/WALS So the ReplicationLogCleaner will miss the WALs of rs3 in peer 10 and the hmaster may delete these WALs before they are replicated to peer clusters. We encountered this problem in our cluster and I think it's a serious bug for replication. Suggestions are welcomed to fix this bug. thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12865) WALs may be deleted before they are replicated to peers
[ https://issues.apache.org/jira/browse/HBASE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653406#comment-14653406 ] Lars Hofhansl commented on HBASE-12865: --- Patch looks good. I find it hard to convince myself that the cversion would change in all cases that we care about... I'll trust you on this. Minor nit: {{int retry = 0; do \{...; retry+\+;} while (true)}} can perhaps be expressed nicer as {{for (int retry=0; ; retry++) \{...\}}} WALs may be deleted before they are replicated to peers --- Key: HBASE-12865 URL: https://issues.apache.org/jira/browse/HBASE-12865 Project: HBase Issue Type: Bug Components: Replication Reporter: Liu Shaohui Assignee: He Liangliang Priority: Critical Attachments: HBASE-12865-V1.diff, HBASE-12865-V2.diff By design, ReplicationLogCleaner guarantee that the WALs being in replication queue can't been deleted by the HMaster. The ReplicationLogCleaner gets the WAL set from zookeeper by scanning the replication zk node. But it may get uncompleted WAL set during replication failover for the scan operation is not atomic. For example: There are three region servers: rs1, rs2, rs3, and peer id 10. The layout of replication zookeeper nodes is: {code} /hbase/replication/rs/rs1/10/wals /rs2/10/wals /rs3/10/wals {code} - t1: the ReplicationLogCleaner finished scanning the replication queue of rs1, and start to scan the queue of rs2. - t2: region server rs3 is down, and rs1 take over rs3's replication queue. The new layout is {code} /hbase/replication/rs/rs1/10/wals /rs1/10-rs3/wals /rs2/10/wals /rs3 {code} - t3, the ReplicationLogCleaner finished scanning the queue of rs2, and start to scan the node of rs3. But the the queue has been moved to replication/rs1/10-rs3/WALS So the ReplicationLogCleaner will miss the WALs of rs3 in peer 10 and the hmaster may delete these WALs before they are replicated to peer clusters. We encountered this problem in our cluster and I think it's a serious bug for replication. Suggestions are welcomed to fix this bug. thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14184) Fix indention and type-o in JavaHBaseContext
[ https://issues.apache.org/jira/browse/HBASE-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653679#comment-14653679 ] Ted Malaska commented on HBASE-14184: - Also fixed some JavaDoc stuff. Nothing in the code should had changed in this patch. Simple cleaning effort. Should be a simple review and commit. Fix indention and type-o in JavaHBaseContext Key: HBASE-14184 URL: https://issues.apache.org/jira/browse/HBASE-14184 Project: HBase Issue Type: Wish Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor Attachments: HBASE-14184.3.patch Looks like there is a Ddd that should be Rdd. Also looks like everything is one space over too much -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14184) Fix indention and type-o in JavaHBaseContext
[ https://issues.apache.org/jira/browse/HBASE-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HBASE-14184: Attachment: HBASE-14184.3.patch Fix indention and type-o in JavaHBaseContext Key: HBASE-14184 URL: https://issues.apache.org/jira/browse/HBASE-14184 Project: HBase Issue Type: Wish Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor Attachments: HBASE-14184.3.patch Looks like there is a Ddd that should be Rdd. Also looks like everything is one space over too much -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653201#comment-14653201 ] Duo Zhang commented on HBASE-14178: --- [~anoopsamjohn] {{CacheConfig}} is a bit confusing I think. {{family.isBlockCacheEnabled}} is only equal to {{cacheDataOnRead}}, and we still have chance to put data into {{BlockCache}} if we set {{cacheDataOnWrite}} or {{prefetchOnOpen}} to {{true}} even if we set {{cacheDataOnRead}} to {{false}}? So I suggest here we make a new method called {{shouldReadBlockFromCache}}, and check all the possibility that we may put a block into {{BlockCache}}? Thanks. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653184#comment-14653184 ] Hadoop QA commented on HBASE-14178: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748601/HBASE-14178-0.98.patch against 0.98 branch at commit 931e77d4507e1650c452cefadda450e0bf3f0528. ATTACHMENT ID: 12748601 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 21 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14965//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14965//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14965//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14965//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14965//console This message is automatically generated. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653334#comment-14653334 ] Anoop Sam John commented on HBASE-14178: I see.. I didnt not check much like how this variable is getting initialized in CacheConfig.. Ya we better do some cleanup there. So much confusing stuff. bq.and we still have chance to put data into BlockCache if we set cacheDataOnWrite or prefetchOnOpen to true even if we set cacheDataOnRead to false? I did not test it. Nice to test with some UTs. If at CF level we set like never cache the data from this CF into BC, we should NOT cache it at all. Whatever be value of cacheDataOnWrite or prefetchOnOpen. If we are not doing so, then those are bugs to be addressed. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a
[jira] [Updated] (HBASE-14183) Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-14183: -- Attachment: HBASE-14183.patch Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14183) Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Singhi updated HBASE-14183: -- Status: Patch Available (was: Open) Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14183) Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653645#comment-14653645 ] Ashish Singhi commented on HBASE-14183: --- Checked no other place is missed. Please review. Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14184) Fix indention and type-o in JavaHBaseContext
Ted Malaska created HBASE-14184: --- Summary: Fix indention and type-o in JavaHBaseContext Key: HBASE-14184 URL: https://issues.apache.org/jira/browse/HBASE-14184 Project: HBase Issue Type: Wish Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor Looks like there is a Ddd that should be Rdd. Also looks like everything is one space over too much -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14150) Add BulkLoad functionality to HBase-Spark Module
[ https://issues.apache.org/jira/browse/HBASE-14150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HBASE-14150: Attachment: HBASE-14150.2.patch Did the following: 1. Added test for rdd implicit function 2. Applied some of Ted Y's comments Add BulkLoad functionality to HBase-Spark Module Key: HBASE-14150 URL: https://issues.apache.org/jira/browse/HBASE-14150 Project: HBase Issue Type: New Feature Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Attachments: HBASE-14150.1.patch, HBASE-14150.2.patch Add on to the work done in HBASE-13992 to add functionality to do a bulk load from a given RDD. This will do the following: 1. figure out the number of regions and sort and partition the data correctly to be written out to HFiles 2. Also unlike the MR bulkload I would like that the columns to be sorted in the shuffle stage and not in the memory of the reducer. This will allow this design to support super wide records with out going out of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14183) Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653722#comment-14653722 ] Anoop Sam John commented on HBASE-14183: Why not doing kv.getValueLength? Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart
[ https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653730#comment-14653730 ] Heng Chen commented on HBASE-14182: --- I think i found the answer! RpcClient use InetAddress class in Java. And InetAddress has a cache to store host,ip pair getAllByName0 will be called when request ip for a host, the source code in jdk1.8 is below: {code} private static InetAddress[] getAllByName0 (String host, InetAddress reqAddr, boolean check) throws UnknownHostException { /* If it gets here it is presumed to be a hostname */ /* Cache.get can return: null, unknownAddress, or InetAddress[] */ /* make sure the connection to the host is allowed, before we * give out a hostname */ if (check) { SecurityManager security = System.getSecurityManager(); if (security != null) { security.checkConnect(host, -1); } } InetAddress[] addresses = getCachedAddresses(host); /* If no entry in cache, then do the host lookup */ if (addresses == null) { addresses = getAddressesFromNameService(host, reqAddr); } if (addresses == unknown_array) throw new UnknownHostException(host); return addresses.clone(); } {code} It will request cache first. So we can't change rs ip without hmaster restart. One solution is that we can store ip information in ZK, and pass ip information into InetAddress Constructor when generate new instance. The problem will be solved. My regionserver change ip. But hmaster still connect to old ip after the rs restart --- Key: HBASE-14182 URL: https://issues.apache.org/jira/browse/HBASE-14182 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Heng Chen I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs. This is hmaster's log below: PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online {code} 2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072 2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999) at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447) at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260) at
[jira] [Commented] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart
[ https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653748#comment-14653748 ] Heng Chen commented on HBASE-14182: --- It seems has a better solution. As JDK docs said {quote} InetAddress Caching The InetAddress class has a cache to store successful as well as unsuccessful host name resolutions. By default, when a security manager is installed, in order to protect against DNS spoofing attacks, the result of positive host name resolutions are cached forever. When a security manager is not installed, the default behavior is to cache entries for a finite (implementation dependent) period of time. The result of unsuccessful host name resolution is cached for a very short period of time (10 seconds) to improve performance. If the default behavior is not desired, then a Java security property can be set to a different Time-to-live (TTL) value for positive caching. Likewise, a system admin can configure a different negative caching TTL value when needed. Two Java security properties control the TTL values used for positive and negative host name resolution caching: networkaddress.cache.ttl Indicates the caching policy for successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the successful lookup. The default setting is to cache for an implementation specific period of time. A value of -1 indicates cache forever. networkaddress.cache.negative.ttl (default: 10) Indicates the caching policy for un-successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the failure for un-successful lookups. A value of 0 indicates never cache. A value of -1 indicates cache forever. {quote} We can set networkaddress.cache.ttl to be a limit time. My regionserver change ip. But hmaster still connect to old ip after the rs restart --- Key: HBASE-14182 URL: https://issues.apache.org/jira/browse/HBASE-14182 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Heng Chen I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs. This is hmaster's log below: PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online {code} 2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072 2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653391#comment-14653391 ] Heng Chen commented on HBASE-14178: --- {quote} Ideally, when the BC is enabled and CF level there is no setting like NOT to cache data into BC, we should try read it from the BC. Also even if the CF level setting is there and we are not reading back Data blocks, then also we have to consult BC. Still it will be much cleaner to do ur suggestion of adding the new method to CacheConfig. It will look much cleaner. {quote} I agree with both of you, I will write a function named shouldReadBlockFromCache in CacheConfig to check all the situations we should read from BC. But there is one problem. we acquire lock to ensure next request could read block from BC. If cacheDataOnRead is false but cacheDataOnWrite is true, as we discuss, we still read from BC, and acquire the lock. But after read block from hdfs, we use another condition to decide whether we should cache the block, and it will not cache the block when cacheDataOnRead is false and cacheDataOnWrite is true。 In this situation, the lock is useless. So i think we will use another 'If' to check whether we should acquire the lock. Do you think so? regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653415#comment-14653415 ] Duo Zhang commented on HBASE-14178: --- Yes, the problem here is the lock, not when to read from cache...So if we can make sure the block will not be put into cache after we fetch it from HDFS, then we can bypass the locking step. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653341#comment-14653341 ] Anoop Sam John commented on HBASE-14178: bq.So I suggest here we make a new method called shouldReadBlockFromCache, and check all the possibility that we may put a block into BlockCache Ideally, when the BC is enabled and CF level there is no setting like NOT to cache data into BC, we should try read it from the BC. Also even if the CF level setting is there and we are not reading back Data blocks, then also we have to consult BC. Still it will be much cleaner to do ur suggestion of adding the new method to CacheConfig. It will look much cleaner. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA
[jira] [Created] (HBASE-14183) Scanning hbase meta table is failing in master branch
Ashish Singhi created HBASE-14183: - Summary: Scanning hbase meta table is failing in master branch Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653814#comment-14653814 ] Hadoop QA commented on HBASE-14178: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748653/HBASE-14178_v6.patch against master branch at commit 931e77d4507e1650c452cefadda450e0bf3f0528. ATTACHMENT ID: 12748653 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 3 zombie test(s): at org.apache.hadoop.hbase.client.TestReplicasClient.testSmallScanWithReplicas(TestReplicasClient.java:606) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14967//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14967//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14967//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14967//console This message is automatically generated. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at
[jira] [Commented] (HBASE-14185) Incorrect region names logged by MemStoreFlusher
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654790#comment-14654790 ] Hudson commented on HBASE-14185: FAILURE: Integrated in HBase-1.2 #90 (See [https://builds.apache.org/job/HBase-1.2/90/]) HBASE-14185 Incorrect region names logged by MemStoreFlusher (Biju Nair) (tedyu: rev 2906b44c5f49c5ccadf9f40e4342ae41dc463d48) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java Incorrect region names logged by MemStoreFlusher Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14183) Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-14183: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183-v1.patch, HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14183) Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-14183: --- Component/s: shell Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Components: shell Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183-v1.patch, HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14183) [Shell] Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-14183: --- Summary: [Shell] Scanning hbase meta table is failing in master branch (was: Scanning hbase meta table is failing in master branch) [Shell] Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Components: shell Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183-v1.patch, HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14185) Incorrect region names logged by MemStoreFlusher
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654821#comment-14654821 ] Hudson commented on HBASE-14185: FAILURE: Integrated in HBase-TRUNK #6697 (See [https://builds.apache.org/job/HBase-TRUNK/6697/]) HBASE-14185 Incorrect region names logged by MemStoreFlusher (Biju Nair) (tedyu: rev a0d72051dbace9dc4ec6ab288f2f6553e2ee7307) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java Incorrect region names logged by MemStoreFlusher Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654822#comment-14654822 ] Hudson commented on HBASE-14021: FAILURE: Integrated in HBase-TRUNK #6697 (See [https://builds.apache.org/job/HBase-TRUNK/6697/]) HBASE-14021 Quota table has a wrong description on the UI (Ashish Singhi) (tedyu: rev 5f6632f80159f283125a7a826d5f8ef76dbe1caa) * hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.jamon Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5878) Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2.
[ https://issues.apache.org/jira/browse/HBASE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654844#comment-14654844 ] Hadoop QA commented on HBASE-5878: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748763/HBASE-5878-v5.patch against master branch at commit 5f6632f80159f283125a7a826d5f8ef76dbe1caa. ATTACHMENT ID: 12748763 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14978//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14978//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14978//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14978//console This message is automatically generated. Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2. --- Key: HBASE-5878 URL: https://issues.apache.org/jira/browse/HBASE-5878 Project: HBase Issue Type: Bug Components: wal Reporter: Uma Maheswara Rao G Assignee: Ashish Singhi Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1, 1.0.3 Attachments: HBASE-5878-v2.patch, HBASE-5878-v3.patch, HBASE-5878-v4.patch, HBASE-5878-v5.patch, HBASE-5878-v5.patch, HBASE-5878.patch SequencFileLogReader: Currently Hbase using getFileLength api from DFSInputStream class by reflection. DFSInputStream is not exposed as public. So, this may change in future. Now HDFS exposed HdfsDataInputStream as public API. We can make use of it, when we are not able to find the getFileLength api from DFSInputStream as a else condition. So, that we will not have any sudden surprise like we are facing today. Also, it is just logging one warn message and proceeding if it throws any exception while getting the length. I think we can re-throw the exception because there is no point in continuing with dataloss. {code} long adjust = 0; try { Field fIn = FilterInputStream.class.getDeclaredField(in); fIn.setAccessible(true); Object realIn = fIn.get(this.in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this, // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith(DFSInputStream)) { Method getFileLength = realIn.getClass(). getDeclaredMethod(getFileLength, new Class? []{}); getFileLength.setAccessible(true); long realLength = ((Long)getFileLength. invoke(realIn, new Object []{})).longValue(); assert(realLength = this.length); adjust = realLength - this.length; } else { LOG.info(Input stream class: + realIn.getClass().getName() + , not adjusting length); } } catch(Exception e) { SequenceFileLogReader.LOG.warn( Error while trying to get accurate file length. + Truncation / data loss may occur if RegionServers die., e); } return adjust + super.getPos(); {code} -- This message was sent by
[jira] [Commented] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654849#comment-14654849 ] Ashish Singhi commented on HBASE-14021: --- Thanks Ted and Nick. Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14183) [Shell] Scanning hbase meta table is failing in master branch
[ https://issues.apache.org/jira/browse/HBASE-14183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654850#comment-14654850 ] Ashish Singhi commented on HBASE-14183: --- Thanks Anoop and Ted. [Shell] Scanning hbase meta table is failing in master branch - Key: HBASE-14183 URL: https://issues.apache.org/jira/browse/HBASE-14183 Project: HBase Issue Type: Bug Components: shell Affects Versions: 2.0.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Fix For: 2.0.0 Attachments: HBASE-14183-v1.patch, HBASE-14183.patch As part of HBASE-14047 cleanup this issue has been introduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13865) Default value of hbase.hregion.memstore.block.multiplier in HBase book is wrong
[ https://issues.apache.org/jira/browse/HBASE-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654870#comment-14654870 ] Hadoop QA commented on HBASE-13865: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748770/HBASE-13865.2.patch against master branch at commit 5f6632f80159f283125a7a826d5f8ef76dbe1caa. ATTACHMENT ID: 12748770 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 16 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14979//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14979//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14979//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14979//console This message is automatically generated. Default value of hbase.hregion.memstore.block.multiplier in HBase book is wrong --- Key: HBASE-13865 URL: https://issues.apache.org/jira/browse/HBASE-13865 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 2.0.0 Reporter: Vladimir Rodionov Assignee: Gabor Liptak Priority: Trivial Attachments: HBASE-13865.1.patch, HBASE-13865.2.patch, HBASE-13865.2.patch Its 4 in the book and 2 in a current master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654613#comment-14654613 ] Duo Zhang commented on HBASE-14178: --- Oh, I looked into wrong ling number... It is the same reason, if {{blockType}} is {{null}}, the safe way is to read it from {{BlockCache}} first. Maybe it is an {{INDEX}} or {{BLOOM}} block. Thanks. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14021: --- Hadoop Flags: Reviewed Fix Version/s: (was: 1.1.2) Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5878) Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2.
[ https://issues.apache.org/jira/browse/HBASE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654642#comment-14654642 ] Nick Dimiduk commented on HBASE-5878: - Buildbot flakiness aside, are reviewers good with this patch? Ping [~eclark], [~stack]. Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2. --- Key: HBASE-5878 URL: https://issues.apache.org/jira/browse/HBASE-5878 Project: HBase Issue Type: Bug Components: wal Reporter: Uma Maheswara Rao G Assignee: Ashish Singhi Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1, 1.0.3 Attachments: HBASE-5878-v2.patch, HBASE-5878-v3.patch, HBASE-5878-v4.patch, HBASE-5878-v5.patch, HBASE-5878-v5.patch, HBASE-5878.patch SequencFileLogReader: Currently Hbase using getFileLength api from DFSInputStream class by reflection. DFSInputStream is not exposed as public. So, this may change in future. Now HDFS exposed HdfsDataInputStream as public API. We can make use of it, when we are not able to find the getFileLength api from DFSInputStream as a else condition. So, that we will not have any sudden surprise like we are facing today. Also, it is just logging one warn message and proceeding if it throws any exception while getting the length. I think we can re-throw the exception because there is no point in continuing with dataloss. {code} long adjust = 0; try { Field fIn = FilterInputStream.class.getDeclaredField(in); fIn.setAccessible(true); Object realIn = fIn.get(this.in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this, // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith(DFSInputStream)) { Method getFileLength = realIn.getClass(). getDeclaredMethod(getFileLength, new Class? []{}); getFileLength.setAccessible(true); long realLength = ((Long)getFileLength. invoke(realIn, new Object []{})).longValue(); assert(realLength = this.length); adjust = realLength - this.length; } else { LOG.info(Input stream class: + realIn.getClass().getName() + , not adjusting length); } } catch(Exception e) { SequenceFileLogReader.LOG.warn( Error while trying to get accurate file length. + Truncation / data loss may occur if RegionServers die., e); } return adjust + super.getPos(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13857) Slow WAL Append count in ServerMetricsTmpl.jamon is hardcoded to zero
[ https://issues.apache.org/jira/browse/HBASE-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654645#comment-14654645 ] Nick Dimiduk commented on HBASE-13857: -- Seems we don't provide the RS with direct access to the {{MetricsWAL}} instance. This will take some wiring. Slow WAL Append count in ServerMetricsTmpl.jamon is hardcoded to zero - Key: HBASE-13857 URL: https://issues.apache.org/jira/browse/HBASE-13857 Project: HBase Issue Type: Bug Components: regionserver, UI Affects Versions: 0.98.0 Reporter: Lars George Labels: beginner Fix For: 2.0.0, 0.98.14, 1.1.2, 1.3.0, 1.2.1, 1.0.3 The template has this: {noformat} tr ... thSlow WAL Append Count/th /tr tr td% 0 %/td /tr {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14184) Fix indention and type-o in JavaHBaseContext
[ https://issues.apache.org/jira/browse/HBASE-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654659#comment-14654659 ] Hadoop QA commented on HBASE-14184: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748668/HBASE-14184.3.patch against master branch at commit a0d72051dbace9dc4ec6ab288f2f6553e2ee7307. ATTACHMENT ID: 12748668 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14976//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14976//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14976//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14976//console This message is automatically generated. Fix indention and type-o in JavaHBaseContext Key: HBASE-14184 URL: https://issues.apache.org/jira/browse/HBASE-14184 Project: HBase Issue Type: Wish Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor Attachments: HBASE-14184.3.patch Looks like there is a Ddd that should be Rdd. Also looks like everything is one space over too much -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13865) Default value of hbase.hregion.memstore.block.multiplier in HBase book is wrong
[ https://issues.apache.org/jira/browse/HBASE-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13865: - Attachment: HBASE-13865.2.patch Reattaching patch. Default value of hbase.hregion.memstore.block.multiplier in HBase book is wrong --- Key: HBASE-13865 URL: https://issues.apache.org/jira/browse/HBASE-13865 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 2.0.0 Reporter: Vladimir Rodionov Assignee: Gabor Liptak Priority: Trivial Attachments: HBASE-13865.1.patch, HBASE-13865.2.patch, HBASE-13865.2.patch Its 4 in the book and 2 in a current master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13865) Default value of hbase.hregion.memstore.block.multiplier in HBase book is wrong
[ https://issues.apache.org/jira/browse/HBASE-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654672#comment-14654672 ] Nick Dimiduk commented on HBASE-13865: -- {noformat} --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java @@ -744,7 +744,8 @@ public class HRegion implements HeapSize, PropagatingConfigurationObserver, Regi } this.memstoreFlushSize = flushSize; this.blockingMemStoreSize = this.memstoreFlushSize * -conf.getLong(hbase.hregion.memstore.block.multiplier, 2); +conf.getLong(HConstants.HREGION_MEMSTORE_BLOCK_MULTIPLIER, +HConstants.DEFAULT_HREGION_MEMSTORE_BLOCK_MULTIPLIER); } /** {noformat} It looks like HBASE-11209 didn't get everywhere, or this change to HRegion was intentionally omitted. Assuming that's a bug, +1. Default value of hbase.hregion.memstore.block.multiplier in HBase book is wrong --- Key: HBASE-13865 URL: https://issues.apache.org/jira/browse/HBASE-13865 Project: HBase Issue Type: Bug Components: documentation Affects Versions: 2.0.0 Reporter: Vladimir Rodionov Assignee: Gabor Liptak Priority: Trivial Attachments: HBASE-13865.1.patch, HBASE-13865.2.patch, HBASE-13865.2.patch Its 4 in the book and 2 in a current master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14185) Incorrect region names logged by MemStoreFlusher
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654683#comment-14654683 ] Hudson commented on HBASE-14185: SUCCESS: Integrated in HBase-1.2-IT #73 (See [https://builds.apache.org/job/HBase-1.2-IT/73/]) HBASE-14185 Incorrect region names logged by MemStoreFlusher (Biju Nair) (tedyu: rev 2906b44c5f49c5ccadf9f40e4342ae41dc463d48) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java Incorrect region names logged by MemStoreFlusher Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13706) CoprocessorClassLoader should not exempt Hive classes
[ https://issues.apache.org/jira/browse/HBASE-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654712#comment-14654712 ] Jerry He commented on HBASE-13706: -- bq. If not going through a facade in org.apache.hadoop.hbase.* whatever objects the coprocessor instantiates and interacts with will not have access to static shared state like UGI, the metrics subsystem registry, the FileSystem instance cache, etc. Working with HDFS, metrics, and security APIs would be interesting Good points. Maybe that is the right way?. The current way is ambiguous and un-intended? Coprocessors should share with the host env only via clearly defined interfaces. Yes, we can make the change on the master branch only. CoprocessorClassLoader should not exempt Hive classes - Key: HBASE-13706 URL: https://issues.apache.org/jira/browse/HBASE-13706 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.12 Reporter: Jerry He Assignee: Jerry He Priority: Minor Fix For: 2.0.0, 0.98.14, 1.0.2, 1.1.3 Attachments: HBASE-13706.patch CoprocessorClassLoader is used to load classes from the coprocessor jar. Certain classes are exempt from being loaded by this ClassLoader, which means they will be ignored in the coprocessor jar, but loaded from parent classpath instead. One problem is that we categorically exempt org.apache.hadoop. But it happens that Hive packages start with org.apache.hadoop. There is no reason to exclude hive classes from theCoprocessorClassLoader. HBase does not even include Hive jars. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654771#comment-14654771 ] Ted Yu commented on HBASE-14178: Can we get a successful QA run ? Thanks regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-14178: -- Attachment: HBASE-14178_v6.patch Retry. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14186) Read mvcc vlong optimization
[ https://issues.apache.org/jira/browse/HBASE-14186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654779#comment-14654779 ] Anoop Sam John commented on HBASE-14186: bq.Is it possible, that we could come in here and there'd only be a short amount to read so we'd skip the SIZEOF_INT parens? if so, the shift by 16 bits in the second paren would be not needed (might not be a problem if left shifting 0) Yes shifting 0 is not a problem apart from this is an unwanted op. But I think it is ok. The mvcc will be either 0 or some larger value with more than 4 bytes. Read mvcc vlong optimization Key: HBASE-14186 URL: https://issues.apache.org/jira/browse/HBASE-14186 Project: HBase Issue Type: Sub-task Components: Scanners Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 2.0.0 Attachments: HBASE-14186.patch {code} for (int idx = 0; idx remaining; idx++) { byte b = blockBuffer.getByteAfterPosition(offsetFromPos + idx); i = i 8; i = i | (b 0xFF); } {code} Doing the read as in case of BIG_ENDIAN. After HBASE-12600, we tend to keep the mvcc and so byte by byte read looks eating up lot of CPU time. (In my test HFileReaderImpl#_readMvccVersion comes on top in terms of hot methods). We can optimize here by reading 4 or 2 bytes in one shot when the length of the vlong is more than 4 bytes. We will in turn use UnsafeAccess methods which handles ENDIAN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13867) Add endpoint coprocessor guide to HBase book
[ https://issues.apache.org/jira/browse/HBASE-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13867: - Attachment: HBASE-13867.2.patch No rat violations when running patch locally. Reattaching for buildbot. Add endpoint coprocessor guide to HBase book Key: HBASE-13867 URL: https://issues.apache.org/jira/browse/HBASE-13867 Project: HBase Issue Type: Task Components: Coprocessors, documentation Reporter: Vladimir Rodionov Assignee: Gaurav Bhardwaj Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-13867.1.patch, HBASE-13867.2.patch, HBASE-13867.2.patch Endpoint coprocessors are very poorly documented. Coprocessor section of HBase book must be updated either with its own endpoint coprocessors HOW-TO guide or, at least, with the link(s) to some other guides. There is good description here: http://www.3pillarglobal.com/insights/hbase-coprocessors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-5878) Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2.
[ https://issues.apache.org/jira/browse/HBASE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-5878: Attachment: HBASE-5878-v5.patch Reattaching patch for buildbot. Use getVisibleLength public api from HdfsDataInputStream from Hadoop-2. --- Key: HBASE-5878 URL: https://issues.apache.org/jira/browse/HBASE-5878 Project: HBase Issue Type: Bug Components: wal Reporter: Uma Maheswara Rao G Assignee: Ashish Singhi Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1, 1.0.3 Attachments: HBASE-5878-v2.patch, HBASE-5878-v3.patch, HBASE-5878-v4.patch, HBASE-5878-v5.patch, HBASE-5878-v5.patch, HBASE-5878.patch SequencFileLogReader: Currently Hbase using getFileLength api from DFSInputStream class by reflection. DFSInputStream is not exposed as public. So, this may change in future. Now HDFS exposed HdfsDataInputStream as public API. We can make use of it, when we are not able to find the getFileLength api from DFSInputStream as a else condition. So, that we will not have any sudden surprise like we are facing today. Also, it is just logging one warn message and proceeding if it throws any exception while getting the length. I think we can re-throw the exception because there is no point in continuing with dataloss. {code} long adjust = 0; try { Field fIn = FilterInputStream.class.getDeclaredField(in); fIn.setAccessible(true); Object realIn = fIn.get(this.in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this, // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith(DFSInputStream)) { Method getFileLength = realIn.getClass(). getDeclaredMethod(getFileLength, new Class? []{}); getFileLength.setAccessible(true); long realLength = ((Long)getFileLength. invoke(realIn, new Object []{})).longValue(); assert(realLength = this.length); adjust = realLength - this.length; } else { LOG.info(Input stream class: + realIn.getClass().getName() + , not adjusting length); } } catch(Exception e) { SequenceFileLogReader.LOG.warn( Error while trying to get accurate file length. + Truncation / data loss may occur if RegionServers die., e); } return adjust + super.getPos(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-14178: -- Attachment: (was: HBASE-14178_v6.patch) regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13160) SplitLogWorker does not pick up the task immediately
[ https://issues.apache.org/jira/browse/HBASE-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13160: - Fix Version/s: (was: 1.1.2) 1.1.3 SplitLogWorker does not pick up the task immediately Key: HBASE-13160 URL: https://issues.apache.org/jira/browse/HBASE-13160 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.3 Attachments: hbase-13160_v1.patch We were reading some code with Jeffrey, and we realized that the SplitLogWorker's internal task loop is weird. It does {{ls}} every second and sleeps, but have another mechanism to learn about new tasks, but does not make affective use of the zk notification. I have a simple patch which might improve this area. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-7105) RS throws NPE on forcing compaction from HBase shell on a single bulk imported file.
[ https://issues.apache.org/jira/browse/HBASE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-7105: Fix Version/s: (was: 1.1.2) 1.1.3 RS throws NPE on forcing compaction from HBase shell on a single bulk imported file. Key: HBASE-7105 URL: https://issues.apache.org/jira/browse/HBASE-7105 Project: HBase Issue Type: Bug Components: regionserver Reporter: Karthik Ranganathan Assignee: Cosmin Lehene Fix For: 2.0.0, 1.3.0, 1.2.1, 1.0.3, 1.1.3 Attachments: 0001-HBASE-7105-RS-throws-NPE-on-forcing-compaction-from-.patch, 0001-HBASE-7105-RS-throws-NPE-on-forcing-compaction-from-.patch In StoreFile, we have: private AtomicBoolean majorCompaction = null; In StoreFile.open(), we do: b = metadataMap.get(MAJOR_COMPACTION_KEY); if (b != null) { // init majorCompaction variable } Because the file was bulk imported, this is never initialized. Any subsequent call to isMajorCompaction() NPE's. Fix is to initialize it to false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12912) StoreScanner calls Configuration for Boolean Check on each initialization
[ https://issues.apache.org/jira/browse/HBASE-12912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-12912: - Fix Version/s: (was: 1.1.2) 1.1.3 StoreScanner calls Configuration for Boolean Check on each initialization - Key: HBASE-12912 URL: https://issues.apache.org/jira/browse/HBASE-12912 Project: HBase Issue Type: Bug Reporter: John Leach Assignee: John Leach Fix For: 2.0.0, 0.98.14, 1.3.0, 1.2.1, 1.0.3, 1.1.3 Attachments: StoreScannerStall.tiff Original Estimate: 1h Remaining Estimate: 1h There is a clear CPU drain and iterator creation when creating store scanners under high load. Splice was running a TPCC test of our database and we are seeing object creation and CPU waste on the boolean check Code Snippet... if (store != null ((HStore)store).getHRegion() != null store.getStorefilesCount() 1) { RegionServerServices rsService = ((HStore)store).getHRegion().getRegionServerServices(); if (rsService == null || !rsService.getConfiguration().getBoolean( STORESCANNER_PARALLEL_SEEK_ENABLE, false)) return; isParallelSeekEnabled = true; executor = rsService.getExecutorService(); } Will attach profile... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart
[ https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654609#comment-14654609 ] Heng Chen commented on HBASE-14182: --- You means i just write email to u...@hbase.apache.org and ask for help for this issue? My regionserver change ip. But hmaster still connect to old ip after the rs restart --- Key: HBASE-14182 URL: https://issues.apache.org/jira/browse/HBASE-14182 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Heng Chen I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs. This is hmaster's log below: PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online {code} 2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072 2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999) at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447) at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-08-04 17:24:06,801 WARN [AM.ZK.Worker-pool2-t14140] master.AssignmentManager: Failed assignment of solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=2 of 10 java.net.ConnectException: Call to dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection exception: java.net.ConnectException: Connection timed out at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at
[jira] [Commented] (HBASE-14178) regionserver blocks because of waiting for offsetLock
[ https://issues.apache.org/jira/browse/HBASE-14178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654607#comment-14654607 ] Duo Zhang commented on HBASE-14178: --- [~tedyu] If {{blockType}} is {{null}} then we can not determine if we could cache the block until we actually read it from HDFS. So I think it is appropriate to return false here? We should always acquire the lock unless we can make sure the block will not be cached. Thanks. regionserver blocks because of waiting for offsetLock - Key: HBASE-14178 URL: https://issues.apache.org/jira/browse/HBASE-14178 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.6 Reporter: Heng Chen Priority: Critical Fix For: 0.98.6 Attachments: HBASE-14178-0.98.patch, HBASE-14178.patch, HBASE-14178_v1.patch, HBASE-14178_v2.patch, HBASE-14178_v3.patch, HBASE-14178_v4.patch, HBASE-14178_v5.patch, HBASE-14178_v6.patch, jstack My regionserver blocks, and all client rpc timeout. I print the regionserver's jstack, it seems a lot of threads were blocked for waiting offsetLock, detail infomation belows: PS: my table's block cache is off {code} B.DefaultRpcServer.handler=2,queue=2,port=60020 #82 daemon prio=5 os_prio=0 tid=0x01827000 nid=0x2cdc in Object.wait() [0x7f3831b72000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:79) - locked 0x000773af7c18 (a org.apache.hadoop.hbase.util.IdLock$Entry) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:352) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:253) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:524) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:572) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:173) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:313) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:269) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:695) at org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:683) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:533) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:140) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:3889) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3969) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3847) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3820) - locked 0x0005e5c55ad0 (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3807) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4779) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4753) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2916) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29583) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - 0x0005e5c55c08 (a java.util.concurrent.locks.ReentrantLock$NonfairSync) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13354) Add documentation and tests for external block cache.
[ https://issues.apache.org/jira/browse/HBASE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654618#comment-14654618 ] Nick Dimiduk commented on HBASE-13354: -- Ping. Add documentation and tests for external block cache. - Key: HBASE-13354 URL: https://issues.apache.org/jira/browse/HBASE-13354 Project: HBase Issue Type: Bug Components: BlockCache, documentation, test Affects Versions: 2.0.0, 1.1.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.3 The new memcached integration needs some documentation and some testing showing how it works and what can go wrong. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12952) Seek with prefixtree may hang
[ https://issues.apache.org/jira/browse/HBASE-12952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-12952: - Assignee: (was: sinfox) Seek with prefixtree may hang - Key: HBASE-12952 URL: https://issues.apache.org/jira/browse/HBASE-12952 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.0.0, 0.98.7, 0.98.8, 0.98.6.1, 0.98.9, 0.98.10 Reporter: sinfox Fix For: 2.0.0, 0.98.14, 1.3.0, 1.2.1, 1.0.3, 1.1.3 Attachments: hbase_0.98.6.1.patch I have upgraded my hbase cluster from hbase-0.96 to hbase-0.98.6.1,then i found some compaction hang on many regionserver, and the cpu costed100%. It looks like there is an infinite loop somewhere. From the log, i found StoreFileScanner.java : reseekAtOrAfter(HFileScanner s, KeyValue k) enterd an infinite loop. Read source code, I found en error on PrefixTreeArrayReversibleScanner.java : previousRowInternal() eg: A:fan:12, numCell:1 A : 1 - B A : 2 - C C: 3 - D C: 4 - E A: fan:12, numCell:1 B: fan,numCell:1 C: fan:34,numCell: 0 D: fan,numCell:1 E: fan,numCell:1 when currentNode is D, its previous node is B , but this function will return A. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12952) Seek with prefixtree may hang
[ https://issues.apache.org/jira/browse/HBASE-12952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-12952: - Fix Version/s: (was: 1.1.2) 1.1.3 Seek with prefixtree may hang - Key: HBASE-12952 URL: https://issues.apache.org/jira/browse/HBASE-12952 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 1.0.0, 0.98.7, 0.98.8, 0.98.6.1, 0.98.9, 0.98.10 Reporter: sinfox Assignee: sinfox Fix For: 2.0.0, 0.98.14, 1.3.0, 1.2.1, 1.0.3, 1.1.3 Attachments: hbase_0.98.6.1.patch I have upgraded my hbase cluster from hbase-0.96 to hbase-0.98.6.1,then i found some compaction hang on many regionserver, and the cpu costed100%. It looks like there is an infinite loop somewhere. From the log, i found StoreFileScanner.java : reseekAtOrAfter(HFileScanner s, KeyValue k) enterd an infinite loop. Read source code, I found en error on PrefixTreeArrayReversibleScanner.java : previousRowInternal() eg: A:fan:12, numCell:1 A : 1 - B A : 2 - C C: 3 - D C: 4 - E A: fan:12, numCell:1 B: fan,numCell:1 C: fan:34,numCell: 0 D: fan,numCell:1 E: fan,numCell:1 when currentNode is D, its previous node is B , but this function will return A. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13318) RpcServer.Listener.getAddress should be synchronized
[ https://issues.apache.org/jira/browse/HBASE-13318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654663#comment-14654663 ] Nick Dimiduk commented on HBASE-13318: -- I guess we can protect the {{listener}} access in {{getListenerAddress}} with a null-check. Alternately we can grab the address as a string once {{listener}} has been assigned, and make that available for exception logging. I'm not a fan of increasing the surface area of {{RpcServerInterface}} with the new API though. RpcServer.Listener.getAddress should be synchronized Key: HBASE-13318 URL: https://issues.apache.org/jira/browse/HBASE-13318 Project: HBase Issue Type: Bug Affects Versions: 0.98.10.1 Reporter: Lars Hofhansl Priority: Minor Labels: thread-safety We just saw exceptions like these: {noformat} Exception in thread B.DefaultRpcServer.handler=45,queue=0,port=60020 java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Listener.getAddress(RpcServer.java:753) at org.apache.hadoop.hbase.ipc.RpcServer.getListenerAddress(RpcServer.java:2157) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:146) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} Looks like RpcServer$Listener.getAddress should be synchronized (acceptChannel is set to null upon exiting the thread under in a synchronized block). Should be happening very rarely only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654692#comment-14654692 ] Ted Yu commented on HBASE-14021: Checkstyle warning was from generated file: ./dev-support/checkstyle_report.py trunkCheckstyle.xml patchCheckstyle.xml hbase-server/target/generated-jamon/org/apache/hadoop/hbase/tmpl/master/MasterStatusTmpl.java 28 29 Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14021) Quota table has a wrong description on the UI
[ https://issues.apache.org/jira/browse/HBASE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14021: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the patch, Ashish. Quota table has a wrong description on the UI - Key: HBASE-14021 URL: https://issues.apache.org/jira/browse/HBASE-14021 Project: HBase Issue Type: Bug Components: UI Affects Versions: 1.1.0 Reporter: Ashish Singhi Assignee: Ashish Singhi Priority: Minor Fix For: 2.0.0, 1.3.0, 1.2.1 Attachments: HBASE-14021.patch, HBASE-14021.patch, error.png, fix.png !error.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14185) Incorrect region names logged by MemStoreFlusher
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654743#comment-14654743 ] Hudson commented on HBASE-14185: FAILURE: Integrated in HBase-1.3 #87 (See [https://builds.apache.org/job/HBase-1.3/87/]) HBASE-14185 Incorrect region names logged by MemStoreFlusher (Biju Nair) (tedyu: rev 03c6c532bf5d22576efdabf7e670ea1366d6a461) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java Incorrect region names logged by MemStoreFlusher Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13825) Get operations on large objects fail with protocol errors
[ https://issues.apache.org/jira/browse/HBASE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654372#comment-14654372 ] Esteban Gutierrez commented on HBASE-13825: --- +1 [~apurtell] also I think you addressed some of the comments from [~anoopsamjohn] from HBASE-14076. I'm going to open a JIRA to port the changes to master as well. Thanks! Get operations on large objects fail with protocol errors - Key: HBASE-13825 URL: https://issues.apache.org/jira/browse/HBASE-13825 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 1.0.1 Reporter: Dev Lakhani Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13825-0.98.patch, HBASE-13825-0.98.patch, HBASE-13825-branch-1.patch, HBASE-13825-branch-1.patch, HBASE-13825.patch When performing a get operation on a column family with more than 64MB of data, the operation fails with: Caused by: Portable(java.io.IOException): Call to host:port failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1481) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1653) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1711) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:27308) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.get(ProtobufUtil.java:1381) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:753) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:751) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:756) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:765) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:395) This may be related to https://issues.apache.org/jira/browse/HBASE-11747 but that issue is related to cluster status. Scan and put operations on the same data work fine Tested on a 1.0.0 cluster with both 1.0.1 and 1.0.0 clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13271) Table#puts(ListPut) operation is indeterminate; needs fixing
[ https://issues.apache.org/jira/browse/HBASE-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654523#comment-14654523 ] Nick Dimiduk commented on HBASE-13271: -- Any movement here? Bumping out of 1.1.2 but bring it on back if you can -- I'm waiting on a resolution to HBASE-14085. Table#puts(ListPut) operation is indeterminate; needs fixing -- Key: HBASE-13271 URL: https://issues.apache.org/jira/browse/HBASE-13271 Project: HBase Issue Type: Improvement Components: API Affects Versions: 1.0.0 Reporter: stack Priority: Critical Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1, 1.0.3 Another API issue found by [~larsgeorge]: Table.put(ListPut) is questionable after the API change. {code} [Mar-17 9:21 AM] Lars George: Table.put(ListPut) is weird since you cannot flush partial lists [Mar-17 9:21 AM] Lars George: Say out of 5 the third is broken, then the put() call returns with a local exception (say empty Put) and then you have 2 that are in the buffer [Mar-17 9:21 AM] Lars George: but how to you force commit them? [Mar-17 9:22 AM] Lars George: In the past you would call flushCache(), but that is gone now [Mar-17 9:22 AM] Lars George: and flush() is not available on a Table [Mar-17 9:22 AM] Lars George: And you cannot access the underlying BufferedMutation neither [Mar-17 9:23 AM] Lars George: You can *only* add more Puts if you can, or call close() [Mar-17 9:23 AM] Lars George: that is just weird to explain {code} So, Table needs to get flush back or we deprecate this method or it flushes immediately and does not return until complete in the implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13221) HDFS Transparent Encryption breaks WAL writing
[ https://issues.apache.org/jira/browse/HBASE-13221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13221: - Fix Version/s: (was: 1.1.2) 1.1.3 HDFS Transparent Encryption breaks WAL writing -- Key: HBASE-13221 URL: https://issues.apache.org/jira/browse/HBASE-13221 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.98.0, 1.0.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Fix For: 2.0.0, 0.98.14, 1.0.3, 1.1.3 We need to detect when HDFS Transparent Encryption (Hadoop 2.6.0+) is enabled and fall back to more synchronization in the WAL to prevent catastrophic failure under load. See HADOOP-11708 for more details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13267) Deprecate or remove isFileDeletable from SnapshotHFileCleaner
[ https://issues.apache.org/jira/browse/HBASE-13267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13267: - Fix Version/s: (was: 1.1.2) 1.1.3 Deferring from 1.1.2. Deprecate or remove isFileDeletable from SnapshotHFileCleaner - Key: HBASE-13267 URL: https://issues.apache.org/jira/browse/HBASE-13267 Project: HBase Issue Type: Task Reporter: Andrew Purtell Priority: Minor Fix For: 2.0.0, 0.98.14, 1.3.0, 1.2.1, 1.0.3, 1.1.3 The isFileDeletable method in SnapshotHFileCleaner became vestigial after HBASE-12627, lets remove it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14181) Add Spark DataFrame DataSource to HBase-Spark Module
[ https://issues.apache.org/jira/browse/HBASE-14181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14181: Component/s: spark Add Spark DataFrame DataSource to HBase-Spark Module Key: HBASE-14181 URL: https://issues.apache.org/jira/browse/HBASE-14181 Project: HBase Issue Type: New Feature Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor Build a RelationProvider for HBase-Spark Module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14167) hbase-spark integration tests do not respect -DskipITs
[ https://issues.apache.org/jira/browse/HBASE-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14167: Component/s: spark hbase-spark integration tests do not respect -DskipITs -- Key: HBASE-14167 URL: https://issues.apache.org/jira/browse/HBASE-14167 Project: HBase Issue Type: Bug Components: spark Affects Versions: 2.0.0 Reporter: Andrew Purtell Priority: Minor When running a build with {{mvn ... -DskipITs}}, the hbase-spark module's integration tests do not respect the flag and run anyway. Fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14146) Once replication sees an error it slows down forever
[ https://issues.apache.org/jira/browse/HBASE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654575#comment-14654575 ] Hudson commented on HBASE-14146: FAILURE: Integrated in HBase-0.98 #1066 (See [https://builds.apache.org/job/HBase-0.98/1066/]) HBASE-14146 Fix Once replication sees an error it slows down forever (apurtell: rev a53b8e0d4c4c6f60ea3ad60ef34be96cc0810368) * hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java Once replication sees an error it slows down forever Key: HBASE-14146 URL: https://issues.apache.org/jira/browse/HBASE-14146 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.2.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14146.patch sleepMultiplier inside of HBaseInterClusterReplicationEndpoint and ReplicationSource never gets reset to zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14085) Correct LICENSE and NOTICE files in artifacts
[ https://issues.apache.org/jira/browse/HBASE-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654414#comment-14654414 ] Andrew Purtell commented on HBASE-14085: bq. For me personally, the way I read my responsibility to the foundation as a PMC member means I'll be voting -1 until legal rules on the license or we replace it. The way I read LEGAL-222 is it is not as clear cut as I think is implied here. The question is on a small number of files bundled in the JRuby jar, which is only included in binary convenience release artifacts. We have larger issues than just the slate of release candidates on deck if there is a problem, every available release in the archive and on the mirrors is affected. That said I buy the argument that we can't release something that has a question mark. I believe it is immediately possible to resume releases, as long as we stick to source only releases until the matter is cleared up. It would also be possible to resume releases including binary artifacts as long as we exclude the binary artifacts for the hbase-shell module. Correct LICENSE and NOTICE files in artifacts - Key: HBASE-14085 URL: https://issues.apache.org/jira/browse/HBASE-14085 Project: HBase Issue Type: Task Components: build Affects Versions: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Fix For: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-14085.1.patch, HBASE-14085.2.patch, HBASE-14085.3.patch +Problems: * checked LICENSE/NOTICE on binary ** binary artifact LICENSE file has not been updated to include the additional license terms for contained third party dependencies ** binary artifact NOTICE file does not include a copyright line ** binary artifact NOTICE file does not appear to propagate appropriate info from the NOTICE files from bundled dependencies * checked NOTICE on source ** source artifact NOTICE file does not include a copyright line ** source NOTICE file includes notices for third party dependencies not included in the artifact * checked NOTICE files shipped in maven jars ** copyright line only says 2015 when it's very likely the contents are under copyright prior to this year * nit: NOTICE file on jars in maven say HBase - ${module} rather than Apache HBase - ${module} as required refs: http://www.apache.org/dev/licensing-howto.html#bundled-vs-non-bundled http://www.apache.org/dev/licensing-howto.html#binary http://www.apache.org/dev/licensing-howto.html#simple -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13825) Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name
[ https://issues.apache.org/jira/browse/HBASE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654455#comment-14654455 ] Hadoop QA commented on HBASE-13825: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748724/HBASE-13825-branch-1.patch against branch-1 branch at commit 931e77d4507e1650c452cefadda450e0bf3f0528. ATTACHMENT ID: 12748724 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 3826 checkstyle errors (more than the master's current 3825 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14973//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14973//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14973//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14973//console This message is automatically generated. Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name --- Key: HBASE-13825 URL: https://issues.apache.org/jira/browse/HBASE-13825 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 1.0.1 Reporter: Dev Lakhani Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13825-0.98.patch, HBASE-13825-0.98.patch, HBASE-13825-branch-1.patch, HBASE-13825-branch-1.patch, HBASE-13825.patch When performing a get operation on a column family with more than 64MB of data, the operation fails with: Caused by: Portable(java.io.IOException): Call to host:port failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1481) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1653) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1711) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:27308) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.get(ProtobufUtil.java:1381) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:753) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:751) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:756) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:765) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:395) This may be related to https://issues.apache.org/jira/browse/HBASE-11747 but that issue is related to cluster status. Scan and put operations on the same data work fine Tested on a 1.0.0 cluster with both 1.0.1 and 1.0.0 clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14085) Correct LICENSE and NOTICE files in artifacts
[ https://issues.apache.org/jira/browse/HBASE-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654526#comment-14654526 ] Andrew Purtell commented on HBASE-14085: I pulled down the master patch. Let me make sure it tests out ok locally. If so I'll start porting it back, starting with branch-1. Correct LICENSE and NOTICE files in artifacts - Key: HBASE-14085 URL: https://issues.apache.org/jira/browse/HBASE-14085 Project: HBase Issue Type: Task Components: build Affects Versions: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Fix For: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-14085.1.patch, HBASE-14085.2.patch, HBASE-14085.3.patch +Problems: * checked LICENSE/NOTICE on binary ** binary artifact LICENSE file has not been updated to include the additional license terms for contained third party dependencies ** binary artifact NOTICE file does not include a copyright line ** binary artifact NOTICE file does not appear to propagate appropriate info from the NOTICE files from bundled dependencies * checked NOTICE on source ** source artifact NOTICE file does not include a copyright line ** source NOTICE file includes notices for third party dependencies not included in the artifact * checked NOTICE files shipped in maven jars ** copyright line only says 2015 when it's very likely the contents are under copyright prior to this year * nit: NOTICE file on jars in maven say HBase - ${module} rather than Apache HBase - ${module} as required refs: http://www.apache.org/dev/licensing-howto.html#bundled-vs-non-bundled http://www.apache.org/dev/licensing-howto.html#binary http://www.apache.org/dev/licensing-howto.html#simple -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12816) GC logs are lost upon Region Server restart if GCLogFileRotation is enabled
[ https://issues.apache.org/jira/browse/HBASE-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-12816: - Fix Version/s: (was: 1.1.2) 1.1.3 GC logs are lost upon Region Server restart if GCLogFileRotation is enabled --- Key: HBASE-12816 URL: https://issues.apache.org/jira/browse/HBASE-12816 Project: HBase Issue Type: Bug Components: scripts Reporter: Abhishek Singh Chouhan Assignee: Abhishek Singh Chouhan Priority: Minor Fix For: 2.0.0, 0.98.14, 1.2.1, 1.1.3 Attachments: HBASE-12816.patch When -XX:+UseGCLogFileRotation is used gc log files end with .gc.0 instead of .gc. hbase_rotate_log () in hbase-daemon.sh does not handle this correctly and hence when a RS is restarted old gc logs are lost(overwritten). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13706) CoprocessorClassLoader should not exempt Hive classes
[ https://issues.apache.org/jira/browse/HBASE-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13706: - Fix Version/s: (was: 1.1.2) 1.1.3 Any update here? CoprocessorClassLoader should not exempt Hive classes - Key: HBASE-13706 URL: https://issues.apache.org/jira/browse/HBASE-13706 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.12 Reporter: Jerry He Assignee: Jerry He Priority: Minor Fix For: 2.0.0, 0.98.14, 1.0.2, 1.1.3 Attachments: HBASE-13706.patch CoprocessorClassLoader is used to load classes from the coprocessor jar. Certain classes are exempt from being loaded by this ClassLoader, which means they will be ignored in the coprocessor jar, but loaded from parent classpath instead. One problem is that we categorically exempt org.apache.hadoop. But it happens that Hive packages start with org.apache.hadoop. There is no reason to exclude hive classes from theCoprocessorClassLoader. HBase does not even include Hive jars. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14085) Correct LICENSE and NOTICE files in artifacts
[ https://issues.apache.org/jira/browse/HBASE-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654422#comment-14654422 ] Sean Busbey commented on HBASE-14085: - I agree with that analysis. My prior statement presumed we continued our pattern of doing source + binary artifacts for votes. Correct LICENSE and NOTICE files in artifacts - Key: HBASE-14085 URL: https://issues.apache.org/jira/browse/HBASE-14085 Project: HBase Issue Type: Task Components: build Affects Versions: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Fix For: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-14085.1.patch, HBASE-14085.2.patch, HBASE-14085.3.patch +Problems: * checked LICENSE/NOTICE on binary ** binary artifact LICENSE file has not been updated to include the additional license terms for contained third party dependencies ** binary artifact NOTICE file does not include a copyright line ** binary artifact NOTICE file does not appear to propagate appropriate info from the NOTICE files from bundled dependencies * checked NOTICE on source ** source artifact NOTICE file does not include a copyright line ** source NOTICE file includes notices for third party dependencies not included in the artifact * checked NOTICE files shipped in maven jars ** copyright line only says 2015 when it's very likely the contents are under copyright prior to this year * nit: NOTICE file on jars in maven say HBase - ${module} rather than Apache HBase - ${module} as required refs: http://www.apache.org/dev/licensing-howto.html#bundled-vs-non-bundled http://www.apache.org/dev/licensing-howto.html#binary http://www.apache.org/dev/licensing-howto.html#simple -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-13825) Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name
[ https://issues.apache.org/jira/browse/HBASE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654430#comment-14654430 ] Andrew Purtell edited comment on HBASE-13825 at 8/4/15 10:03 PM: - Thanks [~esteban]. Or, if you like, I can add what you've identified as missing to the patch for master here. If so, what would that be was (Author: apurtell): Thanks [~esteban]. Or, if you like, I can add what you've identified as missing to the patch for master here. Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name --- Key: HBASE-13825 URL: https://issues.apache.org/jira/browse/HBASE-13825 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 1.0.1 Reporter: Dev Lakhani Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13825-0.98.patch, HBASE-13825-0.98.patch, HBASE-13825-branch-1.patch, HBASE-13825-branch-1.patch, HBASE-13825.patch When performing a get operation on a column family with more than 64MB of data, the operation fails with: Caused by: Portable(java.io.IOException): Call to host:port failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1481) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1653) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1711) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:27308) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.get(ProtobufUtil.java:1381) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:753) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:751) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:756) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:765) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:395) This may be related to https://issues.apache.org/jira/browse/HBASE-11747 but that issue is related to cluster status. Scan and put operations on the same data work fine Tested on a 1.0.0 cluster with both 1.0.1 and 1.0.0 clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13825) Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name
[ https://issues.apache.org/jira/browse/HBASE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13825: --- Summary: Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name (was: Get operations on large objects fail with protocol errors) Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name --- Key: HBASE-13825 URL: https://issues.apache.org/jira/browse/HBASE-13825 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 1.0.1 Reporter: Dev Lakhani Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13825-0.98.patch, HBASE-13825-0.98.patch, HBASE-13825-branch-1.patch, HBASE-13825-branch-1.patch, HBASE-13825.patch When performing a get operation on a column family with more than 64MB of data, the operation fails with: Caused by: Portable(java.io.IOException): Call to host:port failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1481) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1653) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1711) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:27308) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.get(ProtobufUtil.java:1381) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:753) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:751) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:756) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:765) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:395) This may be related to https://issues.apache.org/jira/browse/HBASE-11747 but that issue is related to cluster status. Scan and put operations on the same data work fine Tested on a 1.0.0 cluster with both 1.0.1 and 1.0.0 clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-13706) CoprocessorClassLoader should not exempt Hive classes
[ https://issues.apache.org/jira/browse/HBASE-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654548#comment-14654548 ] Andrew Purtell edited comment on HBASE-13706 at 8/4/15 11:55 PM: - bq. no good reason [to exempt the Hadoop classes] bq. The logic will be that all HBase classes and all their dependencies will be loaded by native/parent loader.All co-processor implementation classes and their dependencies will be loaded by the CoprocessorClassLoader, unless they spill over. We could try only org.apache.hadoop.hbase.*. The subset of the Hadoop APIs relevant and useful for coprocessors is pretty big, there could be unexpected/unintended consequences. If not going through a facade in org.apache.hadoop.hbase.* whatever objects the coprocessor instantiates and interacts with will not have access to static shared state like UGI, the metrics subsystem registry, the FileSystem instance cache, etc. Working with HDFS, metrics, and security APIs would be interesting. We could try it only in master for a while. We could claim such things out of scope for coprocessors, but because we haven't up to now, it's a hairy backwards compatibility problem. was (Author: apurtell): bq. no good reason [to exempt the Hadoop classes] bq. The logic will be that all HBase classes and all their dependencies will be loaded by native/parent loader. All co-processor implementation classes and their dependencies will be loaded by the CoprocessorClassLoader, unless they spill over. We could try only org.apache.hadoop.hbase.*. The subset of the Hadoop APIs relevant and useful for coprocessors is pretty big, there could be unexpected/unintended consequences. If not going through a facade in org.apache.hadoop.hbase.* whatever objects the coprocessor instantiates and interacts with will not have access to static shared state like UGI, the metrics subsystem registry, the FileSystem instance cache, etc. Working with HDFS, metrics, and security APIs would be interesting. We could try it only in master for a while. We could claim such things out of scope for coprocessors, but because we haven't up to now, it's a hairy backwards compatibility problem. CoprocessorClassLoader should not exempt Hive classes - Key: HBASE-13706 URL: https://issues.apache.org/jira/browse/HBASE-13706 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.12 Reporter: Jerry He Assignee: Jerry He Priority: Minor Fix For: 2.0.0, 0.98.14, 1.0.2, 1.1.3 Attachments: HBASE-13706.patch CoprocessorClassLoader is used to load classes from the coprocessor jar. Certain classes are exempt from being loaded by this ClassLoader, which means they will be ignored in the coprocessor jar, but loaded from parent classpath instead. One problem is that we categorically exempt org.apache.hadoop. But it happens that Hive packages start with org.apache.hadoop. There is no reason to exclude hive classes from theCoprocessorClassLoader. HBase does not even include Hive jars. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14184) Fix indention and type-o in JavaHBaseContext
[ https://issues.apache.org/jira/browse/HBASE-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-14184: Status: Patch Available (was: Open) marking as patch available so QABot will run Fix indention and type-o in JavaHBaseContext Key: HBASE-14184 URL: https://issues.apache.org/jira/browse/HBASE-14184 Project: HBase Issue Type: Wish Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor Attachments: HBASE-14184.3.patch Looks like there is a Ddd that should be Rdd. Also looks like everything is one space over too much -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14185) Incorrect region names logged by MemStoreFlusher.java
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654566#comment-14654566 ] Hadoop QA commented on HBASE-14185: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748709/HBASE-14185.patch against master branch at commit 931e77d4507e1650c452cefadda450e0bf3f0528. ATTACHMENT ID: 12748709 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnDatanodeDeath(TestLogRolling.java:393) at org.apache.hadoop.hbase.regionserver.TestCorruptedRegionStoreFile.testLosingFileAfterScannerInit(TestCorruptedRegionStoreFile.java:167) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14975//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14975//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14975//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14975//console This message is automatically generated. Incorrect region names logged by MemStoreFlusher.java - Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14146) Once replication sees an error it slows down forever
[ https://issues.apache.org/jira/browse/HBASE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654365#comment-14654365 ] Hudson commented on HBASE-14146: FAILURE: Integrated in HBase-1.1 #596 (See [https://builds.apache.org/job/HBase-1.1/596/]) HBASE-14146 Fix Once replication sees an error it slows down forever (apurtell: rev 7dcd3c0bdfa8246cb7ca04bd1e21bae37776130b) * hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java Once replication sees an error it slows down forever Key: HBASE-14146 URL: https://issues.apache.org/jira/browse/HBASE-14146 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.2.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14146.patch sleepMultiplier inside of HBaseInterClusterReplicationEndpoint and ReplicationSource never gets reset to zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14085) Correct LICENSE and NOTICE files in artifacts
[ https://issues.apache.org/jira/browse/HBASE-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654510#comment-14654510 ] Hadoop QA commented on HBASE-14085: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12748736/HBASE-14085.3.patch against master branch at commit 931e77d4507e1650c452cefadda450e0bf3f0528. ATTACHMENT ID: 12748736 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 30 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + debug-print-included-work-info${license.debug.print.included}/debug-print-included-work-info + resourceBundle${project.groupId}:hbase-resource-bundle:${project.version}/resourceBundle + supplementalModelArtifact${project.groupId}:hbase-resource-bundle:${project.version}/supplementalModelArtifact + Build an aggregation of our templated NOTICE file and the NOTICE files in our dependencies. +project xmlns=http://maven.apache.org/POM/4.0.0; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; +1.1. âContributorâ means each individual or entity that creates or contributes to the creation of +1.2. âContributor Versionâ means the combination of the Original Software, prior Modifications used +1.5. âInitial Developerâ means the individual or entity that first makes Original Software available +1.6. âLarger Workâ means a work which combines Covered Software or portions thereof with code not +1.8. âLicensableâ means having the right to grant, to the maximum extent possible, whether at the {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.client.TestScannersFromClientSide org.apache.hadoop.hbase.client.TestFromClientSideNoCodec org.apache.hadoop.hbase.client.TestScannerTimeout org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas org.apache.hadoop.hbase.client.TestTableSnapshotScanner org.apache.hadoop.hbase.client.TestMetaWithReplicas org.apache.hadoop.hbase.namespace.TestNamespaceAuditor org.apache.hadoop.hbase.client.TestHCM org.apache.hadoop.hbase.client.TestSnapshotFromClientWithRegionReplicas org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient org.apache.hadoop.hbase.client.TestClientTimeouts org.apache.hadoop.hbase.client.TestMobSnapshotFromClient org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient {color:red}-1 core zombie tests{color}. There are 4 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14974//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14974//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14974//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14974//console This message is automatically generated. Correct LICENSE and NOTICE files in artifacts - Key: HBASE-14085 URL: https://issues.apache.org/jira/browse/HBASE-14085 Project: HBase Issue
[jira] [Updated] (HBASE-13627) Terminating RS results in redundant CLOSE RPC
[ https://issues.apache.org/jira/browse/HBASE-13627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13627: - Fix Version/s: (was: 1.1.2) 1.1.3 Terminating RS results in redundant CLOSE RPC - Key: HBASE-13627 URL: https://issues.apache.org/jira/browse/HBASE-13627 Project: HBase Issue Type: Bug Components: master Affects Versions: 1.1.0 Reporter: Nick Dimiduk Priority: Minor Labels: beginner, beginners Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.3 Noticed while testing the 1.1.0RC0 bits. It seems we're issuing a redundant close RPC during shutdown. This results in a logging warning for each region. {noformat} 2015-05-06 00:07:19,214 INFO [RS:0;ndimiduk-apache-1-1-dist-6:56371] regionserver.HRegionServer: Received CLOSE for the region: 19cbe4fe2fe5335e7aace05e10e36ede, which we are already trying to CLOSE, but not completed yet 2015-05-06 00:07:19,214 WARN [RS:0;ndimiduk-apache-1-1-dist-6:56371] regionserver.HRegionServer: Failed to close cluster_test,,1430869443384.19cbe4fe2fe5335e7aace05e10e36ede. - ignoring and continuing org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: The region 19cbe4fe2fe5335e7aace05e10e36ede was already closing. New CLOSE request is ignored. at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2769) at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2695) at org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2327) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:937) at java.lang.Thread.run(Thread.java:745) {noformat} 1. launch a standalone cluster from tgz (./bin/start-hbase.sh) 2. load some data (ie, run bin/hbase ltt) 3. terminate cluster (./bin/stop-hbase.sh) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13603) Write test asserting desired priority of RS-Master RPCs
[ https://issues.apache.org/jira/browse/HBASE-13603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13603: - Fix Version/s: (was: 1.1.2) 1.1.3 Write test asserting desired priority of RS-Master RPCs Key: HBASE-13603 URL: https://issues.apache.org/jira/browse/HBASE-13603 Project: HBase Issue Type: Test Components: rpc, test Reporter: Josh Elser Assignee: Josh Elser Priority: Minor Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.3 From HBASE-13351: {quote} Any way we can write a FT test to assert that the RS-Master APIs are treated with higher priority. I see your UT for asserting the annotation. {quote} Write a test that verifies expected RPCs are run on the correct pools in as real of an environment possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13627) Terminating RS results in redundant CLOSE RPC
[ https://issues.apache.org/jira/browse/HBASE-13627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13627: - Labels: beginner beginners (was: ) Terminating RS results in redundant CLOSE RPC - Key: HBASE-13627 URL: https://issues.apache.org/jira/browse/HBASE-13627 Project: HBase Issue Type: Bug Components: master Affects Versions: 1.1.0 Reporter: Nick Dimiduk Priority: Minor Labels: beginner, beginners Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.3 Noticed while testing the 1.1.0RC0 bits. It seems we're issuing a redundant close RPC during shutdown. This results in a logging warning for each region. {noformat} 2015-05-06 00:07:19,214 INFO [RS:0;ndimiduk-apache-1-1-dist-6:56371] regionserver.HRegionServer: Received CLOSE for the region: 19cbe4fe2fe5335e7aace05e10e36ede, which we are already trying to CLOSE, but not completed yet 2015-05-06 00:07:19,214 WARN [RS:0;ndimiduk-apache-1-1-dist-6:56371] regionserver.HRegionServer: Failed to close cluster_test,,1430869443384.19cbe4fe2fe5335e7aace05e10e36ede. - ignoring and continuing org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: The region 19cbe4fe2fe5335e7aace05e10e36ede was already closing. New CLOSE request is ignored. at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2769) at org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2695) at org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2327) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:937) at java.lang.Thread.run(Thread.java:745) {noformat} 1. launch a standalone cluster from tgz (./bin/start-hbase.sh) 2. load some data (ie, run bin/hbase ltt) 3. terminate cluster (./bin/stop-hbase.sh) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14185) Incorrect region names logged by MemStoreFlusher
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14185: --- Summary: Incorrect region names logged by MemStoreFlusher (was: Incorrect region names logged by MemStoreFlusher.java) Incorrect region names logged by MemStoreFlusher Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14185) Incorrect region names logged by MemStoreFlusher.java
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654573#comment-14654573 ] Ted Yu commented on HBASE-14185: Ran TestCorruptedRegionStoreFile with patch locally and it passed. Incorrect region names logged by MemStoreFlusher.java - Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14085) Correct LICENSE and NOTICE files in artifacts
[ https://issues.apache.org/jira/browse/HBASE-14085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654424#comment-14654424 ] Andrew Purtell commented on HBASE-14085: bq. I agree with that analysis. My prior statement presumed we continued our pattern of doing source + binary artifacts for votes. Which was correct, because I gave nothing indicating otherwise. My bad. Correct LICENSE and NOTICE files in artifacts - Key: HBASE-14085 URL: https://issues.apache.org/jira/browse/HBASE-14085 Project: HBase Issue Type: Task Components: build Affects Versions: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Reporter: Sean Busbey Assignee: Sean Busbey Priority: Blocker Fix For: 2.0.0, 0.94.28, 0.98.14, 1.0.2, 1.2.0, 1.1.2 Attachments: HBASE-14085.1.patch, HBASE-14085.2.patch, HBASE-14085.3.patch +Problems: * checked LICENSE/NOTICE on binary ** binary artifact LICENSE file has not been updated to include the additional license terms for contained third party dependencies ** binary artifact NOTICE file does not include a copyright line ** binary artifact NOTICE file does not appear to propagate appropriate info from the NOTICE files from bundled dependencies * checked NOTICE on source ** source artifact NOTICE file does not include a copyright line ** source NOTICE file includes notices for third party dependencies not included in the artifact * checked NOTICE files shipped in maven jars ** copyright line only says 2015 when it's very likely the contents are under copyright prior to this year * nit: NOTICE file on jars in maven say HBase - ${module} rather than Apache HBase - ${module} as required refs: http://www.apache.org/dev/licensing-howto.html#bundled-vs-non-bundled http://www.apache.org/dev/licensing-howto.html#binary http://www.apache.org/dev/licensing-howto.html#simple -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14182) My regionserver change ip. But hmaster still connect to old ip after the rs restart
[ https://issues.apache.org/jira/browse/HBASE-14182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-14182. Resolution: Invalid Please write in to u...@hbase.apache.org for help troubleshooting issues. This is the project dev tracker. Thanks! My regionserver change ip. But hmaster still connect to old ip after the rs restart --- Key: HBASE-14182 URL: https://issues.apache.org/jira/browse/HBASE-14182 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.98.6 Reporter: Heng Chen I use docker to deploy my hbase cluster, and the RS ip changed. When restart this RS, hmaster webUI shows it connect to hmaster, but regions num. is zero after a long time. I check the hmaster log and found that master still use old ip to connect this rs. This is hmaster's log below: PS: 10.11.21.140 is old ip of rs dx-ape-regionserver1-online {code} 2015-08-04 17:24:00,081 INFO [AM.ZK.Worker-pool2-t14141] master.AssignmentManager: Assigning solar_image,\x01Y\x8E\xA3y,1434968237206.4a1bdeec85b9f55b962596f9fb2cd07f. to dx-ape-regionserver1-online,60020,1438679950072 2015-08-04 17:24:06,800 WARN [AM.ZK.Worker-pool2-t14133] master.AssignmentManager: Failed assignment of solar_image,\x00\x94\x09\x8D\x95,1430991781025.b0f5b755f443d41cf306026a60675020. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=3 of 10 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupConnection(RpcClient.java:578) at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:868) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1543) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:671) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2097) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1577) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1550) at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:104) at org.apache.hadoop.hbase.master.AssignmentManager.handleRegion(AssignmentManager.java:999) at org.apache.hadoop.hbase.master.AssignmentManager$6.run(AssignmentManager.java:1447) at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1260) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2015-08-04 17:24:06,801 WARN [AM.ZK.Worker-pool2-t14140] master.AssignmentManager: Failed assignment of solar_image,\x00(.\xE7\xB1L,1430024620929.534025fcf4cae5516513b9c9a4cf73dc. to dx-ape-regionserver1-online,60020,1438679950072, trying to assign elsewhere instead; try=2 of 10 java.net.ConnectException: Call to dx-ape-regionserver1-online/10.11.21.140:60020 failed on connection exception: java.net.ConnectException: Connection timed out at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1483) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1461) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1661) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1719) at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.openRegion(AdminProtos.java:20964) at
[jira] [Commented] (HBASE-13889) hbase-shaded-client artifact is missing dependency (therefore, does not work)
[ https://issues.apache.org/jira/browse/HBASE-13889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654515#comment-14654515 ] Nick Dimiduk commented on HBASE-13889: -- [~busbey] I haven't held up 1.1.x releases for a fix here, though it's disappointing we didn't get it tested properly the first time. [~dminkovsky] have you had any luck with this one? hbase-shaded-client artifact is missing dependency (therefore, does not work) - Key: HBASE-13889 URL: https://issues.apache.org/jira/browse/HBASE-13889 Project: HBase Issue Type: Bug Components: Client Affects Versions: 1.1.0, 1.1.0.1 Environment: N/A? Reporter: Dmitry Minkovsky Priority: Blocker Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1 Attachments: 13889.wip.patch, Screen Shot 2015-06-11 at 10.59.55 AM.png The {{hbase-shaded-client}} artifact was introduced in [HBASE-13517|https://issues.apache.org/jira/browse/HBASE-13517]. Thank you very much for this, as I am new to Java building and was having a very slow-moving time resolving conflicts. However, the shaded client artifact seems to be missing {{javax.xml.transform.TransformerException}}. I examined the JAR, which does not have this package/class. Steps to reproduce: Java: {code} package com.mycompany.app; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Connection; import org.apache.hadoop.hbase.client.ConnectionFactory; public class App { public static void main( String[] args ) throws java.io.IOException { Configuration config = HBaseConfiguration.create(); Connection connection = ConnectionFactory.createConnection(config); } } {code} POM: {code} project xmlns=http://maven.apache.org/POM/4.0.0; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd; modelVersion4.0.0/modelVersion groupIdcom.mycompany.app/groupId artifactIdmy-app/artifactId version1.0-SNAPSHOT/version
[jira] [Updated] (HBASE-13143) TestCacheOnWrite is flaky and needs a diet
[ https://issues.apache.org/jira/browse/HBASE-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13143: - Fix Version/s: (was: 1.1.2) 1.1.3 TestCacheOnWrite is flaky and needs a diet -- Key: HBASE-13143 URL: https://issues.apache.org/jira/browse/HBASE-13143 Project: HBase Issue Type: Bug Affects Versions: 0.98.11 Reporter: Andrew Purtell Assignee: Esteban Gutierrez Priority: Critical Fix For: 2.0.0, 0.98.14, 1.3.0, 1.2.1, 1.0.3, 1.1.3 TestCacheOnWrite passes locally but has been flaking in 0.98 builds on Jenkins, most recently https://builds.apache.org/job/HBase-0.98/878/ The test takes a long time to execute (338.492 sec) and is resource intensive (216 tests). Neither of these characteristics endear it to Jenkins. When I ran this unit test on a macbook after a minute the fan was running so fast I thought it would take flight. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13996) Add write sniffing in canary
[ https://issues.apache.org/jira/browse/HBASE-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13996: - Fix Version/s: (was: 1.1.2) This looks like a feature enhancement, not a bug fix. Hence dropping from 1.1.x line. Add write sniffing in canary Key: HBASE-13996 URL: https://issues.apache.org/jira/browse/HBASE-13996 Project: HBase Issue Type: New Feature Components: canary Affects Versions: 0.98.13, 1.1.0.1 Reporter: Liu Shaohui Assignee: Liu Shaohui Fix For: 2.0.0, 0.98.14 Attachments: HBASE-13996-v001.diff Currently the canary tool only sniff the read operations, it's hard to finding the problem in write path. To support the write sniffing, we create a system table named '_canary_' in the canary tool. And the tool will make sure that the region number is large than the number of the regionserver and the regions will be distributed onto all regionservers. Periodically, the tool will put data to these regions to calculate the write availability of HBase and send alerts if needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13605) RegionStates should not keep its list of dead servers
[ https://issues.apache.org/jira/browse/HBASE-13605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13605: - Fix Version/s: (was: 1.1.2) 1.1.3 Deferring from 1.1.2. RegionStates should not keep its list of dead servers - Key: HBASE-13605 URL: https://issues.apache.org/jira/browse/HBASE-13605 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 2.0.0, 1.1.3 Attachments: hbase-13605_v1.patch, hbase-13605_v3-branch-1.1.patch, hbase-13605_v4-branch-1.1.patch, hbase-13605_v4-master.patch As mentioned in https://issues.apache.org/jira/browse/HBASE-9514?focusedCommentId=13769761page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13769761 and HBASE-12844 we should have only 1 source of cluster membership. The list of dead server and RegionStates doing it's own liveliness check (ServerManager.isServerReachable()) has caused an assignment problem again in a test cluster where the region states thinks that the server is dead and SSH will handle the region assignment. However the RS is not dead at all, living happily, and never gets zk expiry or YouAreDeadException or anything. This leaves the list of regions unassigned in OFFLINE state. master assigning the region: {code} 15-04-20 09:02:25,780 DEBUG [AM.ZK.Worker-pool3-t330] master.RegionStates: Onlined 77dddcd50c22e56bfff133c0e1f9165b on os-amb-r6-us-1429512014-hbase4-6.novalocal,16020,1429520535268 {ENCODED = 77dddcd50c {code} Master then disabled the table, and unassigned the region: {code} 2015-04-20 09:02:27,158 WARN [ProcedureExecutorThread-1] zookeeper.ZKTableStateManager: Moving table loadtest_d1 state from DISABLING to DISABLING Starting unassign of loadtest_d1,,1429520544378.77dddcd50c22e56bfff133c0e1f9165b. (offlining), current state: {77dddcd50c22e56bfff133c0e1f9165b state=OPEN, ts=1429520545780, server=os-amb-r6-us-1429512014-hbase4-6.novalocal,16020,1429520535268} bleProcedure$BulkDisabler-0] master.AssignmentManager: Sent CLOSE to os-amb-r6-us-1429512014-hbase4-6.novalocal,16020,1429520535268 for region loadtest_d1,,1429520544378.77dddcd50c22e56bfff133c0e1f9165b. 2015-04-20 09:02:27,414 INFO [AM.ZK.Worker-pool3-t316] master.RegionStates: Offlined 77dddcd50c22e56bfff133c0e1f9165b from os-amb-r6-us-1429512014-hbase4-6.novalocal,16020,1429520535268 {code} On table re-enable, AM does not assign the region: {code} 2015-04-20 09:02:30,415 INFO [ProcedureExecutorThread-3] balancer.BaseLoadBalancer: Reassigned 25 regions. 25 retained the pre-restart assignment.· 2015-04-20 09:02:30,415 INFO [ProcedureExecutorThread-3] procedure.EnableTableProcedure: Bulk assigning 25 region(s) across 5 server(s), retainAssignment=true l,16000,1429515659726-GeneralBulkAssigner-4] master.RegionStates: Couldn't reach online server os-amb-r6-us-1429512014-hbase4-6.novalocal,16020,1429520535268 l,16000,1429515659726-GeneralBulkAssigner-4] master.AssignmentManager: Updating the state to OFFLINE to allow to be reassigned by SSH nmentManager: Skip assigning loadtest_d1,,1429520544378.77dddcd50c22e56bfff133c0e1f9165b., it is on a dead but not processed yet server: os-amb-r6-us-1429512014-hbase4-6.novalocal,16020,1429520535268 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11819) Unit test for CoprocessorHConnection
[ https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-11819: - Fix Version/s: (was: 1.1.2) 1.1.3 What happened with the commit here folks? Has this been released on any branch? Unit test for CoprocessorHConnection - Key: HBASE-11819 URL: https://issues.apache.org/jira/browse/HBASE-11819 Project: HBase Issue Type: Test Reporter: Andrew Purtell Assignee: Talat UYARER Priority: Minor Labels: beginner Fix For: 2.0.0, 0.98.14, 1.3.0, 1.2.1, 1.1.3 Attachments: HBASE-11819v4-master.patch, HBASE-11819v5-0.98 (1).patch, HBASE-11819v5-0.98.patch, HBASE-11819v5-master (1).patch, HBASE-11819v5-master.patch, HBASE-11819v5-master.patch, HBASE-11819v5-v0.98.patch, HBASE-11819v5-v1.0.patch Add a unit test to hbase-server that exercises CoprocessorHConnection . -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13706) CoprocessorClassLoader should not exempt Hive classes
[ https://issues.apache.org/jira/browse/HBASE-13706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654548#comment-14654548 ] Andrew Purtell commented on HBASE-13706: bq. no good reason [to exempt the Hadoop classes] bq. The logic will be that all HBase classes and all their dependencies will be loaded by native/parent loader. All co-processor implementation classes and their dependencies will be loaded by the CoprocessorClassLoader, unless they spill over. We could try only org.apache.hadoop.hbase.*. The subset of the Hadoop APIs relevant and useful for coprocessors is pretty big, there could be unexpected/unintended consequences. If not going through a facade in org.apache.hadoop.hbase.* whatever objects the coprocessor instantiates and interacts with will not have access to static shared state like UGI, the metrics subsystem registry, the FileSystem instance cache, etc. Working with HDFS, metrics, and security APIs would be interesting. We could try it only in master for a while. We could claim such things out of scope for coprocessors, but because we haven't up to now, it's a hairy backwards compatibility problem. CoprocessorClassLoader should not exempt Hive classes - Key: HBASE-13706 URL: https://issues.apache.org/jira/browse/HBASE-13706 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.12 Reporter: Jerry He Assignee: Jerry He Priority: Minor Fix For: 2.0.0, 0.98.14, 1.0.2, 1.1.3 Attachments: HBASE-13706.patch CoprocessorClassLoader is used to load classes from the coprocessor jar. Certain classes are exempt from being loaded by this ClassLoader, which means they will be ignored in the coprocessor jar, but loaded from parent classpath instead. One problem is that we categorically exempt org.apache.hadoop. But it happens that Hive packages start with org.apache.hadoop. There is no reason to exclude hive classes from theCoprocessorClassLoader. HBase does not even include Hive jars. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14185) Incorrect region names logged by MemStoreFlusher.java
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14185: --- Fix Version/s: 1.3.0 1.1.2 1.2.0 2.0.0 Incorrect region names logged by MemStoreFlusher.java - Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13825) Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name
[ https://issues.apache.org/jira/browse/HBASE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654430#comment-14654430 ] Andrew Purtell commented on HBASE-13825: Thanks [~esteban]. Or, if you like, I can add what you've identified as missing to the patch for master here. Use ProtobufUtil#mergeFrom and ProtobufUtil#mergeDelimitedFrom in place of builder methods of same name --- Key: HBASE-13825 URL: https://issues.apache.org/jira/browse/HBASE-13825 Project: HBase Issue Type: Bug Affects Versions: 1.0.0, 1.0.1 Reporter: Dev Lakhani Assignee: Andrew Purtell Fix For: 2.0.0, 0.98.14, 1.0.2, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-13825-0.98.patch, HBASE-13825-0.98.patch, HBASE-13825-branch-1.patch, HBASE-13825-branch-1.patch, HBASE-13825.patch When performing a get operation on a column family with more than 64MB of data, the operation fails with: Caused by: Portable(java.io.IOException): Call to host:port failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit. at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1481) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1653) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1711) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:27308) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.get(ProtobufUtil.java:1381) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:753) at org.apache.hadoop.hbase.client.HTable$3.call(HTable.java:751) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:756) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:765) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:395) This may be related to https://issues.apache.org/jira/browse/HBASE-11747 but that issue is related to cluster status. Scan and put operations on the same data work fine Tested on a 1.0.0 cluster with both 1.0.1 and 1.0.0 clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13271) Table#puts(ListPut) operation is indeterminate; needs fixing
[ https://issues.apache.org/jira/browse/HBASE-13271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13271: - Fix Version/s: (was: 1.1.2) 1.1.3 Table#puts(ListPut) operation is indeterminate; needs fixing -- Key: HBASE-13271 URL: https://issues.apache.org/jira/browse/HBASE-13271 Project: HBase Issue Type: Improvement Components: API Affects Versions: 1.0.0 Reporter: stack Priority: Critical Fix For: 2.0.0, 1.3.0, 1.2.1, 1.0.3, 1.1.3 Another API issue found by [~larsgeorge]: Table.put(ListPut) is questionable after the API change. {code} [Mar-17 9:21 AM] Lars George: Table.put(ListPut) is weird since you cannot flush partial lists [Mar-17 9:21 AM] Lars George: Say out of 5 the third is broken, then the put() call returns with a local exception (say empty Put) and then you have 2 that are in the buffer [Mar-17 9:21 AM] Lars George: but how to you force commit them? [Mar-17 9:22 AM] Lars George: In the past you would call flushCache(), but that is gone now [Mar-17 9:22 AM] Lars George: and flush() is not available on a Table [Mar-17 9:22 AM] Lars George: And you cannot access the underlying BufferedMutation neither [Mar-17 9:23 AM] Lars George: You can *only* add more Puts if you can, or call close() [Mar-17 9:23 AM] Lars George: that is just weird to explain {code} So, Table needs to get flush back or we deprecate this method or it flushes immediately and does not return until complete in the implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13452) HRegion warning about memstore size miscalculation is not actionable
[ https://issues.apache.org/jira/browse/HBASE-13452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13452: - Fix Version/s: (was: 1.1.2) 1.1.3 Bumping from 1.1.2. HRegion warning about memstore size miscalculation is not actionable Key: HBASE-13452 URL: https://issues.apache.org/jira/browse/HBASE-13452 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Dev Lakhani Assignee: Mikhail Antonov Priority: Critical Fix For: 2.0.0, 1.2.1, 1.0.3, 1.1.3 During normal operation the HRegion class reports a message related to memstore flushing in HRegion.class : if (!canFlush) { addAndGetGlobalMemstoreSize(-memstoreSize.get()); } else if (memstoreSize.get() != 0) { LOG.error(Memstore size is + memstoreSize.get()); } The log file is filled with lots of Memstore size is 558744 Memstore size is 4390632 Memstore size is 558744 ... These message are uninformative, clog up the logs and offers no root cause nor solution. Maybe the message needs to be more informative, changed to WARN or some further information provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14185) Incorrect region names logged by MemStoreFlusher
[ https://issues.apache.org/jira/browse/HBASE-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14185: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the patch, Biju Incorrect region names logged by MemStoreFlusher Key: HBASE-14185 URL: https://issues.apache.org/jira/browse/HBASE-14185 Project: HBase Issue Type: Bug Components: regionserver Reporter: Biju Nair Assignee: Biju Nair Priority: Minor Fix For: 2.0.0, 1.2.0, 1.1.2, 1.3.0 Attachments: HBASE-14185.patch In MemstoreFlusher the method [flushOneForGlobalPressure|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L142] logs incorrect region names which makes debugging issues a bit difficult. Instead of logging the secondary replica region names in [these|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L200] [locations|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java#L205], the code logs the primary replica region names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13613) Add test to verify DDL operations from admin when the master is restarted
[ https://issues.apache.org/jira/browse/HBASE-13613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654591#comment-14654591 ] Nick Dimiduk commented on HBASE-13613: -- Now when i run, I see {noformat} $ mvn clean test -Dtest=TestMasterRestartWithProcedures ... Running org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 188.423 sec FAILURE! - in org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures testMasterProcsWithMasterRestarting(org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures) Time elapsed: 187.858 sec ERROR! java.lang.Exception: test timed out after 18 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:160) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3989) at org.apache.hadoop.hbase.client.HBaseAdmin.access$500(HBaseAdmin.java:186) at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.getProcedureResult(HBaseAdmin.java:4364) at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult(HBaseAdmin.java:4316) at org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get(HBaseAdmin.java:4272) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:567) at org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures$1.call(TestMasterRestartWithProcedures.java:106) at org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures$1.call(TestMasterRestartWithProcedures.java:101) at org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures.runRollbackableOperation(TestMasterRestartWithProcedures.java:171) at org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures.testMasterProcsWithMasterRestarting(TestMasterRestartWithProcedures.java:101) {noformat} Add test to verify DDL operations from admin when the master is restarted - Key: HBASE-13613 URL: https://issues.apache.org/jira/browse/HBASE-13613 Project: HBase Issue Type: Sub-task Components: proc-v2, test Affects Versions: 2.0.0, 1.1.0, 1.2.0 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 2.0.0, 1.1.2, 1.3.0, 1.2.1 Attachments: HBASE-13613-v0.patch, org.apache.hadoop.hbase.master.procedure.TestMasterRestartWithProcedures-output.txt Following an offline conversation with [~syuanjiang] we should add back TestMasterRestartWithProcedures. more or less it is similar to HBASE-13470 but running in a controlled unit-test form. it is exercising the kill in a random point of execution + wal replay of the procedure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)