[jira] [Commented] (HBASE-10436) hbase 0.96+ jmx does not have regionserver info any more.
[ https://issues.apache.org/jira/browse/HBASE-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906736#comment-13906736 ] Elliott Clark commented on HBASE-10436: --- +1 hbase-10436.notbean.patch hbase 0.96+ jmx does not have regionserver info any more. - Key: HBASE-10436 URL: https://issues.apache.org/jira/browse/HBASE-10436 Project: HBase Issue Type: Bug Affects Versions: 0.98.0, 0.96.0, 0.99.0 Reporter: Jonathan Hsieh Assignee: Jonathan Hsieh Priority: Critical Attachments: hbase-10436.notbean.patch, hbase-10436.patch, hbase-10436.v2.patch HBase 0.96's refactored jmx beans do not contain the master's list of dead region servers and live regionservers with load info. HBase 0.94 did (though in a single monolithic blob). This JMX interface should be considered as much of an API as the the normal wire or java api. Dropping values from this was done without deprecation and the removal of this information is a functional regression. We should provide the information in the 0.96+ JMX. HBase 0.94 had a monolithic JMX blob (hadoop:service=Master,name=Master) that contained a lot of information, including the regionserver list and the cached regionserver load for each region found on the master webpage. 0.96+ refactored jmx this into several jmx beans which could be selectively retrieved. These include: * hadoop:service=HBase,name=Master,sub=AssignmentManager * hadoop:service=HBase,name=Master,sub=Balancer * hadoop:service=HBase,name=Master,sub=Server * hadoop:service=HBase,name=Master,sub=FileSystem Specifically the (Hadoop:service=HBase,name=Master,sub=Server) listing that used to contain regionservers and deadregionservers in jmx were replaced in with numRegionServers and numDeadRegionservers which only contain counts. I propose just adding another mbean called RegionServers under the bean: hadoop:service=HBase,name=Master,sub=RegionServers -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906749#comment-13906749 ] Enis Soztutar commented on HBASE-10525: --- I was testing this, and it seems that there is an issue when the connection is closed (RS killed). it maybe that the CallSender can still accept more calls even after Connection is closed and shouldCloseConnection is set. In case CallSender.cleanUp() runs first, and clears the queue, any call added to the CallSeender queue will not get notification, thus will hang. I think we may need to sync on CallSender accepting Calls with with the Connection.out. Other ways might also be possible. Here is a dump where the RPC callee threads are just hanging forever: {code} htable-pool15-t2 daemon prio=10 tid=0x01f2e800 nid=0x1bc3 in Object.wait() [0x7fca48f72000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435) - locked 0xbd6a0230 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1655) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1713) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:29300) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:125) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:53) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:186) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:658) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) htable-pool23-t2 daemon prio=10 tid=0x7fca4c5db800 nid=0x1bbb in Object.wait() [0x7fca49779000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1435) - locked 0xbd6a0578 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1655) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1713) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.multi(ClientProtos.java:29300) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:125) at org.apache.hadoop.hbase.client.MultiServerCallable.call(MultiServerCallable.java:53) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:186) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:658) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:850) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:824) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.logAndResubmit(AsyncProcess.java:998) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveGlobalFailure(AsyncProcess.java:952) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1000(AsyncProcess.java:546) at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:662) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) HBaseWriterThread_13 prio=10 tid=0x7fca5cf68000 nid=0x1b95 in Object.wait() [0x7fca4bda] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native
[jira] [Created] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop and fails to contact peer's zk ensemble continuously
Feng Honghua created HBASE-10575: Summary: ReplicationSource thread can't be terminated if it runs into the loop and fails to contact peer's zk ensemble continuously Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.99.0 When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop and fails to contact peer's zk ensemble continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Honghua updated HBASE-10575: - Priority: Critical (was: Major) ReplicationSource thread can't be terminated if it runs into the loop and fails to contact peer's zk ensemble continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Honghua updated HBASE-10575: - Summary: ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously (was: ReplicationSource thread can't be terminated if it runs into the loop and fails to contact peer's zk ensemble continuously) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Honghua updated HBASE-10575: - Attachment: HBASE-10575-trunk_v1.patch Patch attached for the fix And two minor changes # exit immediately without sleep if isActive()==false after each failed try # close this.conn and print ReplicationSource exiting log for premature thread-exit as well ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 Attachments: HBASE-10575-trunk_v1.patch When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906766#comment-13906766 ] Feng Honghua commented on HBASE-10575: -- Looks like all branches have this same bug. I have checked 0.94, 0.98 and 0.99... ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 Attachments: HBASE-10575-trunk_v1.patch When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-6642) enable_all,disable_all,drop_all can call list command with regex directly.
[ https://issues.apache.org/jira/browse/HBASE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6642: --- Resolution: Fixed Fix Version/s: 0.99.0 Status: Resolved (was: Patch Available) committed to trunk [~lhofhansl] [~stack] [~apurtell] your decision on backporting it to 94, 96, and 98. You may consider this incompatible change if people relies on globs for list commands, but the other side is the problem with enable_all, delete_all operations which are operating on something that is not the set requested by the user. enable_all,disable_all,drop_all can call list command with regex directly. Key: HBASE-6642 URL: https://issues.apache.org/jira/browse/HBASE-6642 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.6, 0.95.0 Reporter: Y. SREENIVASULU REDDY Assignee: Matteo Bertozzi Fix For: 0.99.0 Attachments: HBASE-6642-v1.patch, HBASE-6642_trunk.patch created few tables. then performing disable_all operation in shell prompt. but it is not performing operation successfully. {noformat} hbase(main):043:0 disable_all '*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y/ 3 tables successfully disabled just it is showing the message but operation is not success. but the following way only performing successfully hbase(main):043:0 disable_all '*.*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y 3 tables successfully disabled {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors
[ https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906872#comment-13906872 ] hoar commented on HBASE-9203: - hi rajeshbabu , how do i add a new index on existing table. I get the HTableDescriptor using HBaseAdmin.getTableDescriptor(table name ), and how to convert it to IndexedHTableDescriptor? Secondary index support through coprocessors Key: HBASE-9203 URL: https://issues.apache.org/jira/browse/HBASE-9203 Project: HBase Issue Type: New Feature Reporter: rajeshbabu Assignee: rajeshbabu Attachments: SecondaryIndex Design.pdf, SecondaryIndex Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf We have been working on implementing secondary index in HBase and open sourced on hbase 0.94.8 version. The project is available on github. https://github.com/Huawei-Hadoop/hindex This Jira is to support secondary index on trunk(0.98). Following features will be supported. - multiple indexes on table, - multi column index, - index based on part of a column value, - equals and range condition scans using index, and - bulk loading data to indexed table (Indexing done with bulk load) Most of the kernel changes needed for secondary index is available in trunk. Very minimal changes needed for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Honghua updated HBASE-10575: - Status: Patch Available (was: Open) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 Attachments: HBASE-10575-trunk_v1.patch When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors
[ https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906880#comment-13906880 ] rajeshbabu commented on HBASE-9203: --- If you have some data then you can use org.apache.hadoop.hbase.index.mapreduce.TableIndexer mapreduce job to create index for existing data. Are you refering IndexedHTableDescriptor from https://github.com/Huawei-Hadoop/hindex/blob/master/secondaryindex/src/main/java/org/apache/hadoop/hbase/index/IndexedHTableDescriptor.java? If yes, currently you need to create IndexHTableDescriptor and set columns and properties of htabledescritor to IndexTableDescriptor object and add new indices(its little ugly). We have in plan to support add and drop indices dynamically. In current work we are setting indices details as meta data to HTableDescriptor(see HTableDescriptor.java#setValue(byte[], byte[])) so need not have special table descritor(IndexedTableDescriptor). For ex: {code} TableIndices tableIndices = new TableIndices(); tableIndices.addIndex(iSpec); htd.setValue(Constants.INDEX_SPEC_KEY, tableIndices.toByteArray()); {code} Secondary index support through coprocessors Key: HBASE-9203 URL: https://issues.apache.org/jira/browse/HBASE-9203 Project: HBase Issue Type: New Feature Reporter: rajeshbabu Assignee: rajeshbabu Attachments: SecondaryIndex Design.pdf, SecondaryIndex Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf We have been working on implementing secondary index in HBase and open sourced on hbase 0.94.8 version. The project is available on github. https://github.com/Huawei-Hadoop/hindex This Jira is to support secondary index on trunk(0.98). Following features will be supported. - multiple indexes on table, - multi column index, - index based on part of a column value, - equals and range condition scans using index, and - bulk loading data to indexed table (Indexing done with bulk load) Most of the kernel changes needed for secondary index is available in trunk. Very minimal changes needed for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10531) Revisit how the key byte[] is passed to HFileScanner.seekTo and reseekTo
[ https://issues.apache.org/jira/browse/HBASE-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906902#comment-13906902 ] ramkrishna.s.vasudevan commented on HBASE-10531: The reason I did not add the new way of the API impl is because we may have to change the comparators and related code to go with this mode of working. So thought instead of giving an empty impl for the new API for now continue the old one. Okie, so let me change the required things to get this in as part of this JIRA only. Revisit how the key byte[] is passed to HFileScanner.seekTo and reseekTo Key: HBASE-10531 URL: https://issues.apache.org/jira/browse/HBASE-10531 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.99.0 Attachments: HBASE-10531.patch Currently the byte[] key passed to HFileScanner.seekTo and HFileScanner.reseekTo, is a combination of row, cf, qual, type and ts. And the caller forms this by using kv.getBuffer, which is actually deprecated. So see how this can be achieved considering kv.getBuffer is removed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906933#comment-13906933 ] Hadoop QA commented on HBASE-10575: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630020/HBASE-10575-trunk_v1.patch against trunk revision . ATTACHMENT ID: 12630020 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnDatanodeDeath(TestLogRolling.java:354) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8755//console This message is automatically generated. ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 Attachments: HBASE-10575-trunk_v1.patch When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-6642) enable_all,disable_all,drop_all can call list command with regex directly.
[ https://issues.apache.org/jira/browse/HBASE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906945#comment-13906945 ] Hudson commented on HBASE-6642: --- SUCCESS: Integrated in HBase-TRUNK #4935 (See [https://builds.apache.org/job/HBase-TRUNK/4935/]) HBASE-6642 enable_all,disable_all,drop_all can call 'list' command with regex directly (mbertozzi: rev 1570128) * /hbase/trunk/hbase-shell/src/main/ruby/hbase/admin.rb * /hbase/trunk/hbase-shell/src/main/ruby/hbase/replication_admin.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/list_namespace.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/list_replicated_tables.rb * /hbase/trunk/hbase-shell/src/main/ruby/shell/commands/list_snapshots.rb enable_all,disable_all,drop_all can call list command with regex directly. Key: HBASE-6642 URL: https://issues.apache.org/jira/browse/HBASE-6642 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.6, 0.95.0 Reporter: Y. SREENIVASULU REDDY Assignee: Matteo Bertozzi Fix For: 0.99.0 Attachments: HBASE-6642-v1.patch, HBASE-6642_trunk.patch created few tables. then performing disable_all operation in shell prompt. but it is not performing operation successfully. {noformat} hbase(main):043:0 disable_all '*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y/ 3 tables successfully disabled just it is showing the message but operation is not success. but the following way only performing successfully hbase(main):043:0 disable_all '*.*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y 3 tables successfully disabled {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10575) ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously
[ https://issues.apache.org/jira/browse/HBASE-10575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906946#comment-13906946 ] Feng Honghua commented on HBASE-10575: -- unit tests pass in my local run, and the failed cases look like have nothing to do with the patch...weird ReplicationSource thread can't be terminated if it runs into the loop to contact peer's zk ensemble and fails continuously -- Key: HBASE-10575 URL: https://issues.apache.org/jira/browse/HBASE-10575 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.98.1, 0.99.0, 0.94.17 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Critical Fix For: 0.99.0 Attachments: HBASE-10575-trunk_v1.patch When ReplicationSource thread runs into the loop to contact peer's zk ensemble, it doesn't check isActive() before each retry, so if the given peer's zk ensemble is not reachable due to some reason, this ReplicationSource thread just can't be terminated by outside such as removePeer etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10576) Custom load balancer to co-locate the regions of two tables which are having same split keys
rajeshbabu created HBASE-10576: -- Summary: Custom load balancer to co-locate the regions of two tables which are having same split keys Key: HBASE-10576 URL: https://issues.apache.org/jira/browse/HBASE-10576 Project: HBase Issue Type: Sub-task Components: Balancer Reporter: rajeshbabu Assignee: rajeshbabu To support local indexing both user table and index table should have same split keys. This issue to provide custom balancer to colocate the regions of two tables which are having same split keys. This helps in Phoenix as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10574) IllegalArgumentException Hadoop Hbase
[ https://issues.apache.org/jira/browse/HBASE-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906980#comment-13906980 ] Jean-Marc Spaggiari commented on HBASE-10574: - Hi, I think it's more a usage concern than a defect. Would you mind asking on the mailing list instead of in a JIRA? JM IllegalArgumentException Hadoop Hbase - Key: HBASE-10574 URL: https://issues.apache.org/jira/browse/HBASE-10574 Project: HBase Issue Type: Test Components: hadoop2 Affects Versions: 0.96.0 Environment: Windows Reporter: SSR Priority: Critical Original Estimate: 96h Remaining Estimate: 96h Hi All, We are trying to load the data to HBase We are able to connect Hbase from Eclipse. We are following the tutorial at: http://courses.coreservlets.com/Course-Materials/pdf/hadoop/04-MapRed-4-InputAndOutput.pdf When we run the program we are getting the below exception. 2014-02-20 10:28:04,099 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(439)) - Cleaning up the staging area file:/tmp/hadoop-yarakanaboinas/mapred/staging/yarakanaboinas1524547448/.staging/job_local1524547448_0001 Exception in thread main java.lang.IllegalArgumentException: Pathname /C:/hdp/hbase-0.96.0.2.0.6.0-0009-hadoop2/lib/hbase-client-0.96.0.2.0.6.0-0009-hadoop2.jar from hdfs://HBADGX7900016:8020/C:/hdp/hbase-0.96.0.2.0.6.0-0009-hadoop2/lib/hbase-client-0.96.0.2.0.6.0-0009-hadoop2.jar is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:184) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:92) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:264) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:300) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:387) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286) at WordCountMapper.StartWithCountJob_HBase.run(StartWithCountJob_HBase.java:41) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at WordCountMapper.StartWithCountJob_HBase.main(StartWithCountJob_HBase.java:44) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10525: Status: Open (was: Patch Available) Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10525: Attachment: 10525.v4.patch Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907002#comment-13907002 ] Nicolas Liochon commented on HBASE-10525: - Thanks a lot for the detailed analysis, Enis. v4 fixes it and adds a test. This scenario was ok already: (connection marked as close) reader close - cleanup expected reads writer close - cleanup calls to writes There is no possible race condition between the two, because we check the connection status in writeRequest. If you did the cleanup, it means that connectionClosed was true, so the writeRequest will fail. another scenario was NOT ok: new client get connection - connection is opened (connection marked as closed) reader close - cleanup expected reads writer close - cleanup calls to writes client thread add call to calls to write The solution is to add a check after we add the call, exactly as in the first scenario. This does not require another synchronization point. Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10525: Status: Patch Available (was: Open) Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10566) cleanup rpcTimeout in the client
[ https://issues.apache.org/jira/browse/HBASE-10566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907013#comment-13907013 ] Nicolas Liochon commented on HBASE-10566: - Any feedback? Is the approach for the rpcController ok? cleanup rpcTimeout in the client Key: HBASE-10566 URL: https://issues.apache.org/jira/browse/HBASE-10566 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10566.sample.patch There are two issues: 1) A confusion between the socket timeout and the call timeout Socket timeouts should be minimal: a default like 20 seconds, that could be lowered to single digits timeouts for some apps: if we can not write to the socket in 10 second, we have an issue. This is different from the total duration (send query + do query + receive query), that can be longer, as it can include remotes calls on the server and so on. Today, we have a single value, it does not allow us to have low socket read timeouts. 2) The timeout can be different between the calls. Typically, if the total time, retries included is 60 seconds but failed after 2 seconds, then the remaining is 58s. HBase does this today, but by hacking with a thread local storage variable. It's a hack (it should have been a parameter of the methods, the TLS allowed to bypass all the layers. May be protobuf makes this complicated, to be confirmed), but as well it does not really work, because we can have multithreading issues (we use the updated rpc timeout of someone else, or we create a new BlockingRpcChannelImplementation with a random default timeout). Ideally, we could send the call timeout to the server as well: it will be able to dismiss alone the calls that it received but git stick in the request queue or in the internal retries (on hdfs for example). This will make the system more reactive to failure. I think we can solve this now, especially after 10525. The main issue is to something that fits well with protobuf... Then it should be easy to have a pool of thread for writers and readers, w/o a single thread per region server as today. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10520) Add handling for swallowed InterruptedException thrown by Thread.sleep in MiniZooKeeperCluster
[ https://issues.apache.org/jira/browse/HBASE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10520: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, thanks for the patch! Add handling for swallowed InterruptedException thrown by Thread.sleep in MiniZooKeeperCluster -- Key: HBASE-10520 URL: https://issues.apache.org/jira/browse/HBASE-10520 Project: HBase Issue Type: Sub-task Components: Zookeeper Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10520-trunk_v1.patch, HBASE-10520-trunk_v2.patch, HBASE-10520-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10521) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer
[ https://issues.apache.org/jira/browse/HBASE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10521: Summary: Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer (was: Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer and RpcClient) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer --- Key: HBASE-10521 URL: https://issues.apache.org/jira/browse/HBASE-10521 Project: HBase Issue Type: Sub-task Components: IPC/RPC Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10521-trunk_v1.patch, HBASE-10521-trunk_v2.patch, HBASE-10521-trunk_v3.patch, HBASE-10521-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10521) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer
[ https://issues.apache.org/jira/browse/HBASE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10521: Resolution: Fixed Fix Version/s: 0.99.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer --- Key: HBASE-10521 URL: https://issues.apache.org/jira/browse/HBASE-10521 Project: HBase Issue Type: Sub-task Components: IPC/RPC Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10521-trunk_v1.patch, HBASE-10521-trunk_v2.patch, HBASE-10521-trunk_v3.patch, HBASE-10521-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10521) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer
[ https://issues.apache.org/jira/browse/HBASE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907040#comment-13907040 ] Nicolas Liochon commented on HBASE-10521: - Thanks for the check, I committed it to trunk. Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer --- Key: HBASE-10521 URL: https://issues.apache.org/jira/browse/HBASE-10521 Project: HBase Issue Type: Sub-task Components: IPC/RPC Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10521-trunk_v1.patch, HBASE-10521-trunk_v2.patch, HBASE-10521-trunk_v3.patch, HBASE-10521-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10521) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer
[ https://issues.apache.org/jira/browse/HBASE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10521: Affects Version/s: 0.99.0 0.98.0 Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer --- Key: HBASE-10521 URL: https://issues.apache.org/jira/browse/HBASE-10521 Project: HBase Issue Type: Sub-task Components: IPC/RPC Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10521-trunk_v1.patch, HBASE-10521-trunk_v2.patch, HBASE-10521-trunk_v3.patch, HBASE-10521-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10522) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in client
[ https://issues.apache.org/jira/browse/HBASE-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10522: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk! Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in client -- Key: HBASE-10522 URL: https://issues.apache.org/jira/browse/HBASE-10522 Project: HBase Issue Type: Sub-task Components: Client Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.99.0 Attachments: HBASE-10522-trunk_v1.patch, HBASE-10522-trunk_v2.patch, HBASE-10522-trunk_v3.patch A sub-task of HBASE-10497 # rethrow rather than ignore InterruptedException thrown in deleteTable, this behavior is to align with other similar methods such as createTable/enableTable/disableTable # correct some wrong handling of InterruptedException where Thread.currentThread.interrupt() is called within while loops -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10519) Add handling for swallowed InterruptedException thrown by Thread.sleep in rest related files
[ https://issues.apache.org/jira/browse/HBASE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10519: Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, nearly done for this set. Thanks again for the patch, Feng. Add handling for swallowed InterruptedException thrown by Thread.sleep in rest related files Key: HBASE-10519 URL: https://issues.apache.org/jira/browse/HBASE-10519 Project: HBase Issue Type: Sub-task Components: REST Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10519-trunk_v1.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10524) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10524: Resolution: Fixed Fix Version/s: 0.99.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver Key: HBASE-10524 URL: https://issues.apache.org/jira/browse/HBASE-10524 Project: HBase Issue Type: Sub-task Components: regionserver Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.99.0 Attachments: HBASE-10524-trunk_v1.patch, HBASE-10524-trunk_v2.patch, split.patch A sub-task of HBASE-10497 # correct wrong handling of InterruptedException where Thread.currentThread.interrupt() is called within while loops # add proper handling for swallowed InterruptedException -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10524) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10524: Affects Version/s: 0.99.0 0.98.0 Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver Key: HBASE-10524 URL: https://issues.apache.org/jira/browse/HBASE-10524 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.99.0 Attachments: HBASE-10524-trunk_v1.patch, HBASE-10524-trunk_v2.patch, split.patch A sub-task of HBASE-10497 # correct wrong handling of InterruptedException where Thread.currentThread.interrupt() is called within while loops # add proper handling for swallowed InterruptedException -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10497) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep under HBase-Client/HBase-Server folders systematically
[ https://issues.apache.org/jira/browse/HBASE-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907062#comment-13907062 ] Nicolas Liochon commented on HBASE-10497: - They are all committed to trunk now. Thanks for your time! Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep under HBase-Client/HBase-Server folders systematically --- Key: HBASE-10497 URL: https://issues.apache.org/jira/browse/HBASE-10497 Project: HBase Issue Type: Bug Components: Client, regionserver Reporter: Feng Honghua Assignee: Feng Honghua Attachments: HBASE-10497-trunk_v1.patch, HBASE-10497-trunk_v2.patch There are two kinds of handling problems for InterruptedException thrown by Thread.sleep in many places under HBase-Client/HBase-Server folders: # Thread.currentThread.interrupt() is called within 'while' loops which can result in buggy behaviors such as expected sleep doesn't occur due to restored interrupt status by former loop # InterruptedException thrown by Thread.sleep are swallowed silently (which are neither declared in the caller method's throws clause nor rethrown immediately) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907083#comment-13907083 ] Nicolas Liochon commented on HBASE-10573: - bq. ClusterStatusPublisher The test is TestHCM#testClusterStatus, it seems to fail under some ide. Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907118#comment-13907118 ] Nick Dimiduk commented on HBASE-10573: -- I'm curious about the decision to use Netty's classes here. In particular, I believe the reason [~mcorgan] introduced the ByteRange interface was because none of the existing byte buffer concepts allowed for buffer re-use. The concern being for excessive GC during compactions. I looked at Netty's ByteBuf a while back and found it didn't support instance reuse, thus was insufficient for his requirement. Did I miss something? Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907126#comment-13907126 ] Nicolas Liochon commented on HBASE-10573: - I actually have the same question about using netty byteBuf (not netty itself). I wonder if we won't have an issue, for example when we will want to pass the buffer from the hbase socket to hdfs. The Java API is very very bad, and does not even support extension, so I understand why Netty had to rewrite it. But I'm not sure about the interoperability then. Note that I'm not a Netty expert, especially not a Netty 4 one, so my concerns may be just off. Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907140#comment-13907140 ] Hadoop QA commented on HBASE-10525: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630060/10525.v4.patch against trunk revision . ATTACHMENT ID: 12630060 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHCM org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8756//console This message is automatically generated. Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HBASE-10191) Move large arena storage off heap
[ https://issues.apache.org/jira/browse/HBASE-10191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906599#comment-13906599 ] Lars Hofhansl edited comment on HBASE-10191 at 2/20/14 4:33 PM: My office neighbor used to work on a proprietary Java database, and he says they used 128GB or even 192GB Java heaps and larger all the time without any significant GC impact. (non moving) Collection times are not a function of the heap size but rather of heap complexity, i.e. the number of objects to track (HBase also produces a lot of garbage, but that is short lived and can be quickly collected by a moving collector for the young gen). With memstoreLAB and the block cache HBase already does a good job on this. Even as is currently, if we fill an entire 128GB of heap with 64k blocks from the blockcache that would only be about 2m objects. Now, if we want 100ms latency area we need to rethink things; that will generally be very difficult in current Java. While we move all-or-nothing everything out of the Java heap, we should also investigate whether we can make the GC's life easier, yet. Edit: Edited for clarity. was (Author: lhofhansl): This might not be very popular viewpoint these days, but anyway. My office neighbor used to work on a proprietary Java database, and he says they used 128GB or even 192GB Java heaps and larger all the time without any significant GC impact. (non moving) Collection times are not a function of the heap size but rather of heap complexity, i.e. the number of objects to track (HBase also produces a lot of garbage, but that is short lived and can be quickly collected by a moving collector for the young gen). With memstoreLAB and the block cache HBase already does a good job on this. Even as is currently, if we fill an entire 128GB of heap with 64k blocks from the blockcache that would only be about 2m objects. Now, if we want to forage into the 100ms latency area we need to rethink things, but then Java might just not be the right choice. Before we embark on an all-or-nothing adventure and move everything out of the Java heap, we should also investigate whether we can make the GC's life easier, yet. Move large arena storage off heap - Key: HBASE-10191 URL: https://issues.apache.org/jira/browse/HBASE-10191 Project: HBase Issue Type: Umbrella Reporter: Andrew Purtell Even with the improved G1 GC in Java 7, Java processes that want to address large regions of memory while also providing low high-percentile latencies continue to be challenged. Fundamentally, a Java server process that has high data throughput and also tight latency SLAs will be stymied by the fact that the JVM does not provide a fully concurrent collector. There is simply not enough throughput to copy data during GC under safepoint (all application threads suspended) within available time bounds. This is increasingly an issue for HBase users operating under dual pressures: 1. tight response SLAs, 2. the increasing amount of RAM available in commodity server configurations, because GC load is roughly proportional to heap size. We can address this using parallel strategies. We should talk with the Java platform developer community about the possibility of a fully concurrent collector appearing in OpenJDK somehow. Set aside the question of if this is too little too late, if one becomes available the benefit will be immediate though subject to qualification for production, and transparent in terms of code changes. However in the meantime we need an answer for Java versions already in production. This requires we move the large arena allocations off heap, those being the blockcache and memstore. On other JIRAs recently there has been related discussion about combining the blockcache and memstore (HBASE-9399) and on flushing memstore into blockcache (HBASE-5311), which is related work. We should build off heap allocation for memstore and blockcache, perhaps a unified pool for both, and plumb through zero copy direct access to these allocations (via direct buffers) through the read and write I/O paths. This may require the construction of classes that provide object views over data contained within direct buffers. This is something else we could talk with the Java platform developer community about - it could be possible to provide language level object views over off heap memory, on heap objects could hold references to objects backed by off heap memory but not vice versa, maybe facilitated by new intrinsics in Unsafe. Again we need an answer for today also. We should investigate what existing libraries may be available in this regard. Key will be avoiding marshalling/unmarshalling costs. At most we should be copying
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907151#comment-13907151 ] Nicolas Liochon commented on HBASE-10525: - conflict on the name for testHCM. Will fix. For Fsck, seems flaky these days. Message is: ??java.lang.AssertionError: expected: EXPIRED_TABLE_LOCK but was:UNKNOWN, NO_META_REGION, RS_CONNECT_FAILURE, RS_CONNECT_FAILURE, RS_CONNECT_FAILURE, EXPIRED_TABLE_LOCK at org.apache.hadoop.hbase.util.TestHBaseFsck.testCheckTableLocks (TestHBaseFsck.java:2072)?? Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10529) Make Cell extend Cloneable
[ https://issues.apache.org/jira/browse/HBASE-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907153#comment-13907153 ] ramkrishna.s.vasudevan commented on HBASE-10529: The intention here was in Memstore currently we do deepClone. I think what's in your mind is that memstore will still deal with KeyValue, but what I thought is may be use Cell in memstore also. So that is why I felt let the clone() method be implemented in the Cell's implementation making the interface itself Clonable. Make Cell extend Cloneable -- Key: HBASE-10529 URL: https://issues.apache.org/jira/browse/HBASE-10529 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.99.0 Attachments: HBSE-10529.patch Refer to the parent JIRA for discussion on making extending Cloneable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10525: Status: Patch Available (was: Open) Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10525: Attachment: 10525.v5.patch Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10525: Status: Open (was: Patch Available) Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10526) Using Cell instead of KeyValue in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907176#comment-13907176 ] Jimmy Xiang commented on HBASE-10526: - Agree. The class name is not perfect. It seems to be the best we can have for now. If no objection, I will commit the patch tomorrow. Thanks. Using Cell instead of KeyValue in HFileOutputFormat --- Key: HBASE-10526 URL: https://issues.apache.org/jira/browse/HBASE-10526 Project: HBase Issue Type: Sub-task Components: mapreduce Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: hbase-10526.patch, hbase-10526_v1.1.patch, hbase-10526_v2.patch, hbase-10526_v3.patch HFileOutputFormat/KeyValueSortReducer use KeyValue. We should deprecate them and use Cell instead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10526) Using Cell instead of KeyValue in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907182#comment-13907182 ] Nick Dimiduk commented on HBASE-10526: -- No further objection. Using Cell instead of KeyValue in HFileOutputFormat --- Key: HBASE-10526 URL: https://issues.apache.org/jira/browse/HBASE-10526 Project: HBase Issue Type: Sub-task Components: mapreduce Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: hbase-10526.patch, hbase-10526_v1.1.patch, hbase-10526_v2.patch, hbase-10526_v3.patch HFileOutputFormat/KeyValueSortReducer use KeyValue. We should deprecate them and use Cell instead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10355) Failover RPC's from client using region replicas
[ https://issues.apache.org/jira/browse/HBASE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907197#comment-13907197 ] Nicolas Liochon commented on HBASE-10355: - The v3 takes the comments above into account. Failover RPC's from client using region replicas Key: HBASE-10355 URL: https://issues.apache.org/jira/browse/HBASE-10355 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10355.v1.patch, 10355.v2.patch, 10355.v3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10355) Failover RPC's from client using region replicas
[ https://issues.apache.org/jira/browse/HBASE-10355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Liochon updated HBASE-10355: Attachment: 10355.v3.patch Failover RPC's from client using region replicas Key: HBASE-10355 URL: https://issues.apache.org/jira/browse/HBASE-10355 Project: HBase Issue Type: Sub-task Components: Client Reporter: Enis Soztutar Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10355.v1.patch, 10355.v2.patch, 10355.v3.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10521) Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer
[ https://issues.apache.org/jira/browse/HBASE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907209#comment-13907209 ] Hudson commented on HBASE-10521: FAILURE: Integrated in HBase-TRUNK #4936 (See [https://builds.apache.org/job/HBase-TRUNK/4936/]) HBASE-10521 Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer (Feng Honghua) (nkeywal: rev 1570212) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java Add handling for swallowed InterruptedException thrown by Thread.sleep in RpcServer --- Key: HBASE-10521 URL: https://issues.apache.org/jira/browse/HBASE-10521 Project: HBase Issue Type: Sub-task Components: IPC/RPC Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10521-trunk_v1.patch, HBASE-10521-trunk_v2.patch, HBASE-10521-trunk_v3.patch, HBASE-10521-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10520) Add handling for swallowed InterruptedException thrown by Thread.sleep in MiniZooKeeperCluster
[ https://issues.apache.org/jira/browse/HBASE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907210#comment-13907210 ] Hudson commented on HBASE-10520: FAILURE: Integrated in HBase-TRUNK #4936 (See [https://builds.apache.org/job/HBase-TRUNK/4936/]) HBASE-10520 Add handling for swallowed InterruptedException thrown by Thread.sleep in MiniZooKeeperCluster (Feng Honghua) (nkeywal: rev 1570211) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java Add handling for swallowed InterruptedException thrown by Thread.sleep in MiniZooKeeperCluster -- Key: HBASE-10520 URL: https://issues.apache.org/jira/browse/HBASE-10520 Project: HBase Issue Type: Sub-task Components: Zookeeper Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10520-trunk_v1.patch, HBASE-10520-trunk_v2.patch, HBASE-10520-trunk_v3.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10577) Remove unnecessary looping in FSHLog
Himanshu Vashishtha created HBASE-10577: --- Summary: Remove unnecessary looping in FSHLog Key: HBASE-10577 URL: https://issues.apache.org/jira/browse/HBASE-10577 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.99.0 Reporter: Himanshu Vashishtha In the new disruptor based FSHLog, the Syncer threads are handed a batch of SyncFuture objects from the RingBufferHandler. The Syncer then invokes a sync call on the current writer instance. This handing of batch is done in serially in RingBufferHandler, that is, every syncer receives a non overlapping batch of SyncFutures. Once synced, Syncer thread updates highestSyncedSequence. In the run method of Syncer, we have: {code} long currentHighestSyncedSequence = highestSyncedSequence.get(); if (currentSequence currentHighestSyncedSequence) { syncCount += releaseSyncFuture(takeSyncFuture, currentHighestSyncedSequence, null); // Done with the 'take'. Go around again and do a new 'take'. continue; } {code} I find this logic of polling the BlockingQueue again in this condition un-necessary. When the currentHighestSyncedSequence is already greater than currentSequence, then doesn't it mean some other Syncer has already synced SyncFuture of these ops ? And, we should just go ahead and release all the SyncFutures for this batch to unblock the handlers. That would avoid polling the Blockingqueue for all SyncFuture objects in this case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10577) Remove unnecessary looping in FSHLog
[ https://issues.apache.org/jira/browse/HBASE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907212#comment-13907212 ] Himanshu Vashishtha commented on HBASE-10577: - What do you think [~stack] ? Thanks. Remove unnecessary looping in FSHLog Key: HBASE-10577 URL: https://issues.apache.org/jira/browse/HBASE-10577 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.99.0 Reporter: Himanshu Vashishtha In the new disruptor based FSHLog, the Syncer threads are handed a batch of SyncFuture objects from the RingBufferHandler. The Syncer then invokes a sync call on the current writer instance. This handing of batch is done in serially in RingBufferHandler, that is, every syncer receives a non overlapping batch of SyncFutures. Once synced, Syncer thread updates highestSyncedSequence. In the run method of Syncer, we have: {code} long currentHighestSyncedSequence = highestSyncedSequence.get(); if (currentSequence currentHighestSyncedSequence) { syncCount += releaseSyncFuture(takeSyncFuture, currentHighestSyncedSequence, null); // Done with the 'take'. Go around again and do a new 'take'. continue; } {code} I find this logic of polling the BlockingQueue again in this condition un-necessary. When the currentHighestSyncedSequence is already greater than currentSequence, then doesn't it mean some other Syncer has already synced SyncFuture of these ops ? And, we should just go ahead and release all the SyncFutures for this batch to unblock the handlers. That would avoid polling the Blockingqueue for all SyncFuture objects in this case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907232#comment-13907232 ] Andrew Purtell commented on HBASE-10573: bq. I'm curious about the decision to use Netty's classes here Call it an investigation. - We use Netty already - Composite buffers - Arena allocation - Dynamic buffer resizing - Reference counting - Dev and testing by another community bq. I looked at Netty's ByteBuf a while back and found it didn't support instance reuse, thus was insufficient for his requirement. Did I miss something? If that is the most important consideration above all else, outweighing all positives, then you did not miss something. What specifically would you suggest in the alternative? Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907245#comment-13907245 ] Nicolas Liochon commented on HBASE-10525: - And the issue with TestHBaseFsck comes from the EnvironmentEdgeManager; the test changes the time, it impacts the RpcClient Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907250#comment-13907250 ] Nick Dimiduk commented on HBASE-10573: -- bq. What specifically would you suggest in the alternative? I don't have anything else at this time, perhaps except for ByteBuffers + some reflection voodoo. This has the benefit of sticking with the JVM native APIs, but has the down side of reflection voodoo. I wrote up a little benchmark a while back, comparing allocating new DirectByteBuffers vs. reusing a single instance and re-assigning it with reflection. Reflection was slower than a single allocation, but I it didn't account for the collection afterwords. I also don't think the synthetic microbenchmark is indicative of use in the real system. In any case, even if dbb + reflection proves viable, we don't get the many other benefits you itemize above. DirectByteBuffers + reflection does open up the possibility of using the unsafe directly to manage memory, which may be desirable. As you say, more investigation is necessary :) Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HBASE-10476) HBase Master log grows very fast after stopped hadoop (due to connection exception)
[ https://issues.apache.org/jira/browse/HBASE-10476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Demai Ni reassigned HBASE-10476: Assignee: Demai Ni HBase Master log grows very fast after stopped hadoop (due to connection exception) --- Key: HBASE-10476 URL: https://issues.apache.org/jira/browse/HBASE-10476 Project: HBase Issue Type: Bug Reporter: Demai Ni Assignee: Demai Ni hbase 96.0(probably the same issue on 94.x) on single node cluster. At some point, we stopped Hadoop, but keep hbase running. As expected, hbase began to throw connection errors. For the first hour, the regionserver log grows by ~10MB, and master log doesn't grow much, which is ok. {code:title=log size after one hour} -rw-rw-r-- 1 biadmin biadmin 497959 Feb 5 10:36 hbase-biadmin-master-hdtest014.svl.ibm.com.log ... -rw-rw-r-- 1 biadmin biadmin 8865371 Feb 5 10:37 hbase-biadmin-regionserver-hdtest014.svl.ibm.com.log {code} However, within 4 hours, the Master log grows to 13GB. And it only stops due to out of disk space. {code:title=log size after 4 hour} -rw-rw-r-- 1 biadmin biadmin 3521880064 Feb 5 14:10 hbase-biadmin-master-hdtest014.svl.ibm.com.log -rw-rw-r-- 1 biadmin biadmin 10737418582 Feb 5 11:25 hbase-biadmin-master-hdtest014.svl.ibm.com.log.1 ... -rw-rw-r-- 1 biadmin biadmin11222365 Feb 5 10:49 hbase-biadmin-regionserver-hdtest014.svl.ibm.com.log {code} The exception/error message filled out Master log is {code:title=Error message filling up Master log} 2014-02-05 11:37:48,688 INFO org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler: Splitting hbase:meta logs for hdtest014.svl.ibm.com,60020,1391622549030 2014-02-05 11:37:48,689 ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event M_META_SERVER_SHUTDOWN java.io.IOException: failed log splitting for hdtest014.svl.ibm.com,60020,1391622549030, will retry at org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:70) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:906) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:929) at java.lang.Thread.run(Thread.java:738) Caused by: java.net.ConnectException: Call From hdtest014.svl.ibm.com/9.30.194.23 to hdtest014.svl.ibm.com:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:39) at java.lang.reflect.Constructor.newInstance(Constructor.java:527) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1351) at org.apache.hadoop.ipc.Client.call(Client.java:1300) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy8.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy8.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266) at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) at
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907272#comment-13907272 ] Andrew Purtell commented on HBASE-10573: bq. perhaps except for ByteBuffers + some reflection voodoo Voodoo is kind of nonspecific as far as implementation strategy descriptions go. :-) I guess we need to reuse such a wrapper object if we must take a reflection-based instantiation hit each time. bq. DirectByteBuffers + reflection does open up the possibility of using the unsafe directly to manage memory, which may be desirable. Isn't this undesirable? Unsafe is a vendor specific extension. Even then Oracle recently ran a public survey asking what uses of Unsafe are common and what would happen if it went away. We do use Unsafe and have some exposure to this, but do have an un-Unsafe fallback in those places. bq. This has the benefit of sticking with the JVM native APIs, Don't the native ByteBuffer method calls tend not to be inlined? I have heard the complaint but have not personally examined JIT disassembly (yet). Aren't there boundary checks and index compensations sprinkled throughout? (which Netty does away with in the simple ByteBuf types) bq. As you say, more investigation is necessary Great, let's proceed. At the moment this issue is about what happens if you even try to bring in 4. On that, N pointed me to TestHCM#testClusterStatus, which tests the multicast status publisher he implemented with Netty 3 channels. My port of that to Netty 4 APIs fails if I remove the @Ignore decoration, so I don't have it right yet. Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907284#comment-13907284 ] Andrew Purtell commented on HBASE-10573: bq. I wonder if we won't have an issue, for example when we will want to pass the buffer from the hbase socket to hdfs. Yes [~nkeywal], I worry about this also. Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907286#comment-13907286 ] Hadoop QA commented on HBASE-10525: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630090/10525.v5.patch against trunk revision . ATTACHMENT ID: 12630090 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8757//console This message is automatically generated. Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10529) Make Cell extend Cloneable
[ https://issues.apache.org/jira/browse/HBASE-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907281#comment-13907281 ] Matt Corgan commented on HBASE-10529: - Sorry, I'm not familiar enough with the code - i'm not saying for certain that it shouldn't be done. Ideally, cells would be moved around not by cloning them individually, but by passing them from an inputStreat to an outputStream. The implementation of the output stream can read each field of the cell passed to it, but ideally it won't need to fully materialize a deep copy. Make Cell extend Cloneable -- Key: HBASE-10529 URL: https://issues.apache.org/jira/browse/HBASE-10529 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.99.0 Attachments: HBSE-10529.patch Refer to the parent JIRA for discussion on making extending Cloneable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10191) Move large arena storage off heap
[ https://issues.apache.org/jira/browse/HBASE-10191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907299#comment-13907299 ] Vladimir Rodionov commented on HBASE-10191: --- {quote} We also want to consider addressing up 1 TB of usable memory without loading up cores with redundant work / multiple processes. {quote} 6TB of RAM. http://www.supermicro.nl/newsroom/pressreleases/2014/press140218_4U_4-Way.cfm {quote} Collection times are not a function of the heap size but rather of heap complexity, i.e. the number of objects to track {quote} Heap compaction is a function of a heap size (at least in CMS). Move large arena storage off heap - Key: HBASE-10191 URL: https://issues.apache.org/jira/browse/HBASE-10191 Project: HBase Issue Type: Umbrella Reporter: Andrew Purtell Even with the improved G1 GC in Java 7, Java processes that want to address large regions of memory while also providing low high-percentile latencies continue to be challenged. Fundamentally, a Java server process that has high data throughput and also tight latency SLAs will be stymied by the fact that the JVM does not provide a fully concurrent collector. There is simply not enough throughput to copy data during GC under safepoint (all application threads suspended) within available time bounds. This is increasingly an issue for HBase users operating under dual pressures: 1. tight response SLAs, 2. the increasing amount of RAM available in commodity server configurations, because GC load is roughly proportional to heap size. We can address this using parallel strategies. We should talk with the Java platform developer community about the possibility of a fully concurrent collector appearing in OpenJDK somehow. Set aside the question of if this is too little too late, if one becomes available the benefit will be immediate though subject to qualification for production, and transparent in terms of code changes. However in the meantime we need an answer for Java versions already in production. This requires we move the large arena allocations off heap, those being the blockcache and memstore. On other JIRAs recently there has been related discussion about combining the blockcache and memstore (HBASE-9399) and on flushing memstore into blockcache (HBASE-5311), which is related work. We should build off heap allocation for memstore and blockcache, perhaps a unified pool for both, and plumb through zero copy direct access to these allocations (via direct buffers) through the read and write I/O paths. This may require the construction of classes that provide object views over data contained within direct buffers. This is something else we could talk with the Java platform developer community about - it could be possible to provide language level object views over off heap memory, on heap objects could hold references to objects backed by off heap memory but not vice versa, maybe facilitated by new intrinsics in Unsafe. Again we need an answer for today also. We should investigate what existing libraries may be available in this regard. Key will be avoiding marshalling/unmarshalling costs. At most we should be copying primitives out of the direct buffers to register or stack locations until finally copying data to construct protobuf Messages. A related issue there is HBASE-9794, which proposes scatter-gather access to KeyValues when constructing RPC messages. We should see how far we can get with that and also zero copy construction of protobuf Messages backed by direct buffer allocations. Some amount of native code may be required. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10573) Use Netty 4
[ https://issues.apache.org/jira/browse/HBASE-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907298#comment-13907298 ] Nick Dimiduk commented on HBASE-10573: -- bq. Voodoo is kind of nonspecific as far as implementation strategy descriptions go. Agreed. I'm happy to share my experiment (in a more appropriate place than this Netty ticket perhaps), but here's a [gist|https://gist.github.com/ndimiduk/9120342]. bq. Isn't this undesirable? Unsafe is a vendor specific extension. Yes, in the long term. My understanding is that the OpenJDK is moving to make some of these facilities more accessible. The timeline appears to be for Java9 or later, as far as I can tell. I think it would be reasonable to consider use of Unsafe for any short-medium term implementation. bq. Don't the native ByteBuffer method calls tend not to be inlined? Unknown. bq. Great, let's proceed. Yes, please proceed! I'll discontinue my conjecture and instead have a look at your work ;) Use Netty 4 --- Key: HBASE-10573 URL: https://issues.apache.org/jira/browse/HBASE-10573 Project: HBase Issue Type: Sub-task Affects Versions: hbase-10191 Reporter: Andrew Purtell Assignee: Andrew Purtell Attachments: 10573.patch Pull in Netty 4 and sort out the consequences. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-6642) enable_all,disable_all,drop_all can call list command with regex directly.
[ https://issues.apache.org/jira/browse/HBASE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907322#comment-13907322 ] Andrew Purtell commented on HBASE-6642: --- +1 for 0.98. Please update this JIRA with a release note. I was in this part of the code when working on HBASE-9182. I would not claim our shell stuff in this area is the best it can be. enable_all,disable_all,drop_all can call list command with regex directly. Key: HBASE-6642 URL: https://issues.apache.org/jira/browse/HBASE-6642 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.6, 0.95.0 Reporter: Y. SREENIVASULU REDDY Assignee: Matteo Bertozzi Fix For: 0.99.0 Attachments: HBASE-6642-v1.patch, HBASE-6642_trunk.patch created few tables. then performing disable_all operation in shell prompt. but it is not performing operation successfully. {noformat} hbase(main):043:0 disable_all '*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y/ 3 tables successfully disabled just it is showing the message but operation is not success. but the following way only performing successfully hbase(main):043:0 disable_all '*.*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y 3 tables successfully disabled {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10351) LoadBalancer changes for supporting region replicas
[ https://issues.apache.org/jira/browse/HBASE-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907334#comment-13907334 ] Sergey Shelukhin commented on HBASE-10351: -- I am reviewing, btw LoadBalancer changes for supporting region replicas --- Key: HBASE-10351 URL: https://issues.apache.org/jira/browse/HBASE-10351 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 0.99.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-10351_v0.patch, hbase-10351_v1.patch, hbase-10351_v3.patch LoadBalancer has to be aware of and enforce placement of region replicas so that the replicas are not co-hosted in the same server, host or rack. This will ensure that the region is highly available during process / host / rack failover. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10351) LoadBalancer changes for supporting region replicas
[ https://issues.apache.org/jira/browse/HBASE-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907343#comment-13907343 ] Sergey Shelukhin commented on HBASE-10351: -- Left some comments yesterday and today... I didn't review tests very carefully. The rest seems ok (except where noted) although the array manipulations are rather complex and meaning is not always clear, so I might have missed something. In general more comments would help :) LoadBalancer changes for supporting region replicas --- Key: HBASE-10351 URL: https://issues.apache.org/jira/browse/HBASE-10351 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 0.99.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: hbase-10351_v0.patch, hbase-10351_v1.patch, hbase-10351_v3.patch LoadBalancer has to be aware of and enforce placement of region replicas so that the replicas are not co-hosted in the same server, host or rack. This will ensure that the region is highly available during process / host / rack failover. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10432) Rpc retries non-recoverable error
[ https://issues.apache.org/jira/browse/HBASE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-10432: - Attachment: HBASE-10432.02.patch Rebased to trunk. This patch corrects RpcRetryingCaller#translateException to wrap an Error in a DoNotRetryIOException, so perhaps the case in AsyncProcess#sendMultiAction is not relevant. [~nkeywal] any thoughts on a test for this? Rpc retries non-recoverable error - Key: HBASE-10432 URL: https://issues.apache.org/jira/browse/HBASE-10432 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.98.0, 0.96.2, 0.99.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Attachments: HBASE-10432.00.patch, HBASE-10432.01.patch, HBASE-10432.02.patch, exception.txt I'm recently working with hbase/trunk + hive/trunk. I had a hive command eventually timeout with the following exception (stacktrace truncated). {noformat} Caused by: java.io.IOException: Could not set up IO Streams at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:922) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1536) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1425) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:28857) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:302) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:157) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120) ... 43 more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.net.NetUtils.getInputStream(Ljava/net/Socket;)Lorg/apache/hadoop/net/SocketInputWrapper; at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:861) ... 52 more {noformat} The root cause looks like a dependency version missmatch (Hive compiled vs hadoop1, HBase vs hadoop2). However, we still retry this exception, even though it'll never actually complete. We should be more careful where we blindly catch Throwables. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10504) Define Replication Interface
[ https://issues.apache.org/jira/browse/HBASE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-10504: -- Priority: Blocker (was: Major) Making blocker so we deal w/ it now. Define Replication Interface Key: HBASE-10504 URL: https://issues.apache.org/jira/browse/HBASE-10504 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.99.0 HBase has replication. Fellas have been hijacking the replication apis to do all kinds of perverse stuff like indexing hbase content (hbase-indexer https://github.com/NGDATA/hbase-indexer) and our [~toffer] just showed up w/ overrides that replicate via an alternate channel (over a secure thrift channel between dcs over on HBASE-9360). This issue is about surfacing these APIs as public with guarantees to downstreamers similar to those we have on our public client-facing APIs (and so we don't break them for downstreamers). Any input [~phunt] or [~gabriel.reid] or [~toffer]? Thanks. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10524) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver
[ https://issues.apache.org/jira/browse/HBASE-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907402#comment-13907402 ] Hudson commented on HBASE-10524: SUCCESS: Integrated in HBase-TRUNK #4937 (See [https://builds.apache.org/job/HBase-TRUNK/4937/]) HBASE-10524 Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver (Feng Honghua) (nkeywal: rev 1570219) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in regionserver Key: HBASE-10524 URL: https://issues.apache.org/jira/browse/HBASE-10524 Project: HBase Issue Type: Sub-task Components: regionserver Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.99.0 Attachments: HBASE-10524-trunk_v1.patch, HBASE-10524-trunk_v2.patch, split.patch A sub-task of HBASE-10497 # correct wrong handling of InterruptedException where Thread.currentThread.interrupt() is called within while loops # add proper handling for swallowed InterruptedException -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10519) Add handling for swallowed InterruptedException thrown by Thread.sleep in rest related files
[ https://issues.apache.org/jira/browse/HBASE-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907401#comment-13907401 ] Hudson commented on HBASE-10519: SUCCESS: Integrated in HBase-TRUNK #4937 (See [https://builds.apache.org/job/HBase-TRUNK/4937/]) HBASE-10519 Add handling for swallowed InterruptedException thrown by Thread.sleep in rest related files (Feng Honghua) (nkeywal: rev 1570217) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteAdmin.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java Add handling for swallowed InterruptedException thrown by Thread.sleep in rest related files Key: HBASE-10519 URL: https://issues.apache.org/jira/browse/HBASE-10519 Project: HBase Issue Type: Sub-task Components: REST Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Priority: Minor Fix For: 0.99.0 Attachments: HBASE-10519-trunk_v1.patch A sub-task of HBASE-10497 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10522) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in client
[ https://issues.apache.org/jira/browse/HBASE-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907400#comment-13907400 ] Hudson commented on HBASE-10522: SUCCESS: Integrated in HBase-TRUNK #4937 (See [https://builds.apache.org/job/HBase-TRUNK/4937/]) HBASE-10522 Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in client (Feng Honghua) (nkeywal: rev 1570216) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTableMultiplexer.java Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep in client -- Key: HBASE-10522 URL: https://issues.apache.org/jira/browse/HBASE-10522 Project: HBase Issue Type: Sub-task Components: Client Affects Versions: 0.98.0, 0.99.0 Reporter: Feng Honghua Assignee: Feng Honghua Fix For: 0.99.0 Attachments: HBASE-10522-trunk_v1.patch, HBASE-10522-trunk_v2.patch, HBASE-10522-trunk_v3.patch A sub-task of HBASE-10497 # rethrow rather than ignore InterruptedException thrown in deleteTable, this behavior is to align with other similar methods such as createTable/enableTable/disableTable # correct some wrong handling of InterruptedException where Thread.currentThread.interrupt() is called within while loops -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-6642) enable_all,disable_all,drop_all can call list command with regex directly.
[ https://issues.apache.org/jira/browse/HBASE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-6642: --- Fix Version/s: 0.98.1 Release Note: from 0.98.1 every hbase shell command that support regex filtering is using the java Pattern class, which does not support glob patterns. http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html enable_all,disable_all,drop_all can call list command with regex directly. Key: HBASE-6642 URL: https://issues.apache.org/jira/browse/HBASE-6642 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.6, 0.95.0 Reporter: Y. SREENIVASULU REDDY Assignee: Matteo Bertozzi Fix For: 0.98.1, 0.99.0 Attachments: HBASE-6642-v1.patch, HBASE-6642_trunk.patch created few tables. then performing disable_all operation in shell prompt. but it is not performing operation successfully. {noformat} hbase(main):043:0 disable_all '*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y/ 3 tables successfully disabled just it is showing the message but operation is not success. but the following way only performing successfully hbase(main):043:0 disable_all '*.*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y 3 tables successfully disabled {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10527) TestTokenAuthentication fails with the IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907418#comment-13907418 ] Gary Helmling commented on HBASE-10527: --- So, from what I can tell, this appears to be due to some different behavior in ByteArrayOutputStream between IBM JDK7 and others, which is ultimately exposed through a bug in the Hadoop {{WritableUtils}} class. In {{WritableUtils}}, we have... {code} public static byte[] toByteArray(Writable... writables) { final DataOutputBuffer out = new DataOutputBuffer(); try { for(Writable w : writables) { w.write(out); } out.close(); } catch (IOException e) { throw new RuntimeException(Fail to convert writables to a byte array,e); } return out.getData(); } {code} The problem is in {{return out.getData();}}, which returns the full current byte[] for output, not just the part that's been written. {{TokenIdentifier.getBytes()}} does similar processing, but handles this correctly... {code} /** * Get the bytes for the token identifier * @return the bytes of the identifier */ public byte[] getBytes() { DataOutputBuffer buf = new DataOutputBuffer(4096); try { this.write(buf); } catch (IOException ie) { throw new RuntimeException(i/o error in getBytes, ie); } return Arrays.copyOf(buf.getData(), buf.getLength()); } {code} ... returning a copy of just the array portion that has been written. Now, why does this work in JDKs other than IBM JDK7? The Oracle JDK implementation of BAOS zero initializes the underlying byte[]. However, IBM JDK does not seem to. This seems to matter in computing the SHA1 for the token identifier password, where the trailing {{\x00}} bytes do not change the SHA1 in the Oracle JDK, but the additional uninitialized junk does change the resulting SHA1 with IBM JDK. The solution is simply to use {{TokenIdentifier.getBytes()}} in {{AuthenticationTokenSecretManager}}, where we had been using {{WritableUtils.toByteArray()}}. This returns a correctly truncated byte array and matches the value returned by {{Token.getIdentifier()}}. TestTokenAuthentication fails with the IBM JDK -- Key: HBASE-10527 URL: https://issues.apache.org/jira/browse/HBASE-10527 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Environment: IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20131114_175264 (JIT enabled, AOT enabled) Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: org.apache.hadoop.hbase.security.token.TestTokenAuthentication-output.txt.gz DIGEST-MD5: digest response format violation. Mismatched response. The failure trace: {noformat} 2014-02-13 15:41:00,449 WARN [RpcServer.reader=1,port=54751] ipc.RpcServer$Listener(794): RpcServer.listener,port=54751: count of bytes read: 0 javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. at com.ibm.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:614) at com.ibm.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:234) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.saslReadAndProcess(RpcServer.java:1315) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1501) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640) at java.lang.Thread.run(Thread.java:853) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907417#comment-13907417 ] Enis Soztutar commented on HBASE-10525: --- Thanks Nicolas. bq. The solution is to add a check after we add the call, exactly as in the first scenario. This does not require another synchronization point. Agreed. The patch seems to be fixing this. Is this sleep(1) intended, or left from debugging? {code} -connection.writeRequest(call, priority); +Thread.sleep(1); {code} Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10527) TestTokenAuthentication fails with the IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-10527: -- Attachment: HBASE-10527.patch Attaching trivial patch that switch AuthenticationTokenSecretManager to use TokenIdentifier.getBytes(). TestTokenAuthentication fails with the IBM JDK -- Key: HBASE-10527 URL: https://issues.apache.org/jira/browse/HBASE-10527 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Environment: IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20131114_175264 (JIT enabled, AOT enabled) Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: HBASE-10527.patch, org.apache.hadoop.hbase.security.token.TestTokenAuthentication-output.txt.gz DIGEST-MD5: digest response format violation. Mismatched response. The failure trace: {noformat} 2014-02-13 15:41:00,449 WARN [RpcServer.reader=1,port=54751] ipc.RpcServer$Listener(794): RpcServer.listener,port=54751: count of bytes read: 0 javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. at com.ibm.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:614) at com.ibm.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:234) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.saslReadAndProcess(RpcServer.java:1315) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1501) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640) at java.lang.Thread.run(Thread.java:853) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HBASE-10527) TestTokenAuthentication fails with the IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling reassigned HBASE-10527: - Assignee: Gary Helmling (was: Andrew Purtell) TestTokenAuthentication fails with the IBM JDK -- Key: HBASE-10527 URL: https://issues.apache.org/jira/browse/HBASE-10527 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Environment: IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20131114_175264 (JIT enabled, AOT enabled) Reporter: Andrew Purtell Assignee: Gary Helmling Priority: Minor Attachments: HBASE-10527.patch, org.apache.hadoop.hbase.security.token.TestTokenAuthentication-output.txt.gz DIGEST-MD5: digest response format violation. Mismatched response. The failure trace: {noformat} 2014-02-13 15:41:00,449 WARN [RpcServer.reader=1,port=54751] ipc.RpcServer$Listener(794): RpcServer.listener,port=54751: count of bytes read: 0 javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. at com.ibm.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:614) at com.ibm.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:234) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.saslReadAndProcess(RpcServer.java:1315) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1501) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640) at java.lang.Thread.run(Thread.java:853) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10432) Rpc retries non-recoverable error
[ https://issues.apache.org/jira/browse/HBASE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907515#comment-13907515 ] Hadoop QA commented on HBASE-10432: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630134/HBASE-10432.02.patch against trunk revision . ATTACHMENT ID: 12630134 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.snapshot.TestExportSnapshot Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8758//console This message is automatically generated. Rpc retries non-recoverable error - Key: HBASE-10432 URL: https://issues.apache.org/jira/browse/HBASE-10432 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 0.98.0, 0.96.2, 0.99.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Attachments: HBASE-10432.00.patch, HBASE-10432.01.patch, HBASE-10432.02.patch, exception.txt I'm recently working with hbase/trunk + hive/trunk. I had a hive command eventually timeout with the following exception (stacktrace truncated). {noformat} Caused by: java.io.IOException: Could not set up IO Streams at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:922) at org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1536) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1425) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:28857) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:302) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:157) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57) at
[jira] [Created] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
Amitanand Aiyer created HBASE-10578: --- Summary: For the same row key, the KV in the newest StoreFile should be returned Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Priority: Minor -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amitanand Aiyer updated HBASE-10578: Description: When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Priority: Minor When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amitanand Aiyer updated HBASE-10578: Fix Version/s: 0.89-fb Assignee: Amitanand Aiyer Status: Patch Available (was: Open) For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amitanand Aiyer updated HBASE-10578: Attachment: HBASE-10578.patch For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907544#comment-13907544 ] Enis Soztutar commented on HBASE-10525: --- This change might have some consequences, because the connection threads will be kept for much longer. Can we instead do this change only on the test? {code} -conf.getInt(hbase.ipc.client.connection.minIdleTimeBeforeClose, 12); // 2 minutes +conf.getInt(hbase.ipc.client.connection.minIdleTimeBeforeClose, 120); // 20 minutes {code} Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10392) Correct references to hbase.regionserver.global.memstore.upperLimit
[ https://issues.apache.org/jira/browse/HBASE-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-10392: - Attachment: HBASE-10392.5.patch bq. This addition may be not needed. Very well. Rebased patch and removed this change. Will commit on successful QABot run. Correct references to hbase.regionserver.global.memstore.upperLimit --- Key: HBASE-10392 URL: https://issues.apache.org/jira/browse/HBASE-10392 Project: HBase Issue Type: Bug Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0 Attachments: HBASE-10392.0.patch, HBASE-10392.1.patch, HBASE-10392.2.patch, HBASE-10392.3.patch, HBASE-10392.4.patch, HBASE-10392.5.patch As part of the awesome new HBASE-5349, a couple references to {{hbase.regionserver.global.memstore.upperLimit}} was missed. Clean those up to use the new {{hbase.regionserver.global.memstore.size}} instead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907555#comment-13907555 ] Liyin Tang commented on HBASE-10578: Nice finding ! For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907567#comment-13907567 ] Hadoop QA commented on HBASE-10578: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630159/HBASE-10578.patch against trunk revision . ATTACHMENT ID: 12630159 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8759//console This message is automatically generated. For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907577#comment-13907577 ] Nicolas Liochon commented on HBASE-10525: - bq. Is this sleep(1) intended, or left from debugging? Oops. left from debugging. bq. minIdleTimeBeforeClose, 12 Yes, I've changed it to see if it was the issue with TestHBaseFsck: if the connection becomes idle, the TestHBaseFsck fails. I will write a test on the RpcClient alone: there is no test for this feature. I can update it for the test itself, but I also need to check what it means for Fsck itself: may be it relies on having the same tcp connection. Lastly, the control of the idle connection is new (previously we were mixing bugs and client pings to keep the connection alive), I don't know what is the best value (but 2 minutes seems much better than 20 for sure) Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10392) Correct references to hbase.regionserver.global.memstore.upperLimit
[ https://issues.apache.org/jira/browse/HBASE-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907701#comment-13907701 ] Hadoop QA commented on HBASE-10392: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630162/HBASE-10392.5.patch against trunk revision . ATTACHMENT ID: 12630162 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8760//console This message is automatically generated. Correct references to hbase.regionserver.global.memstore.upperLimit --- Key: HBASE-10392 URL: https://issues.apache.org/jira/browse/HBASE-10392 Project: HBase Issue Type: Bug Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0 Attachments: HBASE-10392.0.patch, HBASE-10392.1.patch, HBASE-10392.2.patch, HBASE-10392.3.patch, HBASE-10392.4.patch, HBASE-10392.5.patch As part of the awesome new HBASE-5349, a couple references to {{hbase.regionserver.global.memstore.upperLimit}} was missed. Clean those up to use the new {{hbase.regionserver.global.memstore.size}} instead. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amitanand Aiyer updated HBASE-10578: Attachment: HBASE-10578-v2.patch This file is intended to be applied over 0.89-fb. Not sure why HadoopQA tries to apply this over opensource trunk, and complain about it. For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578-v2.patch, HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (HBASE-10527) TestTokenAuthentication fails with the IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907711#comment-13907711 ] Andrew Purtell edited comment on HBASE-10527 at 2/20/14 11:41 PM: -- bq. The Oracle JDK implementation of BAOS zero initializes the underlying byte[]. However, IBM JDK does not seem to. This seems to matter in computing the SHA1 for the token identifier password, where the trailing \x00 bytes do not change the SHA1 in the Oracle JDK, but the additional uninitialized junk does change the resulting SHA1 with IBM JDK. That makes complete sense. All that digging for such a simple fix. Thanks so much Gary, +1 Edit: Fix formatting was (Author: apurtell): bq, The Oracle JDK implementation of BAOS zero initializes the underlying byte[]. However, IBM JDK does not seem to. This seems to matter in computing the SHA1 for the token identifier password, where the trailing \x00 bytes do not change the SHA1 in the Oracle JDK, but the additional uninitialized junk does change the resulting SHA1 with IBM JDK. That makes complete sense. All that digging for such a simple fix. Thanks so much Gary, +1 TestTokenAuthentication fails with the IBM JDK -- Key: HBASE-10527 URL: https://issues.apache.org/jira/browse/HBASE-10527 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Environment: IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20131114_175264 (JIT enabled, AOT enabled) Reporter: Andrew Purtell Assignee: Gary Helmling Priority: Minor Attachments: HBASE-10527.patch, org.apache.hadoop.hbase.security.token.TestTokenAuthentication-output.txt.gz DIGEST-MD5: digest response format violation. Mismatched response. The failure trace: {noformat} 2014-02-13 15:41:00,449 WARN [RpcServer.reader=1,port=54751] ipc.RpcServer$Listener(794): RpcServer.listener,port=54751: count of bytes read: 0 javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. at com.ibm.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:614) at com.ibm.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:234) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.saslReadAndProcess(RpcServer.java:1315) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1501) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640) at java.lang.Thread.run(Thread.java:853) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10527) TestTokenAuthentication fails with the IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907711#comment-13907711 ] Andrew Purtell commented on HBASE-10527: bq, The Oracle JDK implementation of BAOS zero initializes the underlying byte[]. However, IBM JDK does not seem to. This seems to matter in computing the SHA1 for the token identifier password, where the trailing \x00 bytes do not change the SHA1 in the Oracle JDK, but the additional uninitialized junk does change the resulting SHA1 with IBM JDK. That makes complete sense. All that digging for such a simple fix. Thanks so much Gary, +1 TestTokenAuthentication fails with the IBM JDK -- Key: HBASE-10527 URL: https://issues.apache.org/jira/browse/HBASE-10527 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Environment: IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20131114_175264 (JIT enabled, AOT enabled) Reporter: Andrew Purtell Assignee: Gary Helmling Priority: Minor Attachments: HBASE-10527.patch, org.apache.hadoop.hbase.security.token.TestTokenAuthentication-output.txt.gz DIGEST-MD5: digest response format violation. Mismatched response. The failure trace: {noformat} 2014-02-13 15:41:00,449 WARN [RpcServer.reader=1,port=54751] ipc.RpcServer$Listener(794): RpcServer.listener,port=54751: count of bytes read: 0 javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. at com.ibm.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:614) at com.ibm.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:234) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.saslReadAndProcess(RpcServer.java:1315) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1501) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640) at java.lang.Thread.run(Thread.java:853) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10529) Make Cell extend Cloneable
[ https://issues.apache.org/jira/browse/HBASE-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907714#comment-13907714 ] Enis Soztutar commented on HBASE-10529: --- Agreed that ideally, we should not be cloning cells. Since Cell is a public interface, I am afraid that this will also enable the clients to freely depend on Cell's being cloneable. It seems that we only need Cell's to be cloneable for the memstore implementation, right? Shall we do smt like a CellCloneable and use it internally? Make Cell extend Cloneable -- Key: HBASE-10529 URL: https://issues.apache.org/jira/browse/HBASE-10529 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.99.0 Attachments: HBSE-10529.patch Refer to the parent JIRA for discussion on making extending Cloneable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10529) Make Cell extend Cloneable
[ https://issues.apache.org/jira/browse/HBASE-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907723#comment-13907723 ] stack commented on HBASE-10529: --- Lets try and undo the need to clone to add to the MemStore. I like the [~mcorgan] pushback and the [~enis] suggestion that if we need to clone in one place, lets just implement a copy there and not taint all Cells with Clone until we absolutely need it. Make Cell extend Cloneable -- Key: HBASE-10529 URL: https://issues.apache.org/jira/browse/HBASE-10529 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.99.0 Attachments: HBSE-10529.patch Refer to the parent JIRA for discussion on making extending Cloneable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907725#comment-13907725 ] stack commented on HBASE-10578: --- [~amitanand] Yeah, our little hadoopqa bot just runs whatever is posted against trunk... just ignore. For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578-v2.patch, HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907732#comment-13907732 ] stack commented on HBASE-10578: --- We have this issue in trunk too. The patch looks like it'll apply too... For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578-v2.patch, HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10578) For the same row key, the KV in the newest StoreFile should be returned
[ https://issues.apache.org/jira/browse/HBASE-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907743#comment-13907743 ] Hadoop QA commented on HBASE-10578: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630192/HBASE-10578-v2.patch against trunk revision . ATTACHMENT ID: 12630192 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8761//console This message is automatically generated. For the same row key, the KV in the newest StoreFile should be returned --- Key: HBASE-10578 URL: https://issues.apache.org/jira/browse/HBASE-10578 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.89-fb, 0.98.1 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb Attachments: HBASE-10578-v2.patch, HBASE-10578.patch When multiple scanners have the same KV, HBase should pick the newest one. i.e. pick the KV from the store file with the largest seq id. In the KeyValueHeap generalizedSeek implementation, we seem to prefer the current scanner over the scanners in the heap -- THIS IS WRONG. The diff adds a unit test to make sure that bulk loads correctly. And fixes the issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented
Aleksandr Shulman created HBASE-10579: - Summary: [Documentation]: ExportSnapshot tool package incorrectly documented Key: HBASE-10579 URL: https://issues.apache.org/jira/browse/HBASE-10579 Project: HBase Issue Type: Bug Components: documentation, snapshots Affects Versions: 0.98.0 Reporter: Aleksandr Shulman Priority: Minor Fix For: 0.96.2, 0.98.1 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html Expected documentation: The class should be specified as org.apache.hadoop.hbase.snapshot.ExportSnapshot Current documentation: Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot This makes sense because the class is located in the org.apache.hadoop.hbase.snapshot package: https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10527) TestTokenAuthentication fails with the IBM JDK
[ https://issues.apache.org/jira/browse/HBASE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-10527: -- Status: Patch Available (was: Open) TestTokenAuthentication fails with the IBM JDK -- Key: HBASE-10527 URL: https://issues.apache.org/jira/browse/HBASE-10527 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Environment: IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20131114_175264 (JIT enabled, AOT enabled) Reporter: Andrew Purtell Assignee: Gary Helmling Priority: Minor Attachments: HBASE-10527.patch, org.apache.hadoop.hbase.security.token.TestTokenAuthentication-output.txt.gz DIGEST-MD5: digest response format violation. Mismatched response. The failure trace: {noformat} 2014-02-13 15:41:00,449 WARN [RpcServer.reader=1,port=54751] ipc.RpcServer$Listener(794): RpcServer.listener,port=54751: count of bytes read: 0 javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. at com.ibm.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:614) at com.ibm.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:234) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.saslReadAndProcess(RpcServer.java:1315) at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1501) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640) at java.lang.Thread.run(Thread.java:853) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-6642) enable_all,disable_all,drop_all can call list command with regex directly.
[ https://issues.apache.org/jira/browse/HBASE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1390#comment-1390 ] Hudson commented on HBASE-6642: --- FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #160 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/160/]) HBASE-6642 enable_all,disable_all,drop_all can call 'list' command with regex directly (mbertozzi: rev 1570328) * /hbase/branches/0.98/hbase-shell/src/main/ruby/hbase/admin.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/hbase/replication_admin.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list_namespace.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list_replicated_tables.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list_snapshots.rb enable_all,disable_all,drop_all can call list command with regex directly. Key: HBASE-6642 URL: https://issues.apache.org/jira/browse/HBASE-6642 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.6, 0.95.0 Reporter: Y. SREENIVASULU REDDY Assignee: Matteo Bertozzi Fix For: 0.98.1, 0.99.0 Attachments: HBASE-6642-v1.patch, HBASE-6642_trunk.patch created few tables. then performing disable_all operation in shell prompt. but it is not performing operation successfully. {noformat} hbase(main):043:0 disable_all '*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y/ 3 tables successfully disabled just it is showing the message but operation is not success. but the following way only performing successfully hbase(main):043:0 disable_all '*.*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y 3 tables successfully disabled {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-6642) enable_all,disable_all,drop_all can call list command with regex directly.
[ https://issues.apache.org/jira/browse/HBASE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907778#comment-13907778 ] Hudson commented on HBASE-6642: --- FAILURE: Integrated in HBase-0.98 #172 (See [https://builds.apache.org/job/HBase-0.98/172/]) HBASE-6642 enable_all,disable_all,drop_all can call 'list' command with regex directly (mbertozzi: rev 1570328) * /hbase/branches/0.98/hbase-shell/src/main/ruby/hbase/admin.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/hbase/replication_admin.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/disable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/drop_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/enable_all.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list_namespace.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list_replicated_tables.rb * /hbase/branches/0.98/hbase-shell/src/main/ruby/shell/commands/list_snapshots.rb enable_all,disable_all,drop_all can call list command with regex directly. Key: HBASE-6642 URL: https://issues.apache.org/jira/browse/HBASE-6642 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.6, 0.95.0 Reporter: Y. SREENIVASULU REDDY Assignee: Matteo Bertozzi Fix For: 0.98.1, 0.99.0 Attachments: HBASE-6642-v1.patch, HBASE-6642_trunk.patch created few tables. then performing disable_all operation in shell prompt. but it is not performing operation successfully. {noformat} hbase(main):043:0 disable_all '*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y/ 3 tables successfully disabled just it is showing the message but operation is not success. but the following way only performing successfully hbase(main):043:0 disable_all '*.*' table12 zk0113 zk0114 Disable the above 3 tables (y/n)? y 3 tables successfully disabled {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10525) Allow the client to use a different thread for writing to ease interrupt
[ https://issues.apache.org/jira/browse/HBASE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907785#comment-13907785 ] Enis Soztutar commented on HBASE-10525: --- v5 lgtm, except for the two issues above. Allow the client to use a different thread for writing to ease interrupt Key: HBASE-10525 URL: https://issues.apache.org/jira/browse/HBASE-10525 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.99.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Fix For: 0.99.0 Attachments: 10525.v1.patch, 10525.v2.patch, 10525.v3.patch, 10525.v4.patch, 10525.v5.patch, HBaseclient-EventualConsistency.pdf This is an issue in the HBASE-10070 context, but as well more generally if you want to interrupt an operation with a limited cost. I will attach a doc with a more detailed explanation. This adds a thread per region server; so it's otional. The first patch activates it by default to see how it behaves on a full hadoop-qa run. The target is to be unset by default. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10580) IntegrationTestingUtility#restoreCluster leak resource when running in a mini cluster mode
Jeffrey Zhong created HBASE-10580: - Summary: IntegrationTestingUtility#restoreCluster leak resource when running in a mini cluster mode Key: HBASE-10580 URL: https://issues.apache.org/jira/browse/HBASE-10580 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.96.1, 0.96.0, 0.98.0, 0.99.0 Reporter: Jeffrey Zhong When the utility isn't run in distributed cluster mode, the restore only shutdown MiniHBaseCluster not MiniDFSCluster MiniZKCluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HBASE-10580) IntegrationTestingUtility#restoreCluster leak resource when running in a mini cluster mode
[ https://issues.apache.org/jira/browse/HBASE-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeffrey Zhong reassigned HBASE-10580: - Assignee: Jeffrey Zhong IntegrationTestingUtility#restoreCluster leak resource when running in a mini cluster mode -- Key: HBASE-10580 URL: https://issues.apache.org/jira/browse/HBASE-10580 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.98.0, 0.96.0, 0.96.1, 0.99.0 Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong When the utility isn't run in distributed cluster mode, the restore only shutdown MiniHBaseCluster not MiniDFSCluster MiniZKCluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10497) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep under HBase-Client/HBase-Server folders systematically
[ https://issues.apache.org/jira/browse/HBASE-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907792#comment-13907792 ] Feng Honghua commented on HBASE-10497: -- Thank you [~nkeywal] :-) Correct wrong handling and add proper handling for swallowed InterruptedException thrown by Thread.sleep under HBase-Client/HBase-Server folders systematically --- Key: HBASE-10497 URL: https://issues.apache.org/jira/browse/HBASE-10497 Project: HBase Issue Type: Bug Components: Client, regionserver Reporter: Feng Honghua Assignee: Feng Honghua Attachments: HBASE-10497-trunk_v1.patch, HBASE-10497-trunk_v2.patch There are two kinds of handling problems for InterruptedException thrown by Thread.sleep in many places under HBase-Client/HBase-Server folders: # Thread.currentThread.interrupt() is called within 'while' loops which can result in buggy behaviors such as expected sleep doesn't occur due to restored interrupt status by former loop # InterruptedException thrown by Thread.sleep are swallowed silently (which are neither declared in the caller method's throws clause nor rethrown immediately) -- This message was sent by Atlassian JIRA (v6.1.5#6160)