[jira] [Updated] (HBASE-7210) Backport HBASE-6059 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7210: - Fix Version/s: (was: 0.94.5) 0.94.6 Moving to 0.94.6, I don't think we're ready here. Backport HBASE-6059 to 0.94 --- Key: HBASE-7210 URL: https://issues.apache.org/jira/browse/HBASE-7210 Project: HBase Issue Type: Bug Affects Versions: 0.94.2 Reporter: ramkrishna.s.vasudevan Fix For: 0.94.6 Attachments: 6059-94.patch, 7120.txt HBASE-6059 seems to be an important issue. Chunhui has already given a patch for 94. Need to rebase if it does not apply cleanly. Raising a new one as the old issue is already closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7280) TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits, which in turn block following normal replication
[ https://issues.apache.org/jira/browse/HBASE-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7280. -- Resolution: Won't Fix Fix Version/s: (was: 0.94.5) Closing as Won't fix. This is working as designed. We can of course discuss another approach that can also ship meta edits. TableNotFoundException thrown in peer cluster will incur endless retry for shipEdits, which in turn block following normal replication -- Key: HBASE-7280 URL: https://issues.apache.org/jira/browse/HBASE-7280 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.2 Reporter: Feng Honghua Original Estimate: 0.5h Remaining Estimate: 0.5h in cluster replication, if the master cluster have 2 tables which have column-family declared with replication scope = 1, and add a peer cluster which has only 1 table with the same name as the master cluster, in the ReplicationSource (thread in master cluster) for this peer, edits (logs) for both tables will be shipped to the peer, the peer will fail applying the edits due to TableNotFoundException, and this exception will also be responsed to the original shipper (ReplicationSource in master cluster), and the shipper will fall into an endless retry for shipping the failed edits without proceeding to read the remained(newer) log files and to ship following edits(maybe the normal, expected edit for the registered table). the symptom looks like the TableNotFoundException incurs endless retry and blocking normal table replication -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7101) HBase stuck in Region SPLIT
[ https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7101. -- Resolution: Fixed Fix Version/s: (was: 0.94.5) (was: 0.96.0) I think this is a dup. HBase stuck in Region SPLIT Key: HBASE-7101 URL: https://issues.apache.org/jira/browse/HBASE-7101 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Bing Jiang I found this issue from a zknode which has existed for a long time in the unassigned parent.And HMaster report warnning log increasingly.The loop log is at below. WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server gs-dpo-sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split we use Hbase-0.92.1, and I trace back to the source code. HMaster AssignmentManager have already deleted the SPLIT_Region in its memory structure,but HRegionServer SplitTransaction has found the unassigned/parent-node existed in a transient state, precisely SplitTransaction executes tickleNodeSplit to update a new version a little later than AssignmentManager deleting unassigned/parent-znode. After updating a version of the znode, it will intrigue the handleRegion operation again, however, AssignmentManager assert that the RegionState in Memory has been deleted, and transaction goes into a retry loop. In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after sleeping 100ms. In my opinion, if the time is much longger than 100ms, all the operation from AssignmentManagement will finish off completely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-7101) HBase stuck in Region SPLIT
[ https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-7101: -- HBase stuck in Region SPLIT Key: HBASE-7101 URL: https://issues.apache.org/jira/browse/HBASE-7101 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Bing Jiang I found this issue from a zknode which has existed for a long time in the unassigned parent.And HMaster report warnning log increasingly.The loop log is at below. WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server gs-dpo-sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split we use Hbase-0.92.1, and I trace back to the source code. HMaster AssignmentManager have already deleted the SPLIT_Region in its memory structure,but HRegionServer SplitTransaction has found the unassigned/parent-node existed in a transient state, precisely SplitTransaction executes tickleNodeSplit to update a new version a little later than AssignmentManager deleting unassigned/parent-znode. After updating a version of the znode, it will intrigue the handleRegion operation again, however, AssignmentManager assert that the RegionState in Memory has been deleted, and transaction goes into a retry loop. In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after sleeping 100ms. In my opinion, if the time is much longger than 100ms, all the operation from AssignmentManagement will finish off completely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7101) HBase stuck in Region SPLIT
[ https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7101. -- Resolution: Duplicate HBase stuck in Region SPLIT Key: HBASE-7101 URL: https://issues.apache.org/jira/browse/HBASE-7101 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Bing Jiang I found this issue from a zknode which has existed for a long time in the unassigned parent.And HMaster report warnning log increasingly.The loop log is at below. WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468 not found on server gs-dpo-sev0040,60020,1350378314041; failed processing WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468 from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed its split we use Hbase-0.92.1, and I trace back to the source code. HMaster AssignmentManager have already deleted the SPLIT_Region in its memory structure,but HRegionServer SplitTransaction has found the unassigned/parent-node existed in a transient state, precisely SplitTransaction executes tickleNodeSplit to update a new version a little later than AssignmentManager deleting unassigned/parent-znode. After updating a version of the znode, it will intrigue the handleRegion operation again, however, AssignmentManager assert that the RegionState in Memory has been deleted, and transaction goes into a retry loop. In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after sleeping 100ms. In my opinion, if the time is much longger than 100ms, all the operation from AssignmentManagement will finish off completely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6335) Switching log-splitting policy after last failure master start may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6335: - Fix Version/s: (was: 0.94.5) 0.94.6 Is this an issue in 0.94 only (or in 0.96 as well)? Switching log-splitting policy after last failure master start may cause data loss -- Key: HBASE-6335 URL: https://issues.apache.org/jira/browse/HBASE-6335 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.94.6 How happen? If server A is down, and it has three log files, all the data is from one region. File 1: kv01 kv02 kv03 File 2: kv04 kv05 kv06 File 3: kv07 kv08 kv09 Here,kv01 means, its log seqID is 01 Case:Switch to maste-local-log-splitting from distributed-log-splitting 1.Master find serverA is down, and start to split its log files using split-log-splitting. 2.Successfully split log file2, and move it to oldLogs, and generate one edit file named 06 in region recover.edits dir. 3.Master restart, and change the log-splitting policy to maste-local-log-splitting , and start to split file 1, file 3 4.Successfully split log file1 and file3, and generate one edit file named 09 in region recover.edits dir. 5.Region replay edits from edit file 06 and 09, Region's seqID is 06 after it replay edits from 06, and when replaying edit from 09, it will skip kv01,kv02,kv03, So these data loss. As the above case, if we switch to distributed-log-splitting from maste-local-log-splitting, it could also cause data loss Should we fix this bug or avoid the case? I'm not sure... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6469) Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restart
[ https://issues.apache.org/jira/browse/HBASE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6469: - Fix Version/s: (was: 0.94.5) 0.94.6 Any update? Moving out to 0.94.6 Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restart - Key: HBASE-6469 URL: https://issues.apache.org/jira/browse/HBASE-6469 Project: HBase Issue Type: Bug Affects Versions: 0.94.2, 0.96.0 Reporter: Enis Soztutar Assignee: Nick Dimiduk Fix For: 0.96.0, 0.94.6 Attachments: 6469-expose-force-r3.patch In Enable/DisableTableHandler code, if something goes wrong in handling, the table state in zk is left as ENABLING / DISABLING. After that we cannot force any more action from the API or CLI, and the only recovery path is restarting the master. {code} if (done) { // Flip the table to enabled. this.assignmentManager.getZKTable().setEnabledTable( this.tableNameStr); LOG.info(Table ' + this.tableNameStr + ' was successfully enabled. Status: done= + done); } else { LOG.warn(Table ' + this.tableNameStr + ' wasn't successfully enabled. Status: done= + done); } {code} Here, if done is false, the table state is not changed. There is also no way to set skipTableStateCheck from cli / api. We have run into this issue a couple of times before. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5664) CP hooks in Scan flow for fast forward when filter filters out a row
[ https://issues.apache.org/jira/browse/HBASE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563763#comment-13563763 ] Lars Hofhansl commented on HBASE-5664: -- Thanks Anoop. Will do as soon as I get to it (the next two days or so) CP hooks in Scan flow for fast forward when filter filters out a row Key: HBASE-5664 URL: https://issues.apache.org/jira/browse/HBASE-5664 Project: HBase Issue Type: Improvement Components: Coprocessors, Filters Affects Versions: 0.92.1 Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.96.0, 0.94.5 Attachments: HBASE-5664_94.patch, HBASE-5664_94_V2.patch, HBASE-5664_94_V3.patch, HBASE-5664_Trunk.patch, HBASE-5664_Trunk_V2.patch In HRegion.nextInternal(int limit, String metric) We have while(true) loop so as to fetch a next result which satisfies filter condition. When Filter filters out the current fetched row we call nextRow(byte [] currentRow) before going with the next row. {code} if (results.isEmpty() || filterRow()) { // this seems like a redundant step - we already consumed the row // there're no left overs. // the reasons for calling this method are: // 1. reset the filters. // 2. provide a hook to fast forward the row (used by subclasses) nextRow(currentRow); {code} // 2. provide a hook to fast forward the row (used by subclasses) We can provide same feature of fast forward support for the CP also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7545) [replication] Break out TestReplication into manageable classes
[ https://issues.apache.org/jira/browse/HBASE-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563764#comment-13563764 ] Lars Hofhansl commented on HBASE-7545: -- Still wanna make a 0.94 patch, [~jdcryans]? [replication] Break out TestReplication into manageable classes --- Key: HBASE-7545 URL: https://issues.apache.org/jira/browse/HBASE-7545 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.96.0 Attachments: HBASE-7545.patch This has been discussed before but after trying to debug the last failure on Jenkins where I saw the time go back and forth (if you don't care about your own sanity do checkout https://builds.apache.org/job/HBase-TRUNK/3726/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationWithCompression/testDeleteTypes/) I think it is time to break out TestReplication. The difficulty is that the setup for the 2 clusters is a lot of code I don't want to duplicate. I'm thinking that we can keep {{setUpBeforeClass}} there and have the other classes extend TestReplication (which should also change name). I'm thinking of the following new classes: - TestReplicationSmallTests, contains the easy methods that don't mess around too much. - TestReplicationQueueFailover, contains one test of the same name - TestReplicationDisableInactivePeer, contains one test of the same name - Rename TestReplicationWithCompression TestReplicationQueueFailoverWithCompression and make it extends TestReplicationQueueFailover. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7111) hbase zkcli will not start if the zookeeper server choosen to connectted to is not available
[ https://issues.apache.org/jira/browse/HBASE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7111: - Fix Version/s: (was: 0.94.5) 0.94.6 hbase zkcli will not start if the zookeeper server choosen to connectted to is not available - Key: HBASE-7111 URL: https://issues.apache.org/jira/browse/HBASE-7111 Project: HBase Issue Type: Bug Components: Zookeeper Affects Versions: 0.94.2 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.94.6 Attachments: HBASE-7111-trunk.patch, HBASE-7111-trunk-v2.patch there are 3 zookeeper servers in my cluster. s1 s2 s3 after killing s3, i found the hbase zkcli will not start again. it will try to connect to s3 continuely. /11/07 11:01:01 INFO zookeeper.ClientCnxn: Opening socket connection to server s3 12/11/07 11:01:01 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused from the code {code} public String parse(final Configuration c) { // Note that we do not simply grab the property // HConstants.ZOOKEEPER_QUORUM from the HBaseConfiguration because the // user may be using a zoo.cfg file. Properties zkProps = ZKConfig.makeZKProps(c); String host = null; String clientPort = null; for (EntryObject, Object entry: zkProps.entrySet()) { String key = entry.getKey().toString().trim(); String value = entry.getValue().toString().trim(); if (key.startsWith(server.) host == null) { String[] parts = value.split(:); host = parts[0]; } else if (key.endsWith(clientPort)) { clientPort = value; } if (host != null clientPort != null) break; } return host != null clientPort != null? host + : + clientPort: null; } {code} the code will choose the fixed zookeeper server (here is the unavailable s3), which leads to the script fails -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master
[ https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5323: - Fix Version/s: (was: 0.94.5) Removing from 0.94 since there is no movement. Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master Key: HBASE-5323 URL: https://issues.apache.org/jira/browse/HBASE-5323 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.8 Attachments: HBASE-5323.patch, HBASE-5323.patch We know that while parsing the HLog we expect the proper length from HDFS. In WALReaderFSDataInputStream {code} assert(realLength = this.length); {code} We are trying to come out if the above condition is not satisfied. But if SSH.splitLog() gets this problem then it lands in the run method of EventHandler. This kills the SSH thread and so further assignment does not happen. If ROOT and META are to be assigned they cannot be. I think in this condition we abort the master by catching such exceptions. Please do suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
cuijianwei created HBASE-7684: - Summary: NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Attachment: check_call_null_before_set_in_secure_client.diff NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: check_call_null_before_set_in_secure_client.diff We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7685) Closed socket connection can't be remove from SecureClient
cuijianwei created HBASE-7685: - Summary: Closed socket connection can't be remove from SecureClient Key: HBASE-7685 URL: https://issues.apache.org/jira/browse/HBASE-7685 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7685) Closed socket connection can't be remove from SecureClient
[ https://issues.apache.org/jira/browse/HBASE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7685: -- Attachment: closing_socket_connection_not_removed_in_secure_client.diff Closed socket connection can't be remove from SecureClient -- Key: HBASE-7685 URL: https://issues.apache.org/jira/browse/HBASE-7685 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: closing_socket_connection_not_removed_in_secure_client.diff We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7685) Closing socket connection can't be remove from SecureClient
[ https://issues.apache.org/jira/browse/HBASE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7685: -- Summary: Closing socket connection can't be remove from SecureClient (was: Closed socket connection can't be remove from SecureClient) Closing socket connection can't be remove from SecureClient --- Key: HBASE-7685 URL: https://issues.apache.org/jira/browse/HBASE-7685 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: closing_socket_connection_not_removed_in_secure_client.diff We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7685) Closing socket connection can't be remove from SecureClient
[ https://issues.apache.org/jira/browse/HBASE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7685: -- Description: We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: /**close logic in SecureClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } /**close logic in SecureClient*/ However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: /**close logic in HBaseClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } /**close logic in HBaseClient*/ was: We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } Closing socket connection can't be remove from SecureClient --- Key: HBASE-7685 URL: https://issues.apache.org/jira/browse/HBASE-7685 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: closing_socket_connection_not_removed_in_secure_client.diff We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: /**close logic in SecureClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } /**close logic in SecureClient*/ However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: /**close logic in HBaseClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } /**close logic in HBaseClient*/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7685) Closing socket connection can't be remove from SecureClient
[ https://issues.apache.org/jira/browse/HBASE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7685: -- Description: We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: /**close logic in SecureClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: /**close logic in HBaseClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } was: We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: /**close logic in SecureClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } /**close logic in SecureClient*/ However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: /**close logic in HBaseClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } /**close logic in HBaseClient*/ Closing socket connection can't be remove from SecureClient --- Key: HBASE-7685 URL: https://issues.apache.org/jira/browse/HBASE-7685 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: closing_socket_connection_not_removed_in_secure_client.diff We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: /**close logic in SecureClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: /**close logic in HBaseClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7681) Address some recent random test failures
[ https://issues.apache.org/jira/browse/HBASE-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563795#comment-13563795 ] Hudson commented on HBASE-7681: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #378 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/378/]) HBASE-7681 Addendum, close tables in TestRegionServerMetrics (Revision 1439026) HBASE-7681 Address some recent random test failures (Revision 1439003) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java larsh : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestNodeHealthCheckChore.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportTsv.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitLogWorker.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServerCmdLine.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestMiniClusterLoadSequential.java Address some recent random test failures Key: HBASE-7681 URL: https://issues.apache.org/jira/browse/HBASE-7681 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.96.0, 0.94.5 Attachments: 7681-0.94-addendum.txt, 7681-0.94-combined.txt, 7681-0.94-combined_v2.txt, 7681-0.94.txt, 7681-0.96-addendum.txt, 7681-0.96-combined.txt, 7681-0.96-combined.txt, 7681-94-v1.txt, 7681-94-v2.txt, 7681-94-v3.txt, 7681-trunk-v1.txt I've seen many unspecific test failures recently that cannot be reproduced locally even when running these test is a loop for a very long time. Many of these test one way or the other make assumption w.r.t. wall clock time. While I cannot fix that, an option to increase some of these timeout a bit. This issue is to remind me to do that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563820#comment-13563820 ] Ted Yu commented on HBASE-7684: --- Patch looks good. If you can show us the NullPointException stack trace, that would be nice. {code} - call.setValue(value); + if (call != null) { {code} You can see that start of if statement doesn't align with call.setValue() statement. Please use dev-support/hbase_eclipse_formatter.xml from trunk so that the code style aligns with guideline. NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: check_call_null_before_set_in_secure_client.diff We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7686) TestSplitTransactionOnCluster fails occasionally in trunk builds
Ted Yu created HBASE-7686: - Summary: TestSplitTransactionOnCluster fails occasionally in trunk builds Key: HBASE-7686 URL: https://issues.apache.org/jira/browse/HBASE-7686 Project: HBase Issue Type: Bug Reporter: Ted Yu Priority: Critical Fix For: 0.96.0 From trunk build #3808: {code} testShouldFailSplitIfZNodeDoesNotExistDueToPrevRollBack(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): test timed out after 2 milliseconds testMasterRestartWhenSplittingIsPartial(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): test timed out after 30 milliseconds testExistingZnodeBlocksSplitAndWeRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): test timed out after 30 milliseconds {code} From HBase-TRUNK-on-Hadoop-2.0.0 #378 : {code} testShutdownSimpleFixup(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): Region not moved off .META. server testShouldFailSplitIfZNodeDoesNotExistDueToPrevRollBack(org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster): test timed out after 2 milliseconds {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7392) Disable failing example unit tests TestZooKeeperScanPolicyObserver and TestBulkDeleteProtocol
[ https://issues.apache.org/jira/browse/HBASE-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7392: -- Priority: Critical (was: Major) This issue prevents trunk artifacts from being generated. Disable failing example unit tests TestZooKeeperScanPolicyObserver and TestBulkDeleteProtocol - Key: HBASE-7392 URL: https://issues.apache.org/jira/browse/HBASE-7392 Project: HBase Issue Type: Task Reporter: stack Priority: Critical Fix For: 0.96.0 Attachments: 7392.txt Jenkins run https://builds.apache.org/job/HBase-TRUNK/3638/ turned up two broke example tests. They pass on a jdk6 machine locally but not on my jdk7 laptop. Somethings up. My guess is that these failures have been there a while but only surfaced because we got further than we normally do on a jenkins run. Tests have no output on jenkins. If I run w/ $ MAVEN_OPTS= -Xmx3g mvn test -PlocalTests -Dtest=TestBulkDeleteProtocol -Dtest.output.tofile=false I get: {code} --- T E S T S --- Running org.apache.hadoop.hbase.coprocessor.example.TestBulkDeleteProtocol 2012-12-19 09:34:36,340 INFO [main] hbase.HBaseTestingUtility(713): Starting up minicluster with 1 master(s) and 2 regionserver(s) and 2 datanode(s) 2012-12-19 09:34:46,362 INFO [main] hbase.HBaseTestingUtility(343): Created new mini-cluster data directory: /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/dfscluster_1c4634ed-2333-48ee-807d-c56f8c4ff20f 2012-12-19 09:34:46,362 INFO [main] hbase.HBaseTestingUtility(553): Setting test.cache.data to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/cache_data in system properties and HBase conf 2012-12-19 09:34:46,363 INFO [main] hbase.HBaseTestingUtility(553): Setting hadoop.tmp.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/hadoop_tmp in system properties and HBase conf 2012-12-19 09:34:46,363 INFO [main] hbase.HBaseTestingUtility(553): Setting hadoop.log.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/hadoop_logs in system properties and HBase conf 2012-12-19 09:34:46,364 INFO [main] hbase.HBaseTestingUtility(553): Setting mapred.local.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/mapred_local in system properties and HBase conf 2012-12-19 09:34:46,364 INFO [main] hbase.HBaseTestingUtility(553): Setting mapred.temp.dir to /Users/stack/checkouts/trunk/hbase-examples/target/test-data/cbfdeaed-c701-4a96-9277-0d9b9615a06e/mapred_temp in system properties and HBase conf 2012-12-19 09:34:46,365 INFO [main] hbase.HBaseTestingUtility(536): read short circuit is ON for user stack 2012-12-19 09:34:46.438 java[16837:1703] Unable to load realm info from SCDynamicStore 2012-12-19 09:34:56,540 DEBUG [main] fs.HFileSystem(199): Starting addLocationsOrderInterceptor with class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks 2012-12-19 09:34:56,541 WARN [main] fs.HFileSystem(215): The file system is not a DistributedFileSystem.Not adding block location reordering 2012-12-19 09:34:56,669 WARN [main] namenode.FSNamesystem(564): The dfs.support.append option is in your configuration, however append is not supported. This configuration option is no longer required to enable sync. 2012-12-19 09:35:06,962 WARN [main] namenode.FSNamesystem(564): The dfs.support.append option is in your configuration, however append is not supported. This configuration option is no longer required to enable sync. 2012-12-19 09:35:07,076 INFO [main] log.Slf4jLog(67): Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2012-12-19 09:35:07,151 INFO [main] log.Slf4jLog(67): jetty-6.1.26 2012-12-19 09:35:07,184 INFO [main] log.Slf4jLog(67): Extract jar:file:/Users/stack/.m2/repository/org/apache/hadoop/hadoop-core/1.1.1/hadoop-core-1.1.1.jar!/webapps/hdfs to /var/folders/bp/2z1cykc92rs6j24251cg__phgp/T/Jetty_localhost_57924_hdfsg9mqyr/webapp 2012-12-19 09:35:07,402 INFO [main] log.Slf4jLog(67): Started SelectChannelConnector@localhost:57924 Starting DataNode 0 with dfs.data.dir:
[jira] [Updated] (HBASE-7685) Closing socket connection can't be removed from SecureClient
[ https://issues.apache.org/jira/browse/HBASE-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7685: -- Description: We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: {code} // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } {code} However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: {code} // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } {code} was: We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: /**close logic in SecureClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: /**close logic in HBaseClient*/ // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } Summary: Closing socket connection can't be removed from SecureClient (was: Closing socket connection can't be remove from SecureClient) I think the root cause is in PoolMap. Here is some analysis from Hiroshi-san: https://issues.apache.org/jira/browse/HBASE-6651?focusedCommentId=13448770page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13448770 Closing socket connection can't be removed from SecureClient Key: HBASE-7685 URL: https://issues.apache.org/jira/browse/HBASE-7685 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: closing_socket_connection_not_removed_in_secure_client.diff We found many IOExceptions naming Call # not added as the connection # is closing when using SecureClient to access hbase. From the source code of SecureClient, we found there may be bug in close() method of SecureClient. The following is the current logic in close() method of SecureClient: {code} // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { if (connections.get(remoteId) == this) { connections.remove(remoteId); } } {code} However, connections are managed by PoolMap; therefore, if more than one connection are created for the same remoteId, the 'if condition' may not be satisfied because the default PoolType is RoundRobin. This could be cause a closing state connection can't be removed from connections, making new calls which use such connections will always throws IOException naming connection is closing. We can use logic from close() method of HBaseClient to resolve the problem: {code} // release the resources // first thing to do;take the connection out of the connection list synchronized (connections) { connections.remove(remoteId, this); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA
[jira] [Created] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
Lars Hofhansl created HBASE-7687: Summary: TestCatalogTracker.testServerNotRunningIOException fails occasionally Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: 7687-94.txt TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: 7687-trunk.txt I looped TestCatalogTracker with 0.94 patch 40 times locally and they passed. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-7687: - Assignee: Ted Yu TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Status: Patch Available (was: Open) TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563841#comment-13563841 ] Lars Hofhansl commented on HBASE-7687: -- This method and code is wrought with race conditions. I reproduced it locally. What happened in this: # waitForMetaServerConnection started to wait # the test thread sets a new setMetaLocation # CatalogTracker.getMetaServerConnection does not find a valid connection and resets the meta location flag # since the test thread only sets it once, waitForMetaServerConnection will wait for forever (until timeout). Let's just remove this stuff. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563842#comment-13563842 ] Lars Hofhansl commented on HBASE-7687: -- I looked it as well... After 30 round (or so) it failed. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563842#comment-13563842 ] Lars Hofhansl edited comment on HBASE-7687 at 1/27/13 5:17 PM: --- I looped it as well... After 30 round (or so) it failed. was (Author: lhofhansl): I looked it as well... After 30 round (or so) it failed. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563843#comment-13563843 ] Lars Hofhansl commented on HBASE-7687: -- Why would the 0.94 patch make it better? waitForMetaServerConnection just called waitForMeta. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563852#comment-13563852 ] Ted Yu commented on HBASE-7687: --- bq. waitForMetaServerConnection just called waitForMeta. getCachedConnection() comes into play: {code} * @return The HRegionInterface we got when we connected to codesn/code * May have come from cache, may not be good, may have been setup by this * invocation, or may be null. ... try { protocol = connection.getHRegionConnection(sn.getHostname(), sn.getPort()); {code} If the test fails with my patch, that means the patch didn't fix the problem :-( TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7607) Fix TestRegionServerCoprocessorExceptionWithAbort flakiness
[ https://issues.apache.org/jira/browse/HBASE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563858#comment-13563858 ] Himanshu Vashishtha commented on HBASE-7607: I don't think we fixed that in 7681. The 4382 patch doesn't correctly solve the issue raised in 0.94, where the description says that the client went into retrying mode. In trunk, all the sleep/interrupt logic is gone, and which is in the right direction, but I wonder why the client didn't get IOException. Fix TestRegionServerCoprocessorExceptionWithAbort flakiness Key: HBASE-7607 URL: https://issues.apache.org/jira/browse/HBASE-7607 Project: HBase Issue Type: Bug Components: Client, test Affects Versions: 0.94.4 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha TestRegionServerCoprocessorExceptionWithAbort fails sometimes both on trunk and 0.94.X. The codebase is different in both. In trunk, table.put() passes even with BuggyRegionCoprocessor failed the transaction: The put should have failed, as the coprocessor is buggy In 0.94.x, client retries to look at the root region, while the cluster is down and /hbase znode is no longer present. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: (was: 7687-trunk.txt) TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563860#comment-13563860 ] Hadoop QA commented on HBASE-7687: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566672/7687-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 24 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4205//console This message is automatically generated. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: 7687-94-v2.txt TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: 7687-trunk.txt See if patch v2 is better. I removed the second Threads.sleep(1) call. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563862#comment-13563862 ] Lars Hofhansl commented on HBASE-7687: -- v1 failed after 39 iterations for me. trying v2 now TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563864#comment-13563864 ] Lars Hofhansl commented on HBASE-7687: -- v2 fails always for me. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7607) Fix TestRegionServerCoprocessorExceptionWithAbort flakiness
[ https://issues.apache.org/jira/browse/HBASE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563865#comment-13563865 ] Lars Hofhansl commented on HBASE-7607: -- Feel free to reopen :) Fix TestRegionServerCoprocessorExceptionWithAbort flakiness Key: HBASE-7607 URL: https://issues.apache.org/jira/browse/HBASE-7607 Project: HBase Issue Type: Bug Components: Client, test Affects Versions: 0.94.4 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha TestRegionServerCoprocessorExceptionWithAbort fails sometimes both on trunk and 0.94.X. The codebase is different in both. In trunk, table.put() passes even with BuggyRegionCoprocessor failed the transaction: The put should have failed, as the coprocessor is buggy In 0.94.x, client retries to look at the root region, while the cluster is down and /hbase znode is no longer present. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563867#comment-13563867 ] Lars Hofhansl commented on HBASE-7687: -- My vote is still to simply remove this test. What it tests is not useful (IMHO) TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Status: Open (was: Patch Available) TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: 7687-94-v3.txt I stopped the loop for patch v2 after 63 successful iterations. Patch v3 removes the while loop which waits for Thread to come alive because CountDownLatch serves the same purpose. Hopefully the Thread would see the second ct.setMetaLocation(SN) call. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563873#comment-13563873 ] Lars Hofhansl commented on HBASE-7687: -- V2 fails every single time for me, on the very first iteration. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563874#comment-13563874 ] Lars Hofhansl commented on HBASE-7687: -- I'm trying my bandaid patch from parent now. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563880#comment-13563880 ] Ted Yu commented on HBASE-7687: --- @Lars: If you can try patch v3, that would be nice. The test, started @ 10:33 this morning, is still looping. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563883#comment-13563883 ] Hadoop QA commented on HBASE-7687: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566677/7687-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4206//console This message is automatically generated. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563884#comment-13563884 ] Lars Hofhansl commented on HBASE-7687: -- I looped my bandaid patch from parent. Looped it for 100 iterations. Will try v3. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563885#comment-13563885 ] Lars Hofhansl commented on HBASE-7687: -- Your v3 looks good. Let's change the Thread.sleep(1) to Thread.sleep(100) just be sure. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7687: -- Attachment: 7687-trunk-v3.txt Let's get Hadoop QA result. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563895#comment-13563895 ] Ted Yu commented on HBASE-7687: --- Patch v3 for 0.94 made TestCatalogTracker pass 100 times. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563896#comment-13563896 ] Lars Hofhansl commented on HBASE-7687: -- Looped your v3 also 100 times. Seems fine. I do not think you need to change the call to waitForMetaServerConnection and it would still be nice to increase the sleep time to 100ms (just be safe, otherwise waitForRoot could enter the wait loop after the main threads sets the meta location) TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563900#comment-13563900 ] Lars Hofhansl commented on HBASE-7687: -- Hmm... When I put waitForMetaServerConnection, I do get it to fail pretty quickly. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563901#comment-13563901 ] Lars Hofhansl commented on HBASE-7687: -- +1 on your v3. TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563933#comment-13563933 ] Ted Yu commented on HBASE-7687: --- Integrated to 0.94 and trunk. Thanks for the review, Lars TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563938#comment-13563938 ] Hudson commented on HBASE-7687: --- Integrated in HBase-0.94 #791 (See [https://builds.apache.org/job/HBase-0.94/791/]) HBASE-7687 TestCatalogTracker.testServerNotRunningIOException fails occasionally (Ted Yu) (Revision 1439179) Result = ABORTED tedyu : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563941#comment-13563941 ] Hudson commented on HBASE-7687: --- Integrated in HBase-TRUNK #3811 (See [https://builds.apache.org/job/HBase-TRUNK/3811/]) HBASE-7687 TestCatalogTracker.testServerNotRunningIOException fails occasionally (Ted Yu) (Revision 1439180) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563942#comment-13563942 ] Hudson commented on HBASE-7687: --- Integrated in HBase-0.94-security #101 (See [https://builds.apache.org/job/HBase-0.94-security/101/]) HBASE-7687 TestCatalogTracker.testServerNotRunningIOException fails occasionally (Ted Yu) (Revision 1439179) Result = SUCCESS tedyu : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
Mike Lewis created HBASE-7688: - Summary: Master coprocessor RPCs don't propagate exceptions Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
[ https://issues.apache.org/jira/browse/HBASE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Lewis updated HBASE-7688: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Open) Master coprocessor RPCs don't propagate exceptions -- Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis Fix For: 0.96.0 Attachments: fix-hmaster-rpc-exceptions.patch I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
[ https://issues.apache.org/jira/browse/HBASE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Lewis updated HBASE-7688: -- Status: Open (was: Patch Available) Master coprocessor RPCs don't propagate exceptions -- Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis Fix For: 0.96.0 Attachments: fix-hmaster-rpc-exceptions.patch I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
[ https://issues.apache.org/jira/browse/HBASE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Lewis updated HBASE-7688: -- Attachment: fix-hmaster-rpc-exceptions.patch Potential fix Master coprocessor RPCs don't propagate exceptions -- Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis Fix For: 0.96.0 Attachments: fix-hmaster-rpc-exceptions.patch I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563958#comment-13563958 ] Hudson commented on HBASE-7687: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #379 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/379/]) HBASE-7687 TestCatalogTracker.testServerNotRunningIOException fails occasionally (Ted Yu) (Revision 1439180) Result = FAILURE tedyu : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
[ https://issues.apache.org/jira/browse/HBASE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563960#comment-13563960 ] Ted Yu commented on HBASE-7688: --- Interesting. Are you able to add a unit test that shows the exception being propagated ? Thanks Master coprocessor RPCs don't propagate exceptions -- Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis Fix For: 0.96.0 Attachments: fix-hmaster-rpc-exceptions.patch I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
[ https://issues.apache.org/jira/browse/HBASE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563970#comment-13563970 ] Ted Yu commented on HBASE-7688: --- {code} public ClientProtos.CoprocessorServiceResponse execMasterService(final RpcController controller, final ClientProtos.CoprocessorServiceRequest request) throws ServiceException { try { ServerRpcController execController = new ServerRpcController(); {code} If controller is an instance of ServerRpcController, do we need to create a new instance ? Master coprocessor RPCs don't propagate exceptions -- Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis Fix For: 0.96.0 Attachments: fix-hmaster-rpc-exceptions.patch I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7688) Master coprocessor RPCs don't propagate exceptions
[ https://issues.apache.org/jira/browse/HBASE-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Lewis updated HBASE-7688: -- Attachment: fix-hmaster-rpc-exceptions-2.patch Same patch with test case for Master that fails before the fix, and works after. Also added a test that shows that region RPC error stuff works correctly currently. Master coprocessor RPCs don't propagate exceptions -- Key: HBASE-7688 URL: https://issues.apache.org/jira/browse/HBASE-7688 Project: HBase Issue Type: Bug Components: Protobufs Affects Versions: 0.96.0 Reporter: Mike Lewis Fix For: 0.96.0 Attachments: fix-hmaster-rpc-exceptions-2.patch, fix-hmaster-rpc-exceptions.patch I was working on some coprocessor tests, and was wondering why exceptions weren't bubbling up while calling checkFailed on my RpcController. Turns out, that HMaster's execMasterService seems to be sending the wrong RpcController (that happens to be null in my case) to the methods. I have a fix that appears to work, but I am not familiar enough with the code to know that it's the write one. Version I am working on has the git sha 9a322b0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7689) CloneTableHandler notify completion too early
Matteo Bertozzi created HBASE-7689: -- Summary: CloneTableHandler notify completion too early Key: HBASE-7689 URL: https://issues.apache.org/jira/browse/HBASE-7689 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: hbase-6055, 0.96.0 With HBASE-7365 the clone table handler notify that the clone is finished before the base create table handler has moved the table folder in place and added regions to .META. If the client call isRestoreDone() before the META+disk update gets an exception like this: {code} org.apache.hadoop.hbase.TableNotFoundException: clonedtb-1359268409309 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:991) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:856) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:813) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:234) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:174) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.verifyRowCount(TestRestoreSnapshotFromClient.java:387) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:259) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:251) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7689) CloneTableHandler notify completion too early
[ https://issues.apache.org/jira/browse/HBASE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-7689: --- Attachment: HBASE-7689-v0.patch Added to methods completed() and failed() to the CreateTableHandler. I think that in the long term, we should add these two methods to the base EventHandler and get a way to notify the client based on these. CloneTableHandler notify completion too early - Key: HBASE-7689 URL: https://issues.apache.org/jira/browse/HBASE-7689 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: hbase-6055, 0.96.0 Attachments: HBASE-7689-v0.patch With HBASE-7365 the clone table handler notify that the clone is finished before the base create table handler has moved the table folder in place and added regions to .META. If the client call isRestoreDone() before the META+disk update gets an exception like this: {code} org.apache.hadoop.hbase.TableNotFoundException: clonedtb-1359268409309 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:991) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:856) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:813) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:234) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:174) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.verifyRowCount(TestRestoreSnapshotFromClient.java:387) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:259) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:251) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7689) CloneTableHandler notify completion too early
[ https://issues.apache.org/jira/browse/HBASE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563989#comment-13563989 ] Ted Yu commented on HBASE-7689: --- {code} + completed(); if (cpHost != null) { cpHost.postCreateTableHandler(this.hTableDescriptor, this.newRegions); } {code} What if call of postCreateTableHandler() results in exception ? I think call to completed() should be moved after call to postCreateTableHandler() Otherwise patch looks good. CloneTableHandler notify completion too early - Key: HBASE-7689 URL: https://issues.apache.org/jira/browse/HBASE-7689 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: hbase-6055, 0.96.0 Attachments: HBASE-7689-v0.patch With HBASE-7365 the clone table handler notify that the clone is finished before the base create table handler has moved the table folder in place and added regions to .META. If the client call isRestoreDone() before the META+disk update gets an exception like this: {code} org.apache.hadoop.hbase.TableNotFoundException: clonedtb-1359268409309 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:991) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:856) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:813) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:234) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:174) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.verifyRowCount(TestRestoreSnapshotFromClient.java:387) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:259) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:251) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7689) CloneTableHandler notify completion too early
[ https://issues.apache.org/jira/browse/HBASE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563990#comment-13563990 ] Matteo Bertozzi commented on HBASE-7689: There's nothing that removes the table on failure, so if the coprocessor fail the table is still there. That's why the complete() is before the coprocessor. CloneTableHandler notify completion too early - Key: HBASE-7689 URL: https://issues.apache.org/jira/browse/HBASE-7689 Project: HBase Issue Type: Sub-task Components: snapshots Affects Versions: hbase-6055 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Fix For: hbase-6055, 0.96.0 Attachments: HBASE-7689-v0.patch With HBASE-7365 the clone table handler notify that the clone is finished before the base create table handler has moved the table folder in place and added regions to .META. If the client call isRestoreDone() before the META+disk update gets an exception like this: {code} org.apache.hadoop.hbase.TableNotFoundException: clonedtb-1359268409309 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:991) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:856) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:813) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:234) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:174) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.verifyRowCount(TestRestoreSnapshotFromClient.java:387) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:259) at org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient.testCloneSnapshot(TestRestoreSnapshotFromClient.java:251) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7624) Backport HBASE-5359 and HBASE-7596 to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7624: - Fix Version/s: (was: 0.94.5) 0.94.6 HBASE-5359 needs a lot of hand merging. Moving out unless somebody volunteers. Backport HBASE-5359 and HBASE-7596 to 0.94 -- Key: HBASE-7624 URL: https://issues.apache.org/jira/browse/HBASE-7624 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Fix For: 0.94.6 Both HBASE-5359 and HBASE-7596 are useful and should be added to 0.94. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7687) TestCatalogTracker.testServerNotRunningIOException fails occasionally
[ https://issues.apache.org/jira/browse/HBASE-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7687. -- Resolution: Fixed Hadoop Flags: Reviewed TestCatalogTracker.testServerNotRunningIOException fails occasionally - Key: HBASE-7687 URL: https://issues.apache.org/jira/browse/HBASE-7687 Project: HBase Issue Type: Sub-task Reporter: Lars Hofhansl Assignee: Ted Yu Fix For: 0.96.0, 0.94.5 Attachments: 7687-94.txt, 7687-94-v2.txt, 7687-94-v3.txt, 7687-trunk.txt, 7687-trunk-v3.txt I would like to remove this method. It micro-tests a single method and poorly at that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved
[ https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564014#comment-13564014 ] Lars Hofhansl commented on HBASE-6347: -- Looks like ROOT should be handled correctly. META needs to be fixed, though -ROOT- and .META. are stale in table.jsp if they moved -- Key: HBASE-6347 URL: https://issues.apache.org/jira/browse/HBASE-6347 Project: HBase Issue Type: Bug Affects Versions: 0.90.6, 0.92.1, 0.94.0 Reporter: Jean-Daniel Cryans Labels: noob Fix For: 0.90.8, 0.92.3, 0.94.5 table.jsp does not use a lookup method on {{CatalogTracker}} that does not force a refresh of the cache, thus it can get a stale location if -ROOT- or .META. moved and the master hasn't tried to access them yet. Should just be a matter of using waitForRoot/Meta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2611) Handle RS that fails while processing the failure of another one
[ https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564016#comment-13564016 ] Lars Hofhansl commented on HBASE-2611: -- [~jdcryans]? Handle RS that fails while processing the failure of another one Key: HBASE-2611 URL: https://issues.apache.org/jira/browse/HBASE-2611 Project: HBase Issue Type: Sub-task Components: Replication Reporter: Jean-Daniel Cryans Assignee: Himanshu Vashishtha Priority: Critical Fix For: 0.96.0, 0.94.5 Attachments: 2611-trunk-v3.patch, 2611-v3.patch, HBASE-2611-trunk-v2.patch, HBase-2611-upstream-v1.patch, HBASE-2611-v2.patch HBASE-2223 doesn't manage region servers that fail while doing the transfer of HLogs queues from other region servers that failed. Devise a reliable way to do it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Description: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The following is the Exception trace: 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:610) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.receiveResponse(SecureClient.java:378) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606) was:We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Description: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. (was: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The following is the Exception trace: 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:610) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.receiveResponse(SecureClient.java:378) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606)) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security
[jira] [Commented] (HBASE-7337) SingleColumnValueFilter seems to get unavailble data
[ https://issues.apache.org/jira/browse/HBASE-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564032#comment-13564032 ] Anoop Sam John commented on HBASE-7337: --- [~zhou wen jian] When you scan data and your table having multiple versions for a cell, which all versions the scan should return to be specified in Scan object. By default HBase will return only the latest version. But you have setters on Scan object using which you can tell to get more version. SCVF is used to specify a column value check. If the condition is not satisfied for a row, then that row will be fully filtered out from the returned results. Here using latestVersionOnly = true , you are specifying to check the condition only on the latest version. [*Not* that to return the latestVersion only in result]If this is set as false, all the versions will be checked for the condition and if any of the version value is satisfying the condition, that row will get included. But remember SCVF can not specify to return only latest version of cell or not. This is just for specifying the condition and all the filter will see all the versions of cells. Which version to be returned is decided down the line in another part of code which will be executed after this Filter#filterKeyValue(KeyValue) SingleColumnValueFilter seems to get unavailble data - Your heading says getting unavailable data. Can you tell more? Or your problem is getting the older versions? As per your reply we can check whether some real bug is there or not. If no issues we can close this. SingleColumnValueFilter seems to get unavailble data Key: HBASE-7337 URL: https://issues.apache.org/jira/browse/HBASE-7337 Project: HBase Issue Type: Bug Components: Filters Affects Versions: 0.94.3, 0.96.0 Environment: 0.94 Reporter: Zhou wenjian Assignee: Zhou wenjian Fix For: 0.96.0, 0.94.6 put multi versions of a row. r1 cf:q version:1 value:1 r1 cf:q version:2 value:3 r1 cf:q version:3 value:2 the filter in scan is set as below: SingleColumnValueFilter valueF = new SingleColumnValueFilter( family,qualifier,CompareOp.EQUAL,new BinaryComparator(Bytes .toBytes(2))); then i found all of the three versions will be emmitted, then i set latestVersionOnly to false, the result does no change. {code} public ReturnCode filterKeyValue(KeyValue keyValue) { // System.out.println(REMOVE KEY= + keyValue.toString() + , value= + Bytes.toString(keyValue.getValue())); if (this.matchedColumn) { // We already found and matched the single column, all keys now pass return ReturnCode.INCLUDE; } else if (this.latestVersionOnly this.foundColumn) { // We found but did not match the single column, skip to next row return ReturnCode.NEXT_ROW; } if (!keyValue.matchingColumn(this.columnFamily, this.columnQualifier)) { return ReturnCode.INCLUDE; } foundColumn = true; if (filterColumnValue(keyValue.getBuffer(), keyValue.getValueOffset(), keyValue.getValueLength())) { return this.latestVersionOnly? ReturnCode.NEXT_ROW: ReturnCode.INCLUDE; } this.matchedColumn = true; return ReturnCode.INCLUDE; } {code} From the code above, it seeems that version 3 will be first emmited, and set matchedColumn to true, which leads the following version 2 and 1 emmited too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7638) [0.94] region cache entry should only be removed on error if the error is from the server currently in cache
[ https://issues.apache.org/jira/browse/HBASE-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564036#comment-13564036 ] Lars Hofhansl commented on HBASE-7638: -- What if the region has moved, the cache was not updated, and now the new RS returns an error? [0.94] region cache entry should only be removed on error if the error is from the server currently in cache Key: HBASE-7638 URL: https://issues.apache.org/jira/browse/HBASE-7638 Project: HBase Issue Type: Bug Affects Versions: 0.94.4 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.94.5 Attachments: HBASE-7638-v0.patch See HBASE-7268. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7638) [0.94] region cache entry should only be removed on error if the error is from the server currently in cache
[ https://issues.apache.org/jira/browse/HBASE-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-7638: - Fix Version/s: (was: 0.94.5) 0.94.6 [0.94] region cache entry should only be removed on error if the error is from the server currently in cache Key: HBASE-7638 URL: https://issues.apache.org/jira/browse/HBASE-7638 Project: HBase Issue Type: Bug Affects Versions: 0.94.4 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Fix For: 0.94.6 Attachments: HBASE-7638-v0.patch See HBASE-7268. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7690) Improve metadata printing in HFilePrettyPrinter
Nick Dimiduk created HBASE-7690: --- Summary: Improve metadata printing in HFilePrettyPrinter Key: HBASE-7690 URL: https://issues.apache.org/jira/browse/HBASE-7690 Project: HBase Issue Type: Improvement Components: HFile Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor The pretty printer could do a better job with metadata. For example: {pre} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {pre} May of these fields could be cleaned up to print in human-readable values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7690) Improve metadata printing in HFilePrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-7690: Description: The pretty printer could do a better job with metadata. For example: {noformat} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {noformat} May of these fields could be cleaned up to print in human-readable values. was: The pretty printer could do a better job with metadata. For example: {pre} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {pre} May of these fields could be cleaned up to print in human-readable values. Improve metadata printing in HFilePrettyPrinter --- Key: HBASE-7690 URL: https://issues.apache.org/jira/browse/HBASE-7690 Project: HBase Issue Type: Improvement Components: HFile Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor The pretty printer could do a better job with metadata. For example: {noformat} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {noformat} May of these fields could be cleaned up to print in human-readable values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7690) Improve metadata printing in HFilePrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-7690: Attachment: 0001-HBASE-7690-Improve-printing-of-HFile-metadata.patch This improves the output a bit: {noformat} Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = 1359346878350 DELETE_FAMILY_COUNT = 0 EARLIEST_PUT_TS = 1359346869830 EXCLUDE_FROM_MINOR_COMPACTION = false KEY_VALUE_VERSION = 1 MAJOR_COMPACTION_KEY = true MAX_MEMSTORE_TS_KEY = 0 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 {noformat} Improve metadata printing in HFilePrettyPrinter --- Key: HBASE-7690 URL: https://issues.apache.org/jira/browse/HBASE-7690 Project: HBase Issue Type: Improvement Components: HFile Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Attachments: 0001-HBASE-7690-Improve-printing-of-HFile-metadata.patch The pretty printer could do a better job with metadata. For example: {noformat} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {noformat} May of these fields could be cleaned up to print in human-readable values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7690) Improve metadata printing in HFilePrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-7690: Status: Patch Available (was: Open) Improve metadata printing in HFilePrettyPrinter --- Key: HBASE-7690 URL: https://issues.apache.org/jira/browse/HBASE-7690 Project: HBase Issue Type: Improvement Components: HFile Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Attachments: 0001-HBASE-7690-Improve-printing-of-HFile-metadata.patch The pretty printer could do a better job with metadata. For example: {noformat} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {noformat} May of these fields could be cleaned up to print in human-readable values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7690) Improve metadata printing in HFilePrettyPrinter
[ https://issues.apache.org/jira/browse/HBASE-7690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564051#comment-13564051 ] Matteo Bertozzi commented on HBASE-7690: The hfile doesn't know nothing about its content. Most of the fields are written by the store file. if we want to keep the separation, only the store file knows the meaning of each field. (Also with HBASE-7603 probably only the right StoreFileManager is able to dump metadata in a human-readable way) Improve metadata printing in HFilePrettyPrinter --- Key: HBASE-7690 URL: https://issues.apache.org/jira/browse/HBASE-7690 Project: HBase Issue Type: Improvement Components: HFile Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Attachments: 0001-HBASE-7690-Improve-printing-of-HFile-metadata.patch The pretty printer could do a better job with metadata. For example: {noformat} ... Fileinfo: BULKLOAD_SOURCE_TASK = attempt_201301272014_0001_r_00_0 BULKLOAD_TIMESTAMP = \x00\x00\x01\x7FcG\x8E DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00 EARLIEST_PUT_TS = \x00\x00\x01\x7FcF EXCLUDE_FROM_MINOR_COMPACTION = \x00 KEY_VALUE_VERSION = \x00\x00\x00\x01 MAJOR_COMPACTION_KEY = \xFF MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00 TIMERANGE = 13593468698301359346869830 hfile.AVG_KEY_LEN = 19 hfile.AVG_VALUE_LEN = 2 hfile.LASTKEY = \x00\x04row9\x01dc2\x00\x00\x01\x7FcF\x04 ... {noformat} May of these fields could be cleaned up to print in human-readable values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7691) A couple docbook updates
Nick Dimiduk created HBASE-7691: --- Summary: A couple docbook updates Key: HBASE-7691 URL: https://issues.apache.org/jira/browse/HBASE-7691 Project: HBase Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Trivial I've noticed a couple mistakes in the book. - link to external (hadoop) documentation which is no longer available on {{hadoop.apache.org/common/docs/current}}. - small mistake in instructions for generating a release tarball. - mistake in descriptions of test profiles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7691) A couple docbook updates
[ https://issues.apache.org/jira/browse/HBASE-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-7691: Attachment: 0001-HBASE-7691-A-couple-docbook-updates.patch A couple docbook updates Key: HBASE-7691 URL: https://issues.apache.org/jira/browse/HBASE-7691 Project: HBase Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Trivial Attachments: 0001-HBASE-7691-A-couple-docbook-updates.patch I've noticed a couple mistakes in the book. - link to external (hadoop) documentation which is no longer available on {{hadoop.apache.org/common/docs/current}}. - small mistake in instructions for generating a release tarball. - mistake in descriptions of test profiles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7691) A couple docbook updates
[ https://issues.apache.org/jira/browse/HBASE-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-7691: Status: Patch Available (was: Open) A couple docbook updates Key: HBASE-7691 URL: https://issues.apache.org/jira/browse/HBASE-7691 Project: HBase Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Trivial Attachments: 0001-HBASE-7691-A-couple-docbook-updates.patch I've noticed a couple mistakes in the book. - link to external (hadoop) documentation which is no longer available on {{hadoop.apache.org/common/docs/current}}. - small mistake in instructions for generating a release tarball. - mistake in descriptions of test profiles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7671) Flushing memstore again after last failure could cause data loss
[ https://issues.apache.org/jira/browse/HBASE-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564052#comment-13564052 ] chunhui shen commented on HBASE-7671: - Each flushed file has a sequence id, and all hlog entries which smaller than the max sequence id will be skipped when replaying edit logs. It means all data in memstore smaller than the sequence id has been flushed to storefile when flushing, but broken by the case that snapshot called again without clearing previous because of last failure. bq.Could somehow store the correct sequenceId with the snapshot? I have considered this solution, but it seems not convenient for multi stores. In that way we should maintain sequence id for each memstore rather than one region. Snapshot means creating a snapshot of the current memstore. If data belongs to memstore before successfully flushed, IMO, copies any new KVs is reasonable. Flushing memstore again after last failure could cause data loss Key: HBASE-7671 URL: https://issues.apache.org/jira/browse/HBASE-7671 Project: HBase Issue Type: Bug Affects Versions: 0.94.4 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0, 0.94.5 Attachments: HBASE-7671.patch, HBASE-7671v2.patch, HBASE-7671v3.patch See the following logs first: {code} 2013-01-23 18:58:38,801 INFO org.apache.hadoop.hbase.regionserver.Store: Flushed , sequenceid=9746535080, memsize=101.8m, into tmp file hdfs://dw77.kgb.sqa.cm4:9900/hbase-test3/writetest1/8dc14e35b4d7c0e481e0bb30849cff7d/.tmp/bebeeecc56364b6c8126cf1dc6782a25 2013-01-23 18:58:41,982 WARN org.apache.hadoop.hbase.regionserver.MemStore: Snapshot called again without clearing previous. Doing nothing. Another ongoing flush or did we fail last attempt? 2013-01-23 18:58:43,274 INFO org.apache.hadoop.hbase.regionserver.Store: Flushed , sequenceid=9746599334, memsize=101.8m, into tmp file hdfs://dw77.kgb.sqa.cm4:9900/hbase-test3/writetest1/8dc14e35b4d7c0e481e0bb30849cff7d/.tmp/4eede32dc469480bb3d469aaff332313 {code} The first time memstore flush is failed when commitFile()(Logged the first edit above), then trigger server abort, but another flush is coming immediately(could caused by move/split,Logged the third edit above) and successful. For the same memstore's snapshot, we get different sequenceid, it causes data loss when replaying log edits See details from the unit test case in the patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6335) Switching log-splitting policy after last failure master start may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564056#comment-13564056 ] chunhui shen commented on HBASE-6335: - In 0.96 as well. For this problem, I don't have any good idea... Switching log-splitting policy after last failure master start may cause data loss -- Key: HBASE-6335 URL: https://issues.apache.org/jira/browse/HBASE-6335 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.92.1, 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.94.6 How happen? If server A is down, and it has three log files, all the data is from one region. File 1: kv01 kv02 kv03 File 2: kv04 kv05 kv06 File 3: kv07 kv08 kv09 Here,kv01 means, its log seqID is 01 Case:Switch to maste-local-log-splitting from distributed-log-splitting 1.Master find serverA is down, and start to split its log files using split-log-splitting. 2.Successfully split log file2, and move it to oldLogs, and generate one edit file named 06 in region recover.edits dir. 3.Master restart, and change the log-splitting policy to maste-local-log-splitting , and start to split file 1, file 3 4.Successfully split log file1 and file3, and generate one edit file named 09 in region recover.edits dir. 5.Region replay edits from edit file 06 and 09, Region's seqID is 06 after it replay edits from 06, and when replaying edit from 09, it will skip kv01,kv02,kv03, So these data loss. As the above case, if we switch to distributed-log-splitting from maste-local-log-splitting, it could also cause data loss Should we fix this bug or avoid the case? I'm not sure... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception
[ https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564061#comment-13564061 ] chunhui shen commented on HBASE-7507: - I have no objection for revert. Make memstore flush be able to retry after exception Key: HBASE-7507 URL: https://issues.apache.org/jira/browse/HBASE-7507 Project: HBase Issue Type: Bug Affects Versions: 0.94.3 Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0, 0.94.5 Attachments: 7507-94.patch, 7507-trunk v1.patch, 7507-trunk v2.patch, 7507-trunkv3.patch We will abort regionserver if memstore flush throws exception. I thinks we could do retry to make regionserver more stable because file system may be not ok in a transient time. e.g. Switching namenode in the NamenodeHA environment {code} HRegion#internalFlushcache(){ ... try { ... }catch(Throwable t){ DroppedSnapshotException dse = new DroppedSnapshotException(region: + Bytes.toStringBinary(getRegionName())); dse.initCause(t); throw dse; } ... } MemStoreFlusher#flushRegion(){ ... region.flushcache(); ... try { }catch(DroppedSnapshotException ex){ server.abort(Replay of HLog required. Forcing server shutdown, ex); } ... } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Description: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The exception stack trace is : 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:610) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.receiveResponse(SecureClient.java:378) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606) was:We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Description: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The exception stack trace is : 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:610) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.receiveResponse(SecureClient.java:378) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606) was: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Description: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The exception stack trace is : {code} 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:610) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.receiveResponse(SecureClient.java:378) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606) {code} was: We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check
[jira] [Created] (HBASE-7692) Add utility class to generate ordered byte[] serialization
Nick Dimiduk created HBASE-7692: --- Summary: Add utility class to generate ordered byte[] serialization Key: HBASE-7692 URL: https://issues.apache.org/jira/browse/HBASE-7692 Project: HBase Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk The current Bytes utility class works, but produces output that does not maintain the native sort ordering of the input value. This results in, for example, a negative value that does not necessarily sort before a positive value. HBase should provide a canonical implementation of such a serialization format so that third-parties can reliably build on top of HBase. This will allow an implementation for HIVE-3634, HIVE-2599, or HIVE-2903 that is compatible with similar features in Pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cuijianwei updated HBASE-7684: -- Attachment: check_call_null_before_set_value_in_secure_client.txt formatting by dev-support/hbase_eclipse_formatter.xml NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: check_call_null_before_set_in_secure_client.diff, check_call_null_before_set_value_in_secure_client.txt We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The exception stack trace is : {code} 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.io.IOException: Unexpected exception
[jira] [Updated] (HBASE-7691) A couple docbook updates
[ https://issues.apache.org/jira/browse/HBASE-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-7691: - Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Will show on site next time we push. Thanks Nick A couple docbook updates Key: HBASE-7691 URL: https://issues.apache.org/jira/browse/HBASE-7691 Project: HBase Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Trivial Fix For: 0.96.0 Attachments: 0001-HBASE-7691-A-couple-docbook-updates.patch I've noticed a couple mistakes in the book. - link to external (hadoop) documentation which is no longer available on {{hadoop.apache.org/common/docs/current}}. - small mistake in instructions for generating a release tarball. - mistake in descriptions of test profiles. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7684) NullPointerException in SecureClient when Call is cleaned up due to RPC timeout
[ https://issues.apache.org/jira/browse/HBASE-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564069#comment-13564069 ] cuijianwei commented on HBASE-7684: --- Thanks for your concern. I use the source code of SecureClient in 0.94.3 and get the exception stack trace as shown above. The diff patch is updated using dev-support/hbase_eclipse_formatter.xml. NullPointerException in SecureClient when Call is cleaned up due to RPC timeout Key: HBASE-7684 URL: https://issues.apache.org/jira/browse/HBASE-7684 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: check_call_null_before_set_in_secure_client.diff, check_call_null_before_set_value_in_secure_client.txt We find NullPointException when using SecureClient to access hbase. From the source code, we find that receiveResponse() will not check call != null before invoke call.setValue(...)(line 378 and 380 in SecureClient.java). However, as explained in receiveResponse() of HBaseClient, the call may have been cleaned up due to RPC timeout; therefore, it should be better to check call != null before invoke call.setValue(...) in SecureClient. The exception stack trace is : {code} 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 2013-01-28 12:11:20,060 [request-queuepool-10-thread-1] WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation - Failed all from region=sms_message,98,1358995149987.0c18f72df3f3f398f9f6dc83fe65afad., hostname=10.20.2.72, port=11600 java.util.concurrent.ExecutionException: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1544) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1396) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:937) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:777) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:752) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.put(HTablePool.java:397) at com.xiaomi.infra.hbase.business.dao.sms.SmsMessage.updateStatus(SmsMessage.java:245) at com.xiaomi.infra.hbase.business.client.sms.HMessageDao.doUpdateMessage(HMessageDao.java:165) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.xiaomi.infra.hbase.business.client.AsyncDaoWrapper.wrapperRunMethod(AsyncDaoWrapper.java:97) at com.xiaomi.infra.hbase.business.client.HBaseRequest.run(AsyncDaoWrapper.java:135) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Call to 10.20.2.72/10.20.2.72:11600 failed on local exception: java.io.IOException: Unexpected exception receiving call responses at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1056) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025) at org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:165) at $Proxy7.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1373) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1380) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at
[jira] [Commented] (HBASE-7692) Add utility class to generate ordered byte[] serialization
[ https://issues.apache.org/jira/browse/HBASE-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564070#comment-13564070 ] Anoop Sam John commented on HBASE-7692: --- So here the utility will take some type information of the value right? Add utility class to generate ordered byte[] serialization -- Key: HBASE-7692 URL: https://issues.apache.org/jira/browse/HBASE-7692 Project: HBase Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk The current Bytes utility class works, but produces output that does not maintain the native sort ordering of the input value. This results in, for example, a negative value that does not necessarily sort before a positive value. HBase should provide a canonical implementation of such a serialization format so that third-parties can reliably build on top of HBase. This will allow an implementation for HIVE-3634, HIVE-2599, or HIVE-2903 that is compatible with similar features in Pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1212) merge tool expects regions all have different sequence ids
[ https://issues.apache.org/jira/browse/HBASE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564071#comment-13564071 ] chunhui shen commented on HBASE-1212: - We will use sequenceId to sort HFiles, like in compaction. When merging, there is no overlap between these two files from different regions, I think it's ok even if two files have the same sequence ID since we don't use it to distingush the files in Store any more Patch looks good for me merge tool expects regions all have different sequence ids -- Key: HBASE-1212 URL: https://issues.apache.org/jira/browse/HBASE-1212 Project: HBase Issue Type: Bug Reporter: stack Assignee: Jean-Marc Spaggiari Attachments: failure.log, HBASE-1212.patch, HBASE-1212-v2.patch, HBASE-1212-v3-trunk.patch Currently merging two regions, the merge tool will compare their sequence ids. If same, it will decrement one. It needs to do this because on region open, files are keyed by their sequenceid; if two the same, one will erase the other. Well, with the move to the aggregating hfile format, the sequenceid is written when the file is created and its no longer written into an aside file but as metadata on to the end of the file. Changing the sequenceid is no longer an option. This issue is about figuring a solution for the rare case where two store files have same sequence id AND we want to merge the two regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira