[jira] [Commented] (HBASE-8630) Share Socket Connections for different HConnectionImplementations
[ https://issues.apache.org/jira/browse/HBASE-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669036#comment-13669036 ] Hadoop QA commented on HBASE-8630: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585159/8630-trunk-v2.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor org.apache.hadoop.hbase.master.TestTableLockManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5862//console This message is automatically generated. Share Socket Connections for different HConnectionImplementations - Key: HBASE-8630 URL: https://issues.apache.org/jira/browse/HBASE-8630 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: 8630-trunk-v1.txt, 8630-trunk-v2.txt In org.apache.hadoop.hbase.ipc.HBaseClient.java, socket connections are pooled by map as: {code} protected final PoolMapConnectionId, Connection connections; {code} The hashCode of ConnectionId is defined as: {code} public int hashCode() { return (address.hashCode() + PRIME * ( PRIME * System.identityHashCode(protocol) ^ (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout; } {code} As we can see, ticket.hashCode() will contribute to hashCode of ConnectionId. For hbase without authentication, the ticket should be a HadoopUser; while for hbase with authentication, the ticket should be a SecureHadoopUser. Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. On the other hand, when we use HTable to access hbase, firstly, we will invoke HBaseRPC.waitForProxy(...) to create a proxy for region server as follows: {code} server = (HRegionInterface) HBaseRPC.waitForProxy( serverInterfaceClass, HRegionInterface.VERSION, address, this.conf, this.maxRPCAttempts, this.rpcTimeout, this.rpcTimeout); {code}
[jira] [Commented] (HBASE-8625) Dependency version upgrade
[ https://issues.apache.org/jira/browse/HBASE-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669041#comment-13669041 ] Nicolas Liochon commented on HBASE-8625: I can do that. I also plan to propose an upgrade to hdfs. Dependency version upgrade -- Key: HBASE-8625 URL: https://issues.apache.org/jira/browse/HBASE-8625 Project: HBase Issue Type: Improvement Affects Versions: 0.98.0, 0.95.0 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Minor Fix For: 0.98.0 Attachments: 8625.v1.patch, 8625.v2.patch junit dependency should be scoped test We should use a newer versions of jaxb-api. One of our 3rd party would prefer a newer one: javax.xml.bind:jaxb-api:jar:2.1:compile (version managed from 2.2.2) Last is 2.2.4. Not mandatory, but should be done: guava 14.0.1 netty 3.6.6.Final commons-codec.version 1.8 jackson.version 1.9.3 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8610) Introduce interfaces to support MultiWAL
[ https://issues.apache.org/jira/browse/HBASE-8610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669072#comment-13669072 ] ramkrishna.s.vasudevan commented on HBASE-8610: --- To start this off, what are the different scenarios for MultiWAL. Some obvious ones are WAL per table WAL per group of regions(what type of group ) can be decided by the user WAL per CF? ( I think this is difficult to implement wrt atomicity per row across CFs) Any other use cases? The default case will be one WAL per RS and one for META. Introduce interfaces to support MultiWAL Key: HBASE-8610 URL: https://issues.apache.org/jira/browse/HBASE-8610 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.98.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 As the heading says this JIRA is specific to adding interfaces to support MultiWAL. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8610) Introduce interfaces to support MultiWAL
[ https://issues.apache.org/jira/browse/HBASE-8610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669088#comment-13669088 ] Anoop Sam John commented on HBASE-8610: --- bq.WAL per group of regions(what type of group ) can be decided by the user Yes some sort of plubbale (balancer kind of) thing which user can implement would be necessary.. Diff implementations of this can decide whether WAL per table or group of some regions or group of some regions of a particular table to go into one WAL. Introduce interfaces to support MultiWAL Key: HBASE-8610 URL: https://issues.apache.org/jira/browse/HBASE-8610 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.98.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 As the heading says this JIRA is specific to adding interfaces to support MultiWAL. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8496) Implement tags and the internals of how a tag should look like
[ https://issues.apache.org/jira/browse/HBASE-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669096#comment-13669096 ] ramkrishna.s.vasudevan commented on HBASE-8496: --- I have some patches ready for this. Before i could bring them for further discussion, -From the client perspective the tags will now be added as part of Puts? Put.add() will now have an option to pass tag array. One more option that we thought of is to have OperationAttributes and set the tags over there. But we need some CPs to take care of this so that these tags set in OperationAttr can be added to the KVs of the Put. - Tag will be an integral part of the KVs? - A sort of new hfilereader and writer is needed to read them and write them to the block byte buffer What are your suggestions on the above? I would come up with patches sooner. Thanks all. Implement tags and the internals of how a tag should look like -- Key: HBASE-8496 URL: https://issues.apache.org/jira/browse/HBASE-8496 Project: HBase Issue Type: New Feature Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 The intent of this JIRA comes from HBASE-7897. This would help us to decide on the structure and format of how the tags should look like. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669113#comment-13669113 ] Vinod commented on HBASE-8626: -- I think this becomes relevant when I want to remove all columns in a column family followed by adding some new columns to the same column family atomically. Here is my original use-case which lead to this, please suggest how else I can achieve the same? I have a HBase (v0.94.7) table with a single column family and columns are added to it over time. These columns are named as the timestamp they were created, so unless I query the row I do not know what all columns it has. Now given a row, I want to _atomically_ remove all the existing columns of this column family and add a new set of columns and values. So I thought of using HBase's RowMutations like: -- RowMutations mutations = new RowMutations(row); //delete the column family Delete delete = new Delete(row); delete.deleteFamily(cf); //add new columns Put put = new Put(row); put.add(cf, col1, v1); put.add(cf, col2, v2); //delete column family and add new columns to same family mutations.add(delete); mutations.add(put); table.mutateRow(mutations); -- But what this code ends up doing is just deleting the column family, it does not add the new columns to the column family. RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.94.7, 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669124#comment-13669124 ] Liang Xie commented on HBASE-8626: -- [~vinod], we(XiaoMi) had a similar scenario like yours, and we introduced a new DeteteFamilyVersion kv type and modified Delete/DeleteTracker/ScanDeleteTracker/ScanQueryMatcher, [~fenghh]. RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.94.7, 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8627: -- Attachment: HBASE-8627_Trunk-V2.patch HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8627: -- Status: Open (was: Patch Available) HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8627: -- Status: Patch Available (was: Open) HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669129#comment-13669129 ] Anoop Sam John commented on HBASE-8627: --- Thanks for the review Jon Sergey. {quote} What are we planning to remove? the meta entry? this message? {code} +if (!recordMetaRegion()) { + // Will remove later if we can fix it + errors.reportError(Fatal error: unable to get .META. region location. Exiting...); {code} {quote} This is an existing piece of code. This is the case when HBCK not able to get the META location from zookeeper. Right now we are not having a way to fix it. This comment means we can remove this error reporting once we can fix this issue. I am not fixing this/changing this comment. Pls see this code is moved from another method {code} -if (!recordMetaRegion()) { - // Will remove later if we can fix it - errors.reportError(Fatal error: unable to get root region location. Exiting...); - return false; -} {code} bq. We need to fix it twice? why? (explain with comment please) No need to fix twice. After fix calling hbck to assert no errors. At that time I can pass fix as false. A copy paste resulted in this param to be true. Corrected that now. bq.Null is for a coder. Maybe turn this message into something more admin friendly? Actually we will never get this message(as per the new code). Before the checkMetaRegion() we do read the meta location from the zookeeper and adding an entry in regionInfoMap. Also we record regions from RS online regions into regionInfoMap. In both these places I am having checks to make sure only one entry, corresponding to the meta, is getting added into regionInfoMap. If we can not get the meta location even from zookeeper, we exit with a fatal error message. What we need to check is the deployed servers being 0 or 1. I am just deleting this if check. This will address Sergey's comment as well HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669131#comment-13669131 ] Vinod commented on HBASE-8626: -- In the use-case above, the client does not know upfront what all columns the row/column-family has. I guess this would be a common use-case in schema-free data stores like HBase. The client now gets a new copy of the entire row, essentially new data for that row which might not have all the columns the row has currently. So it needs to atomically _replace_ the entire row with this new data. So one way I could think of is to use RowMutations to first delete entire column family and then Put the new columns to same column family. Another way would be to read the row first to figure out the current columns and then create a non-overlapping set of Puts and Deletes and add those to the RowMutations. But this a check-then-act scenario which can cause inconsistency. Also this causes multiple round trips to the server. Any other ways to address this use-case? RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.94.7, 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8630) Share Socket Connections for different HConnectionImplementations
[ https://issues.apache.org/jira/browse/HBASE-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669142#comment-13669142 ] Anoop Sam John commented on HBASE-8630: --- bq.Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. in 94 code base I can see a hashcode() impl in abstract class User. HadoopUser and SecureHadoopUser extends User {code} public int hashCode() { return ugi.hashCode(); } {code} Share Socket Connections for different HConnectionImplementations - Key: HBASE-8630 URL: https://issues.apache.org/jira/browse/HBASE-8630 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: 8630-trunk-v1.txt, 8630-trunk-v2.txt In org.apache.hadoop.hbase.ipc.HBaseClient.java, socket connections are pooled by map as: {code} protected final PoolMapConnectionId, Connection connections; {code} The hashCode of ConnectionId is defined as: {code} public int hashCode() { return (address.hashCode() + PRIME * ( PRIME * System.identityHashCode(protocol) ^ (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout; } {code} As we can see, ticket.hashCode() will contribute to hashCode of ConnectionId. For hbase without authentication, the ticket should be a HadoopUser; while for hbase with authentication, the ticket should be a SecureHadoopUser. Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. On the other hand, when we use HTable to access hbase, firstly, we will invoke HBaseRPC.waitForProxy(...) to create a proxy for region server as follows: {code} server = (HRegionInterface) HBaseRPC.waitForProxy( serverInterfaceClass, HRegionInterface.VERSION, address, this.conf, this.maxRPCAttempts, this.rpcTimeout, this.rpcTimeout); {code} Then HBaseRpc.getProxy(...) will be called as follows: {code}public static VersionedProtocol getProxy(Class? extends VersionedProtocol protocol, long clientVersion, InetSocketAddress addr, Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException { return getProxy(protocol, clientVersion, addr, User.getCurrent(), conf, factory, rpcTimeout); } {code} We can see, User.getCurrent() will be invoked to generate the ticket to build socket connection. User.getCurrent() is defined as: {code} public static User getCurrent() throws IOException { User user; if (IS_SECURE_HADOOP) { user = new SecureHadoopUser(); } else { user = new HadoopUser(); } if (user.getUGI() == null) { return null; } return user; } {code} Therefore, we will get different tickets when we create different proxies for the same region server, so that these proxies can't share the created socket connections and will create new socket connections even if they have the same HBaseConfiguration. We can use the following case to validate the description above: {code} public static void main(String args[]) throws Exception { Configuration conf = HBaseConfiguration.create(); for (int i = 0;; ++i) { HTable table = new HTable(conf, TestTable.testTableName); table.close(); } } {code} Each time we close the HTable, the created region server proxies will be closed as the underlying HConnectionImplementation will be closed. However, the created socket connections won't be closed and wait to be shared in future. Then, when we create HTable in the next turn, we will create server proxy again, get a new ticket and consequently create new socket connections. The created socket connections last turn can not be used any more. As the loop goes on, thousands of socket will be created to connect region servers until we get an exception to show no more sockets could be created. To fix the problem, maybe, we can use ticket.getName().hashCode() instead of ticket.hashCode()? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-8630) Share Socket Connections for different HConnectionImplementations
[ https://issues.apache.org/jira/browse/HBASE-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669142#comment-13669142 ] Anoop Sam John edited comment on HBASE-8630 at 5/29/13 10:51 AM: - bq.Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. in 94 code base I can see a hashcode() impl in abstract class User. HadoopUser and SecureHadoopUser extends User {code} public int hashCode() { return ugi.hashCode(); } {code} In Trunk {code} public static User getCurrent() throws IOException { User user = new SecureHadoopUser(); if (user.getUGI() == null) { return null; } return user; } {code} SecureHadoopUser extends User and I can see hashCode() impl in User.java was (Author: anoop.hbase): bq.Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. in 94 code base I can see a hashcode() impl in abstract class User. HadoopUser and SecureHadoopUser extends User {code} public int hashCode() { return ugi.hashCode(); } {code} Share Socket Connections for different HConnectionImplementations - Key: HBASE-8630 URL: https://issues.apache.org/jira/browse/HBASE-8630 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: 8630-trunk-v1.txt, 8630-trunk-v2.txt In org.apache.hadoop.hbase.ipc.HBaseClient.java, socket connections are pooled by map as: {code} protected final PoolMapConnectionId, Connection connections; {code} The hashCode of ConnectionId is defined as: {code} public int hashCode() { return (address.hashCode() + PRIME * ( PRIME * System.identityHashCode(protocol) ^ (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout; } {code} As we can see, ticket.hashCode() will contribute to hashCode of ConnectionId. For hbase without authentication, the ticket should be a HadoopUser; while for hbase with authentication, the ticket should be a SecureHadoopUser. Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. On the other hand, when we use HTable to access hbase, firstly, we will invoke HBaseRPC.waitForProxy(...) to create a proxy for region server as follows: {code} server = (HRegionInterface) HBaseRPC.waitForProxy( serverInterfaceClass, HRegionInterface.VERSION, address, this.conf, this.maxRPCAttempts, this.rpcTimeout, this.rpcTimeout); {code} Then HBaseRpc.getProxy(...) will be called as follows: {code}public static VersionedProtocol getProxy(Class? extends VersionedProtocol protocol, long clientVersion, InetSocketAddress addr, Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException { return getProxy(protocol, clientVersion, addr, User.getCurrent(), conf, factory, rpcTimeout); } {code} We can see, User.getCurrent() will be invoked to generate the ticket to build socket connection. User.getCurrent() is defined as: {code} public static User getCurrent() throws IOException { User user; if (IS_SECURE_HADOOP) { user = new SecureHadoopUser(); } else { user = new HadoopUser(); } if (user.getUGI() == null) { return null; } return user; } {code} Therefore, we will get different tickets when we create different proxies for the same region server, so that these proxies can't share the created socket connections and will create new socket connections even if they have the same HBaseConfiguration. We can use the following case to validate the description above: {code} public static void main(String args[]) throws Exception { Configuration conf = HBaseConfiguration.create(); for (int i = 0;; ++i) { HTable table = new HTable(conf, TestTable.testTableName); table.close(); } } {code} Each time we close the HTable, the created region server proxies will be closed as the underlying HConnectionImplementation will be closed. However, the created socket connections won't be closed and wait to be shared in future. Then, when we create HTable in the next turn, we will create server proxy again, get a new ticket and consequently create new socket connections. The created socket connections last turn can not be used any more. As the loop goes on, thousands of socket will be created to connect region servers until
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669152#comment-13669152 ] Andrew Purtell commented on HBASE-8626: --- I read this as a request to change RowMutation semantics from a bundle of ops to apply atomically at the exact same time to a bundle of ops to apply atomically, with each op applied at a motonically increasing time, with row locks providing mutual exclusion. It's logical enough, then a client can structure the RM with a DeleteColumn first and Puts to the same row+column after, as described for the use case described here. I think that _could_ be reasonable, but we should take care such that no ops outside the RowMutation bundle can have interleaving timestamps unless the client is providing them, and so is that complication worth it? RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.94.7, 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669153#comment-13669153 ] Hadoop QA commented on HBASE-8627: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585178/HBASE-8627_Trunk-V2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestAssignmentManagerOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5863//console This message is automatically generated. HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8627) HBCK can not fix meta not assigned issue
[ https://issues.apache.org/jira/browse/HBASE-8627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669175#comment-13669175 ] Anoop Sam John commented on HBASE-8627: --- Test failure seems not related. HBCK can not fix meta not assigned issue Key: HBASE-8627 URL: https://issues.apache.org/jira/browse/HBASE-8627 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Attachments: HBASE-8627_Trunk.patch, HBASE-8627_Trunk-V2.patch When meta table region is not assigned to any RS, HBCK run will get exception. I can see code added in checkMetaRegion() to solve this issue but it wont work. It still refers to ROOT region! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8640) ServerName in master may not initialize with the configured ipc address of hbase.master.ipc.address
rajeshbabu created HBASE-8640: - Summary: ServerName in master may not initialize with the configured ipc address of hbase.master.ipc.address Key: HBASE-8640 URL: https://issues.apache.org/jira/browse/HBASE-8640 Project: HBase Issue Type: Bug Components: master Reporter: rajeshbabu Assignee: rajeshbabu Fix For: 0.98.0, 0.95.2, 0.94.9 We are starting rpc server with default interface hostname or configured ipc address {code} this.rpcServer = HBaseRPC.getServer(this, new Class?[]{HMasterInterface.class, HMasterRegionInterface.class}, initialIsa.getHostName(), // This is bindAddress if set else it's hostname initialIsa.getPort(), numHandlers, 0, // we dont use high priority handlers in master conf.getBoolean(hbase.rpc.verbose, false), conf, 0); // this is a DNC w/o high priority handlers {code} But we are initialzing servername with default hostname always master znode also have this hostname. {code} String hostname = Strings.domainNamePointerToHostName(DNS.getDefaultHost( conf.get(hbase.master.dns.interface, default), conf.get(hbase.master.dns.nameserver, default))); ... this.serverName = new ServerName(hostname, this.isa.getPort(), System.currentTimeMillis()); {code} If both default interface hostname and configured ipc address are not same clients will get MasterNotRunningException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8641) IndexBuilder example : CF name of the src table is hard coded
Anoop Sam John created HBASE-8641: - Summary: IndexBuilder example : CF name of the src table is hard coded Key: HBASE-8641 URL: https://issues.apache.org/jira/browse/HBASE-8641 Project: HBase Issue Type: Bug Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor When running the IndexBuilder example we can pass the tablename, family name and qualifier name for indexing that data. But in the code the family name is hard coded to be only attributes. So this example will work only when family name of the src table is attributes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8641) IndexBuilder example : CF name of the src table is hard coded
[ https://issues.apache.org/jira/browse/HBASE-8641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-8641: -- Attachment: HBASE-8641.patch Very small patch with removal of one line. IndexBuilder example : CF name of the src table is hard coded - Key: HBASE-8641 URL: https://issues.apache.org/jira/browse/HBASE-8641 Project: HBase Issue Type: Bug Affects Versions: 0.95.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Priority: Minor Attachments: HBASE-8641.patch When running the IndexBuilder example we can pass the tablename, family name and qualifier name for indexing that data. But in the code the family name is hard coded to be only attributes. So this example will work only when family name of the src table is attributes -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669224#comment-13669224 ] Anoop Sam John commented on HBASE-8534: --- 1. All the tests are large test category. Better we can have functional tests starting mini cluster? Have a look at TestCopyTable. Now every thing is mocked. 2. Ya as Nick said TestCopyTable need a change. Now we have a basic test for this class. 3. Just see HBASE-8641. As per that some assertion you will need to change TestMapReduceExamples#testIndexBuilder() +assertEquals(attributes, configuration.get(index.familyname)); fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8630) Share Socket Connections for different HConnectionImplementations
[ https://issues.apache.org/jira/browse/HBASE-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669257#comment-13669257 ] Ted Yu commented on HBASE-8630: --- Looks like the issue has been resolved in HBASE-8222 User class should implement equals() and hashCode() Share Socket Connections for different HConnectionImplementations - Key: HBASE-8630 URL: https://issues.apache.org/jira/browse/HBASE-8630 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: 8630-trunk-v1.txt, 8630-trunk-v2.txt In org.apache.hadoop.hbase.ipc.HBaseClient.java, socket connections are pooled by map as: {code} protected final PoolMapConnectionId, Connection connections; {code} The hashCode of ConnectionId is defined as: {code} public int hashCode() { return (address.hashCode() + PRIME * ( PRIME * System.identityHashCode(protocol) ^ (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout; } {code} As we can see, ticket.hashCode() will contribute to hashCode of ConnectionId. For hbase without authentication, the ticket should be a HadoopUser; while for hbase with authentication, the ticket should be a SecureHadoopUser. Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. On the other hand, when we use HTable to access hbase, firstly, we will invoke HBaseRPC.waitForProxy(...) to create a proxy for region server as follows: {code} server = (HRegionInterface) HBaseRPC.waitForProxy( serverInterfaceClass, HRegionInterface.VERSION, address, this.conf, this.maxRPCAttempts, this.rpcTimeout, this.rpcTimeout); {code} Then HBaseRpc.getProxy(...) will be called as follows: {code}public static VersionedProtocol getProxy(Class? extends VersionedProtocol protocol, long clientVersion, InetSocketAddress addr, Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException { return getProxy(protocol, clientVersion, addr, User.getCurrent(), conf, factory, rpcTimeout); } {code} We can see, User.getCurrent() will be invoked to generate the ticket to build socket connection. User.getCurrent() is defined as: {code} public static User getCurrent() throws IOException { User user; if (IS_SECURE_HADOOP) { user = new SecureHadoopUser(); } else { user = new HadoopUser(); } if (user.getUGI() == null) { return null; } return user; } {code} Therefore, we will get different tickets when we create different proxies for the same region server, so that these proxies can't share the created socket connections and will create new socket connections even if they have the same HBaseConfiguration. We can use the following case to validate the description above: {code} public static void main(String args[]) throws Exception { Configuration conf = HBaseConfiguration.create(); for (int i = 0;; ++i) { HTable table = new HTable(conf, TestTable.testTableName); table.close(); } } {code} Each time we close the HTable, the created region server proxies will be closed as the underlying HConnectionImplementation will be closed. However, the created socket connections won't be closed and wait to be shared in future. Then, when we create HTable in the next turn, we will create server proxy again, get a new ticket and consequently create new socket connections. The created socket connections last turn can not be used any more. As the loop goes on, thousands of socket will be created to connect region servers until we get an exception to show no more sockets could be created. To fix the problem, maybe, we can use ticket.getName().hashCode() instead of ticket.hashCode()? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7839) Add a script to simulate a dead machine during the integration tests
[ https://issues.apache.org/jira/browse/HBASE-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669258#comment-13669258 ] Nicolas Liochon commented on HBASE-7839: Unixes don't like the sticky bit will shell tools. It's not possible (anymore) for perl for example. I don't remember if it works with ruby or python, but the Unixes would aim at make it not possible. Add a script to simulate a dead machine during the integration tests Key: HBASE-7839 URL: https://issues.apache.org/jira/browse/HBASE-7839 Project: HBase Issue Type: New Feature Components: test Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Minor Fix For: 0.95.1 Attachments: 7839.v1.patch A kill -9 cannot be used to simulate a dead box. With a kill -9: 1) the sockets are closed by the OS 2) the next calls will get immediately a connection refused. Physically unplugging the cable does work. To automate this, the best solution seems to be using a firewall config. It's a little bit complicated however, because we need to refuse the incoming packets. If we block the out connection, it's detected by the OS and there is no wait on the socket. Moreover you need to be root, or use a program with a sticky bit to launch the script that changes the firewall config. This will be usable from the integration tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7840) Enhance the java it framework to start stop a distributed hbase hadoop cluster
[ https://issues.apache.org/jira/browse/HBASE-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669260#comment-13669260 ] Nicolas Liochon commented on HBASE-7840: My local version has evolved since my initial submission. I need to work a little on it to make interesting for everybody Enhance the java it framework to start stop a distributed hbase hadoop cluster --- Key: HBASE-7840 URL: https://issues.apache.org/jira/browse/HBASE-7840 Project: HBase Issue Type: New Feature Components: test Affects Versions: 0.95.2 Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Minor Fix For: 0.95.1 Attachments: 7840.v1.patch, 7840.v3.patch Needs are to use a development version of HBase HDFS 1 2. Ideally, should be nicely backportable to 0.94 to allow comparisons and regression tests between versions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8630) Share Socket Connections for different HConnectionImplementations
[ https://issues.apache.org/jira/browse/HBASE-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669261#comment-13669261 ] cuijianwei commented on HBASE-8630: --- Thanks for [~ted_yu]and [~anoopsamjohn]. HBASE 8222 actually has resolved the problems and we could apply that patch. Share Socket Connections for different HConnectionImplementations - Key: HBASE-8630 URL: https://issues.apache.org/jira/browse/HBASE-8630 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: 8630-trunk-v1.txt, 8630-trunk-v2.txt In org.apache.hadoop.hbase.ipc.HBaseClient.java, socket connections are pooled by map as: {code} protected final PoolMapConnectionId, Connection connections; {code} The hashCode of ConnectionId is defined as: {code} public int hashCode() { return (address.hashCode() + PRIME * ( PRIME * System.identityHashCode(protocol) ^ (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout; } {code} As we can see, ticket.hashCode() will contribute to hashCode of ConnectionId. For hbase without authentication, the ticket should be a HadoopUser; while for hbase with authentication, the ticket should be a SecureHadoopUser. Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. On the other hand, when we use HTable to access hbase, firstly, we will invoke HBaseRPC.waitForProxy(...) to create a proxy for region server as follows: {code} server = (HRegionInterface) HBaseRPC.waitForProxy( serverInterfaceClass, HRegionInterface.VERSION, address, this.conf, this.maxRPCAttempts, this.rpcTimeout, this.rpcTimeout); {code} Then HBaseRpc.getProxy(...) will be called as follows: {code}public static VersionedProtocol getProxy(Class? extends VersionedProtocol protocol, long clientVersion, InetSocketAddress addr, Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException { return getProxy(protocol, clientVersion, addr, User.getCurrent(), conf, factory, rpcTimeout); } {code} We can see, User.getCurrent() will be invoked to generate the ticket to build socket connection. User.getCurrent() is defined as: {code} public static User getCurrent() throws IOException { User user; if (IS_SECURE_HADOOP) { user = new SecureHadoopUser(); } else { user = new HadoopUser(); } if (user.getUGI() == null) { return null; } return user; } {code} Therefore, we will get different tickets when we create different proxies for the same region server, so that these proxies can't share the created socket connections and will create new socket connections even if they have the same HBaseConfiguration. We can use the following case to validate the description above: {code} public static void main(String args[]) throws Exception { Configuration conf = HBaseConfiguration.create(); for (int i = 0;; ++i) { HTable table = new HTable(conf, TestTable.testTableName); table.close(); } } {code} Each time we close the HTable, the created region server proxies will be closed as the underlying HConnectionImplementation will be closed. However, the created socket connections won't be closed and wait to be shared in future. Then, when we create HTable in the next turn, we will create server proxy again, get a new ticket and consequently create new socket connections. The created socket connections last turn can not be used any more. As the loop goes on, thousands of socket will be created to connect region servers until we get an exception to show no more sockets could be created. To fix the problem, maybe, we can use ticket.getName().hashCode() instead of ticket.hashCode()? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8630) Share Socket Connections for different HConnectionImplementations
[ https://issues.apache.org/jira/browse/HBASE-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8630: -- Resolution: Duplicate Status: Resolved (was: Patch Available) Share Socket Connections for different HConnectionImplementations - Key: HBASE-8630 URL: https://issues.apache.org/jira/browse/HBASE-8630 Project: HBase Issue Type: Improvement Components: Client Affects Versions: 0.94.3 Reporter: cuijianwei Attachments: 8630-trunk-v1.txt, 8630-trunk-v2.txt In org.apache.hadoop.hbase.ipc.HBaseClient.java, socket connections are pooled by map as: {code} protected final PoolMapConnectionId, Connection connections; {code} The hashCode of ConnectionId is defined as: {code} public int hashCode() { return (address.hashCode() + PRIME * ( PRIME * System.identityHashCode(protocol) ^ (ticket == null ? 0 : ticket.hashCode()) )) ^ rpcTimeout; } {code} As we can see, ticket.hashCode() will contribute to hashCode of ConnectionId. For hbase without authentication, the ticket should be a HadoopUser; while for hbase with authentication, the ticket should be a SecureHadoopUser. Neither HadoopUser nor SecureHadoopUser override hashCode() method, therefore, two tickets have the same hashCode only when they refer to the same object. On the other hand, when we use HTable to access hbase, firstly, we will invoke HBaseRPC.waitForProxy(...) to create a proxy for region server as follows: {code} server = (HRegionInterface) HBaseRPC.waitForProxy( serverInterfaceClass, HRegionInterface.VERSION, address, this.conf, this.maxRPCAttempts, this.rpcTimeout, this.rpcTimeout); {code} Then HBaseRpc.getProxy(...) will be called as follows: {code}public static VersionedProtocol getProxy(Class? extends VersionedProtocol protocol, long clientVersion, InetSocketAddress addr, Configuration conf, SocketFactory factory, int rpcTimeout) throws IOException { return getProxy(protocol, clientVersion, addr, User.getCurrent(), conf, factory, rpcTimeout); } {code} We can see, User.getCurrent() will be invoked to generate the ticket to build socket connection. User.getCurrent() is defined as: {code} public static User getCurrent() throws IOException { User user; if (IS_SECURE_HADOOP) { user = new SecureHadoopUser(); } else { user = new HadoopUser(); } if (user.getUGI() == null) { return null; } return user; } {code} Therefore, we will get different tickets when we create different proxies for the same region server, so that these proxies can't share the created socket connections and will create new socket connections even if they have the same HBaseConfiguration. We can use the following case to validate the description above: {code} public static void main(String args[]) throws Exception { Configuration conf = HBaseConfiguration.create(); for (int i = 0;; ++i) { HTable table = new HTable(conf, TestTable.testTableName); table.close(); } } {code} Each time we close the HTable, the created region server proxies will be closed as the underlying HConnectionImplementation will be closed. However, the created socket connections won't be closed and wait to be shared in future. Then, when we create HTable in the next turn, we will create server proxy again, get a new ticket and consequently create new socket connections. The created socket connections last turn can not be used any more. As the loop goes on, thousands of socket will be created to connect region servers until we get an exception to show no more sockets could be created. To fix the problem, maybe, we can use ticket.getName().hashCode() instead of ticket.hashCode()? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Gorshkov updated HBASE-8534: Attachment: HBASE-8534-trunk-e.patch HBASE-8534-0.94-e.patch fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94-e.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk-e.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Gorshkov updated HBASE-8534: Attachment: (was: HBASE-8534-0.94-e.patch) fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Gorshkov updated HBASE-8534: Attachment: (was: HBASE-8534-trunk-e.patch) fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Gorshkov updated HBASE-8534: Attachment: HBASE-8534-trunk-e.patch HBASE-8534-0.94-e.patch fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94-e.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk-e.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669301#comment-13669301 ] Aleksey Gorshkov commented on HBASE-8534: - fixed This patch consider HBASE-8609 and HBASE-8641. patch HBASE-8534-0.94-e.patch for branch-0.94 patch HBASE-8534-trunk-e.patch for branch-0.95 and trunk fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94-e.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk-e.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669335#comment-13669335 ] Hadoop QA commented on HBASE-8534: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585213/HBASE-8534-trunk-e.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 29 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5864//console This message is automatically generated. fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94-e.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk-e.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8632) TestAccessController#testGlobalAuthorizationForNewRegisteredRS fails intermittently due to prolonged region movement
[ https://issues.apache.org/jira/browse/HBASE-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669337#comment-13669337 ] Andrew Purtell commented on HBASE-8632: --- Are you planning to put up a patch Ted? TestAccessController#testGlobalAuthorizationForNewRegisteredRS fails intermittently due to prolonged region movement Key: HBASE-8632 URL: https://issues.apache.org/jira/browse/HBASE-8632 Project: HBase Issue Type: Test Reporter: Ted Yu Here is test output from https://builds.apache.org/job/PreCommit-HBASE-Build/5852/testReport/org.apache.hadoop.hbase.security.access/TestAccessController/testGlobalAuthorizationForNewRegisteredRS/: {code} 2013-05-28 19:54:07,360 INFO [PRI IPC Server handler 0 on 58791] regionserver.HRegionServer(3535): Received close region: 4700bdd89a351b506da910dd6a268d44Transitioning in ZK: yes. Version of ZK closing node:0. Destination server:asf001.sp2.ygridcore.net,36189,1369770847288 2013-05-28 19:54:07,360 DEBUG [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] handler.CloseRegionHandler(125): Processing close of testtable2,,1369770846203.4700bdd89a351b506da910dd6a268d44. 2013-05-28 19:54:07,360 DEBUG [IPC Server handler 2 on 55436] master.AssignmentManager(1672): Sent CLOSE to asf001.sp2.ygridcore.net,58791,136977068 for region testtable2,,1369770846203.4700bdd89a351b506da910dd6a268d44. 2013-05-28 19:54:07,360 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 0 times. 2013-05-28 19:54:07,361 DEBUG [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] regionserver.HRegion(928): Closing testtable2,,1369770846203.4700bdd89a351b506da910dd6a268d44.: disabling compactions flushes 2013-05-28 19:54:07,361 DEBUG [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] regionserver.HRegion(950): Updates disabled for region testtable2,,1369770846203.4700bdd89a351b506da910dd6a268d44. 2013-05-28 19:54:07,362 INFO [StoreCloserThread-testtable2,,1369770846203.4700bdd89a351b506da910dd6a268d44.-1] regionserver.HStore(663): Closed f1 2013-05-28 19:54:07,362 INFO [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] regionserver.HRegion(1007): Closed testtable2,,1369770846203.4700bdd89a351b506da910dd6a268d44. 2013-05-28 19:54:07,362 INFO [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] regionserver.HRegionServer(4060): Adding moved region record: 4700bdd89a351b506da910dd6a268d44 to asf001.sp2.ygridcore.net,36189,1369770847288:36189 as of 1 2013-05-28 19:54:07,363 DEBUG [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] zookeeper.ZKAssign(786): regionserver:58791-0x13eecb1db960001 Attempting to transition node 4700bdd89a351b506da910dd6a268d44 from M_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED 2013-05-28 19:54:07,561 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 1 times. 2013-05-28 19:54:07,761 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 2 times. 2013-05-28 19:54:07,961 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 3 times. 2013-05-28 19:54:08,162 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 4 times. 2013-05-28 19:54:08,362 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 5 times. 2013-05-28 19:54:08,562 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 6 times. 2013-05-28 19:54:08,763 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 7 times. 2013-05-28 19:54:08,963 DEBUG [pool-1-thread-1] access.TestAccessController(1955): Waiting for region to be opened. Already retried 8 times. 2013-05-28 19:54:09,164 INFO [pool-1-thread-1] client.HBaseAdmin$5(834): Started disable of testtable 2013-05-28 19:54:09,320 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(307): master:55436-0x13eecb1db96 Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/region-in-transition/4700bdd89a351b506da910dd6a268d44 2013-05-28 19:54:09,320 DEBUG [RS_CLOSE_REGION-asf001.sp2.ygridcore.net,58791,136977068-1] zookeeper.ZKAssign(862): regionserver:58791-0x13eecb1db960001 Successfully transitioned node 4700bdd89a351b506da910dd6a268d44 from M_ZK_REGION_CLOSING to RS_ZK_REGION_CLOSED 2013-05-28
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669341#comment-13669341 ] Jean-Marc Spaggiari commented on HBASE-8639: Is there a way to remove the synchronization part? Or it's really required? If it's read only, that might not be fully required? very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8642) [Snapshot] List and delete snapshot by table
Julian Zhou created HBASE-8642: -- Summary: [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.95.0, 0.98.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.1, 0.95.2, 0.95.0 Support list and delete snapshot by table name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Zhou updated HBASE-8642: --- Attachment: 8642-trunk-0.95-v0.patch [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Attachments: 8642-trunk-0.95-v0.patch Support list and delete snapshot by table name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-8642: --- Status: Patch Available (was: Open) [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.95.0, 0.98.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.1, 0.95.2, 0.95.0 Attachments: 8642-trunk-0.95-v0.patch Support list and delete snapshot by table name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669349#comment-13669349 ] Matteo Bertozzi commented on HBASE-8642: overall looks good to me... Inside deleteSnapshotsByTable() you can use the SnapshotDescription of the for loop, instead of building a new one. Do you want also add the shell support? [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Attachments: 8642-trunk-0.95-v0.patch Support list and delete snapshot by table name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669352#comment-13669352 ] Hadoop QA commented on HBASE-8534: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585218/HBASE-8534-trunk-e.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 29 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5865//console This message is automatically generated. fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94-e.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk-e.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-8639: - Assignee: Ted Yu very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Status: Patch Available (was: Open) very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Attachment: 8639-v1.txt myAddress should be initialized only once. Please comment on patch v1. very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8634) Fix potential null pointer dereference in HRegionServer and TableLockChecker
[ https://issues.apache.org/jira/browse/HBASE-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669389#comment-13669389 ] Ted Yu commented on HBASE-8634: --- Integrated patch v2 to trunk. Thanks for the review, Chunhui. [~stack]: Do you want this in 0.95 ? Fix potential null pointer dereference in HRegionServer and TableLockChecker Key: HBASE-8634 URL: https://issues.apache.org/jira/browse/HBASE-8634 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 8634-v1.txt, 8634-v2.txt From https://builds.apache.org/job/PreCommit-HBASE-Build/5852//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html : {code} tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.multi(RpcController, ClientProtos$MultiRequest)/td /tr tr class=detailrow0 td/ td p id=N69637 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.multi(RpcController, ClientProtos$MultiRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 3176]br/Known null at HRegionServer.java:[line 3174]/p ... tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(RpcController, ClientProtos$MutateRequest)/td /tr tr class=detailrow1 td/ td p id=N69712 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(RpcController, ClientProtos$MutateRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 2832]br/Known null at HRegionServer.java:[line 2830]/p ... tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.replay(RpcController, ClientProtos$MultiRequest)/td /tr tr class=detailrow0 td/ td p id=N69787 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.replay(RpcController, ClientProtos$MultiRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 3742]br/Known null at HRegionServer.java:[line 3740]/p ... tdPossible null pointer dereference of data in org.apache.hadoop.hbase.util.hbck.TableLockChecker$1.handleMetadata(byte[])/td /tr tr class=detailrow1 td/ td p id=N73108 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.util.hbck.TableLockChecker$1br/In method org.apache.hadoop.hbase.util.hbck.TableLockChecker$1.handleMetadata(byte[])br/Value loaded from databr/Dereferenced at TableLockChecker.java:[line 68]br/Known null at TableLockChecker.java:[line 60]/p {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669394#comment-13669394 ] Jean-Marc Spaggiari commented on HBASE-8639: Sound good to me. Let's see what hadoop-QA will say. very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669400#comment-13669400 ] Ted Yu commented on HBASE-8626: --- In Vinod's example, only one column family was involved. However, we should consider multiple column families if the feature is supported. {code} Delete delete1 = new Delete(row); delete1.deleteFamily(cf1); //add new columns Put put1 = new Put(row); put1.add(cf1, col1, v1); put1.add(cf1, col2, v2); Delete delete2 = new Delete(row); delete2.deleteFamily(cf2); //add new columns Put put2 = new Put(row); put2.add(cf2, col3, v3); put2.add(cf2, col4, v4); {code} In the above case, only two distinct timestamps are needed t and t+1 where the Deletes carry t and Puts carry t+1. RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.94.7, 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669410#comment-13669410 ] Anoop Sam John commented on HBASE-8639: --- The change make sense. Checked the class DNS.java in hadoop 2.0.3-alpha in which getDefaultHost() wont be having this overhead for us. JFYI {code} public static String getDefaultHost(String strInterface, String nameserver) throws UnknownHostException { if (default.equals(strInterface)) { return cachedHostname; } {code} very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8439) [replication] Remove ReplicationZookeeper class
[ https://issues.apache.org/jira/browse/HBASE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669428#comment-13669428 ] stack commented on HBASE-8439: -- Is this a blocker on 0.95.x or just 0.98? Thanks. [replication] Remove ReplicationZookeeper class --- Key: HBASE-8439 URL: https://issues.apache.org/jira/browse/HBASE-8439 Project: HBase Issue Type: Improvement Components: Replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Blocker Once all of the logic in ReplicationZookeeper has been refactored into three interfaces (for status, queues, and peers), there is almost no logic in ReplicationZookeeper. It can now be removed and classes that call it should be refactored to call the state interfaces directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669438#comment-13669438 ] Hadoop QA commented on HBASE-8642: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585227/8642-trunk-0.95-v0.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5866//console This message is automatically generated. [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Attachments: 8642-trunk-0.95-v0.patch Support list and delete snapshot by table name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669455#comment-13669455 ] Lars Hofhansl commented on HBASE-8626: -- The semantics of RowMutation are that all edits are applied in one MVCC snapshot and written to a single WALEdit. There are no assumptions about Put/Delete timestamps whatsoever. The client is free to set timestamps as desired. The Vinod's example above the Puts just need to have a timestamp higher than the Delete. The client can make that so. {code} long now = System.currentTimeMillis(); Delete delete = new Delete(row); delete.deleteFamily(cf1, now); Put put1 = new Put(row); put1.add(cf1,col1,now+1); {code} Let's not make this more complicated that it has to be. I maintain my -1 on changing this. RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.94.7, 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8626: - Fix Version/s: (was: 0.94.7) RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8626) RowMutations fail when Delete and Put on same columnFamily/column/row
[ https://issues.apache.org/jira/browse/HBASE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669464#comment-13669464 ] Ted Yu commented on HBASE-8626: --- What if another client uses the following code where the value of now is the same as the value of now obtained above ? {code} Delete delete = new Delete(row); delete.deleteFamily(cf1, now); Put put2 = new Put(row); put2.add(cf1, col2, now+1); {code} Both put1 and put2 would go through, right ? RowMutations fail when Delete and Put on same columnFamily/column/row - Key: HBASE-8626 URL: https://issues.apache.org/jira/browse/HBASE-8626 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.7, 0.95.0 Environment: Ubuntu 12.04, HBase 0.94.7 Reporter: Vinod Assignee: Ted Yu Fix For: 0.95.1 Attachments: 8626-v1.txt, TestRowMutations.java, tests_for_row_mutations1.patch When RowMutations have a Delete followed by Put to same column family or columns or rows, only the Delete is happening while the Put is ignored so atomicity of RowMutations is broken for such cases. Attached is a unit test where the following tests are failing: - testDeleteCFThenPutInSameCF: Delete a column family and then Put to same column family. - testDeleteColumnThenPutSameColumn: Delete a column and then Put to same column. - testDeleteRowThenPutSameRow: Delete a row and then Put to same row -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669465#comment-13669465 ] Hadoop QA commented on HBASE-8639: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585234/8639-v1.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5867//console This message is automatically generated. very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Updated] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8629: - Attachment: 8629-shorter-classnames.txt More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669473#comment-13669473 ] stack commented on HBASE-8629: -- I like your suggestions Sergey. Let me take looksee. I could see that we'd log full region name on first opening but then all subsequent logs would be encoded name only? Let me add small patch to do as [~jmspaggi] suggests. More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669480#comment-13669480 ] Lars Hofhansl commented on HBASE-8639: -- Patch looks good to me. We're not supporting movable devices anyway (like a Laptop that change networks, etc). +1 After this one is committed we can investigate Anoop's suggestion further. very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8439) [replication] Remove ReplicationZookeeper class
[ https://issues.apache.org/jira/browse/HBASE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669478#comment-13669478 ] Chris Trezzo commented on HBASE-8439: - Just 0.98. Will specify. Thanks! [replication] Remove ReplicationZookeeper class --- Key: HBASE-8439 URL: https://issues.apache.org/jira/browse/HBASE-8439 Project: HBase Issue Type: Improvement Components: Replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Blocker Once all of the logic in ReplicationZookeeper has been refactored into three interfaces (for status, queues, and peers), there is almost no logic in ReplicationZookeeper. It can now be removed and classes that call it should be refactored to call the state interfaces directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8439) [replication] Remove ReplicationZookeeper class
[ https://issues.apache.org/jira/browse/HBASE-8439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Trezzo updated HBASE-8439: Fix Version/s: 0.98.0 [replication] Remove ReplicationZookeeper class --- Key: HBASE-8439 URL: https://issues.apache.org/jira/browse/HBASE-8439 Project: HBase Issue Type: Improvement Components: Replication Reporter: Chris Trezzo Assignee: Chris Trezzo Priority: Blocker Fix For: 0.98.0 Once all of the logic in ReplicationZookeeper has been refactored into three interfaces (for status, queues, and peers), there is almost no logic in ReplicationZookeeper. It can now be removed and classes that call it should be refactored to call the state interfaces directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669483#comment-13669483 ] Ted Yu commented on HBASE-8639: --- [~stack]: Do you want this fix in 0.95 ? very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669485#comment-13669485 ] stack commented on HBASE-8629: -- This is a useful formatting page: http://www.tutorialspoint.com/log4j/log4j_patternlayout.htm While we are in here, we should upgrade? Anyone w/ experience on log4j 2? I do not see it in a repo. More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Attachment: 8639-0.94.txt very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Status: Open (was: Patch Available) very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8534) fix coverage org.apache.hadoop.hbase.mapreduce
[ https://issues.apache.org/jira/browse/HBASE-8534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669488#comment-13669488 ] Nick Dimiduk commented on HBASE-8534: - These changes look great. A couple questions - how does the {{LauncherSecurityManager}} behave when multiple tests are run in parallel? - from {{TestLoadIncrementalHFiles}}, was {{verifyAssignedSequenceNumber}} simply never called? from LauncherSecurityManager: {noformat} public LauncherSecurityManager() { reset(); securityManager = System.getSecurityManager(); System.setSecurityManager(this); } {noformat} Construction should not modify system state. I'd rather see each test manage setting and resetting the manager rather than part of it happen in the tests and part in this constructor. From TestTableMapReduceUtil: {noformat} assertEquals( org.apache.hadoop.io.serializer.WritableSerialization, + org.apache.hadoop.hbase.mapreduce.MutationSerialization, + org.apache.hadoop.hbase.mapreduce.ResultSerialization, + org.apache.hadoop.hbase.mapreduce.KeyValueSerialization, job.getConfiguration().get(io.serializations)); {noformat} What happens if the user's environment has other serializers installed, won't this test always fail for them? Instead, these assertEquals should be replaced by asserts over string containment. I'm open to suggestions as for how to assert order of containment. TestHRegionPartitioner doesn't run a job, it only spins up a cluster. I think it can be marked a MediumTest. TestGroupingTableMapper can be marked a SmallTest. TestDriver can be marked a SmallTest. fix coverage org.apache.hadoop.hbase.mapreduce -- Key: HBASE-8534 URL: https://issues.apache.org/jira/browse/HBASE-8534 Project: HBase Issue Type: Test Affects Versions: 0.94.8, 0.95.2 Reporter: Aleksey Gorshkov Attachments: HBASE-8534-0.94-d.patch, HBASE-8534-0.94-e.patch, HBASE-8534-0.94.patch, HBASE-8534-trunk-a.patch, HBASE-8534-trunk-b.patch, HBASE-8534-trunk-c.patch, HBASE-8534-trunk-d.patch, HBASE-8534-trunk-e.patch, HBASE-8534-trunk.patch fix coverage org.apache.hadoop.hbase.mapreduce patch HBASE-8534-0.94.patch for branch-0.94 patch HBASE-8534-trunk.patch for branch-0.95 and trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669492#comment-13669492 ] stack commented on HBASE-8639: -- [~ted_yu] Yes please. Nice fix. very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669489#comment-13669489 ] Sergey Shelukhin commented on HBASE-8629: - +1 on current patch assuming it works, full classnames are not useful More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable.getscanner in multithread env.
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669495#comment-13669495 ] Sergey Shelukhin commented on HBASE-8639: - +1 very poor performance of htable.getscanner in multithread env. -- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5083) Backport from trunk: Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Marcel updated HBASE-5083: --- Attachment: HBASE-5083.patch Added a test fix to the patch. Backport from trunk: Backup HMaster should have http infoport open with link to the active master - Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8643) Do not log full classnames in logs, just the last two levels
[ https://issues.apache.org/jira/browse/HBASE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8643: - Attachment: 8629-shorter-classnames.txt Here is patch that Sergey +1's up on parent issue. Do not log full classnames in logs, just the last two levels Key: HBASE-8643 URL: https://issues.apache.org/jira/browse/HBASE-8643 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Attachments: 8629-shorter-classnames.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669498#comment-13669498 ] Matteo Bertozzi commented on HBASE-8629: the patch will print out something like regionserver.HRegion, wal.FSHLog and similar? I like have an half full path like hbase.xyz.className to been able to grep just for hbase. logs... no one else has mixed logs? More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8643) Do not log full classnames in logs, just the last two levels
stack created HBASE-8643: Summary: Do not log full classnames in logs, just the last two levels Key: HBASE-8643 URL: https://issues.apache.org/jira/browse/HBASE-8643 Project: HBase Issue Type: Improvement Reporter: stack Assignee: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8638) add logging to compaction policy
[ https://issues.apache.org/jira/browse/HBASE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669499#comment-13669499 ] Sergey Shelukhin commented on HBASE-8638: - it is an error condition, so should be ok to log info... add logging to compaction policy Key: HBASE-8638 URL: https://issues.apache.org/jira/browse/HBASE-8638 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Trivial Attachments: HBASE-8638-v0.patch We are seeing some strange patterns with current compaction policy in some contexts (with normal writes, no bulk load). It seems like some logging is needed to understand what is going on, similar to old default policy -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8643) Do not log full classnames in logs, just the last two levels
[ https://issues.apache.org/jira/browse/HBASE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8643: - Issue Type: Sub-task (was: Improvement) Parent: HBASE-8629 Do not log full classnames in logs, just the last two levels Key: HBASE-8643 URL: https://issues.apache.org/jira/browse/HBASE-8643 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8643) Do not log full classnames in logs, just the last two levels
[ https://issues.apache.org/jira/browse/HBASE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-8643. -- Resolution: Fixed Fix Version/s: 0.95.1 0.98.0 Hadoop Flags: Reviewed Committed to trunk and 0.95. Do not log full classnames in logs, just the last two levels Key: HBASE-8643 URL: https://issues.apache.org/jira/browse/HBASE-8643 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Fix For: 0.98.0, 0.95.1 Attachments: 8629-shorter-classnames.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8639) very poor performance of htable#getscanner in multithreaded environment
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Summary: very poor performance of htable#getscanner in multithreaded environment (was: very poor performance of htable.getscanner in multithread env.) very poor performance of htable#getscanner in multithreaded environment --- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669508#comment-13669508 ] stack commented on HBASE-8629: -- Ooops. I committed patch over on hbase-8643. Do we have much of anything but hbase in the logs matteo? Let me try w/ level 3 (I can make an amendment to what I committed). More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable#getscanner in multithreaded environment
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669506#comment-13669506 ] Ted Yu commented on HBASE-8639: --- Integrated to 0.95 and trunk. Thanks for the reviews, Stack, Jean-Marc, Anoop, Lars and Sergey. Waiting for 0.94 test suite to complete locally. very poor performance of htable#getscanner in multithreaded environment --- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669517#comment-13669517 ] stack commented on HBASE-8629: -- I tried it. Sometimes it is good as here: {code} 2013-05-29 11:01:38,007 DEBUG [RegionServer:0;192.168.67.83,56595,1369850491424-EventThread] hbase.regionserver.SplitLogWorker: tasks arrived or departed or 13 2013-05-29 11:01:30,580 INFO [main] hbase.util.VersionInfo: HBase 0.97.0-SNAPSHOT or 18 2013-05-29 11:01:30,713 INFO [main] zookeeper.server.ZooKeeperServer: Server environment:java.version=1.6.0_43 or 105 2013-05-29 11:01:31,310 INFO [IPC Server handler 29 on 56593] hadoop.ipc.RpcServer: IPC Server handler 29 on 56593: starting {code} but it breaks down here. {code} 160 2013-05-29 11:01:31,763 DEBUG [Master:0;192.168.67.83,56593,1369850491032] master.cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner or 178 2013-05-29 11:01:32,259 INFO [RegionServer:0;192.168.67.83,56595,1369850491424] regionserver.wal.FSHLog: New WAL /var/folders/bp/2z1cykc92rs6j24251cg__phgp/T/hbase-stack/hbase/.logs/192.168.67.8# {code} when hbase classes are deep down in a package. You still want three levels [~mbertozzi]? More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5083) Backport from trunk: Backup HMaster should have http infoport open with link to the active master
[ https://issues.apache.org/jira/browse/HBASE-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669513#comment-13669513 ] Hadoop QA commented on HBASE-5083: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585259/HBASE-5083.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5868//console This message is automatically generated. Backport from trunk: Backup HMaster should have http infoport open with link to the active master - Key: HBASE-5083 URL: https://issues.apache.org/jira/browse/HBASE-5083 Project: HBase Issue Type: Improvement Components: master Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Assignee: Cody Marcel Fix For: 0.94.9 Attachments: backup_master.png, HBASE-5083.patch, HBASE-5083.patch, HBASE-5083.patch, master.png Without ssh'ing and jps/ps'ing, it is difficult to see if a backup hmaster is up. It seems like it would be good for a backup hmaster to have a basic web page up on the info port so that users could see that it is up. Also it should probably either provide a link to the active master or automatically forward to the active master. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669519#comment-13669519 ] Matteo Bertozzi commented on HBASE-8629: [~saint@gmail.com] I see stuff from hbase, hdfs, and zookeeper in my logs... so that's why I like to grep for hbase. More log edits: we log too much --- Key: HBASE-8629 URL: https://issues.apache.org/jira/browse/HBASE-8629 Project: HBase Issue Type: Umbrella Reporter: stack Attachments: 8629-shorter-classnames.txt + Our thread names are nice and descriptive and are useful particularly when running standlone mode but it gets silly printing out full thread names when distributed on each log (we could turn off printing thread name but can be helpful). + Do we have to print the fully qualified path for all files every time? Lines get really long and hard to read. Ditto for region names. + Can we print out just the class name rather than full package qualified class name. For example: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: region transitioned to opened in zookeeper: {NAME = 'IntegrationTestDataIngestWithChaosMonkey,c28f5c19,1369509660096.4e91d414f14a75cf367609ce9c4377c7.', STARTKEY = 'c28f5c19', ENDKEY = 'ccbc', ENCODED = 4e91d414f14a75cf367609ce9c4377c7,}, server: sss-2.ent.cloudera.com,60020,1369507494038 {code} Should the above just be: {code} 2013-05-25 12:21:01,912 DEBUG [RS_OPEN_REGION-sss-2.ent.cloudera.com,60020,1369507494038-2] OpenRegionHandler: region 4e91d414f14a75cf367609ce9c4377c7 transitioned to opened {code} + Some logging is bound to freak operators. We print out the full stack trace when we are logging failed assignment because of connection refused. + Should make sure we tell a decent story when INFO level only enabled. At moment it is scattershot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8631) Meta Region First Recovery
[ https://issues.apache.org/jira/browse/HBASE-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669526#comment-13669526 ] Jeffrey Zhong commented on HBASE-8631: -- Could someone else please review the path? The patch solves the issue that recovery sometimes may get stuck because meta isn't available from integration tests on distributedLogReplay. Thanks. Meta Region First Recovery -- Key: HBASE-8631 URL: https://issues.apache.org/jira/browse/HBASE-8631 Project: HBase Issue Type: Bug Components: MTTR Reporter: Jeffrey Zhong Assignee: Jeffrey Zhong Attachments: hbase-8631.patch, hbase-8631-v2.patch We have a separate wal for meta region. While log splitting logic haven't taken the advantage of this and splitlogworker still picks a wal file randomly. Imaging if we have multiple region servers including meta RS fails about the same time while meta wal is recovered last, all failed regions have to wait meta recovered and then can be online again. The open JIRA is to let splitlogworker to pick a meta wal file firstly and then others. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8642) [Snapshot] List and delete snapshot by table
[ https://issues.apache.org/jira/browse/HBASE-8642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669532#comment-13669532 ] Jesse Yates commented on HBASE-8642: nit: bq. Delete existing snapshots of the given table. to bq. Delete all exisiting snapshots of a table Just want to call out more pointedly that it deletes all of them. +1, even without the above. [Snapshot] List and delete snapshot by table Key: HBASE-8642 URL: https://issues.apache.org/jira/browse/HBASE-8642 Project: HBase Issue Type: Improvement Components: snapshots Affects Versions: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Reporter: Julian Zhou Assignee: Julian Zhou Priority: Minor Fix For: 0.98.0, 0.95.0, 0.95.1, 0.95.2 Attachments: 8642-trunk-0.95-v0.patch Support list and delete snapshot by table name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8636) Backport KeyValue Codec to 0.94 (HBASE-7413)
[ https://issues.apache.org/jira/browse/HBASE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669556#comment-13669556 ] Jesse Yates commented on HBASE-8636: [~sershe] i had the same concerns while working through the patch. It seems to work fine, except for managing replication logs which set the compression context on the HLog.Entry, after it has been parsed. I wanted to avoid having to create an extra WALEditCodec in the ReplicationManger or modifying objects too many references deep... But then again, its might make it a little cleaner - let me give it a shot. Don't think we need a test here - we have a pretty comprehensive suite around the WAL reading/writing, so if that all works, we should be fine. Backport KeyValue Codec to 0.94 (HBASE-7413) Key: HBASE-8636 URL: https://issues.apache.org/jira/browse/HBASE-8636 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.94.9 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.94.9 Attachments: HBASE-8636-0.94.patch Backport the KeyValueCodec/WALEditCodec stuff from trunk to 0.94 to support a more pluggable KeyValue serialization. This doesn't include all the Cell code added to trunk, but rather just providing a small layer of indirection for the serialization of a KeyValue in the WAL. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8636) Backport KeyValue Codec to 0.94 (HBASE-7413)
[ https://issues.apache.org/jira/browse/HBASE-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-8636: --- Attachment: HBASE-8636-0.94-v2.patch Attaching updated diff - dropping Entry.setCompressionContext and just managing the compression for the WALEdit in the codec. Backport KeyValue Codec to 0.94 (HBASE-7413) Key: HBASE-8636 URL: https://issues.apache.org/jira/browse/HBASE-8636 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.94.9 Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Fix For: 0.94.9 Attachments: HBASE-8636-0.94.patch, HBASE-8636-0.94-v2.patch Backport the KeyValueCodec/WALEditCodec stuff from trunk to 0.94 to support a more pluggable KeyValue serialization. This doesn't include all the Cell code added to trunk, but rather just providing a small layer of indirection for the serialization of a KeyValue in the WAL. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8634) Fix potential null pointer dereference in HRegionServer and TableLockChecker
[ https://issues.apache.org/jira/browse/HBASE-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669572#comment-13669572 ] Hudson commented on HBASE-8634: --- Integrated in HBase-TRUNK #4147 (See [https://builds.apache.org/job/HBase-TRUNK/4147/]) HBASE-8634 Fix potential null pointer dereference in HRegionServer and TableLockChecker (Ted Yu) (Revision 1487533) Result = SUCCESS tedyu : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/hbck/TableLockChecker.java Fix potential null pointer dereference in HRegionServer and TableLockChecker Key: HBASE-8634 URL: https://issues.apache.org/jira/browse/HBASE-8634 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 8634-v1.txt, 8634-v2.txt From https://builds.apache.org/job/PreCommit-HBASE-Build/5852//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html : {code} tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.multi(RpcController, ClientProtos$MultiRequest)/td /tr tr class=detailrow0 td/ td p id=N69637 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.multi(RpcController, ClientProtos$MultiRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 3176]br/Known null at HRegionServer.java:[line 3174]/p ... tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(RpcController, ClientProtos$MutateRequest)/td /tr tr class=detailrow1 td/ td p id=N69712 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(RpcController, ClientProtos$MutateRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 2832]br/Known null at HRegionServer.java:[line 2830]/p ... tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.replay(RpcController, ClientProtos$MultiRequest)/td /tr tr class=detailrow0 td/ td p id=N69787 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.replay(RpcController, ClientProtos$MultiRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 3742]br/Known null at HRegionServer.java:[line 3740]/p ... tdPossible null pointer dereference of data in org.apache.hadoop.hbase.util.hbck.TableLockChecker$1.handleMetadata(byte[])/td /tr tr class=detailrow1 td/ td p id=N73108 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.util.hbck.TableLockChecker$1br/In method org.apache.hadoop.hbase.util.hbck.TableLockChecker$1.handleMetadata(byte[])br/Value loaded from databr/Dereferenced at TableLockChecker.java:[line 68]br/Known null at TableLockChecker.java:[line 60]/p {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8634) Fix potential null pointer dereference in HRegionServer and TableLockChecker
[ https://issues.apache.org/jira/browse/HBASE-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8634: -- Resolution: Fixed Fix Version/s: 0.98.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Fix potential null pointer dereference in HRegionServer and TableLockChecker Key: HBASE-8634 URL: https://issues.apache.org/jira/browse/HBASE-8634 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 0.98.0 Attachments: 8634-v1.txt, 8634-v2.txt From https://builds.apache.org/job/PreCommit-HBASE-Build/5852//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html : {code} tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.multi(RpcController, ClientProtos$MultiRequest)/td /tr tr class=detailrow0 td/ td p id=N69637 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.multi(RpcController, ClientProtos$MultiRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 3176]br/Known null at HRegionServer.java:[line 3174]/p ... tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(RpcController, ClientProtos$MutateRequest)/td /tr tr class=detailrow1 td/ td p id=N69712 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.mutate(RpcController, ClientProtos$MutateRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 2832]br/Known null at HRegionServer.java:[line 2830]/p ... tdPossible null pointer dereference of rpcc in org.apache.hadoop.hbase.regionserver.HRegionServer.replay(RpcController, ClientProtos$MultiRequest)/td /tr tr class=detailrow0 td/ td p id=N69787 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.regionserver.HRegionServerbr/In method org.apache.hadoop.hbase.regionserver.HRegionServer.replay(RpcController, ClientProtos$MultiRequest)br/Value loaded from rpccbr/Dereferenced at HRegionServer.java:[line 3742]br/Known null at HRegionServer.java:[line 3740]/p ... tdPossible null pointer dereference of data in org.apache.hadoop.hbase.util.hbck.TableLockChecker$1.handleMetadata(byte[])/td /tr tr class=detailrow1 td/ td p id=N73108 style=display: none; a href=#NP_NULL_ON_SOME_PATHBug type NP_NULL_ON_SOME_PATH (click for details)/a br/In class org.apache.hadoop.hbase.util.hbck.TableLockChecker$1br/In method org.apache.hadoop.hbase.util.hbck.TableLockChecker$1.handleMetadata(byte[])br/Value loaded from databr/Dereferenced at TableLockChecker.java:[line 68]br/Known null at TableLockChecker.java:[line 60]/p {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8629) More log edits: we log too much
[ https://issues.apache.org/jira/browse/HBASE-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669589#comment-13669589 ] stack commented on HBASE-8629: -- Grepping for 'hbase.*' won't get you all of the hbase logs if we log only the last three subparts of the class full package name. If we go to four parts, we might as well just keep the full class package name -- there will be little savings. Here is sample w/ three parts of the fully-qualified package name showing: {code} 151 2013-05-29 11:01:31,752 DEBUG [Master:0;192.168.67.83,56593,1369850491032] hbase.executor.ExecutorService: Starting executor service name=MASTER_OPEN_REGION-192.168.67.83,56593,1369850491032, corePoo# 152 2013-05-29 11:01:31,752 DEBUG [Master:0;192.168.67.83,56593,1369850491032] hbase.executor.ExecutorService: Starting executor service name=MASTER_CLOSE_REGION-192.168.67.83,56593,1369850491032, corePo# 153 2013-05-29 11:01:31,752 DEBUG [Master:0;192.168.67.83,56593,1369850491032] hbase.executor.ExecutorService: Starting executor service name=MASTER_SERVER_OPERATIONS-192.168.67.83,56593,1369850491032, c# 154 2013-05-29 11:01:31,752 DEBUG [Master:0;192.168.67.83,56593,1369850491032] hbase.executor.ExecutorService: Starting executor service name=MASTER_META_SERVER_OPERATIONS-192.168.67.83,56593,13698504910# 155 2013-05-29 11:01:31,752 DEBUG [Master:0;192.168.67.83,56593,1369850491032] hbase.executor.ExecutorService: Starting executor service name=MASTER_TABLE_OPERATIONS-192.168.67.83,56593,1369850491032, co# 156 2013-05-29 11:01:31,754 DEBUG [Master:0;192.168.67.83,56593,1369850491032] master.cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner 157 2013-05-29 11:01:31,758 DEBUG [Master:0;192.168.67.83,56593,1369850491032] master.cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.snapshot.SnapshotLogCleaner 158 2013-05-29 11:01:31,761 DEBUG [Master:0;192.168.67.83,56593,1369850491032] master.cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.HFileLinkCleaner 159 2013-05-29 11:01:31,762 DEBUG [Master:0;192.168.67.83,56593,1369850491032] master.cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner 160 2013-05-29 11:01:31,763 DEBUG [Master:0;192.168.67.83,56593,1369850491032] master.cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner 161 2013-05-29 11:01:31,810 INFO [Master:0;192.168.67.83,56593,1369850491032] org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 162 2013-05-29 11:01:31,867 INFO [Master:0;192.168.67.83,56593,1369850491032] hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) 163 2013-05-29 11:01:31,875 INFO [Master:0;192.168.67.83,56593,1369850491032] hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listen# 164 2013-05-29 11:01:31,877 INFO [Master:0;192.168.67.83,56593,1369850491032] hadoop.http.HttpServer: listener.getLocalPort() returned 60010 webServer.getConnectors()[0].getLocalPort() returned 60010 165 2013-05-29 11:01:31,877 INFO [Master:0;192.168.67.83,56593,1369850491032] hadoop.http.HttpServer: Jetty bound to port 60010 166 2013-05-29 11:01:31,877 INFO [Master:0;192.168.67.83,56593,1369850491032] org.mortbay.log: jetty-6.1.26 167 2013-05-29 11:01:32,203 INFO [Master:0;192.168.67.83,56593,1369850491032] org.mortbay.log: Started SelectChannelConnector@0.0.0.0:60010 168 2013-05-29 11:01:32,203 INFO [Master:0;192.168.67.83,56593,1369850491032] hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting mi# 169 2013-05-29 11:01:32,228 INFO [IPC Server handler 0 on 56593] hbase.master.ServerManager: Registering server=192.168.67.83,56595,1369850491424 170 2013-05-29 11:01:32,234 DEBUG [RegionServer:0;192.168.67.83,56595,1369850491424] hbase.regionserver.HRegionServer: Config from master: hbase.rootdir=file:///var/folders/bp/2z1cykc92rs6j24251cg__ph000# 171 2013-05-29 11:01:32,234 DEBUG [RegionServer:0;192.168.67.83,56595,1369850491424] hbase.regionserver.HRegionServer: Config from master: fs.default.name=file:/ 172 2013-05-29 11:01:32,234 INFO [RegionServer:0;192.168.67.83,56595,1369850491424] hbase.regionserver.HRegionServer: Master passed us a different hostname to use; was=192.168.67.83, but now=192.168.67.# 173 2013-05-29 11:01:32,239 DEBUG [RegionServer:0;192.168.67.83,56595,1369850491424] hbase.fs.HFileSystem: The file system is not a DistributedFileSystem. Skipping on block location reordering 174 2013-05-29 11:01:32,240 DEBUG [RegionServer:0;192.168.67.83,56595,1369850491424] hbase.regionserver.HRegionServer:
[jira] [Updated] (HBASE-8639) very poor performance of htable#getscanner in multithreaded environment
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Fix Version/s: 0.94.9 0.95.2 0.98.0 Hadoop Flags: Reviewed Integrated to 0.94 as well. very poor performance of htable#getscanner in multithreaded environment --- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8644) hbck -fixReferenceFiles checks for write permissions in / even if sidelineDir is specified
Clint Heath created HBASE-8644: -- Summary: hbck -fixReferenceFiles checks for write permissions in / even if sidelineDir is specified Key: HBASE-8644 URL: https://issues.apache.org/jira/browse/HBASE-8644 Project: HBase Issue Type: Bug Components: hbck Affects Versions: 0.94.2 Reporter: Clint Heath Priority: Minor We found a situation where a temporary reference file got left in META's recovered.edits directory and this was causing hbck to report inconsistency with HBase. When trying to repair it with the hbck -fixReferenceFiles command, an access control exception was thrown, because hbck was trying to put the sidelined reference file in the root of HDFS and the hbase user did not have permissions. But even after specifying a sidelineDir (hbck -fixReferenceFiles -sidelineDir hdfs://namespace/tmp), the same exception was thrown (logs obfuscated below): INFO util.HBaseFsck: Trying to sildeline reference filehdfs://namespace/hbase/.META./1028685194/recovered.edits/016.temp to hdfs://namespace/016.temp Exception in thread main org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode=/:hdfs:hadoop:drwxr-xr-x Since the operation requires moving data around in HDFS, we tried the same command as the hdfs user which should have no permission problems, but an exception was thrown nonetheless: ... WARN util.HBaseFsck: Got AccessControlException when preCheckPermission org.apache.hadoop.security.AccessControlException: Permission denied: action=WRITE path=hdfs://namespace/hbase/-ROOT- user=hdfs Ultimately in order to get this working, we ran this command to tell hbck to not run it's prechecks: hbase hbck -fixReferenceFiles -sidelineDir hdfs://namespace/tmp -ignorePreCheckPermission A few thoughts here: 1) it seems that / is a bad default location to attempt to sideline files 2) shouldn't hbck be run as the hbase user? If so, it should sideline files to a directory it has write permissions to. 3) if we expect hbck to be run as the hdfs user, permissions should not be denied that user when HDFS itself will not deny them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-8639) very poor performance of htable#getscanner in multithreaded environment
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-8639. --- Resolution: Fixed very poor performance of htable#getscanner in multithreaded environment --- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8645) Change ServerName so it uses hostname only, not FQDN hostnames
stack created HBASE-8645: Summary: Change ServerName so it uses hostname only, not FQDN hostnames Key: HBASE-8645 URL: https://issues.apache.org/jira/browse/HBASE-8645 Project: HBase Issue Type: Sub-task Reporter: stack No need of dragging around domain part of a hostname when we make ServerNames. Rather than do a.example.org,6,12345 as we currently do, just output: 1,6,12345. This will make names displayed in UI and or shell smaller. Will also tighten up our logs a little, especially where ServerName is part of a thrad name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8645) Change ServerName so it uses hostname only, not FQDN hostnames
[ https://issues.apache.org/jira/browse/HBASE-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8645: - Attachment: 8645.txt On construction of a ServerName, if a domain part, drop it before creating ServerName. Small change in JVMClusterUtil to shorten Master to M and RegionServer to RS when used as part of thread names. Change ServerName so it uses hostname only, not FQDN hostnames --- Key: HBASE-8645 URL: https://issues.apache.org/jira/browse/HBASE-8645 Project: HBase Issue Type: Sub-task Reporter: stack Attachments: 8645.txt No need of dragging around domain part of a hostname when we make ServerNames. Rather than do a.example.org,6,12345 as we currently do, just output: 1,6,12345. This will make names displayed in UI and or shell smaller. Will also tighten up our logs a little, especially where ServerName is part of a thrad name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8645) Change ServerName so it uses hostname only, not FQDN hostnames
[ https://issues.apache.org/jira/browse/HBASE-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8645: - Assignee: stack Status: Patch Available (was: Open) Change ServerName so it uses hostname only, not FQDN hostnames --- Key: HBASE-8645 URL: https://issues.apache.org/jira/browse/HBASE-8645 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Attachments: 8645.txt No need of dragging around domain part of a hostname when we make ServerNames. Rather than do a.example.org,6,12345 as we currently do, just output: 1,6,12345. This will make names displayed in UI and or shell smaller. Will also tighten up our logs a little, especially where ServerName is part of a thrad name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8639) very poor performance of htable#getscanner in multithreaded environment
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669711#comment-13669711 ] Jean-Daniel Cryans commented on HBASE-8639: --- [~yuzhih...@gmail.com] can you set a more descriptive title? very poor performance of htable#getscanner in multithreaded environment --- Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8639) Poor performance of htable#getscanner in multithreaded environment due to DNS.getDefaultHost() being called in ScannerCallable#prepare()
[ https://issues.apache.org/jira/browse/HBASE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-8639: -- Summary: Poor performance of htable#getscanner in multithreaded environment due to DNS.getDefaultHost() being called in ScannerCallable#prepare() (was: very poor performance of htable#getscanner in multithreaded environment) Poor performance of htable#getscanner in multithreaded environment due to DNS.getDefaultHost() being called in ScannerCallable#prepare() Key: HBASE-8639 URL: https://issues.apache.org/jira/browse/HBASE-8639 Project: HBase Issue Type: Bug Affects Versions: 0.94.7 Reporter: Raymond Liu Assignee: Ted Yu Fix For: 0.98.0, 0.95.2, 0.94.9 Attachments: 8639-0.94.txt, 8639-v1.txt Hi, I am running a app on top of phoenix which will fork say around 100+ thread to call htable.getscanner(scan) to do parallel scan ( say each scan is actually targeting one Region), And each scan will only match a few result and return thus will be very fast. under this case, I found that the htable.getscanner(scan) op itself runs pretty slow. by profiling with jvisualvm. I found 90% of app time is cost on org.apache.hadoop.net.DNS.getDefaultHost. Which been invoked by scannnercallable.checkIfRegionServerIsRemote. The root cause is that DNS.getDefaultHost involves synchronized methods in java.net.Inet4AddressImpl which have the 100+ thread to lock and wait upon each other. each call to DNS.getDefaultHost cost around 30ms, while in another case, I run single thread to call 100K times DNS.getDefaultHost , each cost leas than 0.06ms. By hacking the code and remove the call to checkIfRegionServerIsRemote, my app runs 5 times faster, say, 50K op in my app cost 200+ seconds instead of 1000+ seconds. by check the code further, I found this checkIfRegionServerIsRemote seems just for use of metrics collection. ( or maybe retry logic?) I am wondering that could this been removed or switch to some other implementation? so that cases like mine which run large number of small scan with multi threads could performance way better? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7680) implement compaction policy for stripe compactions
[ https://issues.apache.org/jira/browse/HBASE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669725#comment-13669725 ] Sergey Shelukhin commented on HBASE-7680: - rb feedback implement compaction policy for stripe compactions -- Key: HBASE-7680 URL: https://issues.apache.org/jira/browse/HBASE-7680 Project: HBase Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7680-latest-with-dependencies.patch, HBASE-7680-latest-with-dependencies.patch, HBASE-7680-latest-with-dependencies.patch, HBASE-7680-v0.patch, HBASE-7680-v10.patch, HBASE-7680-v10.patch, HBASE-7680-v11.patch, HBASE-7680-v1.patch, HBASE-7680-v2.patch, HBASE-7680-v3.patch, HBASE-7680-v4.patch, HBASE-7680-v5.patch, HBASE-7680-v6.patch, HBASE-7680-v7.patch, HBASE-7680-v8.patch, HBASE-7680-v9.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7680) implement compaction policy for stripe compactions
[ https://issues.apache.org/jira/browse/HBASE-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7680: Attachment: HBASE-7680-latest-with-dependencies.patch HBASE-7680-v11.patch implement compaction policy for stripe compactions -- Key: HBASE-7680 URL: https://issues.apache.org/jira/browse/HBASE-7680 Project: HBase Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-7680-latest-with-dependencies.patch, HBASE-7680-latest-with-dependencies.patch, HBASE-7680-latest-with-dependencies.patch, HBASE-7680-v0.patch, HBASE-7680-v10.patch, HBASE-7680-v10.patch, HBASE-7680-v11.patch, HBASE-7680-v1.patch, HBASE-7680-v2.patch, HBASE-7680-v3.patch, HBASE-7680-v4.patch, HBASE-7680-v5.patch, HBASE-7680-v6.patch, HBASE-7680-v7.patch, HBASE-7680-v8.patch, HBASE-7680-v9.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8646) Intermittent TestIOFencing#testFencingAroundCompaction failure due to region getting stuck in compaction
Ted Yu created HBASE-8646: - Summary: Intermittent TestIOFencing#testFencingAroundCompaction failure due to region getting stuck in compaction Key: HBASE-8646 URL: https://issues.apache.org/jira/browse/HBASE-8646 Project: HBase Issue Type: Test Reporter: Ted Yu From http://54.241.6.143/job/HBase-TRUNK/org.apache.hbase$hbase-server/348/testReport/junit/org.apache.hadoop.hbase/TestIOFencing/testFencingAroundCompaction/ (the underlying region is tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.): {code} 2013-05-29 19:25:20,363 DEBUG [pool-1-thread-1] catalog.CatalogTracker(208): Stopping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@6280d069 2013-05-29 19:25:20,366 INFO [pool-1-thread-1] hbase.TestIOFencing(255): Waiting for compaction to be about to start 2013-05-29 19:25:20,367 DEBUG [pool-1-thread-1] hbase.TestIOFencing$CompactionBlockerRegion(107): waiting for compaction to block 2013-05-29 19:25:20,367 DEBUG [pool-1-thread-1] hbase.TestIOFencing$CompactionBlockerRegion(109): compaction block reached 2013-05-29 19:25:20,367 INFO [pool-1-thread-1] hbase.TestIOFencing(257): Starting a new server 2013-05-29 19:25:20,424 DEBUG [pool-1-thread-1] client.HConnectionManager(2811): regionserver/ip-10-197-74-184.us-west-1.compute.internal/10.197.74.184:0 HConnection server-to-server retries=100 ... 2013-05-29 19:25:20,861 INFO [pool-1-thread-1] hbase.TestIOFencing(260): Killing region server ZK lease ... 2013-05-29 19:25:21,030 DEBUG [RS_CLOSE_REGION-ip-10-197-74-184.us-west-1.compute.internal,37836,1369855503920-0] handler.CloseRegionHandler(125): Processing close of tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1. 2013-05-29 19:25:21,031 DEBUG [RS_CLOSE_REGION-ip-10-197-74-184.us-west-1.compute.internal,37836,1369855503920-0] regionserver.HRegion(928): Closing tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1.: disabling compactions flushes 2013-05-29 19:25:21,031 DEBUG [RS_CLOSE_REGION-ip-10-197-74-184.us-west-1.compute.internal,37836,1369855503920-0] regionserver.HRegion(1022): waiting for 1 compactions to complete for region tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1. ... 2013-05-29 19:25:27,037 INFO [pool-1-thread-1] hbase.TestIOFencing(265): Waiting for the new server to pick up the region tabletest,,1369855507443.c251a1d71e75fed8e490db63419edcf1. {code} The test started new region server. However, the region got stuck in: {code} public void waitForFlushesAndCompactions() { synchronized (writestate) { while (writestate.compacting 0 || writestate.flushing) { LOG.debug(waiting for + writestate.compacting + compactions + (writestate.flushing ? cache flush : ) + to complete for region + this); try { writestate.wait(); {code} This led to the timeout: {code} assertTrue(Timed out waiting for new server to open region, System.currentTimeMillis() - startWaitTime 6); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-8647) ChaosMonkey NPE
stack created HBASE-8647: Summary: ChaosMonkey NPE Key: HBASE-8647 URL: https://issues.apache.org/jira/browse/HBASE-8647 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Running tests I see this from time to time: {code} 19 2013-05-29 13:42:37,868 INFO [Thread-476] util.ChaosMonkey$RestartRandomRs(274): Performing action: Restart random region server 20 2013-05-29 13:42:37,869 WARN [Thread-476] util.ChaosMonkey$PeriodicRandomActionPolicy(578): Exception occured during performing action: java.lang.NullPointerException 21 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$Action.getCurrentServers(ChaosMonkey.java:160) 22 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$RestartRandomRs.perform(ChaosMonkey.java:275) 23 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicRandomActionPolicy.runOneIteration(ChaosMonkey.java:576) 24 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicPolicy.run(ChaosMonkey.java:488) 25 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$CompositeSequentialPolicy.run(ChaosMonkey.java:458) 26 ,...at java.lang.Thread.run(Thread.java:680) {code} Our monkey has killed everything. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-8647) ChaosMonkey NPE
[ https://issues.apache.org/jira/browse/HBASE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-8647: - Attachment: 8647.txt Allow returning no found servers. (Test will just timeout now... need to look at that too) ChaosMonkey NPE --- Key: HBASE-8647 URL: https://issues.apache.org/jira/browse/HBASE-8647 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 8647.txt Running tests I see this from time to time: {code} 19 2013-05-29 13:42:37,868 INFO [Thread-476] util.ChaosMonkey$RestartRandomRs(274): Performing action: Restart random region server 20 2013-05-29 13:42:37,869 WARN [Thread-476] util.ChaosMonkey$PeriodicRandomActionPolicy(578): Exception occured during performing action: java.lang.NullPointerException 21 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$Action.getCurrentServers(ChaosMonkey.java:160) 22 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$RestartRandomRs.perform(ChaosMonkey.java:275) 23 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicRandomActionPolicy.runOneIteration(ChaosMonkey.java:576) 24 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicPolicy.run(ChaosMonkey.java:488) 25 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$CompositeSequentialPolicy.run(ChaosMonkey.java:458) 26 ,...at java.lang.Thread.run(Thread.java:680) {code} Our monkey has killed everything. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8645) Change ServerName so it uses hostname only, not FQDN hostnames
[ https://issues.apache.org/jira/browse/HBASE-8645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669734#comment-13669734 ] Hadoop QA commented on HBASE-8645: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12585290/8645.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRegionObserverScannerOpenHook Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5869//console This message is automatically generated. Change ServerName so it uses hostname only, not FQDN hostnames --- Key: HBASE-8645 URL: https://issues.apache.org/jira/browse/HBASE-8645 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Attachments: 8645.txt No need of dragging around domain part of a hostname when we make ServerNames. Rather than do a.example.org,6,12345 as we currently do, just output: 1,6,12345. This will make names displayed in UI and or shell smaller. Will also tighten up our logs a little, especially where ServerName is part of a thrad name. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8647) ChaosMonkey NPE
[ https://issues.apache.org/jira/browse/HBASE-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669733#comment-13669733 ] Enis Soztutar commented on HBASE-8647: -- +1 ChaosMonkey NPE --- Key: HBASE-8647 URL: https://issues.apache.org/jira/browse/HBASE-8647 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Attachments: 8647.txt Running tests I see this from time to time: {code} 19 2013-05-29 13:42:37,868 INFO [Thread-476] util.ChaosMonkey$RestartRandomRs(274): Performing action: Restart random region server 20 2013-05-29 13:42:37,869 WARN [Thread-476] util.ChaosMonkey$PeriodicRandomActionPolicy(578): Exception occured during performing action: java.lang.NullPointerException 21 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$Action.getCurrentServers(ChaosMonkey.java:160) 22 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$RestartRandomRs.perform(ChaosMonkey.java:275) 23 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicRandomActionPolicy.runOneIteration(ChaosMonkey.java:576) 24 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$PeriodicPolicy.run(ChaosMonkey.java:488) 25 ,...at org.apache.hadoop.hbase.util.ChaosMonkey$CompositeSequentialPolicy.run(ChaosMonkey.java:458) 26 ,...at java.lang.Thread.run(Thread.java:680) {code} Our monkey has killed everything. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira