[jira] [Commented] (HBASE-5543) Add a keepalive option for IPC connections
[ https://issues.apache.org/jira/browse/HBASE-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227333#comment-13227333 ] Himanshu Vashishtha commented on HBASE-5543: What is the scope of the uuid token in the Coprocessor context. The current approach is to subdivide the calls in terms of regions; then submit a Callable object for each of these Regions; obtain a Future object on each of these calls and block until all of them have returned some result. So, a uuid from the client side server proxy object, or a list of uuids from all the involved regions, or something more elegant which I am missing. Please suggest. Thanks. Add a keepalive option for IPC connections -- Key: HBASE-5543 URL: https://issues.apache.org/jira/browse/HBASE-5543 Project: HBase Issue Type: Improvement Components: client, coprocessors, ipc Reporter: Andrew Purtell On the user list someone wrote in with a connection failure due to a long running coprocessor: {quote} On Wed, Mar 7, 2012 at 10:59 PM, raghavendhra rahul wrote: 2012-03-08 12:03:09,475 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder, call execCoprocessor([B@50cb21, getProjection(), rpc version=1, client version=0, methodsFingerPrint=0), rpc version=1, client version=29, methodsFingerPrint=54742778 from 10.184.17.26:46472: output error 2012-03-08 12:03:09,476 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020 caught: java.nio.channels.ClosedChannelException {quote} I suggested in response we might consider give our RPC a keepalive option for calls that may run for a long time (like execCoprocessor). LarsH +1ed the idea: {quote} +1 on keepalive. It's a shame (especially for long running server code) to do all the work, just to find out at the end that the client has given up. Or maybe there should be a way to cancel an operation if the clients decides it does not want to wait any longer (PostgreSQL does that for example). Here that would mean the server would need to check periodically and coprocessors would need to be written to support that - so maybe that's no-starter. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227362#comment-13227362 ] Phabricator commented on HBASE-5542: sc has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). @Lars: Sorry. I also missed your previous reply in the JIRA. The JIRA is kind of flooded by HadoopQA and Phabricator and becomes harder to find things. For the doMiniBatchPut, I think it will be better if we can do it in another patch. It will be easier to review and actually move faster. About the time bound, I think it makes sense to leave it to the application to decide. What do you think about this, Ted? REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Attachment: nochange.patch Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Status: Open (was: Patch Available) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Status: Patch Available (was: Open) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227436#comment-13227436 ] Hadoop QA commented on HBASE-5399: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517999/nochange.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -123 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1162//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1162//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1162//console This message is automatically generated. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227455#comment-13227455 ] nkeywal commented on HBASE-5399: Ok, even an empty patch gets these errors from hadoop-qa, so the v42 can be committed imho. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId
[ https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-5563: -- Status: Patch Available (was: Open) HRegionInfo#compareTo add the comparison of regionId Key: HBASE-5563 URL: https://issues.apache.org/jira/browse/HBASE-5563 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5563.patch In the one region multi assigned case, we could find that two regions have the same table name, same startKey, same endKey, and different regionId, so these two regions are same in TreeMap but different in HashMap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227569#comment-13227569 ] Ted Yu commented on HBASE-5542: --- I think we should keep time bound whose default value can be large. Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-5542: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517843/HBASE-5542.D2217.6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -120 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1156//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1156//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1156//console This message is automatically generated.) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5520) Support reseek() at RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5520: -- Attachment: HBASE-5520_3.patch Support reseek() at RegionScanner - Key: HBASE-5520 URL: https://issues.apache.org/jira/browse/HBASE-5520 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, HBASE-5520_3.patch reseek() is not supported currently at the RegionScanner level. We can support the same. This is created following the discussion under HBASE-2038 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227581#comment-13227581 ] ramkrishna.s.vasudevan commented on HBASE-5520: --- @Lars and @Stack Updated the api to take row instead of kv. @Lars Yes SEEK_NEXT_USING_HINT is a definite gain but this reseek will help to avoid that one problem that filters have to reseek one more kv incase where we know the row that we need to seek to. Just a thought of me and Anoop. Pls let us know your suggestions and comments. Support reseek() at RegionScanner - Key: HBASE-5520 URL: https://issues.apache.org/jira/browse/HBASE-5520 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, HBASE-5520_3.patch reseek() is not supported currently at the RegionScanner level. We can support the same. This is created following the discussion under HBASE-2038 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227582#comment-13227582 ] Hadoop QA commented on HBASE-5520: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518028/HBASE-5520_3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1164//console This message is automatically generated. Support reseek() at RegionScanner - Key: HBASE-5520 URL: https://issues.apache.org/jira/browse/HBASE-5520 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, HBASE-5520_3.patch reseek() is not supported currently at the RegionScanner level. We can support the same. This is created following the discussion under HBASE-2038 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId
[ https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227587#comment-13227587 ] Hadoop QA commented on HBASE-5563: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517985/HBASE-5563.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -123 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestRollingRestart org.apache.hadoop.hbase.io.hfile.TestLruBlockCache org.apache.hadoop.hbase.coprocessor.TestClassLoading org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.master.TestDistributedLogSplitting org.apache.hadoop.hbase.client.TestShell org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1163//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1163//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1163//console This message is automatically generated. HRegionInfo#compareTo add the comparison of regionId Key: HBASE-5563 URL: https://issues.apache.org/jira/browse/HBASE-5563 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5563.patch In the one region multi assigned case, we could find that two regions have the same table name, same startKey, same endKey, and different regionId, so these two regions are same in TreeMap but different in HashMap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5564) Bulkload is discarding duplicate records
Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227597#comment-13227597 ] Laxman commented on HBASE-5564: --- I think this is a bug and its not any intentional behavior. Usage of TreeSet in the below code snippet is causing the issue. PutSortReducer.reduce() == TreeSetKeyValue map = new TreeSetKeyValue(KeyValue.COMPARATOR); long curSize = 0; // stop at the end or the RAM threshold while (iter.hasNext() curSize threshold) { Put p = iter.next(); for (ListKeyValue kvs : p.getFamilyMap().values()) { for (KeyValue kv : kvs) { map.add(kv); curSize += kv.getLength(); } } Changing this back to List and then sort explicitly will solve the issue. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227597#comment-13227597 ] ramkrishna.s.vasudevan edited comment on HBASE-5564 at 3/12/12 3:21 PM: I think this is a bug and its not any intentional behavior. Usage of TreeSet in the below code snippet is causing the issue. PutSortReducer.reduce() == {code} TreeSetKeyValue map = new TreeSetKeyValue(KeyValue.COMPARATOR); long curSize = 0; // stop at the end or the RAM threshold while (iter.hasNext() curSize threshold) { Put p = iter.next(); for (ListKeyValue kvs : p.getFamilyMap().values()) { for (KeyValue kv : kvs) { map.add(kv); curSize += kv.getLength(); } } {code} Changing this back to List and then sort explicitly will solve the issue. was (Author: lakshman): I think this is a bug and its not any intentional behavior. Usage of TreeSet in the below code snippet is causing the issue. PutSortReducer.reduce() == TreeSetKeyValue map = new TreeSetKeyValue(KeyValue.COMPARATOR); long curSize = 0; // stop at the end or the RAM threshold while (iter.hasNext() curSize threshold) { Put p = iter.next(); for (ListKeyValue kvs : p.getFamilyMap().values()) { for (KeyValue kv : kvs) { map.add(kv); curSize += kv.getLength(); } } Changing this back to List and then sort explicitly will solve the issue. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5206: --- Attachment: 5206_trunk_latest_1.patch Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Jindal updated HBASE-5206: --- Attachment: 5206_92_latest_1.patch Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227636#comment-13227636 ] Ashutosh Jindal commented on HBASE-5206: Updated the formatting issues. bq.Can we remember the tables we have enabled so that the number of calls to setEnabledTable() is reduced ? Only for the first time to enable the table , we need to go the zookeeper and for all the other calls to check the table enable state , we are caching the states in ZKTable.java. Please correct me if I am wrong. Please provide your suggestions/comments. Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId
[ https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-5563: -- Status: Open (was: Patch Available) HRegionInfo#compareTo add the comparison of regionId Key: HBASE-5563 URL: https://issues.apache.org/jira/browse/HBASE-5563 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5563.patch In the one region multi assigned case, we could find that two regions have the same table name, same startKey, same endKey, and different regionId, so these two regions are same in TreeMap but different in HashMap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227661#comment-13227661 ] Ted Yu commented on HBASE-5206: --- Patch v2 looks good. Minor comments: {code} // Call to undisableTable does this. TODO: Make a more formal purge table. -am.getZKTable().setEnabledTable(Bytes.toString(tableName)); +am.getZKTable().setDeletedTable(Bytes.toString(tableName)); {code} I don't see undisableTable. Can we remove the comment above ? {code} + } else if (!this.zkTable + .isEnabledTable(region.getTableNameAsString())) { +setEnabledTable(region); {code} setEnabledTable(HRegionInfo hri) already calls zkTable.isEnabledTable(). It seems we can call setEnabledTable(region) directly above. Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227665#comment-13227665 ] Phabricator commented on HBASE-5542: lhofhansl has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). @Scott: Yes, let's use another jira for doMiniBatchPut. For the timebound logic, at the very least there has to be an option to not do that for mutateRowsWithLocks, as spawning another thread for that would be too expensive. REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227669#comment-13227669 ] Ted Yu commented on HBASE-5206: --- {code} + String errorMsg = Unable to ensure that the table + tableName + + will be + enabled because of a ZooKeeper issue; {code} A space should be added between and will. Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227674#comment-13227674 ] Hadoop QA commented on HBASE-5206: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518039/5206_trunk_latest_1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -123 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestDrainingServer org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1165//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1165//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1165//console This message is automatically generated. Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227678#comment-13227678 ] Laxman commented on HBASE-5564: --- I tested again with the proposed patch. Changing this back to List and then sort explicitly will solve the issue. Still the same problem persists making this issue bit more complicated. I think the usage of same timestamp for all records in split causing the issue. Currently in code, a) If configured, we are using static timestamp for all mappers. b) If not configured, we are using current system time generated for each split. TsvImporterMapper.doSetup {code} ts = conf.getLong(ImportTsv.TIMESTAMP_CONF_KEY, System.currentTimeMillis()); {code} Should we think of an approach to generate a unique sequence number and use it as a timestamp? Any other thoughts? Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227688#comment-13227688 ] Ted Yu commented on HBASE-5206: --- The following error is reproducible on MacBook (patch for 0.92): {code} Tests in error: org.apache.hadoop.hbase.TestDrainingServer: org.apache.hadoop.hbase.TableNotEnabledException: t {code} Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227696#comment-13227696 ] Jesse Yates commented on HBASE-5564: Hmm, I think your right with this being a problem. It would be totally reasonable to change {code} KeyValue kv = new KeyValue( lineBytes, parsed.getRowKeyOffset(), parsed.getRowKeyLength(), parser.getFamily(i), 0, parser.getFamily(i).length, parser.getQualifier(i), 0, parser.getQualifier(i).length, ts, KeyValue.Type.Put, lineBytes, parsed.getColumnOffset(i), parsed.getColumnLength(i)); {code} to use something like: {code}ts++{code} The question is, if you have a TSV file with the same row key, which value should be considered the most recent version? Should any of them - maybe that is actually a problem and we want to have a warning/error when that occurs? Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5565) Refactoring doMiniBatchPut()
Refactoring doMiniBatchPut() Key: HBASE-5565 URL: https://issues.apache.org/jira/browse/HBASE-5565 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Lars Hofhansl HBASE-5542 introduce a RowProcessor interface that can handle multi-row transaction. It will make the code cleaner if we can make doMiniBatchPut() to be covered by RowProcessor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227724#comment-13227724 ] Phabricator commented on HBASE-5542: sc has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). @lhofhansl: I have created a jira for you :) https://issues.apache.org/jira/browse/HBASE-5565 Thank you for volunteer on working on it. I will make the time bound optional. Will update the patch soon. REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227728#comment-13227728 ] Phabricator commented on HBASE-5542: sc has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). @Ted: I will also make the time bound default value to be large. REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5565) Refactoring doMiniBatchPut()
[ https://issues.apache.org/jira/browse/HBASE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen reassigned HBASE-5565: - Assignee: Scott Chen (was: Lars Hofhansl) Refactoring doMiniBatchPut() Key: HBASE-5565 URL: https://issues.apache.org/jira/browse/HBASE-5565 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen HBASE-5542 introduce a RowProcessor interface that can handle multi-row transaction. It will make the code cleaner if we can make doMiniBatchPut() to be covered by RowProcessor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5565) Refactoring doMiniBatchPut()
[ https://issues.apache.org/jira/browse/HBASE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Scott Chen updated HBASE-5565: -- Assignee: Lars Hofhansl (was: Scott Chen) Refactoring doMiniBatchPut() Key: HBASE-5565 URL: https://issues.apache.org/jira/browse/HBASE-5565 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Lars Hofhansl HBASE-5542 introduce a RowProcessor interface that can handle multi-row transaction. It will make the code cleaner if we can make doMiniBatchPut() to be covered by RowProcessor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227737#comment-13227737 ] Mikhail Bautin commented on HBASE-5074: --- @Lars: what I committed was based on D1521.14.patch, but it will not be exactly the same patch, because I used arc patch to apply the patch from Differential, fixed some minor indentation problem, and committed using the git-svn bridge. I also re-ran all the unit tests before the commit. Sorry for a delay in replying. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0, 0.96.0 Attachments: 5074-0.94.txt, D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.14.patch, D1521.14.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227740#comment-13227740 ] stack commented on HBASE-5564: -- The TreeSet is whats going to be used once the edits make it into the server so losing them in the reducer is probably optimal? The Jesse ts++, or ts--, could be an option? Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227742#comment-13227742 ] Todd Lipcon commented on HBASE-5564: I think it's a feature, not a bug, that the timestamps are all identical. The whole point is that, in a bulk-load-only workflow, you can identify each bulk load exactly, and correlate it to the MR job that inserted it. If you want to use custom timestamps, you should specify a timestamp column in your data, or write your own MR job (ImportTsv is just an example which use useful for some cases, but for anything advanced I would expect users to write their own code) Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227744#comment-13227744 ] stack commented on HBASE-5347: -- bq. We are not actively pursuing this approach anymore due to non-reproducibility of GC issues. That sounds interesting. Was it that there was no discernible difference seen in GC managing the allocations ourselves? Would love to hear more if there are lessons to be had Mikhail (and Prakash). Good on you lads. GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: D1635.5.patch On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId
[ https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227748#comment-13227748 ] stack commented on HBASE-5563: -- Both regions were online? Does this patch make the newest made region -- the one with the larger regionid -- sort AFTER the region with the lesser regionid? I think it does but am not sure. +1 on this patch... HRegionInfo#compareTo add the comparison of regionId Key: HBASE-5563 URL: https://issues.apache.org/jira/browse/HBASE-5563 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5563.patch In the one region multi assigned case, we could find that two regions have the same table name, same startKey, same endKey, and different regionId, so these two regions are same in TreeMap but different in HashMap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5563) HRegionInfo#compareTo add the comparison of regionId
[ https://issues.apache.org/jira/browse/HBASE-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227749#comment-13227749 ] stack commented on HBASE-5563: -- Although, thats a bunch of failures. Want to retry your patch against hadoopqa again Chunhui? Or try the build locally. Do you see the same amount of failures? HRegionInfo#compareTo add the comparison of regionId Key: HBASE-5563 URL: https://issues.apache.org/jira/browse/HBASE-5563 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: HBASE-5563.patch In the one region multi assigned case, we could find that two regions have the same table name, same startKey, same endKey, and different regionId, so these two regions are same in TreeMap but different in HashMap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227765#comment-13227765 ] stack commented on HBASE-5520: -- We are reseeking to the start of the passed 'row'? Is that what we want? I thought we were trying to go to start of next row. If reseeking the start of passed in 'row', then this patch looks fine. I defer to Lars's opinion though. I don't know this area of the code well. Support reseek() at RegionScanner - Key: HBASE-5520 URL: https://issues.apache.org/jira/browse/HBASE-5520 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, HBASE-5520_3.patch reseek() is not supported currently at the RegionScanner level. We can support the same. This is created following the discussion under HBASE-2038 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5542: --- Attachment: HBASE-5542.D2217.8.patch sc updated the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). Reviewers: tedyu, lhofhansl, JIRA Make the defaut timeout infinite (no extra thread in this case) REVISION DETAIL https://reviews.facebook.net/D2217 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRowProcessorEndpoint.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessorProtocol.java src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MultiRowMutationProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRowProcessorEndpoint.java Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5560) Avoid RegionServer GC caused by timed-out calls
[ https://issues.apache.org/jira/browse/HBASE-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227803#comment-13227803 ] stack commented on HBASE-5560: -- @Dhruba In trunk: {code} src/main/java/org/apache/hadoop/hbase/HConstants.java: public static int DEFAULT_HBASE_RPC_TIMEOUT = 6; ... In HConnectionManager. this.rpcTimeout = conf.getInt( HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT); {code} Avoid RegionServer GC caused by timed-out calls --- Key: HBASE-5560 URL: https://issues.apache.org/jira/browse/HBASE-5560 Project: HBase Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0, 0.96.0 Attachments: D2241.1.patch, D2241.2.patch, D2241.3.patch The HBaseRpcServer queues up rpc responses if the socket connection to the client is not yet ready to receive data. Calls are queued here until a 15 minute timeout occurs. I am able to generate a full GC when I artificially make a client read rpc-responses very slowly. This jira is to make this 15 minute time configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227816#comment-13227816 ] Phabricator commented on HBASE-5542: tedyu has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java:93 Should read 'switch off' src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java:51 If positive timeout is specified, new Thread would be spawned for both cases. Is that desirable ? REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227817#comment-13227817 ] Hadoop QA commented on HBASE-5542: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518068/HBASE-5542.D2217.8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -120 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestRowProcessorEndpoint Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1166//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1166//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1166//console This message is automatically generated. Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5562) test-patch.sh reports a javadoc warning when there are no new javadoc warnings
[ https://issues.apache.org/jira/browse/HBASE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5562: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and 0.94. Thanks for the patch David. test-patch.sh reports a javadoc warning when there are no new javadoc warnings -- Key: HBASE-5562 URL: https://issues.apache.org/jira/browse/HBASE-5562 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Fix For: 0.94.0, 0.96.0 Attachments: test-patch.patch test-patch.sh will report new javadoc warnings when there are not any. e.g. -1 javadoc. The javadoc tool appears to have generated -123 warning messages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227857#comment-13227857 ] Phabricator commented on HBASE-5542: sc has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java:53 Ted: there is a return statement here. I will add a comment saying short circuit for the nagative timeout case. src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java:93 goodcatch. Thanks! REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover
[ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5566: -- Reporter: Prakash Khemani (was: Mikhail Bautin) [89-fb] Region server can get stuck getMaster on master failover Key: HBASE-5566 URL: https://issues.apache.org/jira/browse/HBASE-5566 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb Reporter: Prakash Khemani Assignee: Mikhail Bautin Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover
[89-fb] Region server can get stuck getMaster on master failover Key: HBASE-5566 URL: https://issues.apache.org/jira/browse/HBASE-5566 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb Reporter: Mikhail Bautin Assignee: Mikhail Bautin Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck getMaster on master failover
[ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5566: -- Description: This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. was: Reported by Prakash. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. [89-fb] Region server can get stuck getMaster on master failover Key: HBASE-5566 URL: https://issues.apache.org/jira/browse/HBASE-5566 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb Reporter: Prakash Khemani Assignee: Mikhail Bautin This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-5206: -- Attachment: 5206_trunk-v2.patch With patch v2, TestDrainingServer passes. Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk-v2.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227869#comment-13227869 ] stack commented on HBASE-4608: -- Is HLog versioned? If not, perhaps instead of a HConstants.WAL_COMPRESSION_VER, add a WAL_VERSION metadata field. Then have another for compression type (NONE or this)? bq. For TestLRUDictionary, please outline the combinations that should be added. Does it not look bare to you? I'd think that we'd try a paragraph of text going in and out... perhaps test multiple dictionaries in the one file? HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5566) [89-fb] Region server can get stuck in getMaster on master failover
[ https://issues.apache.org/jira/browse/HBASE-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5566: -- Summary: [89-fb] Region server can get stuck in getMaster on master failover (was: [89-fb] Region server can get stuck getMaster on master failover) [89-fb] Region server can get stuck in getMaster on master failover --- Key: HBASE-5566 URL: https://issues.apache.org/jira/browse/HBASE-5566 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb Reporter: Prakash Khemani Assignee: Mikhail Bautin This is specific to the 89-fb master. We have a retry loop in HRegionServer.getMaster where we do not read the location of the master from ZK, so a region server can get stuck there on master failover. We need to add a unit test to reliably catch this, and fix the bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227871#comment-13227871 ] Lars Hofhansl commented on HBASE-5074: -- Thanks Mikhail just making sure :) support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0, 0.96.0 Attachments: 5074-0.94.txt, D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.14.patch, D1521.14.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227875#comment-13227875 ] Phabricator commented on HBASE-5542: tedyu has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java:53 If hbase.hregion.row.processor.timeout carries positive value, new Thread would be spawned for mutateRowsWithLocks(). Does this satisfy Lars' comment @ 12/Mar/12 16:33 ? REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5542: --- Attachment: HBASE-5542.D2217.9.patch sc updated the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). Reviewers: tedyu, lhofhansl, JIRA Addressed Ted's review comments, Thanks! REVISION DETAIL https://reviews.facebook.net/D2217 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRowProcessorEndpoint.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessorProtocol.java src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/MultiRowMutationProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestRowProcessorEndpoint.java Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch, HBASE-5542.D2217.9.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227892#comment-13227892 ] Ted Yu commented on HBASE-4608: --- Introducing WAL_VERSION would imply that we may change HLog aspect other than compression in the future. Is there plan for the above ? Having another compression type is nice but requires making HLogKey persistence pluggable. I think it would be better to introduce one meta entry instead of two. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5198) PerformanceEvaluation --presplit option should not recreate TestTable if it exists
[ https://issues.apache.org/jira/browse/HBASE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujee Maniyam updated HBASE-5198: - Attachment: (was: PerformanceEvaluation_HBASE_5198.patch) PerformanceEvaluation --presplit option should not recreate TestTable if it exists -- Key: HBASE-5198 URL: https://issues.apache.org/jira/browse/HBASE-5198 Project: HBase Issue Type: Improvement Components: util Reporter: Sujee Maniyam Assignee: Sujee Maniyam Priority: Minor Labels: benchmark Attachments: PerformanceEvaluation_HBASE_5198_2.patch --presplit option was added in https://issues.apache.org/jira/browse/HBASE-4440 It drops and re-creates TestTable if the table exists. This behavior in-consistent with previous ones. If TestTable exists, it shouldn't be re-created -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227898#comment-13227898 ] stack commented on HBASE-4608: -- In TestLRUDictionary, we test a single entry in essence. We should try it w/ all kinds of rubbish... really long entries, empty entries, null entries similar entries... a dictionary for 32k worth of stuff..as we'll do in the wild. So I'd think? A test for the new class KeyValueCompression would be good to have too. enableCompression is an odd name for this method. Should it be setCompressionContext since that is what it does (you pass null if no compression)... seems odd passing null to 'enableCompression' Should the Compression class in wal package have more javadoc comments explaining the kinda of compression it does? Otherwise, it looks like a generic compressor class when in facts its a one-trick pony? Should this method, WALCompressionEnabled, be isWALCompressionEnabled? I like your idea of versioning the WAL Patch is coming along nicely. Almost there. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5198) PerformanceEvaluation --presplit option should not recreate TestTable if it exists
[ https://issues.apache.org/jira/browse/HBASE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujee Maniyam updated HBASE-5198: - Attachment: PerformanceEvaluation_HBASE_5198_2.patch PerformanceEvaluation --presplit option should not recreate TestTable if it exists -- Key: HBASE-5198 URL: https://issues.apache.org/jira/browse/HBASE-5198 Project: HBase Issue Type: Improvement Components: util Reporter: Sujee Maniyam Assignee: Sujee Maniyam Priority: Minor Labels: benchmark Attachments: PerformanceEvaluation_HBASE_5198_2.patch --presplit option was added in https://issues.apache.org/jira/browse/HBASE-4440 It drops and re-creates TestTable if the table exists. This behavior in-consistent with previous ones. If TestTable exists, it shouldn't be re-created -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5520) Support reseek() at RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227900#comment-13227900 ] Lars Hofhansl commented on HBASE-5520: -- Generally it would be very useful for coprocessors to have this ability! After looking at the discussion here (the restrictions imposed on what KVs one can passed - previous ones, row only, column family needs to be ignored, etc) we just need to be careful how we add this. For this specific issue it would be nice to get some comparative performance numbers between this and filters. Maybe we're missing a level of abstraction...? Support reseek() at RegionScanner - Key: HBASE-5520 URL: https://issues.apache.org/jira/browse/HBASE-5520 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.92.0 Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Fix For: 0.96.0 Attachments: HBASE-5520_1.patch, HBASE-5520_2.patch, HBASE-5520_3.patch reseek() is not supported currently at the RegionScanner level. We can support the same. This is created following the discussion under HBASE-2038 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227901#comment-13227901 ] Ted Yu commented on HBASE-4608: --- bq. try a paragraph of text going in and out LRUDictionary deals with byte array: {code} public short findEntry(byte[] data, int offset, int length) { {code} In this regard, piping text into the dictionary is functionally same as piping byte[] form of integer. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5198) PerformanceEvaluation --presplit option should not recreate TestTable if it exists
[ https://issues.apache.org/jira/browse/HBASE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujee Maniyam updated HBASE-5198: - Status: Patch Available (was: Open) regenerated using --no-prefix option for git PerformanceEvaluation --presplit option should not recreate TestTable if it exists -- Key: HBASE-5198 URL: https://issues.apache.org/jira/browse/HBASE-5198 Project: HBase Issue Type: Improvement Components: util Reporter: Sujee Maniyam Assignee: Sujee Maniyam Priority: Minor Labels: benchmark Attachments: PerformanceEvaluation_HBASE_5198_2.patch --presplit option was added in https://issues.apache.org/jira/browse/HBASE-4440 It drops and re-creates TestTable if the table exists. This behavior in-consistent with previous ones. If TestTable exists, it shouldn't be re-created -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227908#comment-13227908 ] stack commented on HBASE-4608: -- Its the test of a single entry only which is not really exercising much. bq. Introducing WAL_VERSION would imply that we may change HLog aspect other than compression in the future. Is there plan for the above ? I've not heard of any. Is that your argument for not adding a version? Because if there has been no discussion of change up to this, we wouldn't possibly need to change the format in the future? HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5206) Port HBASE-5155 to 0.92 and TRUNK
[ https://issues.apache.org/jira/browse/HBASE-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227915#comment-13227915 ] Hadoop QA commented on HBASE-5206: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518079/5206_trunk-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestZooKeeper Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1167//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1167//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1167//console This message is automatically generated. Port HBASE-5155 to 0.92 and TRUNK - Key: HBASE-5206 URL: https://issues.apache.org/jira/browse/HBASE-5206 Project: HBase Issue Type: Bug Affects Versions: 0.92.2, 0.96.0 Reporter: Zhihong Yu Attachments: 5206_92_1.patch, 5206_92_latest_1.patch, 5206_trunk-v2.patch, 5206_trunk_1.patch, 5206_trunk_latest_1.patch This JIRA ports HBASE-5155 (ServerShutDownHandler And Disable/Delete should not happen parallely leading to recreation of regions that were deleted) to 0.92 and TRUNK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5561) Create HFileSystemFactory
[ https://issues.apache.org/jira/browse/HBASE-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227920#comment-13227920 ] stack commented on HBASE-5561: -- +1 Create HFileSystemFactory - Key: HBASE-5561 URL: https://issues.apache.org/jira/browse/HBASE-5561 Project: HBase Issue Type: Improvement Components: master, regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur The HFileSystem object can be used to paper over differences in HDFS versions. Create it using a factory object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5399: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I tried it locally and saw some flakeyness in the medium test runs... They don't seem to pass reliably anymore but its with or without this patch. I'm committing to trunk before this fat patch rots. We can come back to work on the flakey tests after it goes in. Thanks Nicolas for the nice patch. Applied to trunk. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227921#comment-13227921 ] stack commented on HBASE-4542: -- +1 for 0.94. This is good stuff. add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Fix For: 0.94.0, 0.96.0 Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227923#comment-13227923 ] stack commented on HBASE-4542: -- And Lars, I think it should be safe since its on toStringery that is going on in this patch... no changes in Interfaces. add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Fix For: 0.94.0, 0.96.0 Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227926#comment-13227926 ] Lars Hofhansl commented on HBASE-5564: -- So this is only about ImportTsv? Should change the title in that case. I agree with Todd, at least for ImportTsv. Import/Export should not (and hopefully do not) exhibit this behavior (since we want to be able to import/export KVs with multiple versions). Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227929#comment-13227929 ] stack commented on HBASE-4608: -- Its a regular pattern only. Perhaps this does some decent testing? TestWALReplayCompressed? HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5562) test-patch.sh reports a javadoc warning when there are no new javadoc warnings
[ https://issues.apache.org/jira/browse/HBASE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227936#comment-13227936 ] Hudson commented on HBASE-5562: --- Integrated in HBase-0.94 #26 (See [https://builds.apache.org/job/HBase-0.94/26/]) HBASE-5562 test-patch.sh reports a javadoc warning when there are no new javadoc warnings (Revision 1299825) Result = FAILURE stack : Files : * /hbase/branches/0.94/dev-support/test-patch.sh test-patch.sh reports a javadoc warning when there are no new javadoc warnings -- Key: HBASE-5562 URL: https://issues.apache.org/jira/browse/HBASE-5562 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Fix For: 0.94.0, 0.96.0 Attachments: test-patch.patch test-patch.sh will report new javadoc warnings when there are not any. e.g. -1 javadoc. The javadoc tool appears to have generated -123 warning messages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227934#comment-13227934 ] stack commented on HBASE-4608: -- The tests do not have variety. I think we should add it here rather than wait for the variety to hit out in the field. bq. If only compression would evolve, I think checking against compression type metadata would be adequate. The above begins with a conditional, If HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5567) test-patch.sh has logic error in findbugs check
[ https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David S. Wang updated HBASE-5567: - Attachment: HBASE_5567.patch test-patch.sh has logic error in findbugs check --- Key: HBASE-5567 URL: https://issues.apache.org/jira/browse/HBASE-5567 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Labels: noob Attachments: HBASE_5567.patch Similar bug to HBASE-5562. The wrong operator is used in the check against $OK_FINDBUGS_WARNINGS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5567) test-patch.sh has logic error in findbugs check
[ https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David S. Wang updated HBASE-5567: - Status: Patch Available (was: Open) test-patch.sh has logic error in findbugs check --- Key: HBASE-5567 URL: https://issues.apache.org/jira/browse/HBASE-5567 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Labels: noob Attachments: HBASE_5567.patch Similar bug to HBASE-5562. The wrong operator is used in the check against $OK_FINDBUGS_WARNINGS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5567) test-patch.sh has logic error in findbugs check
[ https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227941#comment-13227941 ] David S. Wang commented on HBASE-5567: -- Sorry I didn't catch this one earlier ... would have included it in the previous patch. test-patch.sh has logic error in findbugs check --- Key: HBASE-5567 URL: https://issues.apache.org/jira/browse/HBASE-5567 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Labels: noob Attachments: HBASE_5567.patch Similar bug to HBASE-5562. The wrong operator is used in the check against $OK_FINDBUGS_WARNINGS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell
[ https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227944#comment-13227944 ] stack commented on HBASE-5548: -- Looks good. Does stuff work as it used to after this patch goes in? Can you do the old dumb: get 'tablename', 'yadda', 'yadda' still? On get_table, should it just be table? 'status' dumps out status, 'version', dumps out 'version'. Maybe it should stay get_table, because it returns something you can catch into a variable? Oh, this is nice: {code} + hbase t1 = create 't1', 'f1' {code} How do i add a filter? Is it the same old: t.scan, {STARTROW = 'xyz'} or something? Patch is looking good I think. Introduces a nice bit of functionality... Add ability to get a table in the shell --- Key: HBASE-5548 URL: https://issues.apache.org/jira/browse/HBASE-5548 Project: HBase Issue Type: Improvement Components: shell Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0 Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch Currently, all the commands that operate on a table in the shell first have to take the table as name as input. There are two main considerations: * It is annoying to have to write the table name every time, when you should just be able to get a reference to a table * the current implementation is very wasteful - it creates a new HTable for each call (but reuses the connection since it uses the same configuration) We should be able to get a handle to a single HTable and then operate on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227946#comment-13227946 ] Ted Yu commented on HBASE-4608: --- I think WAL_VERSION metadata is orthogonal to compression type metadata and I would expect both to be present in new HLog files written with this feature. Say we define WAL_VERSION as v2 which has WAL compression capability. We still need to check compression type metadata before applying dictionary compression. In this regard adding WAL_VERSION seems to be redundant. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5567) test-patch.sh has logic error in findbugs check
[ https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5567: - Resolution: Fixed Fix Version/s: 0.96.0 0.94.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed 0.94 and trunk. Thanks for the patch David. test-patch.sh has logic error in findbugs check --- Key: HBASE-5567 URL: https://issues.apache.org/jira/browse/HBASE-5567 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Labels: noob Fix For: 0.94.0, 0.96.0 Attachments: HBASE_5567.patch Similar bug to HBASE-5562. The wrong operator is used in the check against $OK_FINDBUGS_WARNINGS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227954#comment-13227954 ] Ted Yu commented on HBASE-4608: --- bq. Should the Compression class in wal package ... I only see KeyValueCompression.java under wal package. Please elaborate which class should carry more comments. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227961#comment-13227961 ] Ted Yu commented on HBASE-4608: --- Uploaded v23 onto review board. After WAL version metadata design is finalized, will add that. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227964#comment-13227964 ] Phabricator commented on HBASE-5542: lhofhansl has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). In the latest patch I can't find rowProcessorTimeout being set anywhere. What I had in mind was this: We are now using this for internal operations (such as mutateRow) and external operations (those created by users). The internal operations should not be creating new threads. External operations either could always time bound or it could be up to the implementer. REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch, HBASE-5542.D2217.9.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227969#comment-13227969 ] stack commented on HBASE-4608: -- bq. I think WAL_VERSION metadata is orthogonal to compression type metadata and I would expect both to be present in new HLog files written with this feature. How does it get in if you don't add it? If you don't want to add it, just don't. I'm not going to +1 this patch though if it adds metadata about a new compression feature w/o introducing a general versioning on the WAL. bq. Should the Compression class in wal package ... The compression class in wal is Compressor.java. I have trouble following your responses to my comments because they come in w/o context and are also they are done piecemeal which means I have to spend way more time than I should have to reviewing your stuff. I'd suggest you save up your comments and submit them in a lump rather than hit submit per comment; you'll use up less internet. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5548) Add ability to get a table in the shell
[ https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227975#comment-13227975 ] Jesse Yates commented on HBASE-5548: bq. get 'tablename', 'yadda', 'yadda' Yeah, nothing changes there. In fact, scan (which is the only change that patch introduces) still will work exactly the same way. Same deal for doing something like t.scan bq. On get_table, should it just be table? 'status' dumps out status, 'version', dumps out 'version'. Maybe it should stay get_table, because it returns something you can catch into a variable? I was thinking it should stay get_table since table will be clutter the overall naming a little (Table is also the name of the ::Hbase::Table class) and yeah, it implies holding a reference, which _all_ the other commands don't give you. I would imagine doing {code} t = table 't1' {code} as actually creating the table if it doesn't exist, or just getting a reference to it (similar to how 'create' works in the latest patch). A follow-on patch would be nice there. I want to think about the 'right' way to do the extension of ::Hbase::Table, rather than this piecemeal, method-by-method, decentralized increase. Also, adding the same type of system for put, get, delete, etc would be nice. But, yeah, close! Add ability to get a table in the shell --- Key: HBASE-5548 URL: https://issues.apache.org/jira/browse/HBASE-5548 Project: HBase Issue Type: Improvement Components: shell Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.96.0 Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch Currently, all the commands that operate on a table in the shell first have to take the table as name as input. There are two main considerations: * It is annoying to have to write the table name every time, when you should just be able to get a reference to a table * the current implementation is very wasteful - it creates a new HTable for each call (but reuses the connection since it uses the same configuration) We should be able to get a handle to a single HTable and then operate on that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5292) getsize per-CF metric incorrectly counts compaction related reads as well
[ https://issues.apache.org/jira/browse/HBASE-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227978#comment-13227978 ] Hudson commented on HBASE-5292: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) [jira] [HBASE-5292] Prevent counting getSize on compactions Author: Zhiqiu Kong Summary: Added two separate metrics for both get() and next(). This is done by refactoring on internal next() API. To be more specific, only Get.get() and ResultScanner.next() passes the metric name (getsize and nextsize repectively) to HRegion::RegionScanner::next(ListKeyValue, String) This will eventually hit StoreScanner()::next((ListKeyValue, int, String) where the metrics are counted. And their call paths are: 1) Get HTable::get(final Get get) = HRegionServer::get(byte [] regionName, Get get) = HRegion::get(final Get get, final Integer lockid) = HRegion::get(final Get get) [pass METRIC_GETSIZE to the callee] = HRegion::RegionScanner::next(ListKeyValue outResults, String metric) = HRegion::RegionScanner::next(ListKeyValue outResults, int limit, String metric) = HRegion::RegionScanner::nextInternal(int limit, String metric) = KeyValueHeap::next(ListKeyValue result, int limit, String metric) = StoreScanner::next(ListKeyValue outResult, int limit, String metric) 2) Next HTable::ClientScanner::next() = ScannerCallable::call() = HRegionServer::next(long scannerId) = HRegionServer::next(final long scannerId, int nbRows) [pass METRIC_NEXTSIZE to the callee] = HRegion::RegionScanner::next(ListKeyValue outResults, String metric) = HRegion::RegionScanner::next(ListKeyValue outResults, int limit, String metric) = HRegion::RegionScanner::nextInternal(int limit, String metric) = KeyValueHeap::next(ListKeyValue result, int limit, String metric) = StoreScanner::next(ListKeyValue outResult, int limit, String metric) Test Plan: 1. Passed unit tests. 2. Created a testcase TestRegionServerMetrics::testGetNextSize to guarantee: * Get/Next contributes to getsize/nextsize metrics * Both getsize/nextsize are per Column Family * Flush/compaction won't affect these two metrics Reviewed By: mbautin Reviewers: Kannan, mbautin, Liyin, JIRA CC: Kannan, mbautin, Liyin, zhiqiu Differential Revision: https://reviews.facebook.net/D1617 (Revision 1299147) Result = FAILURE mbautin : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/InternalScanner.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java getsize per-CF metric incorrectly counts compaction related reads as well -- Key: HBASE-5292 URL: https://issues.apache.org/jira/browse/HBASE-5292 Project: HBase Issue Type: Bug Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Fix For: 0.94.0 Attachments: 0001-jira-HBASE-5292-Prevent-counting-getSize-on-compacti.patch, D1527.1.patch, D1527.2.patch, D1527.3.patch, D1527.4.patch, D1617.1.patch, jira-HBASE-5292-Prevent-counting-getSize-on-compacti-2012-03-09_13_26_52.patch The per-CF getsize metric's intent was to track bytes returned (to HBase clients) per-CF. [Note: We already have metrics to track # of HFileBlock's read for compaction vs. non-compaction cases -- e.g., compactionblockreadcnt vs. fsblockreadcnt.] Currently, the getsize metric gets updated for both client initiated Get/Scan operations as well for compaction related reads. The metric is updated in StoreScanner.java:next() when the Scan query matcher returns an INCLUDE* code via a: HRegion.incrNumericMetric(this.metricNameGetsize, copyKv.getLength()); We should not do the above in case of compactions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5535) Make the functions in task monitor synchronized
[ https://issues.apache.org/jira/browse/HBASE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227976#comment-13227976 ] Hudson commented on HBASE-5535: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) HBASE-5535 Make the functions in task monitor synchronized (Revision 1299088) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/monitoring/TaskMonitor.java Make the functions in task monitor synchronized --- Key: HBASE-5535 URL: https://issues.apache.org/jira/browse/HBASE-5535 Project: HBase Issue Type: Bug Reporter: Liyin Tang Assignee: Liyin Tang Fix For: 0.94.0 Attachments: HBASE-5535-Make-the-functions-in-task-monitor-synchr-2012-03-08_16_33_42.patch There are some potential race condition in the task monitor. So update the functions in task monitor to be synchronized. The example of the problem caused by the race condition: ERROR org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flush failed for region java.lang.IndexOutOfBoundsException: Index: 1745, Size: 1744 at java.util.ArrayList.add(ArrayList.java:367) at java.util.SubList.add(AbstractList.java:633) at java.util.SubList.add(AbstractList.java:633) at java.util.SubList.add(AbstractList.java:633) at java.util.SubList.add(AbstractList.java:633) at java.util.SubList.add(AbstractList.java:633) at java.util.AbstractList.add(AbstractList.java:91) at org.apache.hadoop.hbase.monitoring.TaskMonitor.createStatus(TaskMonitor.java:74) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1139) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:260) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:234) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:146) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227980#comment-13227980 ] Hudson commented on HBASE-4542: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) [jira] [HBASE-4542] Add filter info to slow query logging Author: Zhiqiu Kong Summary: The original 89-fb diff is: https://reviews.facebook.net/D1263 Slow opertaion log does not provide enough information when a filter is present. The followings are done to add the filter info: 1) Added toString() method for filters inheriting FilterBase, this affect 22 filters and their subclasses. The info added includes the filter's name and its members. For example, for TimestampsFilter, we'll output its class name as well as the defined timestamps. 2) Added a field 'filter' in Get::toMap() and Scan::toMap() to enable the logging of filter info. Task ID: #750975 Blame Rev: Test Plan: 1. Run and passed unit-tests to make sure it does not break things 2. Run kannan's script to trigger the slow operation logging, checked for each filter to make sure the filter info was logged. To be more detailed, the output log are as following (only 'filter' filed is put here for ease of reading): filter:TimestampsFilter (3/3): [2, 3, 5] filter:TimestampsFilter (5/6): [2, 3, 5, 7, 11] filter:ColumnPrefixFilter col2 filter:ColumnRangeFilter [col2a, col2b] filter:ColumnCountGetFilter 8 filter:ColumnPaginationFilter (4, 4) filter:InclusiveStopFilter row filter:PrefixFilter row filter:PageFilter 1 filter:SkipFilter TimestampsFilter (1/1): [1000] filter:WhileMatchFilter TimestampsFilter (3/3): [2, 3, 5] filter:KeyOnlyFilter filter:FirstKeyOnlyFilter filter:MultipleColumnPrefixFilter (3/3): [a, b, c] filter:DependentColumnFilter (family, qualifier, true, LESS, value) filter:FamilyFilter (LESS, value) filter:QualifierFilter (LESS, value) filter:RowFilter (LESS, value) filter:ValueFilter (LESS, value) filter:KeyOnlyFilter filter:FirstKeyOnlyFilter filter:SingleColumnValueFilter (family, qualifier, EQUAL, value) filter:SingleColumnValueExcludeFilter (family, qualifier, EQUAL, value) filter:FilterList AND (2/2): [KeyOnlyFilter, FirstKeyOnlyFilter] Please check ~zhiqiu/Codes/scripts/testFilter.rb for the testing script. 3. Added unit test cases to TestOperation to verify the filters' toString() method works well. Reviewed By: mbautin Reviewers: Kannan, madhuvaidya, mbautin, JIRA CC: Kannan, madhuvaidya, mbautin, zhiqiu, stack Differential Revision: https://reviews.facebook.net/D1539 (Revision 1299019) Result = FAILURE mbautin : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Get.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnCountGetFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnRangeFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/CompareFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/DependentColumnFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterBase.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/InclusiveStopFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/PageFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/PrefixFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/SkipFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/TimestampsFilter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/WhileMatchFilter.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestOperation.java add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Fix For: 0.94.0, 0.96.0 Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \
[jira] [Commented] (HBASE-5552) Clean up our jmx view; its a bit of a mess
[ https://issues.apache.org/jira/browse/HBASE-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227977#comment-13227977 ] Hudson commented on HBASE-5552: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) HBASE-5552 Clean up our jmx view; its a bit of a mess (Revision 1298923) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPCStatistics.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/metrics/HBaseInfo.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Clean up our jmx view; its a bit of a mess -- Key: HBASE-5552 URL: https://issues.apache.org/jira/browse/HBASE-5552 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.92.1, 0.94.0 Attachments: 0.92.0jmx.png, 5552.txt, currentjmxview.png, patchedjmxview.png Fix before we release 0.92.1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5555) add a pointer to a dns verification utility in hbase book/dns
[ https://issues.apache.org/jira/browse/HBASE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227979#comment-13227979 ] Hudson commented on HBASE-: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) HBASE- add a pointer to a dns verification utility in hbase book/dns (Revision 1299073) Result = FAILURE stack : Files : * /hbase/trunk/src/docbkx/configuration.xml add a pointer to a dns verification utility in hbase book/dns - Key: HBASE- URL: https://issues.apache.org/jira/browse/HBASE- Project: HBase Issue Type: Improvement Components: documentation Reporter: Sujee Maniyam Assignee: Sujee Maniyam Priority: Minor Fix For: 0.96.0 Attachments: .txt DNS should work correctly in a Hbase cluster. I have a simple DNS checker utility, that verifies DNS on all machines of the cluster. https://github.com/sujee/hadoop-dns-checker add a pointer to the tool in hbase book : http://hbase.apache.org/book.html#dns -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5562) test-patch.sh reports a javadoc warning when there are no new javadoc warnings
[ https://issues.apache.org/jira/browse/HBASE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227981#comment-13227981 ] Hudson commented on HBASE-5562: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) HBASE-5562 test-patch.sh reports a javadoc warning when there are no new javadoc warnings (Revision 1299824) Result = FAILURE stack : Files : * /hbase/trunk/dev-support/test-patch.sh test-patch.sh reports a javadoc warning when there are no new javadoc warnings -- Key: HBASE-5562 URL: https://issues.apache.org/jira/browse/HBASE-5562 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Fix For: 0.94.0, 0.96.0 Attachments: test-patch.patch test-patch.sh will report new javadoc warnings when there are not any. e.g. -1 javadoc. The javadoc tool appears to have generated -123 warning messages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5538) A metric to measure the size of the response queue in the hbase rpc server
[ https://issues.apache.org/jira/browse/HBASE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227982#comment-13227982 ] Hudson commented on HBASE-5538: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) HBASE-5538 A metric to measure the size of the response queue in the hbase rpc server (Revision 1299084) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseRpcMetrics.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java A metric to measure the size of the response queue in the hbase rpc server -- Key: HBASE-5538 URL: https://issues.apache.org/jira/browse/HBASE-5538 Project: HBase Issue Type: Improvement Components: ipc Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.96.0 Attachments: D2199.1.patch The HbaseServer queues responses to client (if the client is slow). Expose a metric that records the size of the response queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227984#comment-13227984 ] Ted Yu commented on HBASE-4608: --- For code specific review, please use https://reviews.apache.org/r/4185/ where there would be context. I can add WAL_VERSION as v2 in the metadata. My question is: would HLog v2 be allowed not to compress Log entries ? If desirable, we can discuss in more detail, face to face, on the 27th. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608-v20.txt, 4608-v22.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227983#comment-13227983 ] Hudson commented on HBASE-5399: --- Integrated in HBase-TRUNK #2676 (See [https://builds.apache.org/job/HBase-TRUNK/2676/]) HBASE-5399 Cut the link between the client and the zookeeper ensemble (Revision 1299872) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/MasterNotRunningException.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MasterKeepAliveConnection.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ZooKeeperKeepAliveConnection.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/Merge.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestRollingRestart.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationPeer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/rest/client/TestRemoteTable.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/util/TestMergeTool.java Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227985#comment-13227985 ] Hadoop QA commented on HBASE-5542: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518084/HBASE-5542.D2217.9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 159 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1168//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1168//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1168//console This message is automatically generated. Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch, HBASE-5542.D2217.9.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5565) Refactoring doMiniBatchPut()
[ https://issues.apache.org/jira/browse/HBASE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5565: - Fix Version/s: 0.96.0 Refactoring doMiniBatchPut() Key: HBASE-5565 URL: https://issues.apache.org/jira/browse/HBASE-5565 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Lars Hofhansl Fix For: 0.96.0 HBASE-5542 introduce a RowProcessor interface that can handle multi-row transaction. It will make the code cleaner if we can make doMiniBatchPut() to be covered by RowProcessor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5542) Unify HRegion.mutateRowsWithLocks() and HRegion.processRow()
[ https://issues.apache.org/jira/browse/HBASE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227988#comment-13227988 ] Phabricator commented on HBASE-5542: sc has commented on the revision HBASE-5542 [jira] Unify HRegion.mutateRowsWithLocks() and HRegion.processRow(). @Lars: I agree. The internal calls should never create threads for this. I will make the change. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:485 The timeout is set here. REVISION DETAIL https://reviews.facebook.net/D2217 Unify HRegion.mutateRowsWithLocks() and HRegion.processRow() Key: HBASE-5542 URL: https://issues.apache.org/jira/browse/HBASE-5542 Project: HBase Issue Type: Improvement Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.96.0 Attachments: HBASE-5542.D2217.1.patch, HBASE-5542.D2217.2.patch, HBASE-5542.D2217.3.patch, HBASE-5542.D2217.4.patch, HBASE-5542.D2217.5.patch, HBASE-5542.D2217.6.patch, HBASE-5542.D2217.7.patch, HBASE-5542.D2217.8.patch, HBASE-5542.D2217.9.patch mutateRowsWithLocks() does atomic mutations on multiple rows. processRow() does atomic read-modify-writes on a single row. It will be useful to generalize both and have a processRowsWithLocks() that does atomic read-modify-writes on multiple rows. This also helps reduce some redundancy in the codes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-4542: -- Reopening so I won't forget about the 0.94 part. add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Fix For: 0.94.0, 0.96.0 Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227991#comment-13227991 ] Lars Hofhansl commented on HBASE-4542: -- Please... Do not mark an issue as fixed if it is targeted to multiple versions but not all versions are committed. Pretty please? :) add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Fix For: 0.94.0, 0.96.0 Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13227997#comment-13227997 ] Ted Yu commented on HBASE-5399: --- TestAtomicOperation failed in latest TRUNK build: https://builds.apache.org/job/HBase-TRUNK/2676/testReport/org.apache.hadoop.hbase.regionserver/TestAtomicOperation/testMultiRowMutationMultiThreads/ Similar failure shows up in the latest Hadoop QA run of HBASE-5542 Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228009#comment-13228009 ] Ted Yu commented on HBASE-5399: --- From test output: {code} Exception in thread Thread-211 junit.framework.AssertionFailedError at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.fail(Assert.java:56) at org.apache.hadoop.hbase.regionserver.TestAtomicOperation$2.run(TestAtomicOperation.java:392) {code} Here is related code in test: {code} if (r.size() != 1) { LOG.debug(r); failures.incrementAndGet(); fail(); } {code} Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5567) test-patch.sh has logic error in findbugs check
[ https://issues.apache.org/jira/browse/HBASE-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228012#comment-13228012 ] Hudson commented on HBASE-5567: --- Integrated in HBase-0.94 #27 (See [https://builds.apache.org/job/HBase-0.94/27/]) HBASE-5567 test-patch.sh has logic error in findbugs check (Revision 1299893) Result = SUCCESS stack : Files : * /hbase/branches/0.94/dev-support/test-patch.sh test-patch.sh has logic error in findbugs check --- Key: HBASE-5567 URL: https://issues.apache.org/jira/browse/HBASE-5567 Project: HBase Issue Type: Bug Components: scripts Affects Versions: 0.94.0, 0.96.0 Reporter: David S. Wang Assignee: David S. Wang Priority: Trivial Labels: noob Fix For: 0.94.0, 0.96.0 Attachments: HBASE_5567.patch Similar bug to HBASE-5562. The wrong operator is used in the check against $OK_FINDBUGS_WARNINGS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira