[jira] [Commented] (HBASE-5529) MR test failures becuase MALLOC_ARENA_MAX is not set
[ https://issues.apache.org/jira/browse/HBASE-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223076#comment-13223076 ] Hadoop QA commented on HBASE-5529: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517201/HBASE-5529-trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 154 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1112//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1112//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1112//console This message is automatically generated. MR test failures becuase MALLOC_ARENA_MAX is not set Key: HBASE-5529 URL: https://issues.apache.org/jira/browse/HBASE-5529 Project: HBase Issue Type: Bug Components: mapreduce, test Affects Versions: 0.92.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Attachments: HBASE-5529-to92.patch, HBASE-5529-trunk.patch When running unit tests on CentOS 6 I get a bunch of unit test failures in mapreduce-related tests due to: 2012-03-03 00:14:18,776 WARN [Container Monitor] monitor.ContainersMonitorImpl$MonitoringThread(436): Container [pid=21446,containerID=container_1330762435849_0002_01_01] is running beyond virtual memory limits. Current usage: 223.1mb of 2.0gb physical memory used; 6.9gb of 4.2gb virtual memory used. Killing container. Note: this also came up in the mapreduce project. See: https://issues.apache.org/jira/browse/MAPREDUCE-3933 Patch coming shortly -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223089#comment-13223089 ] nkeywal commented on HBASE-5399: TestRegionRebalancing: seems to be a flaky test. Will retry on Hadoop-QA, but I don't reproduce it here. TestRegionRebalancing: With the 7s sleep (i.e. same sleep as before), I don't reproduce it. I will try to understand why this sleep changes the result, but anyway it's not a regression. So this patch is a good candidate for a commit I think. Further enhancement (clusterId, ZK watcher replacement by simple calls) could be put in another JIRA. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223092#comment-13223092 ] Hadoop QA commented on HBASE-5074: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517204/D1521.13.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1113//console This message is automatically generated. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5532) get NPE during MajorCompactionChecker
get NPE during MajorCompactionChecker -- Key: HBASE-5532 URL: https://issues.apache.org/jira/browse/HBASE-5532 Project: HBase Issue Type: Bug Components: regionserver Reporter: terry zhang We found error log (NullPointerException) below on our online cluster: 2012-03-05 00:17:09,592 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: Caught exception java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:878) at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:857) at org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:3017) at org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1172) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) After Check the code we found although it already check whether store files has null reader at the begin of the function(isMajorCompaction), but it still has some possibility the reader is closed before it return(eg mini compaction). So we need to check store file reader before we use it to avoid this NPE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223114#comment-13223114 ] Ramkrishna.S.Vasudevan commented on HBASE-5010: --- Why HBASE-5510 updates coming in this JIRA? Regards Ram Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5010: -- Comment: was deleted (was: Integrated in HBase-TRUNK-security #129 (See [https://builds.apache.org/job/HBase-TRUNK-security/129/]) HBASE-5010 Pass region info in LoadBalancer.randomAssignment(ListServerName servers) (Anoop Sam John) (Revision 1297155) Result = FAILURE tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java ) Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5010: -- Comment: was deleted (was: Integrated in HBase-TRUNK #2672 (See [https://builds.apache.org/job/HBase-TRUNK/2672/]) HBASE-5010 Pass region info in LoadBalancer.randomAssignment(ListServerName servers) (Anoop Sam John) (Revision 1297155) Result = SUCCESS tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/DefaultLoadBalancer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java ) Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Status: Open (was: Patch Available) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Status: Patch Available (was: Open) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5399: --- Attachment: 5399.v40.patch Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223138#comment-13223138 ] Hadoop QA commented on HBASE-5399: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517212/5399.v40.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 30 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1114//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1114//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1114//console This message is automatically generated. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laxman updated HBASE-5531: -- Attachment: HBASE-5531-trunk.patch Patch for trunk Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Attachments: HBASE-5531-trunk.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laxman updated HBASE-5531: -- Fix Version/s: 0.96.0 0.92.2 Status: Patch Available (was: Open) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laxman updated HBASE-5531: -- Attachment: HBASE-5531.patch Patch for 0.92 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223190#comment-13223190 ] Hadoop QA commented on HBASE-5531: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517222/HBASE-5531.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 154 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestLruBlockCache org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1115//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1115//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1115//console This message is automatically generated. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223223#comment-13223223 ] Laxman commented on HBASE-5531: --- This patch involves build xml (pom.xml) changes only. Above -1s are irrelevant to the changes. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223284#comment-13223284 ] Zhihong Yu commented on HBASE-5531: --- +1 on patch. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.
[ https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laxman updated HBASE-5516: -- Attachment: HBASE-5516_2_0.90.patch Please review the patch and share your comments. GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90. Key: HBASE-5516 URL: https://issues.apache.org/jira/browse/HBASE-5516 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7 Attachments: HBASE-5516_2_0.90.patch Usage of GZip is leading to resident memory leak in 0.90. We need to have something similar to HBASE-5387 in 0.90. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5532) get NPE during MajorCompactionChecker
[ https://issues.apache.org/jira/browse/HBASE-5532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223291#comment-13223291 ] Zhihong Yu commented on HBASE-5532: --- Which version of HBase were you using ? get NPE during MajorCompactionChecker -- Key: HBASE-5532 URL: https://issues.apache.org/jira/browse/HBASE-5532 Project: HBase Issue Type: Bug Components: regionserver Reporter: terry zhang Attachments: HBASE-5532.patch We found error log (NullPointerException) below on our online cluster: 2012-03-05 00:17:09,592 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: Caught exception java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:878) at org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:857) at org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:3017) at org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1172) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) After Check the code we found although it already check whether store files has null reader at the begin of the function(isMajorCompaction), but it still has some possibility the reader is closed before it return(eg mini compaction). So we need to check store file reader before we use it to avoid this NPE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223312#comment-13223312 ] ramkrishna.s.vasudevan commented on HBASE-5531: --- @Ted If ok i can commit this today? Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223323#comment-13223323 ] Zhihong Yu commented on HBASE-5531: --- I think so. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time
[ https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223343#comment-13223343 ] ramkrishna.s.vasudevan commented on HBASE-5494: --- Is this similar to https://issues.apache.org/jira/browse/HBASE-5373 ? Introduce a zk hosted table-wide read/write lock so only one table operation at a time -- Key: HBASE-5494 URL: https://issues.apache.org/jira/browse/HBASE-5494 Project: HBase Issue Type: Improvement Reporter: stack I saw this facility over in the accumulo code base. Currently we just try to sort out the mess when splits come in during an online schema edit; somehow we figure we can figure all possible region transition combinations and make the right call. We could try and narrow the number of combinations by taking out a zk table lock when doing table operations. For example, on split or merge, we could take a read-only lock meaning the table can't be disabled while these are running. We could then take a write only lock if we want to ensure the table doesn't change while disabling or enabling process is happening. Shouldn't be too hard to add. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5373) Table level lock to prevent the race of multiple table level operation
[ https://issues.apache.org/jira/browse/HBASE-5373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5373. -- Resolution: Duplicate Resolving as duplicate. Liyin, you did it first so I should be resolving HBASE-5494 as a duplicate of this but HBASE-5494 has a little bit more going on. Hope you don't mind. Are you working on this? Table level lock to prevent the race of multiple table level operation -- Key: HBASE-5373 URL: https://issues.apache.org/jira/browse/HBASE-5373 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang A table level lock can guarantee that only one table operation would happen at one time for each table. The master should require and release these table locks correctly during the failover time. One proposal is to keep track of the lock and its corresponding operation in the zookeeper. If there is a master failover, the secondary should have a way to check whether these operations are succeeded nor not before releasing the lock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5494) Introduce a zk hosted table-wide read/write lock so only one table operation at a time
[ https://issues.apache.org/jira/browse/HBASE-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223350#comment-13223350 ] stack commented on HBASE-5494: -- @Ram Yes sir. Thanks. Resolved hbase-5373 as duplicate of this. Introduce a zk hosted table-wide read/write lock so only one table operation at a time -- Key: HBASE-5494 URL: https://issues.apache.org/jira/browse/HBASE-5494 Project: HBase Issue Type: Improvement Reporter: stack I saw this facility over in the accumulo code base. Currently we just try to sort out the mess when splits come in during an online schema edit; somehow we figure we can figure all possible region transition combinations and make the right call. We could try and narrow the number of combinations by taking out a zk table lock when doing table operations. For example, on split or merge, we could take a read-only lock meaning the table can't be disabled while these are running. We could then take a write only lock if we want to ensure the table doesn't change while disabling or enabling process is happening. Shouldn't be too hard to add. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223379#comment-13223379 ] ramkrishna.s.vasudevan commented on HBASE-5531: --- To be committed to trunk, 0.94 and 0.92? Correct me if am wrong. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223401#comment-13223401 ] stack commented on HBASE-5531: -- +1 on patch and +1 on commit to all of the branches cited above. Thanks Ram. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223422#comment-13223422 ] Phabricator commented on HBASE-5515: sc has commented on the revision HBASE-5515 [jira] Add a processRow API that supports atomic multiple reads and writes on a row. Thanks for the review, guys. I will update this soon. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java:41 I see. Now I understand what you mean. Yes, there should really be only one class. I will make the change. Thanks! src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java:44 That looks better. I will make the change. src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:4791 Sorry for the mistake. I will fix it. REVISION DETAIL https://reviews.facebook.net/D2067 Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.
[ https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223446#comment-13223446 ] stack commented on HBASE-5436: -- Please commit to both Lars. Thanks. Right-size the map when reading attributes. --- Key: HBASE-5436 URL: https://issues.apache.org/jira/browse/HBASE-5436 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Trivial Labels: performance Fix For: 0.94.0 Attachments: 0001-Right-size-the-map-when-reading-attributes.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223468#comment-13223468 ] ramkrishna.s.vasudevan commented on HBASE-5531: --- Committed to 0.92, 0.94 and trunk. Thanks for the patch Laxman. Thanks for the review Stack and Zhihong. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223469#comment-13223469 ] ramkrishna.s.vasudevan commented on HBASE-5531: --- How to change assignee to Laxman ? Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5531: -- Resolution: Fixed Fix Version/s: 0.94.0 Status: Resolved (was: Patch Available) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5528) Retry splitting log if failed in the process of ServerShutdownHandler, and abort master when retries exhausted
[ https://issues.apache.org/jira/browse/HBASE-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223475#comment-13223475 ] Prakash Khemani commented on HBASE-5528: I think the log-splitting retry logic is there in ServerShutdownHandler ... In ServerShutdownHandler.process() ... the handler is requeued in case of error code try { if (this.shouldSplitHlog) { LOG.info(Splitting logs for + serverName); this.services.getMasterFileSystem().splitLog(serverName); } else { LOG.info(Skipping log splitting for + serverName); } } catch (IOException ioe) { this.services.getExecutorService().submit(this); this.deadServers.add(serverName); throw new IOException(failed log splitting for + serverName + , will retry, ioe); } code Retry splitting log if failed in the process of ServerShutdownHandler, and abort master when retries exhausted -- Key: HBASE-5528 URL: https://issues.apache.org/jira/browse/HBASE-5528 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-5528.patch, hbase-5528v2.patch We will retry splitting log if failed in splitLogAfterStartup when master starts. However, there is no retry for failed splitting log in the process of ServerShutdownHandler. Also, if we finally failed to split log, we should abort master even if filesystem is ok to prevent data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223490#comment-13223490 ] Mikhail Bautin commented on HBASE-5010: --- @Ram: I don't see any mentions of HBASE-5510 in this JIRA, except for your comment. What updates are you referring to? Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223493#comment-13223493 ] stack commented on HBASE-5531: -- I added him (See 'Administration' in JIRA. You should have access. Once in administration screens, look for people along the left.. the rest should be plain... bug me if you can't figure it). Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223492#comment-13223492 ] ramkrishna.s.vasudevan commented on HBASE-5010: --- @Mikhail The commit related updates that usually comes up once a commit is done was appearing in this JIRA. But it was for HBASE-5510. Ted removed them as it was not related to this JIRA. May be that deleted part you are not able to view now. :) Sorry if the above comment had confused you. Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-5531: - Assignee: ramkrishna.s.vasudevan Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Assignee: ramkrishna.s.vasudevan Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223496#comment-13223496 ] ramkrishna.s.vasudevan commented on HBASE-5531: --- Got it :).. Thanks Stack. Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Assignee: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan reassigned HBASE-5531: - Assignee: Laxman (was: ramkrishna.s.vasudevan) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Assignee: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL
[ https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223499#comment-13223499 ] Mikhail Bautin commented on HBASE-5010: --- Actually here is the reason for those confusing updates. Ted seems to have specified HBASE-5010 instead of HBASE-5510 in the commit message. commit 5d773d9fa176cb056b993fdff8a2853f75315ec8 Author: tedyu tedyu@13f79535-47bb-0310-9956-ffa450edef68 Date: Mon Mar 5 10:41:03 2012 HBASE-5010 Pass region info in LoadBalancer.randomAssignment(ListServerName servers) (Anoop Sam git-svn-id: http://svn.apache.org/repos/asf/hbase/trunk@1297155 13f79535-47bb-0310-9956-ffa450edef68 Filter HFiles based on TTL -- Key: HBASE-5010 URL: https://issues.apache.org/jira/browse/HBASE-5010 Project: HBase Issue Type: Bug Reporter: Mikhail Bautin Assignee: Mikhail Bautin Fix For: 0.94.0 Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch, D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch In ScanWildcardColumnTracker we have {code:java} this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl; ... private boolean isExpired(long timestamp) { return timestamp oldestStamp; } {code} but this time range filtering does not participate in HFile selection. In one real case this caused next() calls to time out because all KVs in a table got expired, but next() had to iterate over the whole table to find that out. We should be able to filter out those HFiles right away. I think a reasonable approach is to add a default timerange filter to every scan for a CF with a finite TTL and utilize existing filtering in StoreFile.Reader.passesTimerangeFilter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5533) Add more metrics to HBase
Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaneal Manek updated HBASE-5533: - Attachment: hbase-5533-0.92.patch Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: hbase-5533-0.92.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaneal Manek updated HBASE-5533: - Status: Patch Available (was: Open) Added a first pass to trunk and the 0.92 branch Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: hbase-5533-0.92.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223516#comment-13223516 ] Hadoop QA commented on HBASE-5533: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517268/hbase-5533-0.92.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1116//console This message is automatically generated. Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: hbase-5533-0.92.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.
[ https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223522#comment-13223522 ] Zhihong Yu commented on HBASE-5516: --- Can test results be described here ? {code} + if (this.compressAlgo.equals(Compression.Algorithm.GZ) blockBegin 0) { +blockBegin -= HEADER_SIZE; + } {code} What if blockBegin is 0 but less than HEADER_SIZE ? {code} + if (compressionBos == null) { +if (this.compressAlgo.equals(Compression.Algorithm.GZ)) { + createCompressionStream(); +} + } {code} The nested if statements can be condensed into one if statement. GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90. Key: HBASE-5516 URL: https://issues.apache.org/jira/browse/HBASE-5516 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.7 Attachments: HBASE-5516_2_0.90.patch Usage of GZip is leading to resident memory leak in 0.90. We need to have something similar to HBASE-5387 in 0.90. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223523#comment-13223523 ] stack commented on HBASE-5533: -- Did you mean the below: {code} -# hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext -# hbase.period=10 -# hbase.fileName=/tmp/metrics_hbase.log +hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext +hbase.period=10 +hbase.fileName=/tmp/metrics_hbase.log {code} Will there be a bunch of contention on these additions: {code} + static volatile BlockingQueueLong fsReadLatenciesNanos = new ArrayBlockingQueueLong(LATENCY_BUFFER_SIZE); {code} Could this fill the logs with thousands of repeated messages: {code} + if (!stored) { +LOG.warn(Dropping fs latency stat since buffer is full); + } {code} Could we use the cliff click counters instead of AtomicLong? They are on the classpath IIRC: {code} + private final MapString, AtomicLong counts; {code} These additions would be great to have. Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: hbase-5533-0.92.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223530#comment-13223530 ] Hudson commented on HBASE-5531: --- Integrated in HBase-0.94 #16 (See [https://builds.apache.org/job/HBase-0.94/16/]) HBASE-5531 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot (Laxman) (Revision 1297595) Result = SUCCESS ramkrishna : Files : * /hbase/branches/0.94/pom.xml Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Assignee: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5399: - Fix Version/s: 0.96.0 Too big for 0.94. Agreed? Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5529) MR test failures becuase MALLOC_ARENA_MAX is not set
[ https://issues.apache.org/jira/browse/HBASE-5529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223534#comment-13223534 ] Gregory Chanan commented on HBASE-5529: --- The 0.94 patch would be identical to trunk. Do you want me to reattach with a 0.94 name? Regarding your questions, see here: https://issues.apache.org/jira/browse/HADOOP-7154 That is a better explanation than I gave or the linked MR jira gives. I think that addresses all your questions, but let me know if you have more. MR test failures becuase MALLOC_ARENA_MAX is not set Key: HBASE-5529 URL: https://issues.apache.org/jira/browse/HBASE-5529 Project: HBase Issue Type: Bug Components: mapreduce, test Affects Versions: 0.92.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Attachments: HBASE-5529-to92.patch, HBASE-5529-trunk.patch When running unit tests on CentOS 6 I get a bunch of unit test failures in mapreduce-related tests due to: 2012-03-03 00:14:18,776 WARN [Container Monitor] monitor.ContainersMonitorImpl$MonitoringThread(436): Container [pid=21446,containerID=container_1330762435849_0002_01_01] is running beyond virtual memory limits. Current usage: 223.1mb of 2.0gb physical memory used; 6.9gb of 4.2gb virtual memory used. Killing container. Note: this also came up in the mapreduce project. See: https://issues.apache.org/jira/browse/MAPREDUCE-3933 Patch coming shortly -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5515: --- Attachment: HBASE-5515.D2067.15.patch sc updated the revision HBASE-5515 [jira] Add a processRow API that supports atomic multiple reads and writes on a row. Reviewers: tedyu, dhruba, JIRA Addressed Lars and Ted's comments @tedyu: I changed reference count to 31 and long count to 6. This is because we have added the following two things long rowProcessorTimeout RowProcessor.RowScanner rowScanner @lhofhansl: I have made most of the changes you suggested. Thanks! I tried to put RowProcessor into ProcessRowEndpoint. With that approach, we will need multiple CoprocessorProtocol. One for each implementation of RowProcessor. Because the implementation of the RowProcessor is now passed by the protocol class name instead of the Writable. For example, in our unit test, we have to do --- FriendsOfFriendsProtocol p = table.coprocessorProxy(FriendsOfFriendsProtocol.class, row); p.processRow(); --- And the unit test needs to provide both FriendsOfFriendsProtocol and FriendsOfFriendsEndpoint. I feel this is not very convenient. I still prefer to pass the RowProcessor information via a Writable. The other thought is that the user of RowProcessor is actually HRegion. But ProcessRowEndpoint is used by the client via coprocessor. I feel that it would be nice to have them separated instead of listing their methods together. What do you think? REVISION DETAIL https://reviews.facebook.net/D2067 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5526) Optional file permission settings
[ https://issues.apache.org/jira/browse/HBASE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223540#comment-13223540 ] Jesse Yates commented on HBASE-5526: Side note for other interested parties - just using dfs.umaskmode from hadoop isn't sufficient here as we want to protect the data files, but still see the directory structure. In true posix style, hdfs just applies the umask to all the files _and_ directories, as opposed to differentiating between the two. Optional file permission settings - Key: HBASE-5526 URL: https://issues.apache.org/jira/browse/HBASE-5526 Project: HBase Issue Type: New Feature Components: regionserver Reporter: Jesse Yates Assignee: Jesse Yates Fix For: 0.94.0 Attachments: java_HBASE-5526.patch Currently many all the files created by the HBase user are just written using the default file permissions granted by hdfs. However, it is often times adventageous to only allow a subset of the world to view the actual data written by hbase when scanning the raw hdfs files. This ticket covers setting permissions for files written to hdfs that are storing actual user data, as opposed to _all_ files written to hdfs as many of them contain non-identifiable metadata. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5515: --- Attachment: HBASE-5515.D2067.16.patch sc updated the revision HBASE-5515 [jira] Add a processRow API that supports atomic multiple reads and writes on a row. Reviewers: tedyu, dhruba, JIRA Forgot to include BaseRowProcessor.java REVISION DETAIL https://reviews.facebook.net/D2067 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRowProcessor.java src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5522) hbase 0.92 test artifacts are missing from Maven central
[ https://issues.apache.org/jira/browse/HBASE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5522: - Attachment: 5522.txt hbase 0.92 test artifacts are missing from Maven central Key: HBASE-5522 URL: https://issues.apache.org/jira/browse/HBASE-5522 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Roman Shaposhnik Attachments: 5522.txt Could someone with enough karma, please, publish the test artifacts for 0.92.0? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223550#comment-13223550 ] Hadoop QA commented on HBASE-5515: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517282/HBASE-5515.D2067.15.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1117//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1117//console This message is automatically generated. Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223564#comment-13223564 ] Phabricator commented on HBASE-5074: mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 What I meant by repeated blob was everything but the last four bytes. We can create a string constant for that part in TestHFileBlock and reuse it here. REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223565#comment-13223565 ] Phabricator commented on HBASE-5074: mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:161-174 What I meant by repeated blob was everything but the last four bytes. We can create a string constant for that part in TestHFileBlock and reuse it here. REVISION DETAIL https://reviews.facebook.net/D1521 BRANCH svn support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223567#comment-13223567 ] Mikhail Bautin commented on HBASE-5074: --- @Dhruba: could you please rerun the failed tests locally, as well as check the test reports? org.apache.hadoop.hbase.coprocessor.TestMasterObserver org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223574#comment-13223574 ] Phabricator commented on HBASE-5347: mbautin has resigned from the revision [HBASE-5347] [jira] GC free memory management in Level-1 Block Cache. We are not actively pursuing this approach anymore due to non-reproducibility of GC issues. REVISION DETAIL https://reviews.facebook.net/D1635 GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: D1635.5.patch On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223578#comment-13223578 ] Phabricator commented on HBASE-5521: heyongqiang has commented on the revision HBASE-5521 [jira] Move compression/decompression to an encoder specific encoding context. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java:323-327 good suggestion. i will change to use class for checking. Yes, we need to enforce that contains right now. Basically we want each encoder to use its own context object which is returned by newDataBlockEncodingContext() src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java:58 will remove the original method, and change all testcases to use the new api. src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:53 are you suggesting adding a new field Encoding to each encoder? i also think it is a good thing to do, but not sure the reason of why the current code not doing that. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:693 Good catch! this is a bug. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:811 good catch here. will assign to null. Initially i am doing that is because they are all defined as 'final'. will remove the final. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1253 default is used for nondata block. I can not make them singleton as they need to maintain some reusable objects internally. By reusable objects, it is basically some buffers allocated once and can be reused across multiple operations. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1327-1328 added back src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java:194-195 good catch. will add a class field to reuse it REVISION DETAIL https://reviews.facebook.net/D2097 BRANCH svn Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223586#comment-13223586 ] dhruba borthakur commented on HBASE-5074: - I ran all four of them individually (manually), and all four of them pass. Looking at the Hudson test results, it appears that all the failures are related to some map-reduce problem, but not really sure the precise cause. But I think that these failures are somehow related to this patch, especially because the Hudson tests for HBASE-5399 just passed successfully. Will investigate more (but if you have any clues, please do let me know). support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223590#comment-13223590 ] Phabricator commented on HBASE-5515: lhofhansl has commented on the revision HBASE-5515 [jira] Add a processRow API that supports atomic multiple reads and writes on a row. @Scott: FriendsOfFriendsProtocol would be known to the client and FriendsOfFriendsEndpoint to the server. You can always add a tiny client side wrapper to make it more convenient. It's cool if you want to do it differently. But keep in mind that a lot of effort went into getting classloading for coprocessors right. They can be loaded dynamically from a jar on HDFS and per table, or statically and globally for all tables. For example you could deploy new row processing logic without touching the servers at all (by loading a jar from HDFS). That is actually something I do not like about filters, to use a new filter it needs to be deployed at every regionserver and requires a rolling restart of the cluster. Coprocessors nicely solve this problems. With the current approach the row processing implementation would need to be deployed outside of the coprocessor framework and always require a cluster restart. REVISION DETAIL https://reviews.facebook.net/D2067 Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5521: --- Attachment: HBASE-5521.D2097.2.patch heyongqiang updated the revision HBASE-5521 [jira] Move compression/decompression to an encoder specific encoding context. Reviewers: JIRA, dhruba, tedyu, sc, mbautin address Ted and mbautin's comments. REVISION DETAIL https://reviews.facebook.net/D2097 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.2.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223598#comment-13223598 ] Karthik Ranganathan commented on HBASE-5509: @Zhihong Yu: We use this code as the primary means to backup HFiles inside FB. We have done a lot of improvements to the DFS copy underneath, and they have caused some bugs, but thats unrelated to this code. Not too many issues, besides tuning the number of mappers to use so that we dont overwhelm a running system. @Lars: You are correct about getStoreFileList() - it is passed from commandline and it is overloaded for a subset/all CF's. Zhihong - the list versus a comma-separated string is a trivial point since the list construction has to happen either in the RS or in the caller, so should not make much of a difference practically. MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509-v2.txt, 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223601#comment-13223601 ] Hadoop QA commented on HBASE-5515: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517287/HBASE-5515.D2067.16.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -128 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1118//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1118//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1118//console This message is automatically generated. Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5509) MR based copier for copying HFiles (trunk version)
[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223605#comment-13223605 ] Zhihong Yu commented on HBASE-5509: --- I agree about the point w.r.t. getStoreFileList() MR based copier for copying HFiles (trunk version) -- Key: HBASE-5509 URL: https://issues.apache.org/jira/browse/HBASE-5509 Project: HBase Issue Type: Sub-task Components: documentation, regionserver Reporter: Karthik Ranganathan Assignee: Lars Hofhansl Fix For: 0.94.0, 0.96.0 Attachments: 5509-v2.txt, 5509.txt This copier is a modification of the distcp tool in HDFS. It does the following: 1. List out all the regions in the HBase cluster for the required table 2. Write the above out to a file 3. Each mapper 3.1 lists all the HFiles for a given region by querying the regionserver 3.2 copies all the HFiles 3.3 outputs success if the copy succeeded, failure otherwise. Failed regions are retried in another loop 4. Mappers are placed on nodes which have maximum locality for a given region to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223606#comment-13223606 ] Zhihong Yu commented on HBASE-5399: --- {code} +// We put that all the possible setting to make it fails asap {code} should read 'put all the possible settings to make it fail'. Please remove extra empty lines in checkHBaseAvailable(). In HConnectionManager.java, can the java imports be moved back to top of file ? {code} +// We have a single lock for master zk to prevents deadlocks. Having {code} should read 'to prevent'. Please use spaces around '+' below: {code} + return hconnection +hashCode(); {code} Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223615#comment-13223615 ] stack commented on HBASE-5399: -- @LarsH I think its too radical a change in client behavior for 0.94. If we target it for 0.96, it'll be a ripple only compared to rpc changes; it won't be noticed. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223616#comment-13223616 ] Phabricator commented on HBASE-5521: mbautin has commented on the revision HBASE-5521 [jira] Move compression/decompression to an encoder specific encoding context. Yongqiang: thanks for addressing the comments. I will take another pass through the new version diff. Could you please use a consistent abbreviation for context? Either Ctx or Cxt, but not both. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:53 HFileDataBlockEncoder is actually aware of on-disk and in-cache encoding. The on-disk encoding may be disabled when in-cache encoding is enabled, but not the other way. The current code apparently did not need to explicitly store encoding type in EncodedDataBlock (which is mostly used for testing and benchmarking), and I am wondering why you need to store it now. REVISION DETAIL https://reviews.facebook.net/D2097 Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.2.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5522) hbase 0.92 test artifacts are missing from Maven central
[ https://issues.apache.org/jira/browse/HBASE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5522. -- Resolution: Fixed Fix Version/s: 0.94.0 0.92.1 Assignee: stack Confirmed that test.jar shows in repo. A sighting happened out on the mailing list. Committed this patch to 0.92, 0.94, and trunk. hbase 0.92 test artifacts are missing from Maven central Key: HBASE-5522 URL: https://issues.apache.org/jira/browse/HBASE-5522 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Roman Shaposhnik Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: 5522.txt Could someone with enough karma, please, publish the test artifacts for 0.92.0? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5531) Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot
[ https://issues.apache.org/jira/browse/HBASE-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223631#comment-13223631 ] Hudson commented on HBASE-5531: --- Integrated in HBase-0.92 #316 (See [https://builds.apache.org/job/HBase-0.92/316/]) HBASE-5531 Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot (Laxman) (Revision 1297589) Result = FAILURE ramkrishna : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/pom.xml Maven hadoop profile (version 23) needs to be updated with latest 23 snapshot - Key: HBASE-5531 URL: https://issues.apache.org/jira/browse/HBASE-5531 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.2 Reporter: Laxman Assignee: Laxman Labels: build Fix For: 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5531-trunk.patch, HBASE-5531.patch Current profile is still pointing to 0.23.1-SNAPSHOT. This is failing to build as 23.1 is already released and snapshot is not available anymore. We can update this to 0.23.2-SNAPSHOT. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5519) Incorrect warning in splitlogmanager
[ https://issues.apache.org/jira/browse/HBASE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5519: - Attachment: 5519.92.txt Version for 0.92 branch Incorrect warning in splitlogmanager Key: HBASE-5519 URL: https://issues.apache.org/jira/browse/HBASE-5519 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Attachments: 0001-HBASE-5519-Incorrect-warning-in-splitlogmanager.patch, 5519.92.txt because of recently added behavior - where the splitlogmanager timeout thread get's data from zk node just to check that the zk node is there ... we might have multiple watches firing without the task znode expiring. remove the poor warning message. (internally, there was an assert that failed in Mikhail's tests) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5519) Incorrect warning in splitlogmanager
[ https://issues.apache.org/jira/browse/HBASE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5519. -- Resolution: Fixed Assignee: Prakash Khemani Hadoop Flags: Reviewed Committed to 0.92, 0.94 and to trunk. This is a usability thing. Incorrect warning in splitlogmanager Key: HBASE-5519 URL: https://issues.apache.org/jira/browse/HBASE-5519 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: 0001-HBASE-5519-Incorrect-warning-in-splitlogmanager.patch, 5519.92.txt because of recently added behavior - where the splitlogmanager timeout thread get's data from zk node just to check that the zk node is there ... we might have multiple watches firing without the task znode expiring. remove the poor warning message. (internally, there was an assert that failed in Mikhail's tests) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223657#comment-13223657 ] nkeywal commented on HBASE-5399: @all; Ok, I will provide an updated patch with the comments taken into account. I also understood the issue behind TestZooKeeper. My fix currently breaks other unit tests, but if I manage to make it work I will include it. For 0.94 vs. 0.96 ok as well, however, it would makes sense to deprecate some methods in the 0.94, especially getMaster getZooKeeper in HConnection: people will get more time to react this way. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223660#comment-13223660 ] stack commented on HBASE-5399: -- @nkeywal yes, agree, good to deprecate in 0.94 rather than 0.96 so more time to move off the old methods Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223662#comment-13223662 ] stack commented on HBASE-5399: -- ... so it seems like there needs to be a separate patch for 0.94? Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223664#comment-13223664 ] nkeywal commented on HBASE-5399: I can do a patch with only the deprecated stuff, or it can be be added in a raw commit, as you like. On Tue, Mar 6, 2012 at 10:03 PM, stack (Commented) (JIRA) Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223670#comment-13223670 ] Phabricator commented on HBASE-4542: mbautin has commented on the revision [jira] [HBASE-4542] Add filter info to slow query logging. It looks like this never got committed into trunk. I will rebase the patch on trunk changes. REVISION DETAIL https://reviews.facebook.net/D1539 add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223685#comment-13223685 ] Phabricator commented on HBASE-4542: zhiqiu has commented on the revision [jira] [HBASE-4542] Add filter info to slow query logging. @mbautin Thanks a lot! REVISION DETAIL https://reviews.facebook.net/D1539 add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5515: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517282/HBASE-5515.D2067.15.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1117//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1117//console This message is automatically generated.) Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.17.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5515: --- Attachment: HBASE-5515.D2067.17.patch sc updated the revision HBASE-5515 [jira] Add a processRow API that supports atomic multiple reads and writes on a row. Reviewers: tedyu, dhruba, JIRA @Lars: They can be loaded dynamically from a jar on HDFS and per table, or statically and globally for all tables. This is really cool! I made a sight change to HRegion.processRow() so that it takes a RowProcessor interface (instead of the BaseRowProcessor class) Now people can still define their own CorocessorProtocol that implements RowProcessor and have the ability to dynamically load this class as you described. I wrote a unit test TestProcessRowWithCustomMadeProtocol.java to verify that use case. But if this is cool with you, I sill like to keep the ProcessRowProtocol and ProcessRowEndpoint because that's a convenient entry point for most users. REVISION DETAIL https://reviews.facebook.net/D2067 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowWithCustomMadeProtocol.java Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.17.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223705#comment-13223705 ] stack commented on HBASE-5074: -- @Dhruba Try resubmitting your patch too. We regularly see three of these mr tests fail. Fixed in hadoop 1.0.2 apparently. support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.94.0 Attachments: D1521.1.patch, D1521.1.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.10.patch, D1521.11.patch, D1521.11.patch, D1521.12.patch, D1521.12.patch, D1521.13.patch, D1521.13.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5522) hbase 0.92 test artifacts are missing from Maven central
[ https://issues.apache.org/jira/browse/HBASE-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223710#comment-13223710 ] Hudson commented on HBASE-5522: --- Integrated in HBase-0.94 #17 (See [https://builds.apache.org/job/HBase-0.94/17/]) HBASE-5522 hbase 0.92 test artifacts are missing from Maven central (Revision 1297682) Result = FAILURE stack : Files : * /hbase/branches/0.94/pom.xml hbase 0.92 test artifacts are missing from Maven central Key: HBASE-5522 URL: https://issues.apache.org/jira/browse/HBASE-5522 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Roman Shaposhnik Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: 5522.txt Could someone with enough karma, please, publish the test artifacts for 0.92.0? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5519) Incorrect warning in splitlogmanager
[ https://issues.apache.org/jira/browse/HBASE-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223711#comment-13223711 ] Hudson commented on HBASE-5519: --- Integrated in HBase-0.94 #17 (See [https://builds.apache.org/job/HBase-0.94/17/]) HBASE-5519 Incorrect warning in splitlogmanager (Revision 1297708) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java Incorrect warning in splitlogmanager Key: HBASE-5519 URL: https://issues.apache.org/jira/browse/HBASE-5519 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: 0001-HBASE-5519-Incorrect-warning-in-splitlogmanager.patch, 5519.92.txt because of recently added behavior - where the splitlogmanager timeout thread get's data from zk node just to check that the zk node is there ... we might have multiple watches firing without the task znode expiring. remove the poor warning message. (internally, there was an assert that failed in Mikhail's tests) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaneal Manek updated HBASE-5533: - Attachment: hbase5533-0.92-v2.patch Re hadoop-metrics.properties: You are right - I didn't mean to include that change. Re BlockingQueue contention: I don't expect much contention since there is only one (non-blocking) 'offer()' per FS I/O (figure, at the absolute worst with a large array of fast SSDs, we're doing we're still only doing a a few tens of thousands of IOs/second). I've attached a microbenchmark showing the current technique works fine for up to at least a few million offers()/second under heavy contention (with 99th percentile latencies under 0.001 milliseconds). If you still suspect this might be troublesome, I'd be happy to stripe the accesses to reducing contention (we don't need a strict ordering). I think that's probably unnecessary complexity though. Re logging: Good point - changed it to not log more than once per 30 seconds. Re AtomicLong counters: Didn't know about the Cliff Click couhbanters - I switched to using them. Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: hbase-5533-0.92.patch, hbase5533-0.92-v2.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaneal Manek updated HBASE-5533: - Attachment: BlockingQueueContention.java This is just the micro-benchmark to evaluate how BlockingArrayQueues perform under contention. It seems to be good enough for our purposes. Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: BlockingQueueContention.java, hbase-5533-0.92.patch, hbase5533-0.92-v2.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223723#comment-13223723 ] Phabricator commented on HBASE-5521: heyongqiang has commented on the revision HBASE-5521 [jira] Move compression/decompression to an encoder specific encoding context. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java:53 I think here it is DataBlockEncoder (prefix, diff etc), and not HFileDataBlockEncoder From encoding, you can get encoder. but from encoder, can not get encoding. The class field encoding is not needed, will remove it. REVISION DETAIL https://reviews.facebook.net/D2097 Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.2.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5533) Add more metrics to HBase
[ https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223727#comment-13223727 ] Hadoop QA commented on HBASE-5533: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517316/BlockingQueueContention.java against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1120//console This message is automatically generated. Add more metrics to HBase - Key: HBASE-5533 URL: https://issues.apache.org/jira/browse/HBASE-5533 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Shaneal Manek Assignee: Shaneal Manek Priority: Minor Attachments: BlockingQueueContention.java, hbase-5533-0.92.patch, hbase5533-0.92-v2.patch To debub/monitor production clusters, there are some more metrics I wish I had available. In particular: - Although the average FS latencies are useful, a 'histogram' of recent latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) would be more useful - Similar histograms of latencies on common operations (GET, PUT, DELETE) would be useful - Counting the number of accesses to each region to detect hotspotting - Exposing the current number of HLog files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4608: -- Comment: was deleted (was: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517160/4608v17.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -127 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestWALReplayCompressed org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1103//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1103//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1103//console This message is automatically generated.) HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5521: --- Attachment: HBASE-5521.D2097.3.patch heyongqiang updated the revision HBASE-5521 [jira] Move compression/decompression to an encoder specific encoding context. Reviewers: JIRA, dhruba, tedyu, sc, mbautin add mbautin's comments, and changed all Cxt to Ctx REVISION DETAIL https://reviews.facebook.net/D2097 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultEncodingContext.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoding.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockEncodingContext.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/Compression.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-4542: -- Attachment: Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch Rebasing patch on trunk add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5521: -- Hadoop Flags: Reviewed Status: Patch Available (was: Open) Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5515: --- Attachment: HBASE-5515.D2067.18.patch sc updated the revision HBASE-5515 [jira] Add a processRow API that supports atomic multiple reads and writes on a row. Reviewers: tedyu, dhruba, JIRA Remove some repeated codes in unit tests REVISION DETAIL https://reviews.facebook.net/D2067 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowEndpoint.java src/main/java/org/apache/hadoop/hbase/coprocessor/ProcessRowProtocol.java src/main/java/org/apache/hadoop/hbase/coprocessor/RowProcessor.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowEndpoint.java src/test/java/org/apache/hadoop/hbase/coprocessor/TestProcessRowWithCustomMadeProtocol.java Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.17.patch, HBASE-5515.D2067.18.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223750#comment-13223750 ] Hadoop QA commented on HBASE-5515: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517313/HBASE-5515.D2067.17.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1119//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1119//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1119//console This message is automatically generated. Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.17.patch, HBASE-5515.D2067.18.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223755#comment-13223755 ] Mikhail Bautin commented on HBASE-4542: --- All unit tests passed locally. Waiting for Hadoop QA before committing. add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5213) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan reassigned HBASE-5213: - Assignee: Gregory Chanan hbase master stop does not bring down backup masters -- Key: HBASE-5213 URL: https://issues.apache.org/jira/browse/HBASE-5213 Project: HBase Issue Type: Bug Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4542) add filter info to slow query logging
[ https://issues.apache.org/jira/browse/HBASE-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223785#comment-13223785 ] Hadoop QA commented on HBASE-4542: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517322/Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 154 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1121//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1121//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1121//console This message is automatically generated. add filter info to slow query logging - Key: HBASE-4542 URL: https://issues.apache.org/jira/browse/HBASE-4542 Project: HBase Issue Type: Improvement Affects Versions: 0.89.20100924 Reporter: Kannan Muthukkaruppan Assignee: Madhuwanti Vaidya Attachments: 0001-jira-HBASE-4542-Add-filter-info-to-slow-query-loggin.patch, Add-filter-info-to-slow-query-logging-2012-03-06_14_28_13.patch, D1263.2.patch, D1539.1.patch Slow query log doesn't report filters in effect. For example: {code} (operationTooSlow): \ {processingtimems:3468,client:10.138.43.206:40035,timeRange: [0,9223372036854775807],\ starttimems:1317772005821,responsesize:42411, \ class:HRegionServer,table:myTable,families:{CF1:ALL]},\ row:6c3b8efa132f0219b7621ed1e5c8c70b,queuetimems:0,\ method:get,totalColumns:1,maxVersions:1,storeLimit:-1} {code} the above would suggest that all columns of myTable:CF1 are being requested for the given row. But in reality there could be filters in effect (such as ColumnPrefixFilter, ColumnRangeFilter, TimestampsFilter() etc.). We should enhance the slow query log to capture report this information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5534) HBase shell's return value is almost always 0
HBase shell's return value is almost always 0 - Key: HBASE-5534 URL: https://issues.apache.org/jira/browse/HBASE-5534 Project: HBase Issue Type: Improvement Reporter: Alex Newman Assignee: Alex Newman So I was trying to write some simple scripts to verify client connections to HBase using the shell and I noticed that the HBase shell always returns 0 even when it can't connect to an HBase server. I'm not sure if this is the best option. What would be neat is if you had some capability to run commands like hbase shell --command='disable table;\ndrop table;' and it would error out if any of the commands fail to succeed. echo disable table | hbase shell could continue to work as it does now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5521) Move compression/decompression to an encoder specific encoding context
[ https://issues.apache.org/jira/browse/HBASE-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223795#comment-13223795 ] Hadoop QA commented on HBASE-5521: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517319/HBASE-5521.D2097.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -125 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 158 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1122//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1122//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1122//console This message is automatically generated. Move compression/decompression to an encoder specific encoding context -- Key: HBASE-5521 URL: https://issues.apache.org/jira/browse/HBASE-5521 Project: HBase Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HBASE-5521.1.patch, HBASE-5521.D2097.1.patch, HBASE-5521.D2097.2.patch, HBASE-5521.D2097.3.patch As part of working on HBASE-5313, we want to add a new columnar encoder/decoder. It makes sense to move compression to be part of encoder/decoder: 1) a scanner for a columnar encoded block can do lazy decompression to a specific part of a key value object 2) avoid an extra bytes copy from encoder to hblock-writer. If there is no encoder specified for a writer, the HBlock.Writer will use a default compression-context to do something very similar to today's code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5515) Add a processRow API that supports atomic multiple reads and writes on a row
[ https://issues.apache.org/jira/browse/HBASE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223800#comment-13223800 ] Hadoop QA commented on HBASE-5515: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517326/HBASE-5515.D2067.18.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -129 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 155 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1123//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1123//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1123//console This message is automatically generated. Add a processRow API that supports atomic multiple reads and writes on a row Key: HBASE-5515 URL: https://issues.apache.org/jira/browse/HBASE-5515 Project: HBase Issue Type: New Feature Reporter: Scott Chen Assignee: Scott Chen Attachments: HBASE-5515.D2067.1.patch, HBASE-5515.D2067.10.patch, HBASE-5515.D2067.11.patch, HBASE-5515.D2067.12.patch, HBASE-5515.D2067.13.patch, HBASE-5515.D2067.14.patch, HBASE-5515.D2067.15.patch, HBASE-5515.D2067.16.patch, HBASE-5515.D2067.17.patch, HBASE-5515.D2067.18.patch, HBASE-5515.D2067.2.patch, HBASE-5515.D2067.3.patch, HBASE-5515.D2067.4.patch, HBASE-5515.D2067.5.patch, HBASE-5515.D2067.6.patch, HBASE-5515.D2067.7.patch, HBASE-5515.D2067.8.patch, HBASE-5515.D2067.9.patch We have modified HRegion.java internally to do some atomic row processing. It will be nice to have a plugable API for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4608) HLog Compression
[ https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4608: -- Attachment: 4608-v19.txt Patch v19 from review board. HLog Compression Key: HBASE-4608 URL: https://issues.apache.org/jira/browse/HBASE-4608 Project: HBase Issue Type: New Feature Reporter: Li Pi Assignee: Li Pi Fix For: 0.94.0 Attachments: 4608-v19.txt, 4608v1.txt, 4608v13.txt, 4608v13.txt, 4608v14.txt, 4608v15.txt, 4608v16.txt, 4608v17.txt, 4608v18.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 4608v8fixed.txt The current bottleneck to HBase write speed is replicating the WAL appends across different datanodes. We can speed up this process by compressing the HLog. Current plan involves using a dictionary to compress table name, region id, cf name, and possibly other bits of repeated data. Also, HLog format may be changed in other ways to produce a smaller HLog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira