[jira] [Commented] (HBASE-4296) Deprecate HTable[Interface].getRowOrBefore(...)
[ https://issues.apache.org/jira/browse/HBASE-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214435#comment-13214435 ] dhruba borthakur commented on HBASE-4296: - The ThriftServer uses HTable.getRowOrBefore() to find a entry in the .META. table. This used to work with hbase-92 but returns null for hbase-94. did something change here? Deprecate HTable[Interface].getRowOrBefore(...) --- Key: HBASE-4296 URL: https://issues.apache.org/jira/browse/HBASE-4296 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Trivial Fix For: 0.92.0 Attachments: 4296.txt HTable's getRowOrBefore(...) internally calls into Store.getRowKeyAtOrBefore. That method was created to allow our scanning of .META. (see HBASE-2600). Store.getRowKeyAtOrBefore(...) lists a bunch of requirements for this to be performant that a user of HTable will not be aware of. I propose deprecating this in the public interface in 0.92 and removing it from the public interface in 0.94. If we don't get to HBASE-2600 in 0.94 it will still remain as internal interface for scanning meta. Comments? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5437) HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy
[ https://issues.apache.org/jira/browse/HBASE-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214447#comment-13214447 ] Hadoop QA commented on HBASE-5437: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515720/HBASE-5437.D1887.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 152 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1022//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1022//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1022//console This message is automatically generated. HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy --- Key: HBASE-5437 URL: https://issues.apache.org/jira/browse/HBASE-5437 Project: HBase Issue Type: Bug Components: metrics, thrift Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.94.0 Attachments: HBASE-5437.D1857.1.patch, HBASE-5437.D1887.1.patch, HBASE-5437.D1887.2.patch 3.facebook.com,60020,1329865516120: Initialization of RS failed. Hence aborting RS. java.lang.ClassCastException: $Proxy9 cannot be cast to org.apache.hadoop.hbase.thrift.generated.Hbase$Iface at org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.newInstance(HbaseHandlerMetricsProxy.java:47) at org.apache.hadoop.hbase.thrift.ThriftServerRunner.init(ThriftServerRunner.java:239) at org.apache.hadoop.hbase.regionserver.HRegionThriftServer.init(HRegionThriftServer.java:74) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeThreads(HRegionServer.java:646) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:546) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:658) at java.lang.Thread.run(Thread.java:662) 2012-02-21 15:05:18,749 FATAL org.apache.hadoop.h -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5270: Attachment: hbase-5270v5.patch Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler - Key: HBASE-5270 URL: https://issues.apache.org/jira/browse/HBASE-5270 Project: HBase Issue Type: Sub-task Components: master Reporter: Zhihong Yu Assignee: chunhui shen Fix For: 0.92.1, 0.94.0 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, sampletest.txt This JIRA continues the effort from HBASE-5179. Starting with Stack's comments about patches for 0.92 and TRUNK: Reviewing 0.92v17 isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere. Does isDeadRootServerInProgress need to be public? Ditto for meta version. This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier? Is there anything in place to stop us expiring a server twice if its carrying root and meta? What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta. I think I've asked for this a few times - onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all. It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here? Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works? Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired? This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch? This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214475#comment-13214475 ] chunhui shen commented on HBASE-5270: - @Ted I submit patch v5. bq. So a server could be in both deadNotExpiredServers and deadservers ? I don't see return statement in the if block. I'm sorry I make a mistake to miss return statement in the if block. Also we check that we're not in safe mode in expireDelayedServers(). And master is in safe mode only when it is initializing now. Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler - Key: HBASE-5270 URL: https://issues.apache.org/jira/browse/HBASE-5270 Project: HBase Issue Type: Sub-task Components: master Reporter: Zhihong Yu Assignee: chunhui shen Fix For: 0.92.1, 0.94.0 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, sampletest.txt This JIRA continues the effort from HBASE-5179. Starting with Stack's comments about patches for 0.92 and TRUNK: Reviewing 0.92v17 isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere. Does isDeadRootServerInProgress need to be public? Ditto for meta version. This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier? Is there anything in place to stop us expiring a server twice if its carrying root and meta? What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta. I think I've asked for this a few times - onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all. It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here? Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works? Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired? This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch? This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214476#comment-13214476 ] chunhui shen commented on HBASE-5270: - I can't add review request, it throws error:The file 'https://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java' (r1292711) could not be found in the repository why? Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler - Key: HBASE-5270 URL: https://issues.apache.org/jira/browse/HBASE-5270 Project: HBase Issue Type: Sub-task Components: master Reporter: Zhihong Yu Assignee: chunhui shen Fix For: 0.92.1, 0.94.0 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, sampletest.txt This JIRA continues the effort from HBASE-5179. Starting with Stack's comments about patches for 0.92 and TRUNK: Reviewing 0.92v17 isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere. Does isDeadRootServerInProgress need to be public? Ditto for meta version. This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier? Is there anything in place to stop us expiring a server twice if its carrying root and meta? What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta. I think I've asked for this a few times - onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all. It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here? Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works? Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired? This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch? This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5462) [monitor] Ganglia metric hbase.master.cluster_requests should exclude the scan meta request generated by master, or create a new metric which could show the real request
[monitor] Ganglia metric hbase.master.cluster_requests should exclude the scan meta request generated by master, or create a new metric which could show the real request from client - Key: HBASE-5462 URL: https://issues.apache.org/jira/browse/HBASE-5462 Project: HBase Issue Type: Bug Components: monitoring Environment: hbase 0.90.5 Reporter: johnyang We have a big table which have 30k regions but the request is not very high (about 50K per day). We use the hbase.master.cluster_request metrics to monitor the cluster request but find that lots of requests is generated by master, which scan the meta table at regular intervals. It is hard for us to monitor the real request from the client, it is possible to filter the scanning meta table or create a new metric which could show the real request from client. Thank you. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4491) HBase Locality Checker
[ https://issues.apache.org/jira/browse/HBASE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214617#comment-13214617 ] Anoop Sam John commented on HBASE-4491: --- @Liyin: Looks very useful. Any update on this new feature HBase Locality Checker -- Key: HBASE-4491 URL: https://issues.apache.org/jira/browse/HBASE-4491 Project: HBase Issue Type: New Feature Reporter: Liyin Tang Assignee: Liyin Tang If we run data node and region server in the same physical machine, region server will be benefit if the store files for its serving regions have a local replica in the data node process. So for each regions, there exists a best locality region server which has most local blocks for this region. The HBase Locality Checker will show how many regions is running on its best locality region server. The higher the number is, the more performance benefits HBase can get from data locality. Also there would be a followup task to use these region locality information for region assignment. Assignment manager will prefer assign regions to its best locality region server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5455) Add test to avoid unintentional reordering of items in HbaseObjectWritable
[ https://issues.apache.org/jira/browse/HBASE-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214678#comment-13214678 ] Michael Drzal commented on HBASE-5455: -- Lars, sure, let me throw something together, and I'll send it your way. Add test to avoid unintentional reordering of items in HbaseObjectWritable -- Key: HBASE-5455 URL: https://issues.apache.org/jira/browse/HBASE-5455 Project: HBase Issue Type: Test Reporter: Michael Drzal Priority: Minor Fix For: 0.94.0 HbaseObjectWritable has a static initialization block that assigns ints to various classes. The int is assigned by using a local variable that is incremented after each use. If someone adds a line in the middle of the block, this throws off everything after the change, and can break client compatibility. There is already a comment to not add/remove lines at the beginning of this block. It might make sense to have a test against a static set of ids. If something gets changed unintentionally, it would at least fail the tests. If the change was intentional, at the very least the test would need to get updated, and it would be a conscious decision. https://issues.apache.org/jira/browse/HBASE-5204 contains the the fix for one issue of this type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler
[ https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214793#comment-13214793 ] Zhihong Yu commented on HBASE-5270: --- I was able to create new request. Select hbase for Repository. Enter '/' for Base Directory. Leave Bugs field blank. Enter hbase to Groups field. Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler - Key: HBASE-5270 URL: https://issues.apache.org/jira/browse/HBASE-5270 Project: HBase Issue Type: Sub-task Components: master Reporter: Zhihong Yu Assignee: chunhui shen Fix For: 0.92.1, 0.94.0 Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch, 5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch, 5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch, hbase-5270v4.patch, hbase-5270v5.patch, sampletest.txt This JIRA continues the effort from HBASE-5179. Starting with Stack's comments about patches for 0.92 and TRUNK: Reviewing 0.92v17 isDeadServerInProgress is a new public method in ServerManager but it does not seem to be used anywhere. Does isDeadRootServerInProgress need to be public? Ditto for meta version. This method param names are not right 'definitiveRootServer'; what is meant by definitive? Do they need this qualifier? Is there anything in place to stop us expiring a server twice if its carrying root and meta? What is difference between asking assignment manager isCarryingRoot and this variable that is passed in? Should be doc'd at least. Ditto for meta. I think I've asked for this a few times - onlineServers needs to be explained... either in javadoc or in comment. This is the param passed into joinCluster. How does it arise? I think I know but am unsure. God love the poor noob that comes awandering this code trying to make sense of it all. It looks like we get the list by trawling zk for regionserver znodes that have not checked in. Don't we do this operation earlier in master setup? Are we doing it again here? Though distributed split log is configured, we will do in master single process splitting under some conditions with this patch. Its not explained in code why we would do this. Why do we think master log splitting 'high priority' when it could very well be slower. Should we only go this route if distributed splitting is not going on. Do we know if concurrent distributed log splitting and master splitting works? Why would we have dead servers in progress here in master startup? Because a servershutdownhandler fired? This patch is different to the patch for 0.90. Should go into trunk first with tests, then 0.92. Should it be in this issue? This issue is really hard to follow now. Maybe this issue is for 0.90.x and new issue for more work on this trunk patch? This patch needs to have the v18 differences applied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5461) Set hbase.hstore.compaction.min.size way down to 4MB
[ https://issues.apache.org/jira/browse/HBASE-5461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214838#comment-13214838 ] Nicolas Spiegelberg commented on HBASE-5461: bq. I wonder if a good default would be some fraction of the flushsize. Maybe 1/4*flushsize, or something. Note that we raised the min size so that we'd aggressively compact the .META. table, which is normally pretty small. Set hbase.hstore.compaction.min.size way down to 4MB Key: HBASE-5461 URL: https://issues.apache.org/jira/browse/HBASE-5461 Project: HBase Issue Type: Task Affects Versions: 0.92.1 Reporter: stack Priority: Critical See discussion over in HBASE-3149. Nicolas suggests setting this setting way down, to the below: {code} property namehbase.hstore.compaction.min.size/name value4194304/value description The minimum compaction size. All files below this size are always included into a compaction, even if outside compaction ratio times the total size of all files added to compaction so far. /description /property {code} Lets try it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5437) HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy
[ https://issues.apache.org/jira/browse/HBASE-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214857#comment-13214857 ] stack commented on HBASE-5437: -- +1 HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy --- Key: HBASE-5437 URL: https://issues.apache.org/jira/browse/HBASE-5437 Project: HBase Issue Type: Bug Components: metrics, thrift Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.94.0 Attachments: HBASE-5437.D1857.1.patch, HBASE-5437.D1887.1.patch, HBASE-5437.D1887.2.patch 3.facebook.com,60020,1329865516120: Initialization of RS failed. Hence aborting RS. java.lang.ClassCastException: $Proxy9 cannot be cast to org.apache.hadoop.hbase.thrift.generated.Hbase$Iface at org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.newInstance(HbaseHandlerMetricsProxy.java:47) at org.apache.hadoop.hbase.thrift.ThriftServerRunner.init(ThriftServerRunner.java:239) at org.apache.hadoop.hbase.regionserver.HRegionThriftServer.init(HRegionThriftServer.java:74) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeThreads(HRegionServer.java:646) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:546) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:658) at java.lang.Thread.run(Thread.java:662) 2012-02-21 15:05:18,749 FATAL org.apache.hadoop.h -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5423) Regionserver may block forever on waitOnAllRegionsToClose when aborting
[ https://issues.apache.org/jira/browse/HBASE-5423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214865#comment-13214865 ] Zhihong Yu commented on HBASE-5423: --- {code} + if (this.regionsInTransitionInRS.isEmpty()) { + if (!isOnlineRegionsEmpty()) { + LOG.info(We were exiting though online regions are not empty, because some regions failed closing); {code} I think regionsInTransitionOnRS is better name for the new Map. The log line above exceeds 80 chars. Regionserver may block forever on waitOnAllRegionsToClose when aborting --- Key: HBASE-5423 URL: https://issues.apache.org/jira/browse/HBASE-5423 Project: HBase Issue Type: Bug Components: regionserver Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.94.0 Attachments: hbase-5423.patch, hbase-5423v2.patch If closeRegion throws any exception (It would be caused by FS ) when RS is aborting, RS will block forever on waitOnAllRegionsToClose(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4403) Adopt interface stability/audience classifications from Hadoop
[ https://issues.apache.org/jira/browse/HBASE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-4403: --- Status: Patch Available (was: Open) Adopt interface stability/audience classifications from Hadoop -- Key: HBASE-4403 URL: https://issues.apache.org/jira/browse/HBASE-4403 Project: HBase Issue Type: Task Affects Versions: 0.92.0, 0.90.5 Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.94.0 Attachments: hbase-4403-interface.txt, hbase-4403-interface_v2.txt, hbase-4403-interface_v3.txt, hbase-4403-nowhere-near-done.txt, hbase-4403.patch, hbase-4403.patch As HBase gets more widely used, we need to be more explicit about which APIs are stable and not expected to break between versions, which APIs are still evolving, etc. We also have many public classes that are really internal to the RS or Master and not meant to be used by users. Hadoop has adopted a classification scheme for audience (public, private, or limited-private) as well as stability (stable, evolving, unstable). I think we should copy these annotations to HBase and start to classify our public classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5442) Use builder pattern in StoreFile and HFile
[ https://issues.apache.org/jira/browse/HBASE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214868#comment-13214868 ] stack commented on HBASE-5442: -- @Mikhail Thats the usual set of three that fail on hadoopqa, fyi. Use builder pattern in StoreFile and HFile -- Key: HBASE-5442 URL: https://issues.apache.org/jira/browse/HBASE-5442 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1893.1.patch, D1893.2.patch, HFile-StoreFile-builder-2012-02-22_22_49_00.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses StoreFile and HFile refactoring. For HColumnDescriptor refactoring see HBASE-5357. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5454) Refuse operations from Admin befor master is initialized
[ https://issues.apache.org/jira/browse/HBASE-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214870#comment-13214870 ] stack commented on HBASE-5454: -- So, you want to mash this patch into hbase-5270? If so, close this one as won't fix? Refuse operations from Admin befor master is initialized Key: HBASE-5454 URL: https://issues.apache.org/jira/browse/HBASE-5454 Project: HBase Issue Type: Improvement Reporter: chunhui shen Attachments: hbase-5454.patch In our testing environment, When master is initializing, we found conflict problems between master#assignAllUserRegions and EnableTable event, causing assigning region throw exception so that master abort itself. We think we'd better refuse operations from Admin, such as CreateTable, EnableTable,etc, It could reduce error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
[ https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5166: - Attachment: 5166-v9.txt Same as 0008. Uploading again to rerun hadoopqa. Shouldn't be failing that many tests with this patch. MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop -- Key: HBASE-5166 URL: https://issues.apache.org/jira/browse/HBASE-5166 Project: HBase Issue Type: Improvement Reporter: Jai Kumar Singh Priority: Minor Labels: multithreaded, tablemapper Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt Original Estimate: 0.5h Remaining Estimate: 0.5h There is no MultiThreadedTableMapper in hbase currently just like we have a MultiThreadedMapper in Hadoop for IO Bound Jobs. UseCase, webcrawler: take input (urls) from a hbase table and put the content (urls, content) back into hbase. Running these kind of hbase mapreduce job with normal table mapper is quite slow as we are not utilizing CPU fully (N/W IO Bound). Moreover, I want to know whether It would be a good/bad idea to use HBase for these kind of usecases ?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
[ https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5166: - Status: Open (was: Patch Available) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop -- Key: HBASE-5166 URL: https://issues.apache.org/jira/browse/HBASE-5166 Project: HBase Issue Type: Improvement Reporter: Jai Kumar Singh Priority: Minor Labels: multithreaded, tablemapper Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt Original Estimate: 0.5h Remaining Estimate: 0.5h There is no MultiThreadedTableMapper in hbase currently just like we have a MultiThreadedMapper in Hadoop for IO Bound Jobs. UseCase, webcrawler: take input (urls) from a hbase table and put the content (urls, content) back into hbase. Running these kind of hbase mapreduce job with normal table mapper is quite slow as we are not utilizing CPU fully (N/W IO Bound). Moreover, I want to know whether It would be a good/bad idea to use HBase for these kind of usecases ?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
[ https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5166: - Status: Patch Available (was: Open) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop -- Key: HBASE-5166 URL: https://issues.apache.org/jira/browse/HBASE-5166 Project: HBase Issue Type: Improvement Reporter: Jai Kumar Singh Priority: Minor Labels: multithreaded, tablemapper Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt Original Estimate: 0.5h Remaining Estimate: 0.5h There is no MultiThreadedTableMapper in hbase currently just like we have a MultiThreadedMapper in Hadoop for IO Bound Jobs. UseCase, webcrawler: take input (urls) from a hbase table and put the content (urls, content) back into hbase. Running these kind of hbase mapreduce job with normal table mapper is quite slow as we are not utilizing CPU fully (N/W IO Bound). Moreover, I want to know whether It would be a good/bad idea to use HBase for these kind of usecases ?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5166) MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop
[ https://issues.apache.org/jira/browse/HBASE-5166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214912#comment-13214912 ] Hadoop QA commented on HBASE-5166: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515764/5166-v9.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -134 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 153 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1024//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1024//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1024//console This message is automatically generated. MultiThreaded Table Mapper analogous to MultiThreaded Mapper in hadoop -- Key: HBASE-5166 URL: https://issues.apache.org/jira/browse/HBASE-5166 Project: HBase Issue Type: Improvement Reporter: Jai Kumar Singh Priority: Minor Labels: multithreaded, tablemapper Attachments: 0001-Added-MultithreadedTableMapper-HBASE-5166.patch, 0003-Added-MultithreadedTableMapper-HBASE-5166.patch, 0005-HBASE-5166-Added-MultithreadedTableMapper.patch, 0006-HBASE-5166-Added-MultithreadedTableMapper.patch, 0008-HBASE-5166-Added-MultithreadedTableMapper.patch, 5166-v9.txt Original Estimate: 0.5h Remaining Estimate: 0.5h There is no MultiThreadedTableMapper in hbase currently just like we have a MultiThreadedMapper in Hadoop for IO Bound Jobs. UseCase, webcrawler: take input (urls) from a hbase table and put the content (urls, content) back into hbase. Running these kind of hbase mapreduce job with normal table mapper is quite slow as we are not utilizing CPU fully (N/W IO Bound). Moreover, I want to know whether It would be a good/bad idea to use HBase for these kind of usecases ?. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together
[ https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214915#comment-13214915 ] He Yongqiang commented on HBASE-5457: - @stack, we haven't thought that in much detail, but we can start the discussion by an example. Let's say there is one column family, and it only contains one type column whose name is a combine of 'string and ts'. So the data is sorted by 'string' first. But one query wants the data to be sorted by ts instead. add inline index in data block for data which are not clustered together Key: HBASE-5457 URL: https://issues.apache.org/jira/browse/HBASE-5457 Project: HBase Issue Type: New Feature Reporter: He Yongqiang As we are go through our data schema, and we found we have one large column family which is just duplicating data from another column family and is just a re-org of the data to cluster data in a different way than the original column family in order to serve another type of queries efficiently. If we compare this second column family with similar situation in mysql, it is like an index in mysql. So if we can add inline block index on required columns, the second column family then is not needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together
[ https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214922#comment-13214922 ] Lars Hofhansl commented on HBASE-5457: -- @He. So you found the row and then you search inside the row with a ColumnRange or ColumnPrefix filter? add inline index in data block for data which are not clustered together Key: HBASE-5457 URL: https://issues.apache.org/jira/browse/HBASE-5457 Project: HBase Issue Type: New Feature Reporter: He Yongqiang As we are go through our data schema, and we found we have one large column family which is just duplicating data from another column family and is just a re-org of the data to cluster data in a different way than the original column family in order to serve another type of queries efficiently. If we compare this second column family with similar situation in mysql, it is like an index in mysql. So if we can add inline block index on required columns, the second column family then is not needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload
[ https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214944#comment-13214944 ] Jean-Daniel Cryans commented on HBASE-5349: --- @Mubarak We already have that through HeapSize, it's really just a matter of knowing what to auto-tune and when. Automagically tweak global memstore and block cache sizes based on workload --- Key: HBASE-5349 URL: https://issues.apache.org/jira/browse/HBASE-5349 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0 Hypertable does a neat thing where it changes the size given to the CellCache (our MemStores) and Block Cache based on the workload. If you need an image, scroll down at the bottom of this link: http://www.hypertable.com/documentation/architecture/ That'd be one less thing to configure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5434) [REST] Include more metrics in cluster status request
[ https://issues.apache.org/jira/browse/HBASE-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214945#comment-13214945 ] Mubarak Seyed commented on HBASE-5434: -- @Stack When i tested with -P runMediumTests, it went well but code annotate as smallTests, will update a patch soon. Sorry for the inconvenience. Thanks. [REST] Include more metrics in cluster status request - Key: HBASE-5434 URL: https://issues.apache.org/jira/browse/HBASE-5434 Project: HBase Issue Type: Improvement Components: metrics, rest Affects Versions: 0.94.0 Reporter: Mubarak Seyed Assignee: Mubarak Seyed Priority: Minor Labels: noob Fix For: 0.94.0 Attachments: HBASE-5434.trunk.v1.patch /status/cluster shows only {code} stores=2 storefiless=0 storefileSizeMB=0 memstoreSizeMB=0 storefileIndexSizeMB=0 {code} for a region but master web-ui shows {code} stores=1, storefiles=0, storefileUncompressedSizeMB=0 storefileSizeMB=0 memstoreSizeMB=0 storefileIndexSizeMB=0 readRequestsCount=0 writeRequestsCount=0 rootIndexSizeKB=0 totalStaticIndexSizeKB=0 totalStaticBloomSizeKB=0 totalCompactingKVs=0 currentCompactedKVs=0 compactionProgressPct=NaN {code} In a write-heavy REST gateway based production environment, ops team needs to verify whether write counters are getting incremented per region (they do run /status/cluster on each REST server), we can get the same values from *rpc.metrics.put_num_ops* and *hbase.regionserver.writeRequestsCount* but some home-grown tools needs to parse the output of /status/cluster and updates the dashboard. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5463) Why is my upload to mvn spread across multiple repositories?
Why is my upload to mvn spread across multiple repositories? Key: HBASE-5463 URL: https://issues.apache.org/jira/browse/HBASE-5463 Project: HBase Issue Type: Task Reporter: stack I'm been struggling publishing a release to repository.apache.org. Its worked for me in the past. If you look at https://repository.apache.org/index.html#stagingRepositories (you need to be logged in), you will see that I somehow made twelve repositories when I did my mvn release:perform, each artifact element to its own repo. Any idea how that happens? (I'll attach a png that shows similar). How do I prevent it? I have another issue where the upload to apache will fail with a 400 Bad Request very frequently uploading one of my artifact items -- usually maven-metadata.xml -- but then, just now, it went through fine. Pointers appreciated on this little nugget too. I'm using mvn 3.0.4 and 2.2.2 of the maven:release plugin. Otherwise, my settings.xml is one that has worked for me in the past: {code} servers !-- To publish a snapshot of some part of Maven -- server idapache.snapshots.https/id usernamestack /username password /password /server !-- To publish a website using Maven -- !-- To stage a release of some part of Maven -- server idapache.releases.https/id usernamestack /username password /password /server /servers profiles profile idapache-release/id properties gpg.keyname00A5F21E/gpg.keyname gpg.passphrase /gpg.passphrase /properties /profile /profiles /settings {code} My pom is here: http://svn.apache.org/viewvc/hbase/tags/0.92.0mvn/pom.xml?view=markup Thanks for any pointers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5357: --- Attachment: D1851.3.patch mbautin updated the revision [jira] [HBASE-5357] Refactoring: use the builder pattern for HColumnDescriptor. Reviewers: JIRA, todd, stack, tedyu, Kannan, Karthik, Liyin Rebasing on trunk changes. REVISION DETAIL https://reviews.facebook.net/D1851 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java src/main/java/org/apache/hadoop/hbase/client/UnmodifyableHColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/thrift/ThriftUtilities.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestSerialization.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java src/test/java/org/apache/hadoop/hbase/regionserver/TestColumnSeeking.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5463) Why is my upload to mvn spread across multiple repositories?
[ https://issues.apache.org/jira/browse/HBASE-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-5463. -- Resolution: Won't Fix Resolving. Meant to file this against infra: https://issues.apache.org/jira/browse/INFRA-4482 Why is my upload to mvn spread across multiple repositories? Key: HBASE-5463 URL: https://issues.apache.org/jira/browse/HBASE-5463 Project: HBase Issue Type: Task Reporter: stack I'm been struggling publishing a release to repository.apache.org. Its worked for me in the past. If you look at https://repository.apache.org/index.html#stagingRepositories (you need to be logged in), you will see that I somehow made twelve repositories when I did my mvn release:perform, each artifact element to its own repo. Any idea how that happens? (I'll attach a png that shows similar). How do I prevent it? I have another issue where the upload to apache will fail with a 400 Bad Request very frequently uploading one of my artifact items -- usually maven-metadata.xml -- but then, just now, it went through fine. Pointers appreciated on this little nugget too. I'm using mvn 3.0.4 and 2.2.2 of the maven:release plugin. Otherwise, my settings.xml is one that has worked for me in the past: {code} servers !-- To publish a snapshot of some part of Maven -- server idapache.snapshots.https/id usernamestack /username password /password /server !-- To publish a website using Maven -- !-- To stage a release of some part of Maven -- server idapache.releases.https/id usernamestack /username password /password /server /servers profiles profile idapache-release/id properties gpg.keyname00A5F21E/gpg.keyname gpg.passphrase /gpg.passphrase /properties /profile /profiles /settings {code} My pom is here: http://svn.apache.org/viewvc/hbase/tags/0.92.0mvn/pom.xml?view=markup Thanks for any pointers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214964#comment-13214964 ] Hadoop QA commented on HBASE-5357: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515781/D1851.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 60 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 152 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestWideScanner Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1025//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1025//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1025//console This message is automatically generated. Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5357: --- Attachment: D1851.4.patch mbautin updated the revision [jira] [HBASE-5357] Refactoring: use the builder pattern for HColumnDescriptor. Reviewers: JIRA, todd, stack, tedyu, Kannan, Karthik, Liyin Fix TestWideScanner failure. REVISION DETAIL https://reviews.facebook.net/D1851 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java src/main/java/org/apache/hadoop/hbase/client/UnmodifyableHColumnDescriptor.java src/main/java/org/apache/hadoop/hbase/thrift/ThriftUtilities.java src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/TestSerialization.java src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestEncodedSeekers.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java src/test/java/org/apache/hadoop/hbase/regionserver/TestBlocksRead.java src/test/java/org/apache/hadoop/hbase/regionserver/TestColumnSeeking.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java src/test/java/org/apache/hadoop/hbase/regionserver/TestScanner.java src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java src/test/java/org/apache/hadoop/hbase/regionserver/TestWideScanner.java src/test/java/org/apache/hadoop/hbase/thrift2/TestThriftHBaseServiceHandler.java Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5357: -- Attachment: Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214974#comment-13214974 ] Phabricator commented on HBASE-5357: mbautin has commented on the revision [jira] [HBASE-5357] Refactoring: use the builder pattern for HColumnDescriptor. This should pass all unit tests now (re-running internally as well as on Hadoop QA). Someone please accept this patch if there are no additional comments. Thanks! REVISION DETAIL https://reviews.facebook.net/D1851 Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5464) Log warning message when thrift calls throw exceptions
[ https://issues.apache.org/jira/browse/HBASE-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5464: --- Attachment: HBASE-5464.D1899.1.patch sc requested code review of HBASE-5464 [jira] Log warning message when thrift calls throw exceptions. Reviewers: tedyu, dhruba, JIRA Log warning when thrift calls throws exceptions Task ID: # Blame Rev: Currently there is no logging message when client calls throw exceptions. It will be easier to debug if we have them. TEST PLAN none Revert Plan: Tags: REVISION DETAIL https://reviews.facebook.net/D1899 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/4047/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. Log warning message when thrift calls throw exceptions -- Key: HBASE-5464 URL: https://issues.apache.org/jira/browse/HBASE-5464 Project: HBase Issue Type: Improvement Components: thrift Reporter: Scott Chen Assignee: Scott Chen Priority: Trivial Attachments: HBASE-5464.D1899.1.patch Currently there is no logging message when client calls throw exceptions. It will be easier to debug if we have them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5464) Log warning message when thrift calls throw exceptions
Log warning message when thrift calls throw exceptions -- Key: HBASE-5464 URL: https://issues.apache.org/jira/browse/HBASE-5464 Project: HBase Issue Type: Improvement Components: thrift Reporter: Scott Chen Assignee: Scott Chen Priority: Trivial Attachments: HBASE-5464.D1899.1.patch Currently there is no logging message when client calls throw exceptions. It will be easier to debug if we have them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5465) [book] chaning book to reference guide (content only, not filenames)
[ https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5465: - Attachment: docbkx_hbase_5465.patch [book] chaning book to reference guide (content only, not filenames) Key: HBASE-5465 URL: https://issues.apache.org/jira/browse/HBASE-5465 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_hbase_5465.patch book.xml preface.xml Changing book to reference guide Note: the filenames are still the same. This is only a change to the way the document refers to itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5465) [book] chaning book to reference guide (content only, not filenames)
[ https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5465: - Status: Patch Available (was: Open) [book] chaning book to reference guide (content only, not filenames) Key: HBASE-5465 URL: https://issues.apache.org/jira/browse/HBASE-5465 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_hbase_5465.patch book.xml preface.xml Changing book to reference guide Note: the filenames are still the same. This is only a change to the way the document refers to itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5465) [book] changing book to reference guide (content only, not filenames)
[ https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5465: - Summary: [book] changing book to reference guide (content only, not filenames) (was: [book] chaning book to reference guide (content only, not filenames)) [book] changing book to reference guide (content only, not filenames) - Key: HBASE-5465 URL: https://issues.apache.org/jira/browse/HBASE-5465 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_hbase_5465.patch book.xml preface.xml Changing book to reference guide Note: the filenames are still the same. This is only a change to the way the document refers to itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5465) [book] changing book to reference guide (content only, not filenames)
[ https://issues.apache.org/jira/browse/HBASE-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5465: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] changing book to reference guide (content only, not filenames) - Key: HBASE-5465 URL: https://issues.apache.org/jira/browse/HBASE-5465 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: docbkx_hbase_5465.patch book.xml preface.xml Changing book to reference guide Note: the filenames are still the same. This is only a change to the way the document refers to itself. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214984#comment-13214984 ] Jean-Daniel Cryans commented on HBASE-3134: --- Patch looks good. Was it tested outside of unit tests? [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214988#comment-13214988 ] Zhihong Yu commented on HBASE-5357: --- @Mikhail: Can you attach patch generated with --no-prefix ? Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214996#comment-13214996 ] Phabricator commented on HBASE-5357: Kannan has commented on the revision [jira] [HBASE-5357] Refactoring: use the builder pattern for HColumnDescriptor. nice change-- the unit tests are a lot more readable now! One inlined comment... INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:438 changing the return type can break some existing apps out there, no? REVISION DETAIL https://reviews.facebook.net/D1851 Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214998#comment-13214998 ] Gregory Chanan commented on HBASE-5317: --- Okay, this looks like a mvn issue. On linux, if JAVA_HOME is not set, mvn sets JAVA_HOME to be whatever shows up in mvn --version On mac, mvn appears to not set JAVA_HOME if it is not already set. So if you set JAVA_HOME to /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home the test should pass. The new version of mapreduce requires JAVA_HOME to be set and tries to execute $JAVA_HOME + bin/java -- they should probably report a bug if JAVA_HOME is not set. Let me know if you want me to do anything else wrt this issue. Fix TestHFileOutputFormat to work against hadoop 0.23 - Key: HBASE-5317 URL: https://issues.apache.org/jira/browse/HBASE-5317 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0, 0.94.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, HBASE-5317-v6.patch, TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml Running mvn -Dhadoop.profile=23 test -P localTests -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat yields this on 0.92: Failed tests: testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): HFile for column family info-A not found Tests in error: test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0 (Is a directory) testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable It looks like on trunk, this also results in an error: testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but haven't fixed the other 3 yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5457) add inline index in data block for data which are not clustered together
[ https://issues.apache.org/jira/browse/HBASE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215002#comment-13215002 ] He Yongqiang commented on HBASE-5457: - @lars, in today's implementation we actually create another column family and reorg the column name to be 'ts and string', so the data is sorted by ts in this new column family. And we redirect the query to use the second column family. But this approach duplicates data. Without the second column family, we can do a search once we found the row. but that requires searching all data with the target row key. It hurts cpu. add inline index in data block for data which are not clustered together Key: HBASE-5457 URL: https://issues.apache.org/jira/browse/HBASE-5457 Project: HBase Issue Type: New Feature Reporter: He Yongqiang As we are go through our data schema, and we found we have one large column family which is just duplicating data from another column family and is just a re-org of the data to cluster data in a different way than the original column family in order to serve another type of queries efficiently. If we compare this second column family with similar situation in mysql, it is like an index in mysql. So if we can add inline block index on required columns, the second column family then is not needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215017#comment-13215017 ] Zhihong Yu commented on HBASE-5317: --- {code} echo ${JAVA_HOME} /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home mvn -Dhadoop.profile=23 clean test -P localTests -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat {code} Same test failures, e.g.: {code} testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat) Time elapsed: 60.857 sec FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.runIncrementalPELoad(TestHFileOutputFormat.java:478) at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.doIncrementalLoadTest(TestHFileOutputFormat.java:390) at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.testMRIncrementalLoadWithSplit(TestHFileOutputFormat.java:369) {code} Fix TestHFileOutputFormat to work against hadoop 0.23 - Key: HBASE-5317 URL: https://issues.apache.org/jira/browse/HBASE-5317 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0, 0.94.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, HBASE-5317-v6.patch, TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml Running mvn -Dhadoop.profile=23 test -P localTests -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat yields this on 0.92: Failed tests: testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): HFile for column family info-A not found Tests in error: test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0 (Is a directory) testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable It looks like on trunk, this also results in an error: testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but haven't fixed the other 3 yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215024#comment-13215024 ] Zhihong Yu commented on HBASE-5317: --- {code} LM-SJN-00713032:org.apache.hadoop.mapred.MiniMRCluster zhihyu$ find . -name stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_0/application_1330028372796_0001/container_1330028372796_0001_01_07/stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_2/application_1330028372796_0001/container_1330028372796_0001_01_01/stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_2/application_1330028372796_0001/container_1330028372796_0001_01_06/stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1330028372796_0001/container_1330028372796_0001_01_04/stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1330028372796_0001/container_1330028372796_0001_01_05/stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-1_2/application_1330028372796_0001/container_1330028372796_0001_01_02/stderr ./org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-1_3/application_1330028372796_0001/container_1330028372796_0001_01_03/stderr LM-SJN-00713032:org.apache.hadoop.mapred.MiniMRCluster zhihyu$ cat org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_0/application_1330028372796_0001/container_1330028372796_0001_01_07/stderr WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. LM-SJN-00713032:org.apache.hadoop.mapred.MiniMRCluster zhihyu$ cat org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-1_3/application_1330028372796_0001/container_1330028372796_0001_01_03/stderr WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. {code} From org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_0/application_1330028372796_0001/container_1330028372796_0001_01_07/syslog: {code} 2012-02-23 12:22:34,208 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Assiging lm-sjn-00713032.corp.ebay.com:61263 with 1 to fetcher#2 2012-02-23 12:22:34,208 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: assigned 1 of 1 to lm-sjn-00713032.corp.ebay.com:61263 to fetcher#2 2012-02-23 12:22:34,469 WARN [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to lm-sjn-00713032.corp.ebay.com:61263 with 1 map outputs java.io.IOException: Server returned HTTP response code: 503 for URL: http://lm-sjn-00713032.corp.ebay.com:61263/mapOutput?job=job_1330028372796_0001reduce=0map=attempt_1330028372796_0001_m_00_1 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:220) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152) 2012-02-23 12:22:34,471 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Reporting fetch failure for attempt_1330028372796_0001_m_00_1 to jobtracker. 2012-02-23 12:22:34,471 FATAL [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: Shuffle failed with too many fetch failures and insufficient progress! 2012-02-23 12:22:34,472 INFO [fetcher#2] org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler: lm-sjn-00713032.corp.ebay.com:61263 freed by fetcher#2 in 264s 2012-02-23 12:22:34,472 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:zhihyu (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2 2012-02-23 12:22:34,472 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253) at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:247) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152) 2012-02-23 12:22:34,476 INFO
[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5357: -- Attachment: Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215030#comment-13215030 ] Mikhail Bautin commented on HBASE-5357: --- @Ted: I thought that was what I did. Attaching again. Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5357: -- Attachment: (was: Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_25.patch) Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5357: -- Attachment: Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5440) Allow import to optionally use HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5440: - Attachment: 5440.txt First cut. * a new import mapper that writes KeyValues * uses KeyValueSortReducer Only used when -Dimport.bulk.output=path/to/output is set. I did experiment with a Reducer that accepts Mutation (common super class of Put and Delete), but that caused more problems than it solved, hence the KeyValueImporter. Allow import to optionally use HFileOutputFormat Key: HBASE-5440 URL: https://issues.apache.org/jira/browse/HBASE-5440 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5440.txt importtsv support importing into a life table or to generate HFiles for bulk load. import should allow the same. Could even consider merging these tools into one (in principle the only difference is the parsing part - although that is maybe for a different jira). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5440) Allow import to optionally use HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5440: - Status: Patch Available (was: Open) Allow import to optionally use HFileOutputFormat Key: HBASE-5440 URL: https://issues.apache.org/jira/browse/HBASE-5440 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5440.txt importtsv support importing into a life table or to generate HFiles for bulk load. import should allow the same. Could even consider merging these tools into one (in principle the only difference is the parsing part - although that is maybe for a different jira). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215038#comment-13215038 ] Phabricator commented on HBASE-5357: mbautin has commented on the revision [jira] [HBASE-5357] Refactoring: use the builder pattern for HColumnDescriptor. The patch passed all unit tests in my map-reduce run, except TestAtomicOperation, which passed locally. Still waiting for the Hadoop QA run. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java:438 I don't think this can break existing code. The return value will most likely be simply ignored. From http://docs.oracle.com/javase/tutorial/java/javaOO/methods.html: You cannot declare more than one method with the same name and the same number and type of arguments, because the compiler cannot tell them apart. The compiler does not consider return type when differentiating methods, so you cannot declare two methods with the same signature even if they have a different return type. REVISION DETAIL https://reviews.facebook.net/D1851 Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5466) Opening a table also opens the metatable and never closes it.
Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Reporter: Ashley Taylor Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashley Taylor updated HBASE-5466: - Affects Version/s: 0.90.5 0.92.0 Status: Patch Available (was: Open) patch to make sure the metatable gets closed when the table is opened before it falls out of scope Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0, 0.90.5 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5467) changing homepage to refer to 'reference guide'
changing homepage to refer to 'reference guide' --- Key: HBASE-5467 URL: https://issues.apache.org/jira/browse/HBASE-5467 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: site_hbase_5467.patch index.xml site.xml Changing reference from book to reference guide on home page and left-nav. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashley Taylor updated HBASE-5466: - Attachment: MetaScanner_HBASE_5466.patch Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5467) changing homepage to refer to 'reference guide'
[ https://issues.apache.org/jira/browse/HBASE-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5467: - Attachment: site_hbase_5467.patch changing homepage to refer to 'reference guide' --- Key: HBASE-5467 URL: https://issues.apache.org/jira/browse/HBASE-5467 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: site_hbase_5467.patch index.xml site.xml Changing reference from book to reference guide on home page and left-nav. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5467) changing homepage to refer to 'reference guide'
[ https://issues.apache.org/jira/browse/HBASE-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5467: - Resolution: Fixed Status: Resolved (was: Patch Available) changing homepage to refer to 'reference guide' --- Key: HBASE-5467 URL: https://issues.apache.org/jira/browse/HBASE-5467 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: site_hbase_5467.patch index.xml site.xml Changing reference from book to reference guide on home page and left-nav. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5467) changing homepage to refer to 'reference guide'
[ https://issues.apache.org/jira/browse/HBASE-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5467: - Status: Patch Available (was: Open) changing homepage to refer to 'reference guide' --- Key: HBASE-5467 URL: https://issues.apache.org/jira/browse/HBASE-5467 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: site_hbase_5467.patch index.xml site.xml Changing reference from book to reference guide on home page and left-nav. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215066#comment-13215066 ] Gregory Chanan commented on HBASE-5317: --- I am unable to reproduce either on my machine or on a coworker's Mac. Is it possible for you to try it on another machine? Perhaps some (security?) setting on your machine makes mapreduce unable to open the required port or something? Fix TestHFileOutputFormat to work against hadoop 0.23 - Key: HBASE-5317 URL: https://issues.apache.org/jira/browse/HBASE-5317 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0, 0.94.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, HBASE-5317-v6.patch, TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml Running mvn -Dhadoop.profile=23 test -P localTests -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat yields this on 0.92: Failed tests: testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): HFile for column family info-A not found Tests in error: test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0 (Is a directory) testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable It looks like on trunk, this also results in an error: testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but haven't fixed the other 3 yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215068#comment-13215068 ] Zhihong Yu commented on HBASE-5466: --- Thanks for the finding. {code} + }finally{ + if(metaTable!=null){ + metaTable.close(); + } {code} We use two spaces for indentation. Can you regenerate patch ? Refer to HBASE-3678. Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5317) Fix TestHFileOutputFormat to work against hadoop 0.23
[ https://issues.apache.org/jira/browse/HBASE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215069#comment-13215069 ] Zhihong Yu commented on HBASE-5317: --- I will try again when I get home. I noticed weird connection issues at work - inability to use Colloquy, etc Fix TestHFileOutputFormat to work against hadoop 0.23 - Key: HBASE-5317 URL: https://issues.apache.org/jira/browse/HBASE-5317 Project: HBase Issue Type: Bug Components: test Affects Versions: 0.92.0, 0.94.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Attachments: HBASE-5317-v0.patch, HBASE-5317-v1.patch, HBASE-5317-v3.patch, HBASE-5317-v4.patch, HBASE-5317-v5.patch, HBASE-5317-v6.patch, TEST-org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.xml Running mvn -Dhadoop.profile=23 test -P localTests -Dtest=org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat yields this on 0.92: Failed tests: testColumnFamilyCompression(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): HFile for column family info-A not found Tests in error: test_TIMERANGE(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): /home/gchanan/workspace/apache92/target/test-data/276cbd0c-c771-4f81-9ba8-c464c9dd7486/test_TIMERANGE_present/_temporary/0/_temporary/_attempt_200707121733_0001_m_00_0 (Is a directory) testMRIncrementalLoad(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable testMRIncrementalLoadWithSplit(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable It looks like on trunk, this also results in an error: testExcludeMinorCompaction(org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat): TestTable I have a patch that fixes testColumnFamilyCompression and test_TIMERANGE, but haven't fixed the other 3 yet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5440) Allow import to optionally use HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215071#comment-13215071 ] Hadoop QA commented on HBASE-5440: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515813/5440.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 152 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.coprocessor.TestClassLoading org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1027//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1027//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1027//console This message is automatically generated. Allow import to optionally use HFileOutputFormat Key: HBASE-5440 URL: https://issues.apache.org/jira/browse/HBASE-5440 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 5440.txt importtsv support importing into a life table or to generate HFiles for bulk load. import should allow the same. Could even consider merging these tools into one (in principle the only difference is the parsing part - although that is maybe for a different jira). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215078#comment-13215078 ] Hadoop QA commented on HBASE-5357: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515810/Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 78 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 152 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1028//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1028//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1028//console This message is automatically generated. Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashley Taylor updated HBASE-5466: - Attachment: MetaScanner_HBASE_5466(2).patch patch regenerated thanks for the link to that jira task. Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466(2).patch, MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215083#comment-13215083 ] Phabricator commented on HBASE-5074: dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, a new checksum object is created by every invocation of ChecksumType.getChecksumObject(), so it should be thread-safe src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will restructure the comment, this feature is switched on by default. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215080#comment-13215080 ] Jean-Daniel Cryans commented on HBASE-4365: --- Conclusion for the 1TB upload: Flush size: 512MB Split size: 20GB Without patch: 18012s With patch: 12505s It's 1.44x better, so a huge improvement. The difference here is due to the fact that it takes an awfully long time to split the first few regions without the patch. In the past I was starting the test with a smaller split size and then once I got a good distribution I was doing an online alter to set it to 20GB. Not anymore with this patch :) Another observation: the upload in general is slowed down by too many store files blocking. I could trace this to compactions taking a long time to get rid of reference files (3.5GB taking more than 10 minutes) and during that time you can hit the block multiple times. We really ought to see how we can optimize the compactions, consider compacting those big files in many threads instead of only one, and enable referencing reference files to skip some compactions altogether. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215084#comment-13215084 ] Phabricator commented on HBASE-5074: dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:73 Actually, a new checksum object is created by every invocation of ChecksumType.getChecksumObject(), so it should be thread-safe src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:120 doing it src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:164 will restructure the comment, this feature is switched on by default. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache
[ https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215085#comment-13215085 ] Prakash Khemani commented on HBASE-5347: Lars, you are right. I have been trying to induce a Full GC but without any success. (I can induce a full GC if I artificially hold some key-values in queue and force them to be tenured.) On 89-fb, my test-case is doing random increments on a space of slightly more than 40GB worth of Key-value data. The heap is set to 36GB. The LRU cache has a high and low watermark of .98 and .85 percents. The region server spawns 1000 threads that continuously do the increments. The eviction thread manages to keep the block-cache at about 85% always. Cache-on-write is turned on to induce more cache churn. All the 12 disks are close to 100% read pegged. GC takes 60% of the CPU (sum of user times in 1000 lines of gc log / (elapsed time * #cpus)). Compactions that get started never complete while the load is on. I guess I have to change the dynamics of the test case to induce GC pauses. On 2/22/12 11:35 PM, Todd Lipcon (Commented) (JIRA) j...@apache.org wrote: GC free memory management in Level-1 Block Cache Key: HBASE-5347 URL: https://issues.apache.org/jira/browse/HBASE-5347 Project: HBase Issue Type: Improvement Reporter: Prakash Khemani Assignee: Prakash Khemani Attachments: D1635.5.patch On eviction of a block from the block-cache, instead of waiting for the garbage collecter to reuse its memory, reuse the block right away. This will require us to keep reference counts on the HFile blocks. Once we have the reference counts in place we can do our own simple blocks-out-of-slab allocation for the block-cache. This will help us with * reducing gc pressure, especially in the old generation * making it possible to have non-java-heap memory backing the HFile blocks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215088#comment-13215088 ] Todd Lipcon commented on HBASE-4365: Great results! Very cool. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5442) Use builder pattern in StoreFile and HFile
[ https://issues.apache.org/jira/browse/HBASE-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215094#comment-13215094 ] Hadoop QA commented on HBASE-5442: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515717/HFile-StoreFile-builder-2012-02-22_22_49_00.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 73 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -135 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 153 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationPeer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1029//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1029//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1029//console This message is automatically generated. Use builder pattern in StoreFile and HFile -- Key: HBASE-5442 URL: https://issues.apache.org/jira/browse/HBASE-5442 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1893.1.patch, D1893.2.patch, HFile-StoreFile-builder-2012-02-22_22_49_00.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses StoreFile and HFile refactoring. For HColumnDescriptor refactoring see HBASE-5357. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui
[ https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215099#comment-13215099 ] Phabricator commented on HBASE-5407: Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the per-region level request/sec count in the web ui. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these fields need not/shouldn't be part of the serialize/deserialize -- otherwise, we'll break client-server interop for the getClusterStatus call. We temporarily ran into similar problem before and the relevant stack is: at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512) at org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228) at org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459) java.lang.reflect.UndeclaredThrowableException at $Proxy0.getClusterStatus(Unknown Source) at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076) We should remove these from serialize/deserialize, but the metrics should still be available for viewing the web-ui. REVISION DETAIL https://reviews.facebook.net/D1779 BRANCH regionRequest Show the per-region level request/sec count in the web ui - Key: HBASE-5407 URL: https://issues.apache.org/jira/browse/HBASE-5407 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch It would be nice to show the per-region level request/sec count in the web ui, especially when debugging the hot region problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui
[ https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215100#comment-13215100 ] Phabricator commented on HBASE-5407: Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the per-region level request/sec count in the web ui. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these fields need not/shouldn't be part of the serialize/deserialize -- otherwise, we'll break client-server interop for the getClusterStatus call. We temporarily ran into similar problem before and the relevant stack is: at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512) at org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228) at org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459) java.lang.reflect.UndeclaredThrowableException at $Proxy0.getClusterStatus(Unknown Source) at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076) We should remove these from serialize/deserialize, but the metrics should still be available for viewing the web-ui. REVISION DETAIL https://reviews.facebook.net/D1779 BRANCH regionRequest Show the per-region level request/sec count in the web ui - Key: HBASE-5407 URL: https://issues.apache.org/jira/browse/HBASE-5407 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch It would be nice to show the per-region level request/sec count in the web ui, especially when debugging the hot region problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui
[ https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215102#comment-13215102 ] Phabricator commented on HBASE-5407: Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the per-region level request/sec count in the web ui. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these fields need not/shouldn't be part of the serialize/deserialize -- otherwise, we'll break client-server interop for the getClusterStatus call. We temporarily ran into similar problem before and the relevant stack is: at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512) at org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228) at org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459) java.lang.reflect.UndeclaredThrowableException at $Proxy0.getClusterStatus(Unknown Source) at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076) We should remove these from serialize/deserialize, but the metrics should still be available for viewing the web-ui. REVISION DETAIL https://reviews.facebook.net/D1779 BRANCH regionRequest Show the per-region level request/sec count in the web ui - Key: HBASE-5407 URL: https://issues.apache.org/jira/browse/HBASE-5407 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch It would be nice to show the per-region level request/sec count in the web ui, especially when debugging the hot region problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui
[ https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215103#comment-13215103 ] Phabricator commented on HBASE-5407: Kannan has commented on the revision [jira][HBASE-5407][89-fb] Show the per-region level request/sec count in the web ui. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/HServerLoad.java:279 I think these fields need not/shouldn't be part of the serialize/deserialize -- otherwise, we'll break client-server interop for the getClusterStatus call. We temporarily ran into similar problem before and the relevant stack is: at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:211) at org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:512) at org.apache.hadoop.hbase.HServerInfo.readFields(HServerInfo.java:228) at org.apache.hadoop.hbase.ClusterStatus.readFields(ClusterStatus.java:226) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:489) at org.apache.hadoop.hbase.io.HbaseObjectWritable.readFields(HbaseObjectWritable.java:230) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:534) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459) java.lang.reflect.UndeclaredThrowableException at $Proxy0.getClusterStatus(Unknown Source) at org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:1076) We should remove these from serialize/deserialize, but the metrics should still be available for viewing the web-ui. REVISION DETAIL https://reviews.facebook.net/D1779 BRANCH regionRequest Show the per-region level request/sec count in the web ui - Key: HBASE-5407 URL: https://issues.apache.org/jira/browse/HBASE-5407 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Attachments: D1779.1.patch, D1779.1.patch, D1779.1.patch It would be nice to show the per-region level request/sec count in the web ui, especially when debugging the hot region problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215108#comment-13215108 ] Phabricator commented on HBASE-5074: dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used only for V2, hence the major number is not a parameter. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t be any changes to the number of threads in the datanode. A datanode thread is not tied up with a client FileSystem object. Instead, a global pool of threads in the datanode are free to serve any read-requests from any client src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates disk-format changes inside an HFileBlock. The major version indicates disk-format changes within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical that it contains the minorVersion, is it not? But I can put in the majorVersion in it as well, if you so desire. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it to enable hbase-checksum verification. And you are right that if the hfile is of the older type, then we will quickly flip this back to false (in the next line) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep both streams active till the HFile itself is closed. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely. Going forward, I would like to see if we can make HLogs go to a filesystem object that is different from the filesystem used for hfiles. src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree with you completely. This is an interface that should not change often. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215109#comment-13215109 ] Phabricator commented on HBASE-5074: dhruba has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 This constructor is used only for V2, hence the major number is not a parameter. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1235 I think there won;t be any changes to the number of threads in the datanode. A datanode thread is not tied up with a client FileSystem object. Instead, a global pool of threads in the datanode are free to serve any read-requests from any client src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1244 The minor version indicates disk-format changes inside an HFileBlock. The major version indicates disk-format changes within a entire HFile. Since the AbstractFSReader only reads HFileBlocks, so it is logical that it contains the minorVersion, is it not? But I can put in the majorVersion in it as well, if you so desire. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1584 Yes, the default it to enable hbase-checksum verification. And you are right that if the hfile is of the older type, then we will quickly flip this back to false (in the next line) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1630 I think we should keep both streams active till the HFile itself is closed. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1646 done src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 Yes, precisely. Going forward, I would like to see if we can make HLogs go to a filesystem object that is different from the filesystem used for hfiles. src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java:86 I agree with you completely. This is an interface that should not change often. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5074: --- Attachment: D1521.9.patch dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin Pulled in review comments from Stack and Ted. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5074: --- Attachment: D1521.9.patch dhruba updated the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Reviewers: mbautin Pulled in review comments from Stack and Ted. REVISION DETAIL https://reviews.facebook.net/D1521 AFFECTED FILES src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java src/test/java/org/apache/hadoop/hbase/regionserver/TestFSErrorsExposed.java src/test/java/org/apache/hadoop/hbase/regionserver/HFileReadWriteTest.java src/test/java/org/apache/hadoop/hbase/regionserver/TestCompoundBloomFilter.java src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreFile.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileReaderV1.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestFixedFileTrailer.java src/test/java/org/apache/hadoop/hbase/io/hfile/CacheTestUtils.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileDataBlockEncoder.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java src/test/java/org/apache/hadoop/hbase/util/MockRegionServerServices.java src/main/java/org/apache/hadoop/hbase/HConstants.java src/main/java/org/apache/hadoop/hbase/fs src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java src/main/java/org/apache/hadoop/hbase/util/CompoundBloomFilter.java src/main/java/org/apache/hadoop/hbase/util/ChecksumFactory.java src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java src/main/java/org/apache/hadoop/hbase/regionserver/Store.java src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java src/main/java/org/apache/hadoop/hbase/io/hfile/NoOpDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV1.java support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215115#comment-13215115 ] Phabricator commented on HBASE-5074: mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. @dhruba: going through the diff once again. Since you've updated the revision, submitting existing comments against the previous version, and continuing with the new version. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 Misspelling: Minimun - Minimum src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two be made final too? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 s/chuck/chunk/ src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 Fix javadoc: do do - do src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this final, rename to DUMMY_VALUE, because this is a constant, and make the length a factor of 16 to take advantage of alignment. src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 s/manor/major/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is misleading. This is not something that defaults to the 16 K, but the default value itself. I think this should say something about how a non-default value is specified. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The additional constructor should not be needed when https://issues.apache.org/jira/browse/HBASE-5442 goes in. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible to obtain the filesystem from the input stream rather than pass it as an additional parameter? Or is the underlying filesystem of the input stream a regular one, as opposed to an HFileSystem? REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215114#comment-13215114 ] Phabricator commented on HBASE-5074: mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. @dhruba: going through the diff once again. Since you've updated the revision, submitting existing comments against the previous version, and continuing with the new version. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:131 Misspelling: Minimun - Minimum src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java:44-45 Can these two be made final too? src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:145 s/chuck/chunk/ src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java:48 Fix javadoc: do do - do src/main/java/org/apache/hadoop/hbase/io/hfile/ChecksumUtil.java:38 Make this final, rename to DUMMY_VALUE, because this is a constant, and make the length a factor of 16 to take advantage of alignment. src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java:532 s/manor/major/ src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:157 This comment is misleading. This is not something that defaults to the 16 K, but the default value itself. I think this should say something about how a non-default value is specified. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:265-271 The additional constructor should not be needed when https://issues.apache.org/jira/browse/HBASE-5442 goes in. src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java:409 Is it possible to obtain the filesystem from the input stream rather than pass it as an additional parameter? Or is the underlying filesystem of the input stream a regular one, as opposed to an HFileSystem? REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215127#comment-13215127 ] Hadoop QA commented on HBASE-5466: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515823/MetaScanner_HBASE_5466%282%29.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 152 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1030//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1030//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1030//console This message is automatically generated. Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466(2).patch, MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215129#comment-13215129 ] Hadoop QA commented on HBASE-5466: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12515823/MetaScanner_HBASE_5466%282%29.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -136 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 152 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.TestZooKeeper Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1031//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1031//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1031//console This message is automatically generated. Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466(2).patch, MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5468) [book] updating copyright in Reference Guide
[book] updating copyright in Reference Guide Key: HBASE-5468 URL: https://issues.apache.org/jira/browse/HBASE-5468 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5468.xml.patch book.xml updating copyright to 2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5468) [book] updating copyright in Reference Guide
[ https://issues.apache.org/jira/browse/HBASE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5468: - Status: Patch Available (was: Open) [book] updating copyright in Reference Guide Key: HBASE-5468 URL: https://issues.apache.org/jira/browse/HBASE-5468 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5468.xml.patch book.xml updating copyright to 2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5468) [book] updating copyright in Reference Guide
[ https://issues.apache.org/jira/browse/HBASE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5468: - Attachment: book_hbase_5468.xml.patch [book] updating copyright in Reference Guide Key: HBASE-5468 URL: https://issues.apache.org/jira/browse/HBASE-5468 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5468.xml.patch book.xml updating copyright to 2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5468) [book] updating copyright in Reference Guide
[ https://issues.apache.org/jira/browse/HBASE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-5468: - Resolution: Fixed Status: Resolved (was: Patch Available) [book] updating copyright in Reference Guide Key: HBASE-5468 URL: https://issues.apache.org/jira/browse/HBASE-5468 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Trivial Attachments: book_hbase_5468.xml.patch book.xml updating copyright to 2012 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215137#comment-13215137 ] stack commented on HBASE-5466: -- +1 on patch (except for the spacing that is not like the rest of the file) Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466(2).patch, MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5466) Opening a table also opens the metatable and never closes it.
[ https://issues.apache.org/jira/browse/HBASE-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215140#comment-13215140 ] Zhihong Yu commented on HBASE-5466: --- TestZooKeeper passed locally with patch v2. {code} + }finally{ +if(metaTable!=null){ {code} There should be a space between } and finally, finally and {, if and (, ) and { Overall, +1 on patch v2. Please fix formatting in v3. Opening a table also opens the metatable and never closes it. - Key: HBASE-5466 URL: https://issues.apache.org/jira/browse/HBASE-5466 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.5, 0.92.0 Reporter: Ashley Taylor Attachments: MetaScanner_HBASE_5466(2).patch, MetaScanner_HBASE_5466.patch Having upgraded to CDH3U3 version of hbase we found we had a zookeeper connection leak, tracking it down we found that closing the connection will only close the zookeeper connection if all calls to get the connection have been closed, there is incCount and decCount in the HConnection class, When a table is opened it makes a call to the metascanner class which opens a connection to the meta table, this table never gets closed. This caused the count in the HConnection class to never return to zero meaning that the zookeeper connection will not close when we close all the tables or calling HConnectionManager.deleteConnection(config, true); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215149#comment-13215149 ] Phabricator commented on HBASE-5074: tedyu has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will happen after HFileV3 is introduced ? I would expect HFileV3 starts with minorVersion of 0. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 HLog goes to fs on SSD ? Nice. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215148#comment-13215148 ] Phabricator commented on HBASE-5074: tedyu has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 What will happen after HFileV3 is introduced ? I would expect HFileV3 starts with minorVersion of 0. src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:961 HLog goes to fs on SSD ? Nice. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215150#comment-13215150 ] Phabricator commented on HBASE-5074: stack has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Good w/ your comebacks Dhruba... just minor one below for your next rev. Let us know how the cluster testing goes. This patch applies fine. Might try it out over here too.. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't understand. I think this means the fact that we have a minor version unaccompanied by a major needs docing here in a comment? No hurry. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215153#comment-13215153 ] Phabricator commented on HBASE-5074: mbautin has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. @dhruba: some more comments inline. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:451-452 Assign headerSize() to a local variable instead of calling it twice. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:529-530 Call headerSize() once and store in a local variable. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1232 do do - do src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1642-1644 Store and reuse part of the previous error message. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1636 Check if WARN level messages are enabled and only generate the message string in that case. src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:1848 double semicolon (does not matter) src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java:424 What if istream != istreamNoFsChecksum but istreamNoFsChecksum == null? src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:3610-3612 Not sure how this is related to HBase-level checksum checking src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:265 Make this conf key a constant in HConstants src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:275 conf key - HConstants src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:40-43 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:57-60 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:103-106 This is unnecessary because the default toString would do the same. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:143-144 It looks like toString would to this. src/main/java/org/apache/hadoop/hbase/util/ChecksumType.java:179 Would not the built-in enum method valueOf do what this function is doing? src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java:1 This file still seems to contain a lot of copy-and-paste from TestHFileBlock. Are you planning to address that? REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5074) support checksums in HBase block cache
[ https://issues.apache.org/jira/browse/HBASE-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215151#comment-13215151 ] Phabricator commented on HBASE-5074: stack has commented on the revision [jira] [HBASE-5074] Support checksums in HBase block cache. Good w/ your comebacks Dhruba... just minor one below for your next rev. Let us know how the cluster testing goes. This patch applies fine. Might try it out over here too.. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java:235 I don't understand. I think this means the fact that we have a minor version unaccompanied by a major needs docing here in a comment? No hurry. REVISION DETAIL https://reviews.facebook.net/D1521 support checksums in HBase block cache -- Key: HBASE-5074 URL: https://issues.apache.org/jira/browse/HBASE-5074 Project: HBase Issue Type: Improvement Components: regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: D1521.1.patch, D1521.1.patch, D1521.2.patch, D1521.2.patch, D1521.3.patch, D1521.3.patch, D1521.4.patch, D1521.4.patch, D1521.5.patch, D1521.5.patch, D1521.6.patch, D1521.6.patch, D1521.7.patch, D1521.7.patch, D1521.8.patch, D1521.8.patch, D1521.9.patch, D1521.9.patch The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read into the HBase block cache actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better
[ https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-5415: -- Attachment: HBASE-5415.patch This patch handles it by just printing a WARN, the side effect is that this method doesn't throw TableExistsException anymore (which didn't make sense anyway) so I cleaned up a bunch of code. FSTableDescriptors should handle random folders in hbase.root.dir better Key: HBASE-5415 URL: https://issues.apache.org/jira/browse/HBASE-5415 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.1, 0.94.0 Attachments: HBASE-5415.patch I faked an upgrade on a test cluster using our dev data so I had to distcp the data between the two clusters, but after starting up and doing the migration and whatnot the web UI didn't show any table. The reason was in the master's log: {quote} org.apache.hadoop.hbase.TableExistsException: No descriptor for _distcp_logs_e0ehek at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164) at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182) at org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {quote} I don't think we need to show a full stack (just a WARN maybe), this shouldn't kill the request (still see tables in the web UI), and why is that a TableExistsException? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215158#comment-13215158 ] Phabricator commented on HBASE-5357: tedyu has accepted the revision [jira] [HBASE-5357] Refactoring: use the builder pattern for HColumnDescriptor. I see 'This diff has Lint Problems.' because of Lint being skipped. REVISION DETAIL https://reviews.facebook.net/D1851 BRANCH hcd_builder2 Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5437) HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy
[ https://issues.apache.org/jira/browse/HBASE-5437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215161#comment-13215161 ] Zhihong Yu commented on HBASE-5437: --- Integrated to TRUNK. Thanks for the patch, Scott. Thanks for the review, Stack. HRegionThriftServer does not start because of a bug in HbaseHandlerMetricsProxy --- Key: HBASE-5437 URL: https://issues.apache.org/jira/browse/HBASE-5437 Project: HBase Issue Type: Bug Components: metrics, thrift Reporter: Scott Chen Assignee: Scott Chen Fix For: 0.94.0 Attachments: HBASE-5437.D1857.1.patch, HBASE-5437.D1887.1.patch, HBASE-5437.D1887.2.patch 3.facebook.com,60020,1329865516120: Initialization of RS failed. Hence aborting RS. java.lang.ClassCastException: $Proxy9 cannot be cast to org.apache.hadoop.hbase.thrift.generated.Hbase$Iface at org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.newInstance(HbaseHandlerMetricsProxy.java:47) at org.apache.hadoop.hbase.thrift.ThriftServerRunner.init(ThriftServerRunner.java:239) at org.apache.hadoop.hbase.regionserver.HRegionThriftServer.init(HRegionThriftServer.java:74) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeThreads(HRegionServer.java:646) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:546) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:658) at java.lang.Thread.run(Thread.java:662) 2012-02-21 15:05:18,749 FATAL org.apache.hadoop.h -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better
[ https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215164#comment-13215164 ] stack commented on HBASE-5415: -- Whats difference between miscellaneous dirs under hbase.rootdir and an actual table directory that is missing its .tableinfo file? We're changing our API when we remove TEE from public methods? FSTableDescriptors should handle random folders in hbase.root.dir better Key: HBASE-5415 URL: https://issues.apache.org/jira/browse/HBASE-5415 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.1, 0.94.0 Attachments: HBASE-5415.patch I faked an upgrade on a test cluster using our dev data so I had to distcp the data between the two clusters, but after starting up and doing the migration and whatnot the web UI didn't show any table. The reason was in the master's log: {quote} org.apache.hadoop.hbase.TableExistsException: No descriptor for _distcp_logs_e0ehek at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164) at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182) at org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {quote} I don't think we need to show a full stack (just a WARN maybe), this shouldn't kill the request (still see tables in the web UI), and why is that a TableExistsException? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5357) Use builder pattern in HColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215166#comment-13215166 ] Mikhail Bautin commented on HBASE-5357: --- Re-ran failed unit tests locally: Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 211.925 sec Running org.apache.hadoop.hbase.mapreduce.TestImportTsv Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 81.352 sec Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.09 sec Running org.apache.hadoop.hbase.mapred.TestTableMapReduce Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.055 sec Results : Tests run: 19, Failures: 0, Errors: 0, Skipped: 0 Use builder pattern in HColumnDescriptor Key: HBASE-5357 URL: https://issues.apache.org/jira/browse/HBASE-5357 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: D1851.1.patch, D1851.2.patch, D1851.3.patch, D1851.4.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-21_19_13_35.patch, Use-builder-pattern-for-HColumnDescriptor-2012-02-23_12_42_49.patch, Use-builder-pattern-for-HColumnDescriptor-20120223113155-e387d251.patch We have five ways to create an HFile writer, two ways to create a StoreFile writer, and the sets of parameters keep changing, creating a lot of confusion, especially when porting patches across branches. The same thing is happening to HColumnDescriptor. I think we should move to a builder pattern solution, e.g. {code:java} HFileWriter w = HFile.getWriterBuilder(conf, some common args) .setParameter1(value1) .setParameter2(value2) ... .build(); {code} Each parameter setter being on its own line will make merges/cherry-pick work properly, we will not have to even mention default parameters again, and we can eliminate a dozen impossible-to-remember constructors. This particular JIRA addresses the HColumnDescriptor refactoring. For StoreFile/HFile refactoring see HBASE-5442. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool
Add baseline compression efficiency to DataBlockEncodingTool Key: HBASE-5469 URL: https://issues.apache.org/jira/browse/HBASE-5469 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better
[ https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215175#comment-13215175 ] Jean-Daniel Cryans commented on HBASE-5415: --- bq. Whats difference between miscellaneous dirs under hbase.rootdir and an actual table directory that is missing its .tableinfo file? Former's HTD is null, latter gets a FNFE. bq. We're changing our API when we remove TEE from public methods? Technically no, TEE (and FNFE FWIW) are both IOEs so there's no change there. I removed TEE specifically because it isn't thrown anymore. FSTableDescriptors should handle random folders in hbase.root.dir better Key: HBASE-5415 URL: https://issues.apache.org/jira/browse/HBASE-5415 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.1, 0.94.0 Attachments: HBASE-5415.patch I faked an upgrade on a test cluster using our dev data so I had to distcp the data between the two clusters, but after starting up and doing the migration and whatnot the web UI didn't show any table. The reason was in the master's log: {quote} org.apache.hadoop.hbase.TableExistsException: No descriptor for _distcp_logs_e0ehek at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164) at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182) at org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {quote} I don't think we need to show a full stack (just a WARN maybe), this shouldn't kill the request (still see tables in the web UI), and why is that a TableExistsException? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215176#comment-13215176 ] stack commented on HBASE-4365: -- @Lars You want to put an upper bound on the number of regions? I think if we do power of three, we'll lose some of the benefit J-D sees above; we'll fan out the regions slower. Do you want to put an upper bound on the number of regions per regionserver for a table? Say, three? As in, when we get to three regions on a server, just scoot the split size up to the maximum. So, given a power of two, we'd split on first flush, then the next split would happen at (2*2*128M) 512M, then 9*128M=1.2G and thereafter we'd split at the max, say 10G? Or should we just commit this for now and do more in another patch? Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira