[jira] [Commented] (HBASE-4336) Convert source tree into maven modules
[ https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108389#comment-13108389 ] Jesse Yates commented on HBASE-4336: Did a bunch of the work for this already under HBASE-4438 - posted initial patch to review board. I wasn't sure which classes we wanted to move to which package, but set up at least the top level hierarchies and did the most of PITA work of cleaning up the top-level pom. Up on review board: https://reviews.apache.org/r/1965/ Right now, its a skeleton so we can easily drop in the code we want, where we want it. Convert source tree into maven modules -- Key: HBASE-4336 URL: https://issues.apache.org/jira/browse/HBASE-4336 Project: HBase Issue Type: Task Components: build Reporter: Gary Helmling When we originally converted the build to maven we had a single core module defined, but later reverted this to a module-less build for the sake of simplicity. It now looks like it's time to re-address this, as we have an actual need for modules to: * provide a trimmed down client library that applications can make use of * more cleanly support building against different versions of Hadoop, in place of some of the reflection machinations currently required * incorporate the secure RPC engine that depends on some secure Hadoop classes I propose we start simply by refactoring into two initial modules: * core - common classes and utilities, and client-side code and interfaces * server - master and region server implementations and supporting code This would also lay the groundwork for incorporating the HBase security features that have been developed. Once the module structure is in place, security-related features could then be incorporated into a third module -- security -- after normal review and approval. The security module could then depend on secure Hadoop, without modifying the dependencies of the rest of the HBase code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4446) Rolling restart RSs scenario, regions could stay in OPENING state
[ https://issues.apache.org/jira/browse/HBASE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108390#comment-13108390 ] Todd Lipcon commented on HBASE-4446: Waiting on ServerShutdownHandler may make sense for some states - eg if the region was CLOSING, we need to make sure that we split logs before we reassign. But I agree that many other states (OPENING, FAILED_OPEN, CLOSED), we can handle regardless of whether the RS is online or not. Rolling restart RSs scenario, regions could stay in OPENING state - Key: HBASE-4446 URL: https://issues.apache.org/jira/browse/HBASE-4446 Project: HBase Issue Type: Bug Components: master Reporter: Ming Ma Assignee: Ming Ma Fix For: 0.92.0 Attachments: HBASE-4446-trunk.patch Keep Master up all the time, do rolling restart of RSs like this - stop RS1, wait for 2 seconds, stop RS2, start RS1, wait for 2 seconds, stop RS3, start RS2, wait for 2 seconds, etc. Region sometimes can just stay in OPENING state even after timeoutmonitor period. 2011-09-19 08:10:33,131 WARN org.apache.hadoop.hbase.master.AssignmentManager: While timing out a region in state OPENING, found ZK node in unexpected state: RS_ZK_REGION_FAILED_OPEN The issue - RS was shutdown when a region is being opened, it was transitioned to RS_ZK_REGION_FAILED_OPEN in ZK. In timeoutmonitor, it didn't take care of RS_ZK_REGION_FAILED_OPEN. processOpeningState ... else if (dataInZNode.getEventType() != EventType.RS_ZK_REGION_OPENING LOG.warn(While timing out a region in state OPENING, + found ZK node in unexpected state: + dataInZNode.getEventType()); return; } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4446) Rolling restart RSs scenario, regions could stay in OPENING state
[ https://issues.apache.org/jira/browse/HBASE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108396#comment-13108396 ] ramkrishna.s.vasudevan commented on HBASE-4446: --- +1. Nice analysis. We need to dig in more to find any corner scenarios like this comes up. Rolling restart RSs scenario, regions could stay in OPENING state - Key: HBASE-4446 URL: https://issues.apache.org/jira/browse/HBASE-4446 Project: HBase Issue Type: Bug Components: master Reporter: Ming Ma Assignee: Ming Ma Fix For: 0.92.0 Attachments: HBASE-4446-trunk.patch Keep Master up all the time, do rolling restart of RSs like this - stop RS1, wait for 2 seconds, stop RS2, start RS1, wait for 2 seconds, stop RS3, start RS2, wait for 2 seconds, etc. Region sometimes can just stay in OPENING state even after timeoutmonitor period. 2011-09-19 08:10:33,131 WARN org.apache.hadoop.hbase.master.AssignmentManager: While timing out a region in state OPENING, found ZK node in unexpected state: RS_ZK_REGION_FAILED_OPEN The issue - RS was shutdown when a region is being opened, it was transitioned to RS_ZK_REGION_FAILED_OPEN in ZK. In timeoutmonitor, it didn't take care of RS_ZK_REGION_FAILED_OPEN. processOpeningState ... else if (dataInZNode.getEventType() != EventType.RS_ZK_REGION_OPENING LOG.warn(While timing out a region in state OPENING, + found ZK node in unexpected state: + dataInZNode.getEventType()); return; } -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4327) Compile HBase against hadoop 0.22
[ https://issues.apache.org/jira/browse/HBASE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joep Rottinghuis updated HBASE-4327: Assignee: Joep Rottinghuis Compile HBase against hadoop 0.22 - Key: HBASE-4327 URL: https://issues.apache.org/jira/browse/HBASE-4327 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.92.0 Attachments: HBASE-4327-Michael.patch, HBASE-4327.patch, HBASE-4327.patch, HBASE-4327.patch Pom contains a profile for hadoop-0.20 and one for hadoop-0.23, but not one for hadoop-0.22. When overriding hadoop.version to 0.22, then the (compile-time) dependency on hadoop-annotations cannot be met. That exists on 0.23 and 0.24/trunk, but not on 0.22. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4447) Allow hbase.version to be passed in as command-line argument
[ https://issues.apache.org/jira/browse/HBASE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108439#comment-13108439 ] Hudson commented on HBASE-4447: --- Integrated in HBase-TRUNK #2238 (See [https://builds.apache.org/job/HBase-TRUNK/2238/]) HBASE-4447 Allow hbase.version to be passed in as command-line argument stack : Files : * /hbase/trunk/pom.xml Allow hbase.version to be passed in as command-line argument Key: HBASE-4447 URL: https://issues.apache.org/jira/browse/HBASE-4447 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.92.0 Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.92.0 Attachments: HBASE-4447-0.92.patch Currently the build always produces the jars and tarball according to the version baked into the POM. When we modify this to allow the version to be passed in as a command-line argument, it can still default to the same behavior, yet give the flexibility for an internal build to tag on own version. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4447) Allow hbase.version to be passed in as command-line argument
[ https://issues.apache.org/jira/browse/HBASE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108495#comment-13108495 ] Hudson commented on HBASE-4447: --- Integrated in HBase-0.92 #5 (See [https://builds.apache.org/job/HBase-0.92/5/]) HBASE-4447 Allow hbase.version to be passed in as command-line argument stack : Files : * /hbase/branches/0.92/pom.xml Allow hbase.version to be passed in as command-line argument Key: HBASE-4447 URL: https://issues.apache.org/jira/browse/HBASE-4447 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.92.0 Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Fix For: 0.92.0 Attachments: HBASE-4447-0.92.patch Currently the build always produces the jars and tarball according to the version baked into the POM. When we modify this to allow the version to be passed in as a command-line argument, it can still default to the same behavior, yet give the flexibility for an internal build to tag on own version. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4352) Apply version of hbase-4015 to branch
[ https://issues.apache.org/jira/browse/HBASE-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108629#comment-13108629 ] Ted Yu commented on HBASE-4352: --- TestZKBasedOpenCloseRegion hangs based on patch. Apply version of hbase-4015 to branch - Key: HBASE-4352 URL: https://issues.apache.org/jira/browse/HBASE-4352 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Fix For: 0.90.5 Attachments: HBASE-4352_0.90.patch Consider adding a version of hbase-4015 to 0.90. It changes HRegionInterface so would need move change to end of the Interface and then test that it doesn't break rolling restart. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108631#comment-13108631 ] Ted Yu commented on HBASE-4153: --- Test suite had a few failures: {code} Failed tests: testBasicRollingRestart(org.apache.hadoop.hbase.master.TestRollingRestart): expected:22 but was:6 Tests in error: testRSAlreadyProcessingRegion(org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion) testFailedOpenRegion(org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler) testFailedUpdateMeta(org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler) {code} Handle RegionAlreadyInTransitionException in AssignmentManager -- Key: HBASE-4153 URL: https://issues.apache.org/jira/browse/HBASE-4153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch Comment from Stack over in HBASE-3741: {quote} Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException, won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at least one case here is saying that the region is already open on this regionserver? {quote} Indeed looking at the code it's going to be handled the same way other exceptions are. Need to add special cases for assign and unassign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108659#comment-13108659 ] ramkrishna.s.vasudevan commented on HBASE-4153: --- I will check once again and will let you know in sometime. Handle RegionAlreadyInTransitionException in AssignmentManager -- Key: HBASE-4153 URL: https://issues.apache.org/jira/browse/HBASE-4153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch Comment from Stack over in HBASE-3741: {quote} Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException, won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at least one case here is saying that the region is already open on this regionserver? {quote} Indeed looking at the code it's going to be handled the same way other exceptions are. Need to add special cases for assign and unassign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108708#comment-13108708 ] ramkrishna.s.vasudevan commented on HBASE-4153: --- TestOpenRegionHandler change in patch HBASE-4153_3.patch was not applied in latest. But i had the changed code in my code base.. hence the testcases passed. Other two testcases i dont find any errors. Correct me if am wrong Ted. Thanks for your findings :) {code} Running org.apache.hadoop.hbase.master.TestRollingRestart Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.718 sec Running org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.593 sec Running org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.825 sec {code} Handle RegionAlreadyInTransitionException in AssignmentManager -- Key: HBASE-4153 URL: https://issues.apache.org/jira/browse/HBASE-4153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch Comment from Stack over in HBASE-3741: {quote} Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException, won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at least one case here is saying that the region is already open on this regionserver? {quote} Indeed looking at the code it's going to be handled the same way other exceptions are. Need to add special cases for assign and unassign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-4448: - Attachment: HBaseTestingUtilityFactory.java HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108735#comment-13108735 ] Doug Meil commented on HBASE-4448: -- Attached a prototype of HBaseTestingUtilityFactory. This is not ready for prime time yet, but I'd like to solicit comments for the general idea. Noted issues: there needs to be a configurable wait period when the ref-counts get to zero. That should be set from the build, but how? System property? The reason is that while this pattern can be useful for many cases, it won't be suitable for all. Therefore, there could be periods of non-use when another test is running, and we don't want to be too aggressive in tearing down the instances otherwise we'll be back where we started. HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-4153: -- Attachment: HBASE-4153_6.patch Did one small change. Rhe return type of getRegionsInTransitionInRS() in RegionServerServices has been made to Map instead of ConcurrentSkipListMap because it is a good practice to return the super type in interfaces. This avoids the change in TestOpenRegionHandler. Handle RegionAlreadyInTransitionException in AssignmentManager -- Key: HBASE-4153 URL: https://issues.apache.org/jira/browse/HBASE-4153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch, HBASE-4153_6.patch Comment from Stack over in HBASE-3741: {quote} Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException, won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at least one case here is saying that the region is already open on this regionserver? {quote} Indeed looking at the code it's going to be handled the same way other exceptions are. Need to add special cases for assign and unassign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-4153: -- Status: Open (was: Patch Available) Handle RegionAlreadyInTransitionException in AssignmentManager -- Key: HBASE-4153 URL: https://issues.apache.org/jira/browse/HBASE-4153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch, HBASE-4153_6.patch Comment from Stack over in HBASE-3741: {quote} Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException, won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at least one case here is saying that the region is already open on this regionserver? {quote} Indeed looking at the code it's going to be handled the same way other exceptions are. Need to add special cases for assign and unassign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4387) Error while syncing: DFSOutputStream is closed
[ https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108818#comment-13108818 ] Lars Hofhansl commented on HBASE-4387: -- @Todd Do you think retry your test with this change? (Maybe 100m rows would do too :) ) Error while syncing: DFSOutputStream is closed -- Key: HBASE-4387 URL: https://issues.apache.org/jira/browse/HBASE-4387 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0 Reporter: Todd Lipcon Priority: Critical Fix For: 0.92.0 Attachments: 4387.txt, errors-with-context.txt In a billion-row load on ~25 servers, I see error while syncing reasonable often with the error DFSOutputStream is closed around a roll. We have some race where a roll at the same time as heavy inserts causes a problem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Status: Patch Available (was: Open) Added hbase.hstore.compaction.kv.max config option. Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Attachment: HBASE-34211-v2.patch Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Attachment: HBASE-34211-v2.patch Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)
[ https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108837#comment-13108837 ] Jean-Daniel Cryans commented on HBASE-4437: --- Would make sense to work on 205, can't be that far off from CDH3 anyways once it gets sync. Update hadoop in 0.92 (0.20.205?) - Key: HBASE-4437 URL: https://issues.apache.org/jira/browse/HBASE-4437 Project: HBase Issue Type: Task Reporter: stack We ship with branch-0.20-append a few versions back from the tip. If 205 comes out and hbase works on it, we should ship 0.92 with it (while also ensuring it work w/ 0.22 and 0.23 branches). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108844#comment-13108844 ] Ted Yu commented on HBASE-3421: --- Patch v2 looks good. Minor comment: 10 should be replaced with new config below: {code} +// Limit this to 10 to avoid OOME {code} The patch doesn't apply on TRUNK. Please prepare another patch for 0.92/TRUNK. Thanks Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)
[ https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108845#comment-13108845 ] Gary Helmling commented on HBASE-4437: -- Agree with targeting 205 as well. Since it includes security, that would also mean security wouldn't need to override the Hadoop version for build. The Hadoop security code in 205 does have some changes vs. what's in the current CDH3, but that won't make a difference for current HBase. Update hadoop in 0.92 (0.20.205?) - Key: HBASE-4437 URL: https://issues.apache.org/jira/browse/HBASE-4437 Project: HBase Issue Type: Task Reporter: stack We ship with branch-0.20-append a few versions back from the tip. If 205 comes out and hbase works on it, we should ship 0.92 with it (while also ensuring it work w/ 0.22 and 0.23 branches). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Attachment: (was: HBASE-34211-v3.patch) Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Attachment: HBASE-34211-v3.patch Clarify code comment. Grant a license this time :) Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Attachment: HBASE-34211-v3.patch Clarify code comment. Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager
[ https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108853#comment-13108853 ] Ted Yu commented on HBASE-4153: --- I ran test suite on 0.92 and saw two failures: {code} Failed tests: testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): Results should contain region test,ccc,1316534680968.20178584e985d7c9300aa37d3fa249b9. for row 'ccc' Tests in error: testRSAlreadyProcessingRegion(org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion) {code} Both of them passed when run standalone. +1 on patch v6. Handle RegionAlreadyInTransitionException in AssignmentManager -- Key: HBASE-4153 URL: https://issues.apache.org/jira/browse/HBASE-4153 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: ramkrishna.s.vasudevan Fix For: 0.92.0 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch, HBASE-4153_6.patch Comment from Stack over in HBASE-3741: {quote} Question: Looking at this patch again, if we throw a RegionAlreadyInTransitionException, won't we just assign the region elsewhere though RegionAlreadyInTransitionException in at least one case here is saying that the region is already open on this regionserver? {quote} Indeed looking at the code it's going to be handled the same way other exceptions are. Need to add special cases for assign and unassign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108854#comment-13108854 ] Ted Yu commented on HBASE-3421: --- Patch v3 included changes to HBaseConfiguration.java If I commit v3, Todd would kill me :-) Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108856#comment-13108856 ] Nate Putnam commented on HBASE-3421: Sorry about that. My mistake, I should be more careful when creating patches. v4 is the winner. Thanks for your patience. Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nate Putnam updated HBASE-3421: --- Attachment: HBASE-34211-v4.patch Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108864#comment-13108864 ] Ted Yu commented on HBASE-3421: --- +1 on patch v4. Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108872#comment-13108872 ] stack commented on HBASE-3421: -- +1 on patch. Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108895#comment-13108895 ] Doug Meil commented on HBASE-4448: -- The timeout behavior was intended to be since the usage counts went to zero, so I think we're generally talking about the same idea. How to pass this variable from the build? System property? HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108908#comment-13108908 ] Jesse Yates commented on HBASE-4448: I was more worried about if people don't clean up their tests properly and leave the cluster hanging around. But I guess we can just assume that they do it right? We could make it a system property (maybe settable via maven on run) or do it with a special test-config.xml HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4352) Apply version of hbase-4015 to branch
[ https://issues.apache.org/jira/browse/HBASE-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108906#comment-13108906 ] ramkrishna.s.vasudevan commented on HBASE-4352: --- Testing was done today. Out of4000 regions 5 regions had inconsistencies reported by HBCK. Trying to figure out the reason. But things may not be due to timeoutmonitor changes. Out of 5 one is double assignment. and 4 are like the RS hosting them are actually different from the one in META. So tomorrow will dig in deeper and find if timeoutmonitor changes were the root cause or some existing flow is causing this inconsistency. But no regions are in RIT which is assured. :) Apply version of hbase-4015 to branch - Key: HBASE-4352 URL: https://issues.apache.org/jira/browse/HBASE-4352 Project: HBase Issue Type: Bug Reporter: stack Assignee: ramkrishna.s.vasudevan Fix For: 0.90.5 Attachments: HBASE-4352_0.90.patch Consider adding a version of hbase-4015 to 0.90. It changes HRegionInterface so would need move change to end of the Interface and then test that it doesn't break rolling restart. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Revell updated HBASE-4449: Attachment: HBASE-4449.patch LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Attachments: HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108979#comment-13108979 ] Ted Yu commented on HBASE-3421: --- Integrate to 0.90.5, 0.92 and TRUNK. Thanks for the patch Nate. Thanks for the review Michael. Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Assignee: Nate Putnam Fix For: 0.90.5 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Revell updated HBASE-4449: Attachment: HBASE-4449-v2.patch Patch v2 fixes test failures and adds new test cases. LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Attachments: HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108987#comment-13108987 ] Ted Yu commented on HBASE-4449: --- @David: I tried to run the new tests without the change to LoadIncrementalHFiles and they passed. Are you able to refine the new tests so that they fail for the current codebase ? Thanks LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Attachments: HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4439) Move ClientScanner out of HTable
[ https://issues.apache.org/jira/browse/HBASE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-4439: - Attachment: 4439.txt 1st cut. Very simple change, mostly just moves some code around. * simply moves HTable.ClientScanner to its own toplevel class. ClientScanner is now useable without an instance of an HTable. * HTable.getScanner(scan) now clones the scan. Even previously the scan object was actually modified inside the ClientScanner. * Some config options (maxScannerResultSize, scannerTimeout) * deprecates HTable.{get|set}ScannerCaching, so that scannerCaching can also be removed from HTable. Caching should be set through the scan object or the cluster wide config option instead. Move ClientScanner out of HTable Key: HBASE-4439 URL: https://issues.apache.org/jira/browse/HBASE-4439 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 4439.txt See HBASE-1935 for motivation. ClientScanner should be able to exist outside of HTable. While we're at it, we can also add an abstract client scanner to easy development of new client side scanners (such as parallel scanners, or per region scanners). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4439) Move ClientScanner out of HTable
[ https://issues.apache.org/jira/browse/HBASE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109005#comment-13109005 ] Lars Hofhansl commented on HBASE-4439: -- Hit submit too early... * Some config options (maxScannerResultSize, scannerTimeout) are moved from HTable to ClientScanner. * ClientScanner uses static logger. Could consider abstracting useful parts in a helper class if we foresee that writing new client scanners would be a common client side task. Move ClientScanner out of HTable Key: HBASE-4439 URL: https://issues.apache.org/jira/browse/HBASE-4439 Project: HBase Issue Type: Improvement Affects Versions: 0.94.0 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Minor Fix For: 0.94.0 Attachments: 4439.txt See HBASE-1935 for motivation. ClientScanner should be able to exist outside of HTable. While we're at it, we can also add an abstract client scanner to easy development of new client side scanners (such as parallel scanners, or per region scanners). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4387) Error while syncing: DFSOutputStream is closed
[ https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reassigned HBASE-4387: Assignee: Lars Hofhansl Error while syncing: DFSOutputStream is closed -- Key: HBASE-4387 URL: https://issues.apache.org/jira/browse/HBASE-4387 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Lars Hofhansl Priority: Critical Fix For: 0.92.0 Attachments: 4387.txt, errors-with-context.txt In a billion-row load on ~25 servers, I see error while syncing reasonable often with the error DFSOutputStream is closed around a roll. We have some race where a roll at the same time as heavy inserts causes a problem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4387) Error while syncing: DFSOutputStream is closed
[ https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109013#comment-13109013 ] Lars Hofhansl commented on HBASE-4387: -- Assigned to me so this has an owner. Error while syncing: DFSOutputStream is closed -- Key: HBASE-4387 URL: https://issues.apache.org/jira/browse/HBASE-4387 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Lars Hofhansl Priority: Critical Fix For: 0.92.0 Attachments: 4387.txt, errors-with-context.txt In a billion-row load on ~25 servers, I see error while syncing reasonable often with the error DFSOutputStream is closed around a roll. We have some race where a roll at the same time as heavy inserts causes a problem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109016#comment-13109016 ] Ted Yu commented on HBASE-4449: --- For HFileV2, maxBloomEntries is optional. That's why the test passed in TRUNK. LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Attachments: HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109018#comment-13109018 ] Ted Yu commented on HBASE-4449: --- +1 on patch. LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Attachments: HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109022#comment-13109022 ] stack commented on HBASE-4448: -- How would we pass this factory for test to test? How is this different from a fat class of tests that has a @Before that spins up the cluster and then an @After to shut it down as TestAdmin or TestFromClientSide do currently? HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4449: -- Fix Version/s: 0.90.5 LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Assignee: David Revell Fix For: 0.90.5 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-4449: - Assignee: David Revell LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Assignee: David Revell Fix For: 0.90.5 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms
[ https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Revell updated HBASE-4449: Attachment: HBASE-4449-trunk-testsonly.patch Sorry Ted, I should have realized that my patch was only against 0.90. Current state: - HBASE_4449-v2.patch applies to 0.90 branch, and was +1'ed by Ted. - HBASE-4449-trunk-testsonly.patch is just now being uploaded and includes only test changes for bloom filter CFs. It hasn't been +1'ed by anyone yet. LoadIncrementalHFiles can't handle CFs with blooms -- Key: HBASE-4449 URL: https://issues.apache.org/jira/browse/HBASE-4449 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: David Revell Assignee: David Revell Fix For: 0.90.5 Attachments: HBASE-4449-trunk-testsonly.patch, HBASE-4449-v2.patch, HBASE-4449.patch When LoadIncrementalHFiles loads a store file that crosses region boundaries, it will split the file at the boundary to create two store files. If the store file is for a column family that has a bloom filter, then a java.lang.ArithmeticException: / by zero will be raised because ByteBloomFilter() is called with maxKeys of 0. The included patch assumes that the number of keys in each split child will be equal to the number of keys in the parent's bloom filter (instead of 0). This is an overestimate, but it's safe and easy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109026#comment-13109026 ] Hudson commented on HBASE-3421: --- Integrated in HBase-TRUNK #2239 (See [https://builds.apache.org/job/HBase-TRUNK/2239/]) HBASE-3421 Very wide rows -- 30M plus -- cause us OOME (Nate Putnam) tedyu : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Assignee: Nate Putnam Fix For: 0.90.5 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujee Maniyam updated HBASE-4440: - Component/s: (was: performance) util add an option to presplit table to PerformanceEvaluation Key: HBASE-4440 URL: https://issues.apache.org/jira/browse/HBASE-4440 Project: HBase Issue Type: Improvement Components: util Reporter: Sujee Maniyam Priority: Minor Labels: benchmark PerformanceEvaluation a quick way to 'benchmark' a HBase cluster. The current 'write*' operations do not pre-split the table. Pre splitting the table will really boost the insert performance. It would be nice to have an option to enable pre-splitting table before the inserts begin. it would look something like: (a) hbase ...PerformanceEvaluation --presplit=10 other options (b) hbase ...PerformanceEvaluation --presplit other options (b) will try to presplit the table on some default value (say number of region servers) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3421: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Assignee: Nate Putnam Fix For: 0.90.5 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109075#comment-13109075 ] Jesse Yates commented on HBASE-4448: I think that these tests would be run all in the same jvm (non-forked mode) - that way they can all reuse the same static testing util. Running it in forked mode really wouldn't help with this issue. How sure how running in parallel is actually managed - I'm assuming its all out of the same jvm, just on different threads. Using the cluster the proposed way would again be a win. HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions
[ https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109086#comment-13109086 ] jirapos...@reviews.apache.org commented on HBASE-4014: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/969/ --- (Updated 2011-09-20 23:33:04.074472) Review request for hbase, Gary Helmling and Mingjie Lai. Changes --- -Address Gary's last review: -Set hbase.coprocessor.abortonerror defaults to false. -Remove separate threads in tests where possible. -Remove redundant testStarted() : it does not differ from the same test in TestMasterObserver. -fix name of test -Simplified patch as allowed by Gary's committal of HBASE-4420: MasterObserver's preMove() and postMove() are now declared to throw an IOException now. -Split existing two tests: TestRegionServerCoprocessorException.java and TestMasterCoprocessorException.java each into two tests to test new configuration setting of hbase.coprocessor.abortonerror, so four total tests now: 1. TestRegionServerCoprocessorExceptionWithAbort.java (hbase.coprocessor.abortonerror=true) 2. TestRegionServerCoprocessorExceptionWithRemove.java (hbase.coprocessor.abortonerror=false) 3. TestRegionServerCoprocessorExceptionWithAbort.java (hbase.coprocessor.abortonerror=true) 4. TestRegionServerCoprocessorExceptionWithRemove.java (hbase.coprocessor.abortonerror=false) Summary --- https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the presence of coprocessors in logged exceptions The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's coprocessor call inside a try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) } block. handleCoprocessorThrowable() is responsible for either passing 'e' along to the client (if 'e' is an IOException) or, otherwise, aborting the service (Regionserver or Master). The abort message contains a list of the loaded coprocessors for crash analysis. This addresses bug HBASE-4014. https://issues.apache.org/jira/browse/HBASE-4014 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 4e492e1 src/main/java/org/apache/hadoop/hbase/master/HMaster.java 06bf814 src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 0c95017 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java bff1f6c src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a6cf6a8 src/main/resources/hbase-default.xml 2c8f44b src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java PRE-CREATION Diff: https://reviews.apache.org/r/969/diff Testing --- patch includes two tests: TestMasterCoprocessorException.java TestRegionServerCoprocessorException.java both tests pass in my build environment. Thanks, Eugene Coprocessors: Flag the presence of coprocessors in logged exceptions Key: HBASE-4014 URL: https://issues.apache.org/jira/browse/HBASE-4014 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Andrew Purtell Assignee: Eugene Koontz Fix For: 0.92.0 Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch For some initial triage of bug reports for core versus for deployments with loaded coprocessors, we need something like the Linux kernel's taint flag, and list of linked in modules that show up in the output of every OOPS, to appear above or below exceptions that appear in the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions
[ https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109087#comment-13109087 ] jirapos...@reviews.apache.org commented on HBASE-4014: -- bq. On 2011-09-08 23:46:17, Gary Helmling wrote: bq. src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java, line 638 bq. https://reviews.apache.org/r/969/diff/9/?file=38128#file38128line638 bq. bq. This should default to false. Fixed; please see latest patch. bq. On 2011-09-08 23:46:17, Gary Helmling wrote: bq. src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java, line 67 bq. https://reviews.apache.org/r/969/diff/9/?file=38134#file38134line67 bq. bq. Does this need to be a separate thread? Can the contents of the run() method just be inline in testExceptionFromCoprocessorWhenCreatingTable()? In my testing, if the server (master or regionserver) aborts, it seems like the client becomes unresponsive and the test times out and fails. However, if I create a separate thread, the main thread can terminate properly and the test passes. I removed the separate Threads for the two tests where an abort is not expected (TestMasterCoprocessorExceptionWithRemove.java and TestRegionServerExceptionWithRemove.java). bq. On 2011-09-08 23:46:17, Gary Helmling wrote: bq. src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java, line 156 bq. https://reviews.apache.org/r/969/diff/9/?file=38134#file38134line156 bq. bq. Do we need this test? If we're already doing the same tests in TestMasterObserver, it doesn't seem like it. Has anything been added to this method that we need? You are right; removed. bq. On 2011-09-08 23:46:17, Gary Helmling wrote: bq. src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java, line 89 bq. https://reviews.apache.org/r/969/diff/9/?file=38135#file38135line89 bq. bq. Name should be something like testExceptionDuringPut? Renamed; thanks. - Eugene --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/969/#review1805 --- On 2011-09-06 19:08:59, Eugene Koontz wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/969/ bq. --- bq. bq. (Updated 2011-09-06 19:08:59) bq. bq. bq. Review request for hbase, Gary Helmling and Mingjie Lai. bq. bq. bq. Summary bq. --- bq. bq. https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the presence of coprocessors in logged exceptions bq. bq. The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's coprocessor call inside a bq. bq. try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) } bq. bq. block. bq. bq. handleCoprocessorThrowable() is responsible for either passing 'e' along to the client (if 'e' is an IOException) or, otherwise, aborting the service (Regionserver or Master). bq. bq. The abort message contains a list of the loaded coprocessors for crash analysis. bq. bq. bq. This addresses bug HBASE-4014. bq. https://issues.apache.org/jira/browse/HBASE-4014 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 4e492e1 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 3f60653 bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java aa930f5 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 8ff6e62 bq. src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 5796413 bq.src/main/resources/hbase-default.xml 2c8f44b bq. src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/969/diff bq. bq. bq. Testing bq. --- bq. bq. patch includes two tests: bq. bq. TestMasterCoprocessorException.java bq. TestRegionServerCoprocessorException.java bq. bq. both tests pass in my build environment. bq. bq. bq. Thanks, bq. bq. Eugene bq. bq. Coprocessors: Flag the presence of coprocessors in logged exceptions Key: HBASE-4014 URL: https://issues.apache.org/jira/browse/HBASE-4014 Project: HBase Issue Type: Improvement Components: coprocessors
[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)
[ https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109090#comment-13109090 ] stack commented on HBASE-4437: -- Trying with 205. Looks like we lose our blue border in UI in 0.90.x hbase. Lars Francke figured out why a while back. Need to revisit. Update hadoop in 0.92 (0.20.205?) - Key: HBASE-4437 URL: https://issues.apache.org/jira/browse/HBASE-4437 Project: HBase Issue Type: Task Reporter: stack We ship with branch-0.20-append a few versions back from the tip. If 205 comes out and hbase works on it, we should ship 0.92 with it (while also ensuring it work w/ 0.22 and 0.23 branches). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109103#comment-13109103 ] Doug Meil commented on HBASE-4448: -- As Jesse said, we have to reuse JVMs for this to work. Rather than doing this... {code} @BeforeClass public static void setUpBeforeClass() throws Exception { TEST_UTIL.startMiniCluster(1); {code} ... you would do something like this... {code} @BeforeClass public static void setUpBeforeClass() throws Exception { TEST_UTIL = HBaseTestingUtilityFactory.get().getMiniCluster(1); {code} ... and it would already be started. And rather than calling an explicit shutdown on the HBaseTestingUtility instance, you'd call a return on the factory... {code} HBaseTestingUtilityFactory.get().returnMiniCluster(instance); {code} ... where it would also blow away and tables that have been created so it's clean for the next person that uses it. HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2742) Provide strong authentication with a secure RPC engine
[ https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-2742: - Summary: Provide strong authentication with a secure RPC engine (was: Merge secure Hadoop RPC changes into HBase RPC) Changing title for clarity. Provide strong authentication with a secure RPC engine -- Key: HBASE-2742 URL: https://issues.apache.org/jira/browse/HBASE-2742 Project: HBase Issue Type: Improvement Components: ipc Reporter: Gary Helmling The HBase RPC code (org.apache.hadoop.hbase.ipc.*) was originally forked off of Hadoop RPC classes, with some performance tweaks added. Those optimizations have come at a cost in keeping up with Hadoop RPC changes however, both bug fixes and improvements/new features. In particular, this impacts how we implement security features in HBase (see HBASE-1697 and HBASE-2016). The secure Hadoop implementation (HADOOP-4487) relies heavily on RPC changes to support client authentication via kerberos and securing and mutual authentication of client/server connections via SASL. Making use of the built-in Hadoop RPC classes will gain us these pieces for free in a secure HBase. So, I'm proposing that we drop the HBase forked version of RPC and convert to direct use of Hadoop RPC, while working to contribute important fixes back upstream to Hadoop core. Based on a review of the HBase RPC changes, the key divergences seem to be: HBaseClient: - added use of TCP keepalive (HBASE-1754) - made connection retries and sleep configurable (HBASE-1815) - prevent NPE if socket == null due to creation failure (HBASE-2443) HBaseRPC: - mapping of method names - codes (removed in HBASE-2219) HBaseServer: - use of TCP keep alives (HBASE-1754) - OOME in server does not trigger abort (HBASE-1198) HbaseObjectWritable: - allows List serialization - includes it's own class - code mapping (HBASE-328) Proposed process is: 1. open issues with patches on Hadoop core for important fixes/adjustments from HBase RPC (HBASE-1198, HBASE-1815, HBASE-1754, HBASE-2443, plus a pluggable ObjectWritable implementation in RPC.Invocation to allow use of HbaseObjectWritable). 2. ship a Hadoop version with RPC patches applied -- ideally we should avoid another copy-n-paste code fork, subject to ability to isolate changes from impacting Hadoop internal RPC wire formats 3. if all Hadoop core patches are applied we can drop back to a plain vanilla Hadoop version I realize there are many different opinions on how to proceed with HBase RPC, so I'm hoping this issue will kick off a discussion on what the best approach might be. My own motivation is maximizing re-use of the authentication and connection security work that's already gone into Hadoop core. I'll put together a set of patches around #1 and #2, but obviously we need some consensus around this to move forward. If I'm missing other differences between HBase and Hadoop RPC, please list as well. Discuss! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109109#comment-13109109 ] Hudson commented on HBASE-3421: --- Integrated in HBase-0.92 #6 (See [https://builds.apache.org/job/HBase-0.92/6/]) HBASE-3421 Very wide rows -- 30M plus -- cause us OOME (Nate Putnam) tedyu : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Assignee: Nate Putnam Fix For: 0.90.5 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests
[ https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109110#comment-13109110 ] Doug Meil commented on HBASE-4448: -- I'm also working on some analysis to show the different uses of HBaseTestingUtility - not all the tests use it the same way. Some do 1 or 3 slave MiniClusters, and some do ZkClusters. HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests - Key: HBASE-4448 URL: https://issues.apache.org/jira/browse/HBASE-4448 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: HBaseTestingUtilityFactory.java Setting up and tearing down HBaseTestingUtility instances in unit tests is very expensive. On my MacBook it takes about 10 seconds to set up a MiniCluster, and 7 seconds to tear it down. When multiplied by the number of test classes that use this facility, that's a lot of time in the build. This factory assumes that the JVM is being re-used across test classes in the build, otherwise this pattern won't work. I don't think this is appropriate for every use, but I think it can be applicable in a great many cases - especially where developers just want a simple MiniCluster with 1 slave. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4387) Error while syncing: DFSOutputStream is closed
[ https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109118#comment-13109118 ] Todd Lipcon commented on HBASE-4387: I'm traveling for the next week or so, so probably won't have a chance to do so. I'll be continuing to work on 0.92 stabilization over the next couple of months, though - so I'll certainly be running this test again in the relatively near future. Error while syncing: DFSOutputStream is closed -- Key: HBASE-4387 URL: https://issues.apache.org/jira/browse/HBASE-4387 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Lars Hofhansl Priority: Critical Fix For: 0.92.0 Attachments: 4387.txt, errors-with-context.txt In a billion-row load on ~25 servers, I see error while syncing reasonable often with the error DFSOutputStream is closed around a roll. We have some race where a roll at the same time as heavy inserts causes a problem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master
[ https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz reassigned HBASE-4070: Assignee: Eugene Koontz [Coprocessors] Improve region server metrics to report loaded coprocessors to master Key: HBASE-4070 URL: https://issues.apache.org/jira/browse/HBASE-4070 Project: HBase Issue Type: Improvement Affects Versions: 0.90.3 Reporter: Mingjie Lai Assignee: Eugene Koontz HBASE-3512 is about listing loaded cp classes at shell. To make it more generic, we need a way to report this piece of information from region to master (or just at region server level). So later on, we can display the loaded class names at shell as well as web console. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4344) Persist memstoreTS to disk
[ https://issues.apache.org/jira/browse/HBASE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4344: -- Attachment: 4344-v2.txt Patch version 2 is rebased for TRUNK. Running test suite now. Persist memstoreTS to disk -- Key: HBASE-4344 URL: https://issues.apache.org/jira/browse/HBASE-4344 Project: HBase Issue Type: Sub-task Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Fix For: 0.89.20100924 Attachments: 4344-v2.txt, patch-2 Atomicity can be achieved in two ways -- (i) by using a multiversion concurrency system (MVCC), or (ii) by ensuring that new writes do not complete, until the old reads complete. Currently, Memstore uses something along the lines of MVCC (called RWCC for read-write-consistency-control). But, this mechanism is not incorporated for the key-values written to the disk, as they do not include the memstore TS. Let us make the two approaches be similar, by persisting the memstoreTS along with the key-value when it is written to the disk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4344) Persist memstoreTS to disk
[ https://issues.apache.org/jira/browse/HBASE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109228#comment-13109228 ] Ted Yu commented on HBASE-4344: --- Got a few test failures so far: {code} testHFileFormatV2(org.apache.hadoop.hbase.io.hfile.TestHFileWriterV2) Time elapsed: 0.704 sec FAILURE! java.lang.AssertionError: at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.hadoop.hbase.io.hfile.TestHFileWriterV2.testHFileFormatV2(TestHFileWriterV2.java:141) testCacheOnWrite[5](org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite) Time elapsed: 0.741 sec FAILURE! org.junit.ComparisonFailure: expected:{DATA=1[367, LEAF_INDEX=172, BLOOM_CHUNK=9, INTERMEDIATE_INDEX=24]} but was:{DATA=1[459, LEAF_INDEX=183, BLOOM_CHUNK=9, INTERMEDIATE_INDEX=25]} at org.junit.Assert.assertEquals(Assert.java:123) at org.junit.Assert.assertEquals(Assert.java:145) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.readStoreFile(TestCacheOnWrite.java:180) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testCacheOnWrite(TestCacheOnWrite.java:150) testCacheOnWrite[0](org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite) Time elapsed: 1.129 sec FAILURE! org.junit.ComparisonFailure: expected:{DATA=1[367, LEAF_INDEX=172, BLOOM_CHUNK=9, INTERMEDIATE_INDEX=24]} but was:{DATA=1[459, LEAF_INDEX=183, BLOOM_CHUNK=9, INTERMEDIATE_INDEX=25]} at org.junit.Assert.assertEquals(Assert.java:123) at org.junit.Assert.assertEquals(Assert.java:145) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.readStoreFile(TestCacheOnWrite.java:180) at org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testCacheOnWrite(TestCacheOnWrite.java:150) testSeekBefore(org.apache.hadoop.hbase.io.hfile.TestSeekTo) Time elapsed: 0.232 sec ERROR! java.lang.IllegalStateException: blockSeek with seekBefore at the first key of the block: key=\x00\x01c\x06familyqualifier\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x04, blockOffset=0, onDiskSize=171 at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:647) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:577) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekBefore(HFileReaderV2.java:732) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekBefore(HFileReaderV2.java:687) at org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekBefore(TestSeekTo.java:70) {code} Persist memstoreTS to disk -- Key: HBASE-4344 URL: https://issues.apache.org/jira/browse/HBASE-4344 Project: HBase Issue Type: Sub-task Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Fix For: 0.89.20100924 Attachments: 4344-v2.txt, patch-2 Atomicity can be achieved in two ways -- (i) by using a multiversion concurrency system (MVCC), or (ii) by ensuring that new writes do not complete, until the old reads complete. Currently, Memstore uses something along the lines of MVCC (called RWCC for read-write-consistency-control). But, this mechanism is not incorporated for the key-values written to the disk, as they do not include the memstore TS. Let us make the two approaches be similar, by persisting the memstoreTS along with the key-value when it is written to the disk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
[ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3421: -- Attachment: 3421.addendum Addendum fixes Store.FIXED_OVERHEAD Very wide rows -- 30M plus -- cause us OOME --- Key: HBASE-3421 URL: https://issues.apache.org/jira/browse/HBASE-3421 Project: HBase Issue Type: Bug Affects Versions: 0.90.0 Reporter: stack Assignee: Nate Putnam Fix For: 0.90.5 Attachments: 3421.addendum, HBASE-3421.patch, HBASE-34211-v2.patch, HBASE-34211-v3.patch, HBASE-34211-v4.patch From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it looks like wide rows -- 30M or so -- causes OOME during compaction. We should check it out. Can the scanner used during compactions use the 'limit' when nexting? If so, this should save our OOME'ing (or, we need to add to the next a max size rather than count of KVs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3130) [replication] ReplicationSource can't recover from session expired on remote clusters
[ https://issues.apache.org/jira/browse/HBASE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109246#comment-13109246 ] Chris Trezzo commented on HBASE-3130: - @J-D Now that I have looked at the test code a bit, I have a question: My current understanding is that to kill the master-slave connection, you need to somehow get the session id and session password for the ReplicationPeer's zookeeper session (i.e. you need the ZookeeperWatcher instance). Currently, this is not exposed. Also, this does not seem like something we would want to expose if the only motivation is for testing. Any thoughts? I could be missing something obvious. Thanks! Chris [replication] ReplicationSource can't recover from session expired on remote clusters - Key: HBASE-3130 URL: https://issues.apache.org/jira/browse/HBASE-3130 Project: HBase Issue Type: Bug Components: replication Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Chris Trezzo Fix For: 0.92.0 Attachments: 3130-v2.txt, 3130-v3.txt, 3130.txt Currently ReplicationSource cannot recover when its zookeeper connection to its remote cluster expires. HLogs are still being tracked, but a cluster restart is required to continue replication (or a rolling restart). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4344) Persist memstoreTS to disk
[ https://issues.apache.org/jira/browse/HBASE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109252#comment-13109252 ] Ted Yu commented on HBASE-4344: --- Two more failures: {code} testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile) Time elapsed: 2.932 sec FAILURE! junit.framework.AssertionFailedError: expected:80 but was:81 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:130) at junit.framework.Assert.assertEquals(Assert.java:136) at org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:672) testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction) Time elapsed: 7.433 sec ERROR! java.lang.RuntimeException: Already used this rwcc. Too late to initialize at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.initialize(ReadWriteConsistencyControl.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:415) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366) at org.apache.hadoop.hbase.regionserver.SplitTransaction.rollback(SplitTransaction.java:679) at org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testRollback(TestSplitTransaction.java:234) {code} Persist memstoreTS to disk -- Key: HBASE-4344 URL: https://issues.apache.org/jira/browse/HBASE-4344 Project: HBase Issue Type: Sub-task Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Fix For: 0.89.20100924 Attachments: 4344-v2.txt, patch-2 Atomicity can be achieved in two ways -- (i) by using a multiversion concurrency system (MVCC), or (ii) by ensuring that new writes do not complete, until the old reads complete. Currently, Memstore uses something along the lines of MVCC (called RWCC for read-write-consistency-control). But, this mechanism is not incorporated for the key-values written to the disk, as they do not include the memstore TS. Let us make the two approaches be similar, by persisting the memstoreTS along with the key-value when it is written to the disk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3130) [replication] ReplicationSource can't recover from session expired on remote clusters
[ https://issues.apache.org/jira/browse/HBASE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109270#comment-13109270 ] Jean-Daniel Cryans commented on HBASE-3130: --- At the very minimum you could do a TestReplicationPeer that tests the session recovery code. An integration test might be harder since you have to fiddle with the internals, maybe explore the avenue of having a test that resides in the same package (o.a.h.h.r.replication) and expose the methods only there. [replication] ReplicationSource can't recover from session expired on remote clusters - Key: HBASE-3130 URL: https://issues.apache.org/jira/browse/HBASE-3130 Project: HBase Issue Type: Bug Components: replication Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Chris Trezzo Fix For: 0.92.0 Attachments: 3130-v2.txt, 3130-v3.txt, 3130.txt Currently ReplicationSource cannot recover when its zookeeper connection to its remote cluster expires. HLogs are still being tracked, but a cluster restart is required to continue replication (or a rolling restart). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem
[ https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109279#comment-13109279 ] subramanian raghunathan commented on HBASE-3958: As per Jerry Du: Rangs means Cross regions Scan(multi-regions scan). The issue is my first HBase program, the following is P-code: create a table which is preSplit 100 regions; each region have 100 rows; fill data with row key [0,] Scan with startKey and stopKey which cross all regions;[0,) scan.setCaching(3); scan.setFilter(new PageFilter(5)); the out put is: Row key: 0 1 2 caching border 3 4 region_0 with filter border 5 caching border 6 7 8 caching border 9 region_1 with filter border 10 11 caching border 12 13 14 caching border AND region_2 with filter border Case another scan.setCaching(2); scan.setFilter(new PageFilter(5)); Output will be Row key: 0 1 caching border 2 3 caching border 4 region_0 with filter border 5 caching border 6 7 caching border 8 9 caching border AND region_1 with filter border scan stop in both caching and region border The Reason is two: Filter instance is only in one region scan; in method org.apache.hadoop.hbase.clien.HTable.ClientScanner.next() do {} while (remainingResultSize 0 countdown 0 nextScanner(countdown, values == null)); the stop condition is NOT consider scan with Filter NOT Only PageFilter,any filter will be problem in cross regions scan(multi-regions scan). use Scan with setCaching() and PageFilter have a problem Key: HBASE-3958 URL: https://issues.apache.org/jira/browse/HBASE-3958 Project: HBase Issue Type: Bug Components: filters, regionserver Affects Versions: 0.90.3 Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_23 Java(TM) SE Runtime Environment (build 1.6.0_23-b05) Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) Reporter: Jerry Du Priority: Minor I have a table with 3 ranges,then I scan the table cross all 3 ranges. Scan scan = new Scan(); scan.setCaching(10); scan.setFilter(new PageFilter(21)); [result rows count = 63] the Result has 63 rows, each range has scaned,and locally limit to page_szie.That is expect result. Then if the page_size = N * caching_size, then result has only page_size rows,only the first range has scanned. If page_size is Multiple of caching_size,one range rsult just align fill the caching,then client NOT trrige next range scan. Example: Scan scan = new Scan(); scan.setCaching(10); scan.setFilter(new PageFilter(20)); [result rows count = 20] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2742) Provide strong authentication with a secure RPC engine
[ https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109289#comment-13109289 ] jirapos...@reviews.apache.org commented on HBASE-2742: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1991/ --- Review request for hbase. Summary --- This patch creates a new secure RPC engine for HBase, which provides Kerberos based authentication of clients, and a token-based authentication mechanism for mapreduce jobs. Primary components of the patch are: - a new maven profile for secure Hadoop/HBase: hadoop-0.20S - Secure Hadoop dependent classes are separated under a pseudo-module in the security/ directory. These source and test directories are only including if building the secure Hadoop profile - Currently the security classes get packaged with the regular HBase build artifacts. We need a way to at least override project.version, so we can append something like a -security suffix indicating the additional security components. - The pseudo-module here is really a half-step forward. It enables the security code to be optionally included in the build for now, and sets up the structure for a security module. But we still will want to pursue full modularization (see HBASE-4336), which will allow packing the security code in a separate build artifact. - a new RPC engine providing kerberos and token-based authentication: org.apache.hadoop.hbase.ipc.SecureRpcEngine - implementation under security/src/main/java/org/apache/hadoop/hbase/ipc/ - The implementation classes extend the existing HBaseClient and HBaseServer to share as much of the RPC code as possible. The main override is of the connection classes to allow control over the SASL negotiation of secure connections - existing RPC changes - The existing HBaseClient and HBaseServer have been modified to make subclassing possible - All references to Hadoop UserGroupInformation have been replaced with org.apache.hadoop.hbase.security.User to insulate from future dependencies on specific Hadoop versions - a coprocessor endpoint for obtaining new authentication tokens: TokenProvider, and supporting classes for token generation and synchronization (incorporating HBASE-3615) - implementation is under security/src/main/java/org/apache/hadoop/hbase/security/token/ - Secret keys for token generation and verification are synchronized throughout the cluster in zookeeper, under /hbase/tokenauth/keys To enable secure RPC, add the following configuration to hbase-site.xml: property namehadoop.security.authorization/name valuetrue/value /property property namehadoop.security.authentication/name valuekerberos/value /property property namehbase.rpc.engine/name valueorg.apache.hadoop.hbase.ipc.SecureRpcEngine/value /property property namehbase.coprocessor.region.classes/name valueorg.apache.hadoop.hbase.security.token.TokenProvider/value /property In addition, the master and regionserver processes must be configured for kerberos authentication using the properties: * hbase.(master|regionserver).keytab.file * hbase.(master|regionserver).kerberos.principal * hbase.(master|regionserver).kerberos.https.principal This addresses bug HBASE-2742. https://issues.apache.org/jira/browse/HBASE-2742 Diffs - conf/hbase-policy.xml PRE-CREATION pom.xml 241973c security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/ipc/SecureConnectionHeader.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/AccessDeniedException.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/HBasePolicyProvider.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/HBaseSaslRpcClient.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/HBaseSaslRpcServer.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationKey.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationProtocol.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenIdentifier.java PRE-CREATION security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenSecretManager.java PRE-CREATION