[jira] [Commented] (HBASE-4336) Convert source tree into maven modules

2011-09-20 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108389#comment-13108389
 ] 

Jesse Yates commented on HBASE-4336:


Did a bunch of the work for this already under HBASE-4438 - posted initial 
patch to review board. I wasn't sure which classes we wanted to move to which 
package, but set up at least the top level hierarchies and did the most of PITA 
work of cleaning up the top-level pom.

Up on review board: https://reviews.apache.org/r/1965/

Right now, its a skeleton so we can easily drop in the code we want, where we 
want it.

 Convert source tree into maven modules
 --

 Key: HBASE-4336
 URL: https://issues.apache.org/jira/browse/HBASE-4336
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Gary Helmling

 When we originally converted the build to maven we had a single core module 
 defined, but later reverted this to a module-less build for the sake of 
 simplicity.
 It now looks like it's time to re-address this, as we have an actual need for 
 modules to:
 * provide a trimmed down client library that applications can make use of
 * more cleanly support building against different versions of Hadoop, in 
 place of some of the reflection machinations currently required
 * incorporate the secure RPC engine that depends on some secure Hadoop classes
 I propose we start simply by refactoring into two initial modules:
 * core - common classes and utilities, and client-side code and interfaces
 * server - master and region server implementations and supporting code
 This would also lay the groundwork for incorporating the HBase security 
 features that have been developed.  Once the module structure is in place, 
 security-related features could then be incorporated into a third module -- 
 security -- after normal review and approval.  The security module could 
 then depend on secure Hadoop, without modifying the dependencies of the rest 
 of the HBase code.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4446) Rolling restart RSs scenario, regions could stay in OPENING state

2011-09-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108390#comment-13108390
 ] 

Todd Lipcon commented on HBASE-4446:


Waiting on ServerShutdownHandler may make sense for some states - eg if the 
region was CLOSING, we need to make sure that we split logs before we reassign. 
But I agree that many other states (OPENING, FAILED_OPEN, CLOSED), we can 
handle regardless of whether the RS is online or not.

 Rolling restart RSs scenario, regions could stay in OPENING state
 -

 Key: HBASE-4446
 URL: https://issues.apache.org/jira/browse/HBASE-4446
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0

 Attachments: HBASE-4446-trunk.patch


 Keep Master up all the time, do rolling restart of RSs like this - stop RS1, 
 wait for 2 seconds, stop RS2, start RS1, wait for 2 seconds, stop RS3, start 
 RS2, wait for 2 seconds, etc. Region sometimes can just stay in OPENING state 
 even after timeoutmonitor period.
 2011-09-19 08:10:33,131 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: While timing out a region 
 in state OPENING, found ZK node in unexpected state: RS_ZK_REGION_FAILED_OPEN
 The issue - RS was shutdown when a region is being opened, it was 
 transitioned to RS_ZK_REGION_FAILED_OPEN in ZK. In timeoutmonitor, it didn't 
 take care of RS_ZK_REGION_FAILED_OPEN.
 processOpeningState
 ...
else if (dataInZNode.getEventType() != EventType.RS_ZK_REGION_OPENING 
 LOG.warn(While timing out a region in state OPENING, 
 + found ZK node in unexpected state: 
 + dataInZNode.getEventType());
 return;
   }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4446) Rolling restart RSs scenario, regions could stay in OPENING state

2011-09-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108396#comment-13108396
 ] 

ramkrishna.s.vasudevan commented on HBASE-4446:
---

+1. Nice analysis.
We need to dig in more to find any corner scenarios like this comes up. 

 Rolling restart RSs scenario, regions could stay in OPENING state
 -

 Key: HBASE-4446
 URL: https://issues.apache.org/jira/browse/HBASE-4446
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Ming Ma
Assignee: Ming Ma
 Fix For: 0.92.0

 Attachments: HBASE-4446-trunk.patch


 Keep Master up all the time, do rolling restart of RSs like this - stop RS1, 
 wait for 2 seconds, stop RS2, start RS1, wait for 2 seconds, stop RS3, start 
 RS2, wait for 2 seconds, etc. Region sometimes can just stay in OPENING state 
 even after timeoutmonitor period.
 2011-09-19 08:10:33,131 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: While timing out a region 
 in state OPENING, found ZK node in unexpected state: RS_ZK_REGION_FAILED_OPEN
 The issue - RS was shutdown when a region is being opened, it was 
 transitioned to RS_ZK_REGION_FAILED_OPEN in ZK. In timeoutmonitor, it didn't 
 take care of RS_ZK_REGION_FAILED_OPEN.
 processOpeningState
 ...
else if (dataInZNode.getEventType() != EventType.RS_ZK_REGION_OPENING 
 LOG.warn(While timing out a region in state OPENING, 
 + found ZK node in unexpected state: 
 + dataInZNode.getEventType());
 return;
   }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4327) Compile HBase against hadoop 0.22

2011-09-20 Thread Joep Rottinghuis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joep Rottinghuis updated HBASE-4327:


Assignee: Joep Rottinghuis

 Compile HBase against hadoop 0.22
 -

 Key: HBASE-4327
 URL: https://issues.apache.org/jira/browse/HBASE-4327
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
Reporter: Joep Rottinghuis
Assignee: Joep Rottinghuis
 Fix For: 0.92.0

 Attachments: HBASE-4327-Michael.patch, HBASE-4327.patch, 
 HBASE-4327.patch, HBASE-4327.patch


 Pom contains a profile for hadoop-0.20 and one for hadoop-0.23, but not one 
 for hadoop-0.22.
 When overriding hadoop.version to 0.22, then the (compile-time) dependency on 
 hadoop-annotations cannot be met.
 That exists on 0.23 and 0.24/trunk, but not on 0.22.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4447) Allow hbase.version to be passed in as command-line argument

2011-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108439#comment-13108439
 ] 

Hudson commented on HBASE-4447:
---

Integrated in HBase-TRUNK #2238 (See 
[https://builds.apache.org/job/HBase-TRUNK/2238/])
HBASE-4447 Allow hbase.version to be passed in as command-line argument

stack : 
Files : 
* /hbase/trunk/pom.xml


 Allow hbase.version to be passed in as command-line argument
 

 Key: HBASE-4447
 URL: https://issues.apache.org/jira/browse/HBASE-4447
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.92.0
Reporter: Joep Rottinghuis
Assignee: Joep Rottinghuis
 Fix For: 0.92.0

 Attachments: HBASE-4447-0.92.patch


 Currently the build always produces the jars and tarball according to the 
 version baked into the POM.
 When we modify this to allow the version to be passed in as a command-line 
 argument, it can still default to the same behavior, yet give the flexibility 
 for an internal build to tag on own version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4447) Allow hbase.version to be passed in as command-line argument

2011-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108495#comment-13108495
 ] 

Hudson commented on HBASE-4447:
---

Integrated in HBase-0.92 #5 (See [https://builds.apache.org/job/HBase-0.92/5/])
HBASE-4447 Allow hbase.version to be passed in as command-line argument

stack : 
Files : 
* /hbase/branches/0.92/pom.xml


 Allow hbase.version to be passed in as command-line argument
 

 Key: HBASE-4447
 URL: https://issues.apache.org/jira/browse/HBASE-4447
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.92.0
Reporter: Joep Rottinghuis
Assignee: Joep Rottinghuis
 Fix For: 0.92.0

 Attachments: HBASE-4447-0.92.patch


 Currently the build always produces the jars and tarball according to the 
 version baked into the POM.
 When we modify this to allow the version to be passed in as a command-line 
 argument, it can still default to the same behavior, yet give the flexibility 
 for an internal build to tag on own version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Doug Meil (JIRA)
HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances 
across unit tests
-

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor


Setting up and tearing down HBaseTestingUtility instances in unit tests is very 
expensive.  On my MacBook it takes about 10 seconds to set up a MiniCluster, 
and 7 seconds to tear it down.  When multiplied by the number of test classes 
that use this facility, that's a lot of time in the build.

This factory assumes that the JVM is being re-used across test classes in the 
build, otherwise this pattern won't work. 

I don't think this is appropriate for every use, but I think it can be 
applicable in a great many cases - especially where developers just want a 
simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4352) Apply version of hbase-4015 to branch

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108629#comment-13108629
 ] 

Ted Yu commented on HBASE-4352:
---

TestZKBasedOpenCloseRegion hangs based on patch.

 Apply version of hbase-4015 to branch
 -

 Key: HBASE-4352
 URL: https://issues.apache.org/jira/browse/HBASE-4352
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.5

 Attachments: HBASE-4352_0.90.patch


 Consider adding a version of hbase-4015 to 0.90.  It changes HRegionInterface 
 so would need move change to end of the Interface and then test that it 
 doesn't break rolling restart.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108631#comment-13108631
 ] 

Ted Yu commented on HBASE-4153:
---

Test suite had a few failures:
{code}
Failed tests:   
testBasicRollingRestart(org.apache.hadoop.hbase.master.TestRollingRestart): 
expected:22 but was:6

Tests in error:
  
testRSAlreadyProcessingRegion(org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion)
  
testFailedOpenRegion(org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler)
  
testFailedUpdateMeta(org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler)
{code}

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0

 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, 
 HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-09-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108659#comment-13108659
 ] 

ramkrishna.s.vasudevan commented on HBASE-4153:
---

I will check once again and will let you know in sometime. 

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0

 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, 
 HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-09-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108708#comment-13108708
 ] 

ramkrishna.s.vasudevan commented on HBASE-4153:
---

TestOpenRegionHandler change in patch HBASE-4153_3.patch was not applied in 
latest.
But i had the changed code in my code base.. hence the testcases passed.  Other 
two testcases i dont find any errors. Correct me if am wrong Ted.  Thanks for 
your findings :)
{code}
Running org.apache.hadoop.hbase.master.TestRollingRestart
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.718 sec

Running org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.593 sec

Running org.apache.hadoop.hbase.regionserver.handler.TestOpenRegionHandler
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.825 sec

{code}

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0

 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, 
 HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-4448:
-

Attachment: HBaseTestingUtilityFactory.java

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108735#comment-13108735
 ] 

Doug Meil commented on HBASE-4448:
--

Attached a prototype of HBaseTestingUtilityFactory.  This is not ready for 
prime time yet, but I'd like to solicit comments for the general idea.  

Noted issues:  there needs to be a configurable wait period when the 
ref-counts get to zero.  That should be set from the build, but how?  System 
property?  The reason is that while this pattern can be useful for many cases, 
it won't be suitable for all.  Therefore, there could be periods of non-use 
when another test is running, and we don't want to be too aggressive in tearing 
down the instances otherwise we'll be back where we started.

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-09-20 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4153:
--

Attachment: HBASE-4153_6.patch

Did one small change.  Rhe return type of getRegionsInTransitionInRS() in 
RegionServerServices has been made to Map instead of ConcurrentSkipListMap 
because it is a good practice to return the super type in interfaces. This 
avoids the change in TestOpenRegionHandler.

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0

 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, 
 HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch, HBASE-4153_6.patch


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-09-20 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4153:
--

Status: Open  (was: Patch Available)

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0

 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, 
 HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch, HBASE-4153_6.patch


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4387) Error while syncing: DFSOutputStream is closed

2011-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108818#comment-13108818
 ] 

Lars Hofhansl commented on HBASE-4387:
--

@Todd Do you think retry your test with this change?
(Maybe 100m rows would do too :) )

 Error while syncing: DFSOutputStream is closed
 --

 Key: HBASE-4387
 URL: https://issues.apache.org/jira/browse/HBASE-4387
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4387.txt, errors-with-context.txt


 In a billion-row load on ~25 servers, I see error while syncing reasonable 
 often with the error DFSOutputStream is closed around a roll. We have some 
 race where a roll at the same time as heavy inserts causes a problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Status: Patch Available  (was: Open)

Added hbase.hstore.compaction.kv.max config option. 

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Attachment: HBASE-34211-v2.patch

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Attachment: HBASE-34211-v2.patch

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)

2011-09-20 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108837#comment-13108837
 ] 

Jean-Daniel Cryans commented on HBASE-4437:
---

Would make sense to work on 205, can't be that far off from CDH3 anyways once 
it gets sync.

 Update hadoop in 0.92 (0.20.205?)
 -

 Key: HBASE-4437
 URL: https://issues.apache.org/jira/browse/HBASE-4437
 Project: HBase
  Issue Type: Task
Reporter: stack

 We ship with branch-0.20-append a few versions back from the tip.  If 205 
 comes out and hbase works on it, we should ship 0.92 with it (while also 
 ensuring it work w/ 0.22 and 0.23 branches).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108844#comment-13108844
 ] 

Ted Yu commented on HBASE-3421:
---

Patch v2 looks good.
Minor comment: 10 should be replaced with new config below:
{code}
+// Limit this to 10 to avoid OOME
{code}
The patch doesn't apply on TRUNK.
Please prepare another patch for 0.92/TRUNK.

Thanks

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)

2011-09-20 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108845#comment-13108845
 ] 

Gary Helmling commented on HBASE-4437:
--

Agree with targeting 205 as well.  Since it includes security, that would also 
mean security wouldn't need to override the Hadoop version for build.  The 
Hadoop security code in 205 does have some changes vs. what's in the current 
CDH3, but that won't make a difference for current HBase.

 Update hadoop in 0.92 (0.20.205?)
 -

 Key: HBASE-4437
 URL: https://issues.apache.org/jira/browse/HBASE-4437
 Project: HBase
  Issue Type: Task
Reporter: stack

 We ship with branch-0.20-append a few versions back from the tip.  If 205 
 comes out and hbase works on it, we should ship 0.92 with it (while also 
 ensuring it work w/ 0.22 and 0.23 branches).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Attachment: (was: HBASE-34211-v3.patch)

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Attachment: HBASE-34211-v3.patch

Clarify code comment. Grant a license this time :) 

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Attachment: HBASE-34211-v3.patch

Clarify code comment. 

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4153) Handle RegionAlreadyInTransitionException in AssignmentManager

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108853#comment-13108853
 ] 

Ted Yu commented on HBASE-4153:
---

I ran test suite on 0.92 and saw two failures:
{code}
Failed tests:   
testRowRange(org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol): 
Results should contain region 
test,ccc,1316534680968.20178584e985d7c9300aa37d3fa249b9. for row 'ccc'

Tests in error:
  
testRSAlreadyProcessingRegion(org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion)
{code}
Both of them passed when run standalone.

+1 on patch v6.

 Handle RegionAlreadyInTransitionException in AssignmentManager
 --

 Key: HBASE-4153
 URL: https://issues.apache.org/jira/browse/HBASE-4153
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.0

 Attachments: 4153-v3.txt, HBASE-4153_1.patch, HBASE-4153_2.patch, 
 HBASE-4153_3.patch, HBASE-4153_4.patch, HBASE-4153_5.patch, HBASE-4153_6.patch


 Comment from Stack over in HBASE-3741:
 {quote}
 Question: Looking at this patch again, if we throw a 
 RegionAlreadyInTransitionException, won't we just assign the region elsewhere 
 though RegionAlreadyInTransitionException in at least one case here is saying 
 that the region is already open on this regionserver?
 {quote}
 Indeed looking at the code it's going to be handled the same way other 
 exceptions are. Need to add special cases for assign and unassign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108854#comment-13108854
 ] 

Ted Yu commented on HBASE-3421:
---

Patch v3 included changes to HBaseConfiguration.java
If I commit v3, Todd would kill me :-)

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108856#comment-13108856
 ] 

Nate Putnam commented on HBASE-3421:


Sorry about that. My mistake, I should be more careful when creating patches. 
v4 is the winner. Thanks for your patience. 

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Nate Putnam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nate Putnam updated HBASE-3421:
---

Attachment: HBASE-34211-v4.patch

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108864#comment-13108864
 ] 

Ted Yu commented on HBASE-3421:
---

+1 on patch v4.

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108872#comment-13108872
 ] 

stack commented on HBASE-3421:
--

+1 on patch.

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108895#comment-13108895
 ] 

Doug Meil commented on HBASE-4448:
--

The timeout behavior was intended to be since the usage counts went to zero, 
so I think we're generally talking about the same idea.  How to pass this 
variable from the build?  System property?

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108908#comment-13108908
 ] 

Jesse Yates commented on HBASE-4448:


I was more worried about if people don't clean up their tests properly and 
leave the cluster hanging around. But I guess we can just assume that they do 
it right?

We could make it a system property (maybe settable via maven on run) or do it 
with a special test-config.xml

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4352) Apply version of hbase-4015 to branch

2011-09-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108906#comment-13108906
 ] 

ramkrishna.s.vasudevan commented on HBASE-4352:
---

Testing was done today.  Out of4000 regions 5 regions had inconsistencies 
reported by HBCK.  Trying to figure out the reason.  But things may not be due 
to timeoutmonitor changes.  
Out of 5 one is double assignment.
and 4 are like the RS hosting them are actually different from the one in META.
So tomorrow will dig in deeper and find if timeoutmonitor changes were the root 
cause or some existing flow is causing this inconsistency.  But no regions are 
in RIT which is assured. :)

 Apply version of hbase-4015 to branch
 -

 Key: HBASE-4352
 URL: https://issues.apache.org/jira/browse/HBASE-4352
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.5

 Attachments: HBASE-4352_0.90.patch


 Consider adding a version of hbase-4015 to 0.90.  It changes HRegionInterface 
 so would need move change to end of the Interface and then test that it 
 doesn't break rolling restart.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread David Revell (JIRA)
LoadIncrementalHFiles can't handle CFs with blooms
--

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell


When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
it will split the file at the boundary to create two store files. If the store 
file is for a column family that has a bloom filter, then a 
java.lang.ArithmeticException: / by zero will be raised because 
ByteBloomFilter() is called with maxKeys of 0.

The included patch assumes that the number of keys in each split child will be 
equal to the number of keys in the parent's bloom filter (instead of 0). This 
is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread David Revell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Revell updated HBASE-4449:


Attachment: HBASE-4449.patch

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
 Attachments: HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108979#comment-13108979
 ] 

Ted Yu commented on HBASE-3421:
---

Integrate to 0.90.5, 0.92 and TRUNK.

Thanks for the patch Nate.

Thanks for the review Michael.

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
Assignee: Nate Putnam
 Fix For: 0.90.5

 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread David Revell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Revell updated HBASE-4449:


Attachment: HBASE-4449-v2.patch

Patch v2 fixes test failures and adds new test cases.

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13108987#comment-13108987
 ] 

Ted Yu commented on HBASE-4449:
---

@David:
I tried to run the new tests without the change to LoadIncrementalHFiles and 
they passed.
Are you able to refine the new tests so that they fail for the current codebase 
?

Thanks

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4439) Move ClientScanner out of HTable

2011-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4439:
-

Attachment: 4439.txt

1st cut. Very simple change, mostly just moves some code around.

* simply moves HTable.ClientScanner to its own toplevel class. ClientScanner is 
now useable without an instance of an HTable.
* HTable.getScanner(scan) now clones the scan. Even previously the scan object 
was actually modified inside the ClientScanner.
* Some config options (maxScannerResultSize, scannerTimeout)
* deprecates HTable.{get|set}ScannerCaching, so that scannerCaching can also be 
removed from HTable. Caching should be set through the scan object or the 
cluster wide config option instead.


 Move ClientScanner out of HTable
 

 Key: HBASE-4439
 URL: https://issues.apache.org/jira/browse/HBASE-4439
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4439.txt


 See HBASE-1935 for motivation.
 ClientScanner should be able to exist outside of HTable.
 While we're at it, we can also add an abstract client scanner to easy 
 development of new client side scanners (such as parallel scanners, or per 
 region scanners).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4439) Move ClientScanner out of HTable

2011-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109005#comment-13109005
 ] 

Lars Hofhansl commented on HBASE-4439:
--

Hit submit too early...

* Some config options (maxScannerResultSize, scannerTimeout) are moved from 
HTable to ClientScanner.
* ClientScanner uses static logger.

Could consider abstracting useful parts in a helper class if we foresee that 
writing new client scanners would be a common client side task.

 Move ClientScanner out of HTable
 

 Key: HBASE-4439
 URL: https://issues.apache.org/jira/browse/HBASE-4439
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4439.txt


 See HBASE-1935 for motivation.
 ClientScanner should be able to exist outside of HTable.
 While we're at it, we can also add an abstract client scanner to easy 
 development of new client side scanners (such as parallel scanners, or per 
 region scanners).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4387) Error while syncing: DFSOutputStream is closed

2011-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reassigned HBASE-4387:


Assignee: Lars Hofhansl

 Error while syncing: DFSOutputStream is closed
 --

 Key: HBASE-4387
 URL: https://issues.apache.org/jira/browse/HBASE-4387
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4387.txt, errors-with-context.txt


 In a billion-row load on ~25 servers, I see error while syncing reasonable 
 often with the error DFSOutputStream is closed around a roll. We have some 
 race where a roll at the same time as heavy inserts causes a problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4387) Error while syncing: DFSOutputStream is closed

2011-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109013#comment-13109013
 ] 

Lars Hofhansl commented on HBASE-4387:
--

Assigned to me so this has an owner.

 Error while syncing: DFSOutputStream is closed
 --

 Key: HBASE-4387
 URL: https://issues.apache.org/jira/browse/HBASE-4387
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4387.txt, errors-with-context.txt


 In a billion-row load on ~25 servers, I see error while syncing reasonable 
 often with the error DFSOutputStream is closed around a roll. We have some 
 race where a roll at the same time as heavy inserts causes a problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109016#comment-13109016
 ] 

Ted Yu commented on HBASE-4449:
---

For HFileV2, maxBloomEntries is optional. That's why the test passed in TRUNK.

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109018#comment-13109018
 ] 

Ted Yu commented on HBASE-4449:
---

+1 on patch.

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109022#comment-13109022
 ] 

stack commented on HBASE-4448:
--

How would we pass this factory for test to test?

How is this different from a fat class of tests that has a @Before that spins 
up the cluster and then an @After to shut it down as TestAdmin or 
TestFromClientSide do currently?

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4449:
--

Fix Version/s: 0.90.5

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
Assignee: David Revell
 Fix For: 0.90.5

 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4449:
-

Assignee: David Revell

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
Assignee: David Revell
 Fix For: 0.90.5

 Attachments: HBASE-4449-v2.patch, HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4449) LoadIncrementalHFiles can't handle CFs with blooms

2011-09-20 Thread David Revell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Revell updated HBASE-4449:


Attachment: HBASE-4449-trunk-testsonly.patch

Sorry Ted, I should have realized that my patch was only against 0.90.

Current state: 
 - HBASE_4449-v2.patch applies to 0.90 branch, and was +1'ed by Ted.
 - HBASE-4449-trunk-testsonly.patch is just now being uploaded and includes 
only test changes for bloom filter CFs. It hasn't been +1'ed by anyone yet.

 LoadIncrementalHFiles can't handle CFs with blooms
 --

 Key: HBASE-4449
 URL: https://issues.apache.org/jira/browse/HBASE-4449
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: David Revell
Assignee: David Revell
 Fix For: 0.90.5

 Attachments: HBASE-4449-trunk-testsonly.patch, HBASE-4449-v2.patch, 
 HBASE-4449.patch


 When LoadIncrementalHFiles loads a store file that crosses region boundaries, 
 it will split the file at the boundary to create two store files. If the 
 store file is for a column family that has a bloom filter, then a 
 java.lang.ArithmeticException: / by zero will be raised because 
 ByteBloomFilter() is called with maxKeys of 0.
 The included patch assumes that the number of keys in each split child will 
 be equal to the number of keys in the parent's bloom filter (instead of 0). 
 This is an overestimate, but it's safe and easy.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109026#comment-13109026
 ] 

Hudson commented on HBASE-3421:
---

Integrated in HBase-TRUNK #2239 (See 
[https://builds.apache.org/job/HBase-TRUNK/2239/])
HBASE-3421  Very wide rows -- 30M plus -- cause us OOME (Nate Putnam)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
Assignee: Nate Putnam
 Fix For: 0.90.5

 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

2011-09-20 Thread Sujee Maniyam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sujee Maniyam updated HBASE-4440:
-

Component/s: (was: performance)
 util

 add an option to presplit table to PerformanceEvaluation
 

 Key: HBASE-4440
 URL: https://issues.apache.org/jira/browse/HBASE-4440
 Project: HBase
  Issue Type: Improvement
  Components: util
Reporter: Sujee Maniyam
Priority: Minor
  Labels: benchmark

 PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The 
 current 'write*' operations do not pre-split the table.  Pre splitting the 
 table will really boost the insert performance.
 It would be nice to have an option to enable pre-splitting table before the 
 inserts begin.
 it would look something like:
 (a) hbase ...PerformanceEvaluation   --presplit=10 other options
 (b) hbase ...PerformanceEvaluation   --presplit other options
 (b) will try to presplit the table on some default value (say number of 
 region servers)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3421:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
Assignee: Nate Putnam
 Fix For: 0.90.5

 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109075#comment-13109075
 ] 

Jesse Yates commented on HBASE-4448:


I think that these tests would be run all in the same jvm (non-forked mode) - 
that way they can all reuse the same static testing util.

Running it in forked mode really wouldn't help with this issue. How sure how 
running in parallel is actually managed - I'm assuming its all out of the same 
jvm, just on different threads. Using the cluster the proposed way would again 
be a win.

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-09-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109086#comment-13109086
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/
---

(Updated 2011-09-20 23:33:04.074472)


Review request for hbase, Gary Helmling and Mingjie Lai.


Changes
---

-Address Gary's last review:
 -Set hbase.coprocessor.abortonerror defaults to false.
 -Remove separate threads in tests where possible.
 -Remove redundant testStarted() : it does not differ from the same test in 
TestMasterObserver.
 -fix name of test

-Simplified patch as allowed by Gary's committal of HBASE-4420: 
MasterObserver's preMove() and postMove() are now declared to 
throw an IOException now.

-Split existing two tests: TestRegionServerCoprocessorException.java and 
TestMasterCoprocessorException.java each into two tests to test new 
configuration setting of hbase.coprocessor.abortonerror, so four total tests 
now:

1. TestRegionServerCoprocessorExceptionWithAbort.java  
(hbase.coprocessor.abortonerror=true)
2. TestRegionServerCoprocessorExceptionWithRemove.java 
(hbase.coprocessor.abortonerror=false)
3. TestRegionServerCoprocessorExceptionWithAbort.java  
(hbase.coprocessor.abortonerror=true)
4. TestRegionServerCoprocessorExceptionWithRemove.java 
(hbase.coprocessor.abortonerror=false)


Summary
---

https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions

The general gist here is to wrap each of {Master,RegionServer}CoprocessorHost's 
coprocessor call inside a 

try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }

block. 

handleCoprocessorThrowable() is responsible for either passing 'e' along to the 
client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).

The abort message contains a list of the loaded coprocessors for crash analysis.


This addresses bug HBASE-4014.
https://issues.apache.org/jira/browse/HBASE-4014


Diffs (updated)
-

  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
4e492e1 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 06bf814 
  src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
0c95017 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java bff1f6c 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
a6cf6a8 
  src/main/resources/hbase-default.xml 2c8f44b 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
 PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
 PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
 PRE-CREATION 
  
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/969/diff


Testing
---

patch includes two tests:

TestMasterCoprocessorException.java
TestRegionServerCoprocessorException.java

both tests pass in my build environment.


Thanks,

Eugene



 Coprocessors: Flag the presence of coprocessors in logged exceptions
 

 Key: HBASE-4014
 URL: https://issues.apache.org/jira/browse/HBASE-4014
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Andrew Purtell
Assignee: Eugene Koontz
 Fix For: 0.92.0

 Attachments: HBASE-4014.patch, HBASE-4014.patch, HBASE-4014.patch, 
 HBASE-4014.patch, HBASE-4014.patch


 For some initial triage of bug reports for core versus for deployments with 
 loaded coprocessors, we need something like the Linux kernel's taint flag, 
 and list of linked in modules that show up in the output of every OOPS, to 
 appear above or below exceptions that appear in the logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-09-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109087#comment-13109087
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--



bq.  On 2011-09-08 23:46:17, Gary Helmling wrote:
bq.   src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java, 
line 638
bq.   https://reviews.apache.org/r/969/diff/9/?file=38128#file38128line638
bq.  
bq.   This should default to false.

Fixed; please see latest patch.


bq.  On 2011-09-08 23:46:17, Gary Helmling wrote:
bq.   
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java,
 line 67
bq.   https://reviews.apache.org/r/969/diff/9/?file=38134#file38134line67
bq.  
bq.   Does this need to be a separate thread?  Can the contents of the 
run() method just be inline in testExceptionFromCoprocessorWhenCreatingTable()?

In my testing, if the server (master or regionserver) aborts, it seems like the 
client becomes unresponsive and the test times out and fails. However, if I 
create a separate thread, the main thread can terminate properly and the test 
passes. 

I removed the separate Threads for the two tests where an abort is not expected 
(TestMasterCoprocessorExceptionWithRemove.java and 
TestRegionServerExceptionWithRemove.java).


bq.  On 2011-09-08 23:46:17, Gary Helmling wrote:
bq.   
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java,
 line 156
bq.   https://reviews.apache.org/r/969/diff/9/?file=38134#file38134line156
bq.  
bq.   Do we need this test?  If we're already doing the same tests in 
TestMasterObserver, it doesn't seem like it.  Has anything been added to this 
method that we need?

You are right; removed.


bq.  On 2011-09-08 23:46:17, Gary Helmling wrote:
bq.   
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java,
 line 89
bq.   https://reviews.apache.org/r/969/diff/9/?file=38135#file38135line89
bq.  
bq.   Name should be something like testExceptionDuringPut?

Renamed; thanks.


- Eugene


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1805
---


On 2011-09-06 19:08:59, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-09-06 19:08:59)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to the client (if 'e' is an IOException) or, otherwise, aborting the service 
(Regionserver or Master).
bq.  
bq.  The abort message contains a list of the loaded coprocessors for crash 
analysis.
bq.  
bq.  
bq.  This addresses bug HBASE-4014.
bq.  https://issues.apache.org/jira/browse/HBASE-4014
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
4e492e1 
bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 3f60653 
bq.src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 
aa930f5 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
8ff6e62 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
5796413 
bq.src/main/resources/hbase-default.xml 2c8f44b 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java
 PRE-CREATION 
bq.
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorException.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/969/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  patch includes two tests:
bq.  
bq.  TestMasterCoprocessorException.java
bq.  TestRegionServerCoprocessorException.java
bq.  
bq.  both tests pass in my build environment.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Eugene
bq.  
bq.



 Coprocessors: Flag the presence of coprocessors in logged exceptions
 

 Key: HBASE-4014
 URL: https://issues.apache.org/jira/browse/HBASE-4014
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
 

[jira] [Commented] (HBASE-4437) Update hadoop in 0.92 (0.20.205?)

2011-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109090#comment-13109090
 ] 

stack commented on HBASE-4437:
--

Trying with 205.  Looks like we lose our blue border in UI in 0.90.x hbase.  
Lars Francke figured out why a while back.  Need to revisit.

 Update hadoop in 0.92 (0.20.205?)
 -

 Key: HBASE-4437
 URL: https://issues.apache.org/jira/browse/HBASE-4437
 Project: HBase
  Issue Type: Task
Reporter: stack

 We ship with branch-0.20-append a few versions back from the tip.  If 205 
 comes out and hbase works on it, we should ship 0.92 with it (while also 
 ensuring it work w/ 0.22 and 0.23 branches).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109103#comment-13109103
 ] 

Doug Meil commented on HBASE-4448:
--

As Jesse said, we have to reuse JVMs for this to work.  Rather than doing 
this...

{code}
  @BeforeClass
  public static void setUpBeforeClass() throws Exception {
TEST_UTIL.startMiniCluster(1);
{code}

... you would do something like this...
{code}
  @BeforeClass
  public static void setUpBeforeClass() throws Exception {
 TEST_UTIL = HBaseTestingUtilityFactory.get().getMiniCluster(1);
{code}
... and it would already be started.

And rather than calling an explicit shutdown on the HBaseTestingUtility 
instance, you'd call a return on the factory...
{code}
HBaseTestingUtilityFactory.get().returnMiniCluster(instance);
{code}
... where it would also blow away and tables that have been created so it's 
clean for the next person that uses it.


 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2742) Provide strong authentication with a secure RPC engine

2011-09-20 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-2742:
-

Summary: Provide strong authentication with a secure RPC engine  (was: 
Merge secure Hadoop RPC changes into HBase RPC)

Changing title for clarity.

 Provide strong authentication with a secure RPC engine
 --

 Key: HBASE-2742
 URL: https://issues.apache.org/jira/browse/HBASE-2742
 Project: HBase
  Issue Type: Improvement
  Components: ipc
Reporter: Gary Helmling

 The HBase RPC code (org.apache.hadoop.hbase.ipc.*) was originally forked off 
 of Hadoop RPC classes, with some performance tweaks added.  Those 
 optimizations have come at a cost in keeping up with Hadoop RPC changes 
 however, both bug fixes and improvements/new features.  
 In particular, this impacts how we implement security features in HBase (see 
 HBASE-1697 and HBASE-2016).  The secure Hadoop implementation (HADOOP-4487) 
 relies heavily on RPC changes to support client authentication via kerberos 
 and securing and mutual authentication of client/server connections via SASL. 
  Making use of the built-in Hadoop RPC classes will gain us these pieces for 
 free in a secure HBase.
 So, I'm proposing that we drop the HBase forked version of RPC and convert to 
 direct use of Hadoop RPC, while working to contribute important fixes back 
 upstream to Hadoop core.  Based on a review of the HBase RPC changes, the key 
 divergences seem to be:
 HBaseClient:
  - added use of TCP keepalive (HBASE-1754)
  - made connection retries and sleep configurable (HBASE-1815)
  - prevent NPE if socket == null due to creation failure (HBASE-2443)
 HBaseRPC:
  - mapping of method names - codes (removed in HBASE-2219)
 HBaseServer:
  - use of TCP keep alives (HBASE-1754)
  - OOME in server does not trigger abort (HBASE-1198)
 HbaseObjectWritable:
  - allows List serialization
  - includes it's own class - code mapping (HBASE-328)
 Proposed process is:
 1. open issues with patches on Hadoop core for important fixes/adjustments 
 from HBase RPC (HBASE-1198, HBASE-1815, HBASE-1754, HBASE-2443, plus a 
 pluggable ObjectWritable implementation in RPC.Invocation to allow use of 
 HbaseObjectWritable).
 2. ship a Hadoop version with RPC patches applied -- ideally we should avoid 
 another copy-n-paste code fork, subject to ability to isolate changes from 
 impacting Hadoop internal RPC wire formats
 3. if all Hadoop core patches are applied we can drop back to a plain vanilla 
 Hadoop version
 I realize there are many different opinions on how to proceed with HBase RPC, 
 so I'm hoping this issue will kick off a discussion on what the best approach 
 might be.  My own motivation is maximizing re-use of the authentication and 
 connection security work that's already gone into Hadoop core.  I'll put 
 together a set of patches around #1 and #2, but obviously we need some 
 consensus around this to move forward.  If I'm missing other differences 
 between HBase and Hadoop RPC, please list as well.  Discuss!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109109#comment-13109109
 ] 

Hudson commented on HBASE-3421:
---

Integrated in HBase-0.92 #6 (See [https://builds.apache.org/job/HBase-0.92/6/])
HBASE-3421  Very wide rows -- 30M plus -- cause us OOME (Nate Putnam)

tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
Assignee: Nate Putnam
 Fix For: 0.90.5

 Attachments: HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4448) HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility instances across unit tests

2011-09-20 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109110#comment-13109110
 ] 

Doug Meil commented on HBASE-4448:
--

I'm also working on some analysis to show the different uses of 
HBaseTestingUtility - not all the tests use it the same way.  Some do 1 or 3 
slave MiniClusters, and some do ZkClusters.

 

 HBaseTestingUtilityFactory - pattern for re-using HBaseTestingUtility 
 instances across unit tests
 -

 Key: HBASE-4448
 URL: https://issues.apache.org/jira/browse/HBASE-4448
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBaseTestingUtilityFactory.java


 Setting up and tearing down HBaseTestingUtility instances in unit tests is 
 very expensive.  On my MacBook it takes about 10 seconds to set up a 
 MiniCluster, and 7 seconds to tear it down.  When multiplied by the number of 
 test classes that use this facility, that's a lot of time in the build.
 This factory assumes that the JVM is being re-used across test classes in the 
 build, otherwise this pattern won't work. 
 I don't think this is appropriate for every use, but I think it can be 
 applicable in a great many cases - especially where developers just want a 
 simple MiniCluster with 1 slave.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4387) Error while syncing: DFSOutputStream is closed

2011-09-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109118#comment-13109118
 ] 

Todd Lipcon commented on HBASE-4387:


I'm traveling for the next week or so, so probably won't have a chance to do 
so. I'll be continuing to work on 0.92 stabilization over the next couple of 
months, though - so I'll certainly be running this test again in the relatively 
near future.

 Error while syncing: DFSOutputStream is closed
 --

 Key: HBASE-4387
 URL: https://issues.apache.org/jira/browse/HBASE-4387
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Todd Lipcon
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.92.0

 Attachments: 4387.txt, errors-with-context.txt


 In a billion-row load on ~25 servers, I see error while syncing reasonable 
 often with the error DFSOutputStream is closed around a roll. We have some 
 race where a roll at the same time as heavy inserts causes a problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4070) [Coprocessors] Improve region server metrics to report loaded coprocessors to master

2011-09-20 Thread Eugene Koontz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koontz reassigned HBASE-4070:


Assignee: Eugene Koontz

 [Coprocessors] Improve region server metrics to report loaded coprocessors to 
 master
 

 Key: HBASE-4070
 URL: https://issues.apache.org/jira/browse/HBASE-4070
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.3
Reporter: Mingjie Lai
Assignee: Eugene Koontz

 HBASE-3512 is about listing loaded cp classes at shell. To make it more 
 generic, we need a way to report this piece of information from region to 
 master (or just at region server level). So later on, we can display the 
 loaded class names at shell as well as web console. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4344) Persist memstoreTS to disk

2011-09-20 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4344:
--

Attachment: 4344-v2.txt

Patch version 2 is rebased for TRUNK.
Running test suite now.

 Persist memstoreTS to disk
 --

 Key: HBASE-4344
 URL: https://issues.apache.org/jira/browse/HBASE-4344
 Project: HBase
  Issue Type: Sub-task
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
 Fix For: 0.89.20100924

 Attachments: 4344-v2.txt, patch-2


 Atomicity can be achieved in two ways -- (i) by using  a multiversion 
 concurrency system (MVCC), or (ii) by ensuring that new writes do not 
 complete, until the old reads complete.
 Currently, Memstore uses something along the lines of MVCC (called RWCC for 
 read-write-consistency-control). But, this mechanism is not incorporated for 
 the key-values written to the disk, as they do not include the memstore TS.
 Let us make the two approaches be similar, by persisting the memstoreTS along 
 with the key-value when it is written to the disk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4344) Persist memstoreTS to disk

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109228#comment-13109228
 ] 

Ted Yu commented on HBASE-4344:
---

Got a few test failures so far:
{code}
testHFileFormatV2(org.apache.hadoop.hbase.io.hfile.TestHFileWriterV2)  Time 
elapsed: 0.704 sec   FAILURE!
java.lang.AssertionError: 
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.hadoop.hbase.io.hfile.TestHFileWriterV2.testHFileFormatV2(TestHFileWriterV2.java:141)

testCacheOnWrite[5](org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite)  Time 
elapsed: 0.741 sec   FAILURE!
org.junit.ComparisonFailure: expected:{DATA=1[367, LEAF_INDEX=172, 
BLOOM_CHUNK=9, INTERMEDIATE_INDEX=24]} but was:{DATA=1[459, LEAF_INDEX=183, 
BLOOM_CHUNK=9, INTERMEDIATE_INDEX=25]}
at org.junit.Assert.assertEquals(Assert.java:123)
at org.junit.Assert.assertEquals(Assert.java:145)
at 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.readStoreFile(TestCacheOnWrite.java:180)
at 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testCacheOnWrite(TestCacheOnWrite.java:150)

testCacheOnWrite[0](org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite)  Time 
elapsed: 1.129 sec   FAILURE!
org.junit.ComparisonFailure: expected:{DATA=1[367, LEAF_INDEX=172, 
BLOOM_CHUNK=9, INTERMEDIATE_INDEX=24]} but was:{DATA=1[459, LEAF_INDEX=183, 
BLOOM_CHUNK=9, INTERMEDIATE_INDEX=25]}
at org.junit.Assert.assertEquals(Assert.java:123)
at org.junit.Assert.assertEquals(Assert.java:145)
at 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.readStoreFile(TestCacheOnWrite.java:180)
at 
org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite.testCacheOnWrite(TestCacheOnWrite.java:150)

testSeekBefore(org.apache.hadoop.hbase.io.hfile.TestSeekTo)  Time elapsed: 
0.232 sec   ERROR!
java.lang.IllegalStateException: blockSeek with seekBefore at the first key of 
the block: 
key=\x00\x01c\x06familyqualifier\x7F\xFF\xFF\xFF\xFF\xFF\xFF\xFF\x04, 
blockOffset=0, onDiskSize=171
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(HFileReaderV2.java:647)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:577)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekBefore(HFileReaderV2.java:732)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekBefore(HFileReaderV2.java:687)
at 
org.apache.hadoop.hbase.io.hfile.TestSeekTo.testSeekBefore(TestSeekTo.java:70)
{code}

 Persist memstoreTS to disk
 --

 Key: HBASE-4344
 URL: https://issues.apache.org/jira/browse/HBASE-4344
 Project: HBase
  Issue Type: Sub-task
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
 Fix For: 0.89.20100924

 Attachments: 4344-v2.txt, patch-2


 Atomicity can be achieved in two ways -- (i) by using  a multiversion 
 concurrency system (MVCC), or (ii) by ensuring that new writes do not 
 complete, until the old reads complete.
 Currently, Memstore uses something along the lines of MVCC (called RWCC for 
 read-write-consistency-control). But, this mechanism is not incorporated for 
 the key-values written to the disk, as they do not include the memstore TS.
 Let us make the two approaches be similar, by persisting the memstoreTS along 
 with the key-value when it is written to the disk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME

2011-09-20 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3421:
--

Attachment: 3421.addendum

Addendum fixes Store.FIXED_OVERHEAD

 Very wide rows -- 30M plus -- cause us OOME
 ---

 Key: HBASE-3421
 URL: https://issues.apache.org/jira/browse/HBASE-3421
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.0
Reporter: stack
Assignee: Nate Putnam
 Fix For: 0.90.5

 Attachments: 3421.addendum, HBASE-3421.patch, HBASE-34211-v2.patch, 
 HBASE-34211-v3.patch, HBASE-34211-v4.patch


 From the list, see 'jvm oom' in 
 http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
 looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
 should check it out. Can the scanner used during compactions use the 'limit' 
 when nexting?  If so, this should save our OOME'ing (or, we need to add to 
 the next a max size rather than count of KVs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3130) [replication] ReplicationSource can't recover from session expired on remote clusters

2011-09-20 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109246#comment-13109246
 ] 

Chris Trezzo commented on HBASE-3130:
-

@J-D

Now that I have looked at the test code a bit, I have a question:

My current understanding is that to kill the master-slave connection, you need 
to somehow get the session id and session password for the ReplicationPeer's 
zookeeper session (i.e. you need the ZookeeperWatcher instance). Currently, 
this is not exposed. Also, this does not seem like something we would want to 
expose if the only motivation is for testing. Any thoughts?

I could be missing something obvious.

Thanks!
Chris

 [replication] ReplicationSource can't recover from session expired on remote 
 clusters
 -

 Key: HBASE-3130
 URL: https://issues.apache.org/jira/browse/HBASE-3130
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Chris Trezzo
 Fix For: 0.92.0

 Attachments: 3130-v2.txt, 3130-v3.txt, 3130.txt


 Currently ReplicationSource cannot recover when its zookeeper connection to 
 its remote cluster expires. HLogs are still being tracked, but a cluster 
 restart is required to continue replication (or a rolling restart).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4344) Persist memstoreTS to disk

2011-09-20 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109252#comment-13109252
 ] 

Ted Yu commented on HBASE-4344:
---

Two more failures:
{code}
testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
  Time elapsed: 2.932 sec   FAILURE!
junit.framework.AssertionFailedError: expected:80 but was:81
at junit.framework.Assert.fail(Assert.java:47)
at junit.framework.Assert.failNotEquals(Assert.java:283)
at junit.framework.Assert.assertEquals(Assert.java:64)
at junit.framework.Assert.assertEquals(Assert.java:130)
at junit.framework.Assert.assertEquals(Assert.java:136)
at 
org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:672)

testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction)  Time 
elapsed: 7.433 sec   ERROR!
java.lang.RuntimeException: Already used this rwcc. Too late to initialize
at 
org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.initialize(ReadWriteConsistencyControl.java:77)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:415)
at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:366)
at 
org.apache.hadoop.hbase.regionserver.SplitTransaction.rollback(SplitTransaction.java:679)
at 
org.apache.hadoop.hbase.regionserver.TestSplitTransaction.testRollback(TestSplitTransaction.java:234)
{code}

 Persist memstoreTS to disk
 --

 Key: HBASE-4344
 URL: https://issues.apache.org/jira/browse/HBASE-4344
 Project: HBase
  Issue Type: Sub-task
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
 Fix For: 0.89.20100924

 Attachments: 4344-v2.txt, patch-2


 Atomicity can be achieved in two ways -- (i) by using  a multiversion 
 concurrency system (MVCC), or (ii) by ensuring that new writes do not 
 complete, until the old reads complete.
 Currently, Memstore uses something along the lines of MVCC (called RWCC for 
 read-write-consistency-control). But, this mechanism is not incorporated for 
 the key-values written to the disk, as they do not include the memstore TS.
 Let us make the two approaches be similar, by persisting the memstoreTS along 
 with the key-value when it is written to the disk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3130) [replication] ReplicationSource can't recover from session expired on remote clusters

2011-09-20 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109270#comment-13109270
 ] 

Jean-Daniel Cryans commented on HBASE-3130:
---

At the very minimum you could do a TestReplicationPeer that tests the session 
recovery code. An integration test might be harder since you have to fiddle 
with the internals, maybe explore the avenue of having a test that resides in 
the same package (o.a.h.h.r.replication) and expose the methods only there.

 [replication] ReplicationSource can't recover from session expired on remote 
 clusters
 -

 Key: HBASE-3130
 URL: https://issues.apache.org/jira/browse/HBASE-3130
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Chris Trezzo
 Fix For: 0.92.0

 Attachments: 3130-v2.txt, 3130-v3.txt, 3130.txt


 Currently ReplicationSource cannot recover when its zookeeper connection to 
 its remote cluster expires. HLogs are still being tracked, but a cluster 
 restart is required to continue replication (or a rolling restart).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3958) use Scan with setCaching() and PageFilter have a problem

2011-09-20 Thread subramanian raghunathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109279#comment-13109279
 ] 

subramanian raghunathan commented on HBASE-3958:


As per Jerry Du:
Rangs means Cross regions Scan(multi-regions scan).

The issue is my first HBase program, the following is P-code:
 
create a table which is preSplit 100 regions;
each region have 100 rows;

fill data with row key [0,]
 
Scan with startKey and stopKey which cross all regions;[0,)
scan.setCaching(3);
scan.setFilter(new PageFilter(5));
 
the out put is:
Row key:
0
1
2
caching border
3
4
region_0 with filter border
5
caching border
6
7
8
caching border
9
region_1 with filter border
10
11
caching border
12
13
14
caching border AND region_2 with filter border
 
 
 
Case another
scan.setCaching(2);
scan.setFilter(new PageFilter(5));
Output will be
Row key:
0
1
caching border
2
3
caching border
4
region_0 with filter border
5
caching border
6
7
caching border
8
9
caching border AND region_1 with filter border
 
scan stop in both caching and region border
 
The Reason is two:
Filter instance is only in one region scan;
in method org.apache.hadoop.hbase.clien.HTable.ClientScanner.next()
do {} while (remainingResultSize  0  countdown  0  nextScanner(countdown, 
values == null));
the stop condition is NOT consider scan with Filter
NOT Only PageFilter,any filter will be problem in cross regions 
scan(multi-regions scan).

 use Scan with setCaching() and PageFilter have a problem
 

 Key: HBASE-3958
 URL: https://issues.apache.org/jira/browse/HBASE-3958
 Project: HBase
  Issue Type: Bug
  Components: filters, regionserver
Affects Versions: 0.90.3
 Environment: Linux testbox 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 
 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_23
 Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
 Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
Reporter: Jerry Du
Priority: Minor

 I have a table with 3 ranges,then I scan the table cross all 3 ranges.
 Scan scan = new Scan();
 scan.setCaching(10);
 scan.setFilter(new PageFilter(21));
 [result rows count = 63]
 the Result has 63 rows, each range has scaned,and locally limit to 
 page_szie.That is expect result.
 Then if the page_size = N * caching_size, then result has only page_size 
 rows,only the first range has scanned.
 If page_size is Multiple of caching_size,one range rsult just align fill the 
 caching,then client NOT trrige next range scan.
 Example:
 Scan scan = new Scan();
 scan.setCaching(10);
 scan.setFilter(new PageFilter(20));
 [result rows count = 20]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2742) Provide strong authentication with a secure RPC engine

2011-09-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13109289#comment-13109289
 ] 

jirapos...@reviews.apache.org commented on HBASE-2742:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1991/
---

Review request for hbase.


Summary
---

This patch creates a new secure RPC engine for HBase, which provides Kerberos 
based authentication of clients, and a token-based authentication mechanism for 
mapreduce jobs.  Primary components of the patch are:

- a new maven profile for secure Hadoop/HBase: hadoop-0.20S
  - Secure Hadoop dependent classes are separated under a pseudo-module in the 
security/ directory.  These source and test directories are only including if 
building the secure Hadoop profile
  - Currently the security classes get packaged with the regular HBase build 
artifacts.  We need a way to at least override project.version, so we can 
append something like a -security suffix indicating the additional security 
components.
  - The pseudo-module here is really a half-step forward.  It enables the 
security code to be optionally included in the build for now, and sets up the 
structure for a security module.  But we still will want to pursue full 
modularization (see HBASE-4336), which will allow packing the security code in 
a separate build artifact.

- a new RPC engine providing kerberos and token-based authentication: 
org.apache.hadoop.hbase.ipc.SecureRpcEngine
  - implementation under security/src/main/java/org/apache/hadoop/hbase/ipc/
  - The implementation classes extend the existing HBaseClient and HBaseServer 
to share as much of the RPC code as possible.  The main override is of the 
connection classes to allow control over the SASL negotiation of secure 
connections

- existing RPC changes
  - The existing HBaseClient and HBaseServer have been modified to make 
subclassing possible
  - All references to Hadoop UserGroupInformation have been replaced with 
org.apache.hadoop.hbase.security.User to insulate from future dependencies on 
specific Hadoop versions

- a coprocessor endpoint for obtaining new authentication tokens: 
TokenProvider, and supporting classes for token generation and synchronization 
(incorporating HBASE-3615)
  - implementation is under 
security/src/main/java/org/apache/hadoop/hbase/security/token/
  - Secret keys for token generation and verification are synchronized 
throughout the cluster in zookeeper, under /hbase/tokenauth/keys


To enable secure RPC, add the following configuration to hbase-site.xml:

  property
   namehadoop.security.authorization/name
   valuetrue/value
  /property
  property
   namehadoop.security.authentication/name
   valuekerberos/value
  /property
  property
   namehbase.rpc.engine/name
   valueorg.apache.hadoop.hbase.ipc.SecureRpcEngine/value
  /property
  property
   namehbase.coprocessor.region.classes/name
   valueorg.apache.hadoop.hbase.security.token.TokenProvider/value
  /property

In addition, the master and regionserver processes must be configured for 
kerberos authentication using the properties:

 * hbase.(master|regionserver).keytab.file
 * hbase.(master|regionserver).kerberos.principal
 * hbase.(master|regionserver).kerberos.https.principal


This addresses bug HBASE-2742.
https://issues.apache.org/jira/browse/HBASE-2742


Diffs
-

  conf/hbase-policy.xml PRE-CREATION 
  pom.xml 241973c 
  security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java 
PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/ipc/SecureConnectionHeader.java 
PRE-CREATION 
  security/src/main/java/org/apache/hadoop/hbase/ipc/SecureRpcEngine.java 
PRE-CREATION 
  security/src/main/java/org/apache/hadoop/hbase/ipc/SecureServer.java 
PRE-CREATION 
  security/src/main/java/org/apache/hadoop/hbase/ipc/Status.java PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/AccessDeniedException.java
 PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/HBasePolicyProvider.java
 PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/HBaseSaslRpcClient.java 
PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/HBaseSaslRpcServer.java 
PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationKey.java
 PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationProtocol.java
 PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenIdentifier.java
 PRE-CREATION 
  
security/src/main/java/org/apache/hadoop/hbase/security/token/AuthenticationTokenSecretManager.java
 PRE-CREATION