[ 
https://issues.apache.org/jira/browse/HBASE-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756385#action_12756385
 ] 

Lars George commented on HBASE-1829:
------------------------------------

You are right Michael, it cleans up some remnants from when we could have 
different numbers of splits. It also attempts to reduce the split count to the 
number of regions that include start and stop row. The idea with the comparison 
is to find the start key of the region just below the start row and the end key 
of the region just after the stop row. 

I am not sure about the default empty end row and also the comparison in terms 
of equal or equal and greater etc. I just thought I get the patch up as an idea 
I had but it is not yet tested. I will test it early next week an sort out the 
issues.

Question is there a testbed that allows to have say 3-4 regions so that I can 
construct various test cases (like start/stop row both in first/last region, 
spanning all regions, crossing only two regions etc.)? I am not too familiar 
with the test classes and I know you guys changing things around. What would be 
a good sample to start with?

Otherwise I will test it on my live cluster that has more than enough to test 
with. But a unit test seems like a good idea.

> Make use of start/stop row in TableInputFormat
> ----------------------------------------------
>
>                 Key: HBASE-1829
>                 URL: https://issues.apache.org/jira/browse/HBASE-1829
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Lars George
>            Assignee: Lars George
>            Priority: Minor
>             Fix For: 0.20.1
>
>         Attachments: HBASE-1829.patch
>
>
> Since we can now specify a start and stop row with the Scan that is handed to 
> the TIF we can reduce the splits to the regions that contain these rows. That 
> allows to test large MR jobs on a single region for example.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to