[ 
https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899722#action_12899722
 ] 

Dmitriy V. Ryaboy commented on PIG-1205:
----------------------------------------

bq. 1. Is it possible to specify min_row_key and max_row_key in parameters

Even better than that -- you can specify lt, lte, gt, and gte. It's true that 
as written splits will be created for the whole table, but the filters will 
cause most of those splits to immediately exit. Not creating the splits is on 
my todo list (I already do this in the elephantbird version for 0.6)

bq. 2. One small suggestion: move line 206 to if block (only one time setting 
is enough)

Good idea.

bq. 3. It's better to add warning log in HBaseBinaryConverter when the bytes is 
cut off for type conversion 

Will do. 

bq. 4. The parameter "Per-region limit" is a bit confusing for me, I think 
users would like to the set the limit on the whole table not per region. What 
do you think ?

Trouble is, you can't enforce a total limit without post-processing. In 
practice, I use -limit when I am experimenting and want to get just a few rows 
from HBase; if I want a specific number of rows, I use both -limit (to speed up 
the tasks, since the scanners will exit early), and Pig's LIMIT operator (to 
get the exact number of rows I need).



> Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc
> ------------------------------------------------------------------------------
>
>                 Key: PIG-1205
>                 URL: https://issues.apache.org/jira/browse/PIG-1205
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, 
> PIG_1205_4.patch, PIG_1205_5.path
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to