[ 
https://issues.apache.org/jira/browse/PIG-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Mazak updated PIG-4663:
----------------------------
    Affects Version/s: 0.12.0

> HBaseStorage should allow the MaxResultsPerColumnFamily limit to avoid memory 
> or scan timeout issues
> ----------------------------------------------------------------------------------------------------
>
>                 Key: PIG-4663
>                 URL: https://issues.apache.org/jira/browse/PIG-4663
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.12.0
>            Reporter: Paul Mazak
>            Assignee: Paul Mazak
>
> The HBase client Scan API offers a way to setMaxResultsPerColumnFamily.  This 
> number prevents all the columns from being consumed when scanning a row.  If 
> you have a single row with several thousand columns on it, Pig will likely 
> fail giving an OutOfMemoryException or ScannerTimeoutException.
> The suggestion is to add the option '-maxResultsPerColumnFamily' which can be 
> passed as an optString parameter in the constructor, which sets this value on 
> the HBase Scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to