[
https://issues.apache.org/jira/browse/PIG-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Mazak updated PIG-4663:
----------------------------
Attachment: PIG-4663.patch
> HBaseStorage should allow the MaxResultsPerColumnFamily limit to avoid memory
> or scan timeout issues
> ----------------------------------------------------------------------------------------------------
>
> Key: PIG-4663
> URL: https://issues.apache.org/jira/browse/PIG-4663
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.12.0
> Reporter: Paul Mazak
> Assignee: Paul Mazak
> Attachments: PIG-4663.patch
>
>
> The HBase client Scan API offers a way to setMaxResultsPerColumnFamily. This
> number prevents all the columns from being consumed when scanning a row. If
> you have a single row with several thousand columns on it, Pig will likely
> fail giving an OutOfMemoryException or ScannerTimeoutException.
> The suggestion is to add the option '-maxResultsPerColumnFamily' which can be
> passed as an optString parameter in the constructor, which sets this value on
> the HBase Scan.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)