[ https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy V. Ryaboy updated PIG-1205: ----------------------------------- Attachment: PIG_1205_5.path This patch (not really review-ready yet) introduces the Elephant-Bird improvements. You can use -gt, -gte, -lt, -lte flags to filter out row ranges, specify caching and per-region row limits, and you can specify the caster to use (interpret Strings, as before, or use bytes directly for more eficient storage and communication). The filtering is a bit off because it still spins up all the map tasks, the ones whose keys are filtered out just finish extremely fast. The progress reporting is a bit jittery, but better than nothing. TODO: fix up filtering, add projection pushdown, add filter pushdown, and write better tests. > Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc > ------------------------------------------------------------------------------ > > Key: PIG-1205 > URL: https://issues.apache.org/jira/browse/PIG-1205 > Project: Pig > Issue Type: Sub-task > Affects Versions: 0.7.0 > Reporter: Jeff Zhang > Assignee: Dmitriy V. Ryaboy > Fix For: 0.8.0 > > Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, > PIG_1205_4.patch, PIG_1205_5.path > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.