[ 
https://issues.apache.org/jira/browse/PIG-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Welch updated PIG-3961:
----------------------------

    Attachment:     (was: filters-patch.diff)

> Adding HBaseStorage cell value filters
> --------------------------------------
>
>                 Key: PIG-3961
>                 URL: https://issues.apache.org/jira/browse/PIG-3961
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Mike Welch
>            Assignee: Mike Welch
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: filters-patch.v2.diff
>
>
> Adding three additional server side filtering options when loading data with 
> HBaseStorage:
> # specified cf:col does not exist
> {{-null cf:col}}
> # specified cf:col must exist
> {{-notnull cf:col}}
> # specified cf:col contains the given value
> {{-val cf:col=value}}
> These are meant to replace (and optimize by reducing data transfer) the 
> frequent paradigm in pig of loading data and immediately filtering for a 
> specific condition.  For example
> data = load 'hbase://mytable' using 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:*') as (cf:map[]) ;
> data_with_value = filter data by cf#'col' = 'value' ;
> Can be replaced with:
> data_with_value = load 'hbase://mytable' using 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:*', '-val cf:col=value') 
> as (cf:map[]) ;



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to