Cheolsoo Park created PIG-3269:
----------------------------------
Summary: In operator support
Key: PIG-3269
URL: https://issues.apache.org/jira/browse/PIG-3269
Project: Pig
Issue Type: New Feature
Components: internal-udfs, parser
Affects Versions: 0.11
Reporter: Cheolsoo Park
Assignee: Cheolsoo Park
Fix For: 0.12
This is another language improvement using the same approach as in PIG-3268.
Currently, Pig has no support for IN operator. To mimic it, users often have to
concatenate several OR operators.
For example,
{code}
a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FILTER a BY
(i == 1) OR
(i == 22) OR
(i == 333) OR
(i == 4444) OR
(i == 55555);
{code}
But this can be re-rewritten in a more compact manner using IN operator as
follows:
{code}
a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FILTER a BY i IN (1,22,333,4444,55555);
{code}
I propose that we implement IN operator in the following manner:
* Add built-in UDFs that take expressions as args. Take for example the
aforementioned case statement, we can define a UDF such as {{builtInUdf(i, 1,
22, 333, 4444, 55555)}}.
* Add syntactical sugar for these built-in UDFs.
Similarly to PIG-3268, this approach requires a limit on the number of values.
This is again because we need to populate the full list of possible args
schemas in {{EvalFunc.getArgToFuncMapping}}. For now, I arbitrarily chose 50,
but it can be easily changed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira