GROUPING SETS syntax

Jonathan Coveney Tue, 29 May 2012 13:05:54 -0700

Hey Prashanth, happy hacking.

My opinion:


CUBE:

alias = CUBE rel BY (a,b,c);


I like that syntax. It's unambiguous what is going on.


ROLLUP:


alias = CUBE rel BY ROLLUP(a,b,c);


I never liked that syntax in SQL. I suggest we just do what we did with CUBE. IE


alias = ROLLUP rel BY (a,b,c);


GROUPING SETS:


alias = CUBE rel BY GROUPING SETS((a,b),(b),());


I don't like this. The cube vs. grouping sets is confusing to me. maybe
following the
same pattern you could do something like:

alias = GROUPING_SET rel BY ((a,b),(b),());

As far as having, is there an optimization that can be done with a HAVING
clause that can't be done based on the logical plan that comes afterwards?
That seems odd to me. Since you have to materialize the result anyway,
can't the having clause just be a FILTER that comes after the cube? I don't
know why we need a special syntax.

My opinion. Forgive janky formatting, gmail + paste = pain.
Jon

2012/5/27 Prasanth J <[email protected]>

> Hello everyone
>
> I am looking for feedback from the community about the syntax for
> CUBE/ROLLUP/GROUPING SETS operations in pig.
> I am moving the discussion from JIRA to dev-list so that everyone can
> share their opinion for operator syntax. Please have a look at the syntax
> proposal at the link below and let me know your opinion
>
>
> https://issues.apache.org/jira/browse/PIG-2167?focusedCommentId=13277644&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13277644
>
> Thanks
> -- Prasanth
>
>

Re: CUBE/ROLLUP/GROUPING SETS syntax

Reply via email to