[jira] Commented: (PIG-113) Make Grunt's explain output more understandable

Alan Gates (JIRA) Fri, 22 Feb 2008 10:22:04 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571507#action_12571507
 ]


Alan Gates commented on PIG-113:
--------------------------------

In general the patch looks good.  Making the exception output more readable is 
something we need.

There's one question I have that I'd like to get input from others on.  In the 
patch you've made arguments to EXPLAIN be tokens in the language (XML, TREE).  
That's a standard SQL approach.  The pro is it is easy for users to type, and 
SQL users probably already think about things that way.  The con is it bloats 
the number of token in the language (take a look at all the tokens in the SQL 
standard compared to the number of tokens in a language like java) and it means 
many changes include changes to the parser.

The other option is to make EXPLAIN take a string argument, so it would be 
EXPLAIN 'tree' instead of EXPLAIN TREE.  This has the reverse pros and cons.  
Another pro is java, etc. programmers may think of this as a more natural model.

Thoughts?

> Make Grunt's explain output more understandable
> -----------------------------------------------
>
>                 Key: PIG-113
>                 URL: https://issues.apache.org/jira/browse/PIG-113
>             Project: Pig
>          Issue Type: Improvement
>          Components: grunt
>    Affects Versions: 0.1.0
>            Reporter: Pi Song
>            Priority: Minor
>         Attachments: pig_printtree_1.patch
>
>
> I think it would be better if we can display the execution plan in a more 
> understandable way. One intuitive way to do this is to show output as a tree 
> like in SQL Server.
> Possibly we can  have 'AS <format>' as optional argument for explain command
> For example
> {noformat}
> Grunt> explain bag1 AS tree ;
> Grunt> explain bag1 AS xml ;
> {noformat}
> and 
> {noformat}
> Grunt> explain bag1   
> {noformat}
> will display the default format
> I have included a patch that does generate tree output.
> Here is a sample of the existing output format
> {noformat}
> Logical Plan:
> Group root-Sun Feb 17 19:37:07 GMT+10:00 2008-5
> Object id: 9814147
> Inputs: 26335425 
> Schema: (group, (sum, (), (), ()))
> EvalSpecs:
>         Generate: has 2 children
>                 Project: (0)
>                 Star
> Split root-Sun Feb 17 19:37:07 GMT+10:00 2008-2
> Object id: 25199001
> Inputs: 29132923 
> Schema: (sum, (), (), ())
> EvalSpecs:
> Eval root-Sun Feb 17 19:37:07 GMT+10:00 2008-1
> Object id: 29132923
> Inputs: 10774273 
> Schema: (sum, (), (), ())
> EvalSpecs:
>         Generate: has 4 children
>                 FuncEval: name: org.apache.pig.impl.builtin.ADD args:
>                         Generate: has 2 children
>                                 Project: (0)
>                                 Project: (1)
>                 Project: (0)
>                 Project: (1)
>                 Project: (2)
> Load root-Sun Feb 17 19:37:07 GMT+10:00 2008-0
> Object id: 10774273
> Inputs: 
> Schema: ()
> EvalSpecs:
> -----------------------------------------------
> Physical Plan:
> MAPREDUCE
> Object id: 17671659
> Inputs: 682933706
> Map: 
>         Star
> Grouping Funcs: 
>         Generate: has 2 children
>                 Project: (0)
>                 Star
> Input Files: /tmp/temp678140026/tmp1867058340
> MAPREDUCE
> Object id: 17308974
> Inputs: 
> Map: 
>         Composite: has 2 children
>                 Star
>                 Generate: has 4 children
>                         FuncEval: name: org.apache.pig.impl.builtin.ADD args:
>                                 Generate: has 2 children
>                                         Project: (0)
>                                         Project: (1)
>                         Project: (0)
>                         Project: (1)
>                         Project: (2)
> Input Files: /tmp/data1.txt
> Output File: /tmp/temp678140026/tmp1613817084
> {noformat}
> Here is a sample of my tree output which is more compact and more 
> understandable :-
> {noformat}
> grunt> explain c1 as tree ;
> Logical Plan:
> |---LOCogroup ( GENERATE {[PROJECT $0],[*]} ) 
>       |---LOSplitOutput (  ) 
>             |---LOSplit ( ([PROJECT $0] < ['5']),([PROJECT $0] >= ['5']) ) 
>                   |---LOEval ( GENERATE 
> {[org.apache.pig.impl.builtin.ADD(GENERATE {[PROJECT $0],[PROJECT 
> $1]})],[PROJECT $0],[PROJECT $1],[PROJECT $2]} ) 
>                         |---LOLoad ( file = /tmp/data1.txt )
> -----------------------------------------------
> Physical Plan:
> |---POMapreduce
>     Map : *
>     Grouping : Generate(Project(0),*)
>     Input File(s) : /tmp/temp678140026/tmp1867058340
>       |---POMapreduce
>           Map : 
> Composite(*,Generate(FuncEval(org.apache.pig.impl.builtin.ADD(Generate(Project(0),Project(1)))),Project(0),Project(1),Project(2)))
>           Input File(s) : /tmp/data1.txt
> {noformat}
> I'm also thinking about doing output as xml as it might benefit people who 
> are working on displaying execution plan on GUI.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-113) Make Grunt's explain output more understandable

Reply via email to