[ 
https://issues.apache.org/jira/browse/HIVE-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-186:
-------------------------------

    Attachment: patch-186.txt

This patch contains the cleanup and refactoring of all the graph walking and 
rules framework. The unified framework is in the package

org.apache.hadoop.hive.ql.lib

Node is the interface that must be implemented by the graph in order to use the 
graph walkers and rule dispatchers available within this framework. There are 
two implementations of this interface currently -

1. ASTNode - in ql.parse that is a wrapper around the CommonTree classes of the 
antlr runtime.
2. Operator - in ql.exec that implements the operator tree nodes

I have also removed the DefaultDispatcher implementation of the Dispatcher. 
This functionality can be equivalently expressed using DefaultRuleDispatcher. 
Accordingly I have cleaned out the GenMR* processors and the ColumnPruner to 
reflect these changes. ColumnPruner is also split into ColumnPrunerProcFactory 
to create the processors for the various rules needed therein and 
ColumnPrunerProcCtx which is used to carry the context information (this class 
is an implementation of NodeProcessorCtx) between rules.

I have gotten rid of all the classes related to the ASTs (ASTEvent, 
ASTDispatcher, ASTProcessor, ASTEventProcessor etc...)

The Node interfaces are processed by implementations of NodeProcessor. I have 
removed the reflection bases invocation that we were doing in the earlier 
DefaultDispatcher and DefaultRuleDispatcher. Now only a single process function 
is called and the user has to implement a different processors for different 
rules (see ColumnPrunerProcFactory).

The walker interface has been renamed to GraphWalker and the default 
implementation is now callled DefaultGraphWalker. Also I have eliminated the 
TopoWalker. DefaultGraphWalker is now not an abstract class so that clients can 
use it right out of the box. The ColumnPrunerWalker and the GenMapRedWalker are 
still subclasses of the DefaultGraphWalker.


> Refactor code to use a single graph, nodeprocessor, dispatcher and rule 
> abstraction
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-186
>                 URL: https://issues.apache.org/jira/browse/HIVE-186
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: patch-186.txt
>
>
> Currently, the query processor has two different tree and rule abstractions - 
> one for ASTs and one for Operator Graphs. We should clean this up so that we 
> have a single abstraction that can be reused at different stages in the query 
> compiler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to