[ 
https://issues.apache.org/jira/browse/HIVE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677548#action_12677548
 ] 

Namit Jain commented on HIVE-279:
---------------------------------

Some high level comments:

1. Add more comments everywhere, specifically in joinPPD (OpProcFactory)
2. Remove operator specific code in ExprWalkerProcFactory: ColumnExprProcessor: 
process
3. Use specific data-structures where-ever possible instead of using more 
generic data-structures.

ExprWalkerInfo:

  private Map<String, List<Node>> pushdownPreds;
  private Map<Node, ExprInfo> exprInfoMap;

In both of them, Node means exprNodeDesc, why dont we use that instead ?

Simlarly, in OpWalkerInfo:

  private Map<Node, ExprWalkerInfo> opToPrunedPredsMap;
  private Map<Operator<? extends Serializable>, OpParseContext> opToParseCtxMap;

use Operator instead of Node in opToPrunedPredsMap

4. Can you move OpWalker and ExprWalker in different directories ?
5. Why are filters only pushed on top of TableScan - cant it be done anywhere. 
- If you want to do so in a follow-up, can you file a JIRA for that ?
6. No apache header in many files (ppd directory)


SemanticAnalyzer.java:

A comment explaining the reason for existence of colInfoMap will help. Give an 
example: group by 
where the table column order is different from the grouped column order.

Same for posAliasMap, nameToInputColumnInfoMap for JOIN

genJoinOperatorChildren:


      if(aliases == null) {
        aliases = new HashSet<String>();
        posToAliasMap.put(pos, aliases);
      }

isn't the IF redundant ?




> Implement predicate push down for hive queries
> ----------------------------------------------
>
>                 Key: HIVE-279
>                 URL: https://issues.apache.org/jira/browse/HIVE-279
>             Project: Hadoop Hive
>          Issue Type: New Feature
>    Affects Versions: 0.2.0
>            Reporter: Prasad Chakka
>            Assignee: Prasad Chakka
>         Attachments: hive-279.2.patch, hive-279.patch
>
>
> Push predicates that are expressed in outer queries into inner queries where 
> possible so that rows will get filtered out sooner.
> eg.
> select a.*, b.* from a join b on (a.uid = b.uid) where a.age = 20 and 
> a.gender = 'm'
> current compiler generates the filter predicate in the reducer after the join 
> so all the rows have to be passed from mapper to reducer. by pushing the 
> filter predicate to the mapper, query performance should improve.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to