[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877049#action_12877049 ]
Prafulla Tekawade commented on HIVE-417: ---------------------------------------- I was thinking of adding something called query rewrite module. It would be rule-based query rewrite system and it would rewrite the query into semantically equivalent query which is more optimized and/or uses indexes (not just for scans, but for other query operators, e.g. GroupBy etc.) Eg. select distinct c1 from t1; This query, if we have densed index ('compact summary index' in this hive indexing patch) on c1 can be replaced with query on index table itself. select idx_key from t1_cmpct_sum_idx; Similar query transformation can happen for other queries. Module will be placed just before optimizer and will help optimizer. Module structure looks like below. [Query parser] [Query rewrites] --> new phase [Query optimization] [Query execution planner] [Query execution engine] The rewrite module is 'generic', not just for above indexing case, but for other cases too, e.g. OR predicates to union (for efficiency?), outer join to union of anti & semi joins, moving out 'order by' out of union subquery etc etc. The aim is to implement a very simple, light-weight rewrite support, implement the indexing related rewrites (above rewrite does not even need a new run-time map-red operator) and integrate indexing support quickly and cleanly. As noted above, this rewrite phase is rule-based (and not cost-based), sort of early optimization. Let me know what u think. I'll start with reading ur patch. This would do most part from TODO 1, TODO 2 and 3 will have to be looked into. > Implement Indexing in Hive > -------------------------- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor > Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0 > Reporter: Prasad Chakka > Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch, > hive-indexing.3.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.