[ https://issues.apache.org/jira/browse/HIVE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976380#action_12976380 ]
Prajakta Kalmegh commented on HIVE-1694: ---------------------------------------- Thanks to both of you for your comments on our proposed design. Since the last post, we have been working on the code changes as per your comments. The progress has been in the following areas: 1) Removed the dependency for our optimizer to be the first one. It can now be used as any other optimizer by adding it to "transformations" list. 2) Implemented changes to re-structure the operator DAG plan for group-by queries. 3) We have removed the dependency of our optimization to read data from QB(query block) as it used to do earlier to check if the optimization can be applied before proceeding to apply the re-write. (See canApply() method in the original rewrite code.) 4) Regarding issue #3 (from my original post), as per John's suggestion, the change for modification of operator row schemas/resolvers are done smoothly wherever applicable. 5) We have completed testing the new implementation for simple group-by cases. Also, the code to append a sub-query to original DAG is implemented separately as of now. This needs to be integrated as part of our optimization. The only issue that will be pending post this implementation will be regarding John's post on Nov 1st stating "...we store only the distinct block offsets, not the distinct row offsets.....". We plan to work on this once the current implementation is tested end-to-end. You can expect the update on this in a couple of weeks. > Accelerate query execution using indexes > ---------------------------------------- > > Key: HIVE-1694 > URL: https://issues.apache.org/jira/browse/HIVE-1694 > Project: Hive > Issue Type: New Feature > Components: Indexing, Query Processor > Affects Versions: 0.7.0 > Reporter: Nikhil Deshpande > Assignee: Nikhil Deshpande > Attachments: demo_q1.hql, demo_q2.hql, HIVE-1694_2010-10-28.diff > > > The index building patch (Hive-417) is checked into trunk, this JIRA issue > tracks supporting indexes in Hive compiler & execution engine for SELECT > queries. > This is in ref. to John's comment at > https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869 > on creating separate JIRA issue for tracking index usage in optimizer & query > execution. > The aim of this effort is to use indexes to accelerate query execution (for > certain class of queries). E.g. > - Filters and range scans (already being worked on by He Yongqiang as part of > HIVE-417?) > - Joins (index based joins) > - Group By, Order By and other misc cases > The proposal is multi-step: > 1. Building index based operators, compiler and execution engine changes > 2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose > between index scans, full table scans etc.) > This JIRA initially focuses on the first step. This JIRA is expected to hold > the information about index based plans & operator implementations for above > mentioned cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.