[ https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010285#comment-13010285 ]
He Yongqiang commented on HIVE-1644: ------------------------------------ Hi Russell, FIL%SEL% maybe not not good enough, how about a TBL%FIL? Also just had an offline talk with Namit. Namit proposed some very good ideas for this task: 1. check index exists or not. For a query on partitioned tables, index optimizer should try to find out indexes do exists on all partitions which the original task is scanning. This information can be found in ParseContext's OpToPartList. 2. add more parameters to config whether to use the index or not. (like if the filter is a >, not use the index. size of inputs is bigger than some value, not use index) 3. In case the index is not good (like even after scanning the index, it still needs to scan the whole base table), just do not use it, and go back to scan the whole base table. This can be done by adding a conditional task and a backup task. And how to detecting the index is good or not can be done by monitoring the index job's number of input records and number of output records, and compare them. let's say that if the ratio is >50, do not use the index. Kill the index job, and go back to scanning the whole base table. 3) can be done in a followup jira if you want. > use filter pushdown for automatically accessing indexes > ------------------------------------------------------- > > Key: HIVE-1644 > URL: https://issues.apache.org/jira/browse/HIVE-1644 > Project: Hive > Issue Type: Improvement > Components: Indexing > Affects Versions: 0.7.0 > Reporter: John Sichi > Assignee: Russell Melick > Attachments: HIVE-1644.1.patch, HIVE-1644.10.patch, > HIVE-1644.2.patch, HIVE-1644.3.patch, HIVE-1644.4.patch, HIVE-1644.5.patch, > HIVE-1644.6.patch, HIVE-1644.7.patch, HIVE-1644.8.patch, HIVE-1644.9.patch > > > HIVE-1226 provides utilities for analyzing filters which have been pushed > down to a table scan. The next step is to use these for selecting available > indexes and generating access plans for those indexes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira