[ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734408#action_12734408 ]
Prasad Chakka commented on HIVE-417: ------------------------------------ well the number of offsets can't exceed number of SequenceFile blocks since we can only index the SequenceFile block offsets. So the problem is not as dire as it can be. And also if there are that many (i.e. more than 10% of rows in traditional RDBMS but may more in Hadoop case) have same key then index may not be efficient after all since it is better to read the whole table anyways. > Implement Indexing in Hive > -------------------------- > > Key: HIVE-417 > URL: https://issues.apache.org/jira/browse/HIVE-417 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor > Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.4.0 > Reporter: Prasad Chakka > Assignee: He Yongqiang > Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch > > > Implement indexing on Hive so that lookup and range queries are efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.