[ https://issues.apache.org/jira/browse/HIVE-20681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637099#comment-16637099 ]
Eugene Koifman commented on HIVE-20681: --------------------------------------- Could you give a concrete example of some files on disk and what filter you'd like to generate? > Support custom path filter for ORC tables > ----------------------------------------- > > Key: HIVE-20681 > URL: https://issues.apache.org/jira/browse/HIVE-20681 > Project: Hive > Issue Type: Improvement > Components: Transactions > Reporter: Igor Kryvenko > Assignee: Igor Kryvenko > Priority: Minor > > Currently, Orc file input format does not take in path filters set in the > property "mapreduce.input.pathfilter.class" OR " > mapred.input.pathfilter.class ". So, we cannot use custom filters with Orc > files. > AcidUtils class has a static filter called "hiddenFilters" which is used by > ORC to filter input paths. If we can pass the custom filter classes(set in > the property mentioned above) to AcidUtils and replace hiddenFilter with a > filter that does an "and" operation over hiddenFilter+customFilters, the > filters would work well. > It would be useful to have the ability to filter out rows based on > path/filenames, current ORC features like bloom filters and indexes are not > good enough for them to minimize the number of disk read operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)