Illya Yalovyy created HIVE-11882:
------------------------------------

             Summary: Fetch optimizer should stop source files traversal once 
it exceeds the hive.fetch.task.conversion.threshold
                 Key: HIVE-11882
                 URL: https://issues.apache.org/jira/browse/HIVE-11882
             Project: Hive
          Issue Type: Improvement
          Components: Physical Optimizer
    Affects Versions: 1.0.0
            Reporter: Illya Yalovyy


Hive 1.0's fetch optimizer tries to optimize queries of the form "select <C> 
from <T> where <F> limit <L>" to a fetch task (see the 
hive.fetch.task.conversion property). This optimization gets the lengths of all 
the files in the specified partition and does some comparison against a 
threshold value to determine whether it should use a fetch task or not (see the 
hive.fetch.task.conversion.threshold property). This process of getting the 
length of all files. One of the main problems in this optimization is the fetch 
optimizer doesn't seem to stop once it exceeds the 
hive.fetch.task.conversion.threshold. It works fine on HDFS, but could cause a 
significant performance degradation on other supported file systems. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to