[jira] [Created] (HIVE-9988) Evaluating UDF before query is run

JIRA Tue, 17 Mar 2015 01:16:30 -0700

Ådne Brunborg created HIVE-9988:
-----------------------------------

             Summary: Evaluating UDF before query is run
                 Key: HIVE-9988
                 URL: https://issues.apache.org/jira/browse/HIVE-9988
             Project: Hive
          Issue Type: Improvement
            Reporter: Ådne Brunborg



When using UDFs on partition column in Hive, all partitions are scanned before 
the UDF is resolved. 

If the UDF could be evaluated before query is run, this would greatly improve 
performance in cases like this.

Example - the table has a partition by datestamp (bigint): 

The following where clause touches upon all 82 partitions:
{{WHERE datestamp=cast(from_unixtime(unix_timestamp(),'yyyyMMdd') as bigint)}}
{{15/03/16 09:21:53 INFO mapred.FileInputFormat: Total input paths to process : 
82}}

…whereas the following only touches the one partition:
{{WHERE datestamp=20150316}}
{{15/03/16 09:23:06 INFO input.FileInputFormat: Total input paths to process : 
1}}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-9988) Evaluating UDF before query is run

Reply via email to