[ 
https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014182#comment-13014182
 ] 

Ning Zhang commented on HIVE-2084:
----------------------------------

@Namit, yeah, 2.2.3 support filter push down for non-equality. Even the older 
version of 2.0.3 supposes it too. Mac's patch actually supports range queries, 
but since range queries could be complicated on multiple partition columns 
(what if the range is on the column that is not the top partition column), I 
didn't dig deep into it, but it the push down filtering criteria can certainly 
be relaxed. 

Having said that, my test results shows that JDO filter pushing down may not be 
the dominate factor (comparing to the patch in HIVE-2050). In the experiments 
I've done for HIVE-2050, listing partition names and filtering partitions in 
the Hive client side may take 10 sec, but retrieving all Partition objects 
takes about 10 mins in total. The best of pushing down JDO filtering can only 
reduce the 10 sec to 0, but the 10 mins overhead is still there. We need to 
find a way to optimize that away.

> Upgrade datanucleus from 2.0.3 to 2.2.3
> ---------------------------------------
>
>                 Key: HIVE-2084
>                 URL: https://issues.apache.org/jira/browse/HIVE-2084
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2084.patch
>
>
> It seems the datanucleus 2.2.3 does a better join in caching. The time it 
> takes to get the same set of partition objects takes about 1/4 of the time it 
> took for the first time. While with 2.0.3, it took almost the same amount of 
> time in the second execution. We should retest the test case mentioned in 
> HIVE-1853, HIVE-1862.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to