Then you should use BETWEEN, not IN. BETWEEN can be used for PPD, afaik.

2014-03-11 16:33 GMT+09:00 Petter von Dolwitz (Hem)
<petter.von.dolw...@gmail.com>:
> Hi Young,
>
> I must argue that the partition pruning do actually work if I don't use the
> IN clause. What I wanted to achieve in my original query was to specify a
> range of partitions in a simple way. The same query can be expressed as
>
> SELECT * FROM mytable WHERE partitionCol >= UDF("2014-03-10") and
> partitionCol <= UDF("2014-03-11");
>
> This UDF returns an INT (rather than an INT array). Both this UDF and the
> original one are annotated with @UDFType(deterministic = true) (if that has
> any impact) . This variant works fine and does partition pruning. Note that
> I don't have another column as input to my UDF but a static value.
>
> Thanks,
> Petter
>
>
>
>
> 2014-03-11 0:16 GMT+01:00 java8964 <java8...@hotmail.com>:
>
>> I don't know from syntax point of view, if Hive will allow to do "columnA
>> IN UDF(columnB)".
>>
>> What I do know that even let's say above work, it won't do the partition
>> pruning.
>>
>> The partition pruning in Hive is strict static, any dynamic values
>> provided to partition column won't enable partition pruning, even though it
>> is a feature I missed too.
>>
>> Yong
>>
>> ________________________________
>> Date: Mon, 10 Mar 2014 16:23:01 +0100
>> Subject: Using an UDF in the WHERE (IN) clause
>> From: petter.von.dolw...@gmail.com
>> To: user@hive.apache.org
>>
>>
>> Hi,
>>
>> I'm trying to get the following query to work. The parser don't like it.
>> Anybody aware of a workaround?
>>
>> SELECT * FROM mytable WHERE partitionCol IN my_udf("2014-03-10");
>>
>> partitionCol is my partition column of type INT and I want to achieve
>> early pruning. I've tried returning an array of INTs from my_udf and also a
>> plain string in the format (1,2,3). It seems like the parser wont allow me
>> to put an UDF in this place.
>>
>> Any help appreciated.
>>
>> Thanks,
>> Petter
>
>

Reply via email to