Then you should use BETWEEN, not IN. BETWEEN can be used for PPD, afaik.
2014-03-11 16:33 GMT+09:00 Petter von Dolwitz (Hem) <petter.von.dolw...@gmail.com>: > Hi Young, > > I must argue that the partition pruning do actually work if I don't use the > IN clause. What I wanted to achieve in my original query was to specify a > range of partitions in a simple way. The same query can be expressed as > > SELECT * FROM mytable WHERE partitionCol >= UDF("2014-03-10") and > partitionCol <= UDF("2014-03-11"); > > This UDF returns an INT (rather than an INT array). Both this UDF and the > original one are annotated with @UDFType(deterministic = true) (if that has > any impact) . This variant works fine and does partition pruning. Note that > I don't have another column as input to my UDF but a static value. > > Thanks, > Petter > > > > > 2014-03-11 0:16 GMT+01:00 java8964 <java8...@hotmail.com>: > >> I don't know from syntax point of view, if Hive will allow to do "columnA >> IN UDF(columnB)". >> >> What I do know that even let's say above work, it won't do the partition >> pruning. >> >> The partition pruning in Hive is strict static, any dynamic values >> provided to partition column won't enable partition pruning, even though it >> is a feature I missed too. >> >> Yong >> >> ________________________________ >> Date: Mon, 10 Mar 2014 16:23:01 +0100 >> Subject: Using an UDF in the WHERE (IN) clause >> From: petter.von.dolw...@gmail.com >> To: user@hive.apache.org >> >> >> Hi, >> >> I'm trying to get the following query to work. The parser don't like it. >> Anybody aware of a workaround? >> >> SELECT * FROM mytable WHERE partitionCol IN my_udf("2014-03-10"); >> >> partitionCol is my partition column of type INT and I want to achieve >> early pruning. I've tried returning an array of INTs from my_udf and also a >> plain string in the format (1,2,3). It seems like the parser wont allow me >> to put an UDF in this place. >> >> Any help appreciated. >> >> Thanks, >> Petter > >