?You might be using the wrong path to reference the distributed cache - I was under the impression that the distributed cache files would accessible using a local path not something starting with '/'.
I suspect query 1 is working because fetch task conversion is running the select in a local task, where it can see /data/MyData.txt. Try setting hive.fetch.task.conversion=false so the select query is run in a MR task, to see if you get the same results as the other queries. Actually fetch task conversion doesn't work well with the distributed cache since it is not running in a MR task. For the other queries try referencing the file as "MyData.txt". ________________________________ From: Dayong <will...@gmail.com> Sent: Tuesday, April 05, 2016 11:49 AM To: user@hive.apache.org Subject: Re: Hive UDF to fetch value from distributed cache not working with outer queries What if you extends genericUDF Thanks, Dayong On Apr 5, 2016, at 2:11 PM, Abhishek Dubey <abhishek.du...@xoriant.com<mailto:abhishek.du...@xoriant.com>> wrote: Hi, We have written a Hive UDF in Java to fetch value from file added in distributed cache which works perfectly from a select query like : Query 1. select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename; But not working when trying to create table from its output. Like : Query 2. create table new_table as select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename; It is not even working from outer select. Like : Query 3. select t.capital from ( select country_key, MyFunction(country_key,"/data/MyData.txt") as capital from tablename ) t; Below is my UDF's evaluate function : public class CountryMap extends UDF{ Map<Integer, String> countryMap = null; public String evaluate(Integer keyCol, String mapFile) { if (countryMap == null){ //read comma delimited data from mapFile and build a hashmap countryMap.put(key, value); } if (countryMap.containsKey(keyCol)) { return countryMap.get(keyCol); } return "NA"; } } Adding jar, file and creating Hive temporary function in Hive like: ADD JAR /data/CountryMap-with-dependencies.jar; ADD FILE /data/MyData.txt; CREATE TEMPORARY FUNCTION MyFunction as 'CountryMap'; When I run query 1 I get expected value from Map but when I run query 2 and 3 I get 'NA'. When I returned Map.size() for query 2 and 3 in place of 'NA' it was zero. I am puzzled why outer select or create table is not able to fetch coutryMap() value and why the size of Map becomes zero. Thanks in advance, Abhishek Dubey