Did you look at the stack trace in the Pig log file and Hadoop task log?
On Wed, Dec 11, 2013 at 11:12 AM, Sameer Tilak <[email protected]> wrote: > Hi All, > I am trying to use Distributed cache in my UDF. I have the following file > in HDFS that I want all my map functions to have available locally: > hadoop dfs -ls /scratch/-rw-r--r-- 1 userid supergroup size date time > /scratch/id_lookup > In My pig script I pass it as a parameter > > ProcessedUI = FOREACH A GENERATE myparser.myUDF(param1, param2, > '/scratch/id_lookup'); > In my UDF inside exec function I do the following: > lookup_file = (String)input.get(2); > I have implemented the getCacheFiles as follows: > public List<String> getCacheFiles() { List<String> list = new > ArrayList<String>(1); list.add(lookup_file + "#id_lookup"); > return list; } > Now I try to read that file using standard io methods. > public void VectorizeData (){ FileReader fr = new > FileReader("./id_lookup"); BufferedReader brd = new > BufferedReader(fr);} > > I think I am not using it correctly (may be paths messed up etc.). I get > the following exception: > 2013-12-11 11:09:50,821 [JobControl] ERROR > org.apache.hadoop.security.UserGroupInformation - > PriviledgedActionException as:userid cause:java.io.FileNotFoundException: > File does not exist: null2013-12-11 11:09:51,291 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 0% complete2013-12-11 11:09:51,301 [main] WARN > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to > stop immediately on failure. > Any help on this would be great! >
