Felix, 0.7 does not support distributed cache within Pig UDFs. Is there a reason you are using such an old version of Pig?
0.9 and later would support this for you. Alan's book has great info on doing this http://ofps.oreilly.com/titles/9781449302641/writing_udfs.html Thanks, Prashant On Fri, Mar 16, 2012 at 5:32 PM, felix gao <[email protected]> wrote: > I need to put a small shared file on distributed cache so I can load it my > udf in pig0.7. We are using Hadoop 0.20.2+228. I tried to run it using > > > PIG_OPTS="-Dmapred.cache.archives=hdfs://namenode.host:5001/user/gen/categories/exclude/2012-03-15/exclude-categories#excludeCategory > -Dmapred.create.symlink=yes", runpig ~felix/testingr.pig > and > > PIG_OPTS="-Dmapred.cache.files=hdfs://namenode.host:5001/user/gen/categories/exclude/2012-03-15/exclude-categories#excludeCategory > -Dmapred.create.symlink=yes", runpig ~felix/testingr.pig > > > when I do > hadoop fs -ls > > hdfs://namenode.host:5001/user/gen/categories/exclude/2012-03-15/exclude-categories > I do see the file there. > > However, on the UDF side I see > java.io.FileNotFoundException: excludeCategory (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:106) > at java.io.FileInputStream.<init>(FileInputStream.java:66) > at java.io.FileReader.<init>(FileReader.java:41) > > What did I do wrong? >
