I personally did a lot of work to migrate from pig 8 to pig 9. It's a nontrivial jump, and not just for the UDFs (I'd argue they changed less): mainly it's because the parser changed. My recommendation would be to wait until the 0.10 release is baked in and move up to that, but at this point, 0.7 is really really old and it is worth the pain to upgrade.
Hopefully in the future we'll have ways to help facilitate that process...the e2e tests help a lot, but nothing beats running your batch jobs against the new version, catching errors, and hopefully filing JIRAs if you hit weird bugs :) 2012/3/16 felix gao <[email protected]> > We haven't upgrade because we have a lot of UDFs that is written for > pig0.7. If I upgrade I am afraid that I have to re-write many of them to > support the new version. Do you know if the upgrade from pig0.7 to pig > 0.9' with respect to the Udfs need any migration work? > > Thanks, > > Felix > > On Fri, Mar 16, 2012 at 5:37 PM, Prashant Kommireddi <[email protected] > >wrote: > > > Felix, > > > > 0.7 does not support distributed cache within Pig UDFs. Is there a reason > > you are using such an old version of Pig? > > > > 0.9 and later would support this for you. Alan's book has great info on > > doing this > http://ofps.oreilly.com/titles/9781449302641/writing_udfs.html > > > > Thanks, > > Prashant > > > > > > On Fri, Mar 16, 2012 at 5:32 PM, felix gao <[email protected]> wrote: > > > > > I need to put a small shared file on distributed cache so I can load it > > my > > > udf in pig0.7. We are using Hadoop 0.20.2+228. I tried to run it > using > > > > > > > > > > > > PIG_OPTS="-Dmapred.cache.archives=hdfs://namenode.host:5001/user/gen/categories/exclude/2012-03-15/exclude-categories#excludeCategory > > > -Dmapred.create.symlink=yes", runpig ~felix/testingr.pig > > > and > > > > > > > > > PIG_OPTS="-Dmapred.cache.files=hdfs://namenode.host:5001/user/gen/categories/exclude/2012-03-15/exclude-categories#excludeCategory > > > -Dmapred.create.symlink=yes", runpig ~felix/testingr.pig > > > > > > > > > when I do > > > hadoop fs -ls > > > > > > > > > hdfs://namenode.host:5001/user/gen/categories/exclude/2012-03-15/exclude-categories > > > I do see the file there. > > > > > > However, on the UDF side I see > > > java.io.FileNotFoundException: excludeCategory (No such file or > > directory) > > > at java.io.FileInputStream.open(Native Method) > > > at java.io.FileInputStream.<init>(FileInputStream.java:106) > > > at java.io.FileInputStream.<init>(FileInputStream.java:66) > > > at java.io.FileReader.<init>(FileReader.java:41) > > > > > > What did I do wrong? > > > > > >
