Kevin Weil
Thu, 18 Mar 2010 21:22:38 -0700
Your UDF is getting excecuted on an arbitrary datanode, and the java process is trying to load the *local* file ./GeoIP.dat. You could use FileSystem.open to get an inputstream to the HDFS version you have, but then all datanodes will be trying to access that one (or three with replication) file, which may not be efficient. The way we handle this is to have our automated deploy/machine setup put GeoIP.dat in a specified location on all datanodes. That is, don't put it in HDFS, put it in a specified location on the local filesystem, and then your code will work.
Kevin On Thu, Mar 18, 2010 at 11:58 AM, Johannes Rußek < johannes.rus...@io-consulting.net> wrote: > Hello Everybody, > i've written a wrapper class for the GeoIP api, but now i'm trying to > access the GeoIP.dat file which i've added to hdfs via hadoop dfs -put > GeoIP.dat GeoIP.dat and added to the cache in pig.properties via > mapred.cache.files=hdfs://localhost:8020/user/root/GeoIP.dat > however, it seems the geoip api is unable to open the file with > './GeoIP.dat' as path. What should i use for this? > Regards, > Johannes >