I've found in a thread that we could use UDFContext and in another that we could use FileLocalizer. I want something that works in both local and hadoop mode (without having to change the implementation everytime I want to switch my environment).
Does somebody have some example code? Best, Will -----Original Message----- From: Lai Will [mailto:[email protected]] Sent: Thursday, March 03, 2011 10:20 PM To: [email protected] Subject: can a udf access the hdfs? Hello, Assuming following use case, for which I use Pig: 1) Read a file with records, each record contains a filename 2) User a EvalFunc that takes the filename as input and output certain content of the file When I was testing locally everything went fine, as I had only one machine and I could simply do File data = new File(filename); to get a handle on the file and start parsing is and extracting the data I needed. Now when running in Hadoop Mode this fails, even though the relative path names also exist in hdfs. So how can I get access to the HDFS from my UDF, or if that's not possible how do I work around that? Best, Will
