So in my driver code, I try to store the file in the cache with this line of
code:
job.addCacheFile(new URI("file location"));
Then in my Mapper code, I do this to try and access the cached file:
URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);
However, I get a NullPointerException when I do that in the Mapper code.
Any suggesstions?
Andrew
From: Shahab Yunus [mailto:[email protected]]
Sent: Wednesday, July 10, 2013 9:43 PM
To: [email protected]
Subject: Re: New Distributed Cache
Also, once you have the array of URIs after calling getCacheFiles you can
iterate over them using File class or Path
(http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))
Regards,
Shahab
On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi
<[email protected]<mailto:[email protected]>> wrote:
did you try JobContext.getCacheFiles() ?
Thanks,
Omkar Joshi
Hortonworks Inc.<http://www.hortonworks.com>
On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew
<[email protected]<mailto:[email protected]>> wrote:
Hi,
I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this
code:
Job job = Job.getInstance();
job.addCacheFile(new URI("file name"));
That all compiles fine. In the Mapper code, I try accessing the cached file
with this method:
Path[] localPaths = context.getLocalCacheFiles();
However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I
am using Hadoop 2.0.5)
Thanks in advance,
Andrew