Hello,

I have used getCacheFiles() instead of getLocalCacheFiles() and now it works.

Can someone please explain the difference between the two? I'm not able to find some good explanation about it to understand how it works.

Thanks,
Marko

On 05/11/2015 11:25 PM, [email protected] wrote:

Hello,

I'm new to Hadoop and I'm having a problem reading from a sequence file that I add to distributed cache.

I didn't have problems when I ran it in standalone mode, but now in pseudo-distributed and distributed I do.

I'm adding file to distributed cache like this

|DistributedCache.addCacheFile(new URI(currentMedoids), conf);|

And reading from it in mapper's setup method

|         Configuration conf = context.getConfiguration();
         FileSystem fs = FileSystem.get(conf);

         Path[] paths = DistributedCache.getLocalCacheFiles(conf);

         List<Element> sketch = new ArrayList<Element>();

         SequenceFile.Reader medoidsReader = new SequenceFile.Reader(fs, 
paths[0], conf);

         Writable medoidKey = (Writable) 
medoidsReader.getKeyClass().newInstance();
         Writable medoidValue = (Writable) 
medoidsReader.getValueClass().newInstance();

         while(medoidsReader.next(medoidKey, medoidValue)){

             ElementWritable medoidWritable = (ElementWritable)medoidValue;
             sketch.add(medoidWritable.getElement());
         }|

And I'm getting FileNotFoundException.

Can anyone please help me and explain to me what is the problem and how to do this properly?

Thanks

Sent with inky <http://inky.com?kme=signature>


Reply via email to