Working with MapFiles

Ondřej Klimpera Thu, 29 Mar 2012 08:06:02 -0700

Hello,

I have a MapFile as a product of MapReduce job, and what I need to do is:


1. If MapReduce produced more spilts as Output, merge them to single file.

2. Copy this merged MapFile to another HDFS location and use it as aDistributed cache file for another MapReduce job.

I'm wondering if it is even possible to merge MapFiles according totheir nature and use them as Distributed cache file.

What I'm trying to achieve is repeatedly fast search in this file duringanother MapReduce job.

If my idea is absolute wrong, can you give me any tip how to do it?

The file is supposed to be 20MB large.
I'm using Hadoop 0.20.203.

Thanks for your reply:)

Ondrej Klimpera

Working with MapFiles

Reply via email to