Prepare input data for Hadoop

Huy Phan Thu, 17 Sep 2009 19:17:33 -0700

Hi all,

I have a question about strategy to prepare data for Hadoop to run theirMapReduce job, we have to (somehow) copy input files from our localfilesystem to HDFS, how can we make sure that one input file is notprocessed twice in different executions of the same MapReduce job (let'ssay my MapReduce job runs once each 30 mins) ?I don't want to delete my input files after finishing the MR job becauseI may want to re-use it later.

Prepare input data for Hadoop

Reply via email to