Hi all,I have a question about strategy to prepare data for Hadoop to run their MapReduce job, we have to (somehow) copy input files from our local filesystem to HDFS, how can we make sure that one input file is not processed twice in different executions of the same MapReduce job (let's say my MapReduce job runs once each 30 mins) ? I don't want to delete my input files after finishing the MR job because I may want to re-use it later.
- Prepare input data for Hadoop Huy Phan
- Re: Prepare input data for Hadoop Aaron Kimball
