Hi,

Is it possible to run a MapReduce job on a part of file on HDFS? The use case 
is using a single file on HDFS as a stream to store all log events of a 
particular kind. New data can grow on top while Map Reduce can process old 
data. Of course one option would be to copy part of data into a separate file 
and give that to MapReduce but I was wondering if that extra copy can be 
avoided.

Thanks,
Pankaj

Reply via email to