Hi,

I am trying to figure out if Hadoop can be used for one functionality that I
am trying to develop. I have large volumes of data already stored on disks
that are locally/remotely mounted on many linux machines. I have to do some
data analysis over that data. Since the data is huge, I would like to
parallel process that data and then combine the results. The MapReduce
functionality of Hadoop fits well with my scenario.
But I do not want to create HDFS as I already have the data available on all
the machine and I do not want to again transfer the data to the new file
system. Is it possible to skip HDFS but use the MapReduce functionality? Any
idea what would have to be done?

Thanks,
Neeraj

Reply via email to