Re: Can Hadoop MapReduce be used without using HDFS

Doug Cutting Mon, 11 Jun 2007 12:02:27 -0700

Neeraj Mahajan wrote:

But I do not want to create HDFS as I already have the data available onall
the machine and I do not want to again transfer the data to the new file
system. Is it possible to skip HDFS but use the MapReduce functionality?Any
idea what would have to be done?

Hadoop requires that input paths are universal across nodes. So if youhave data that is accessible from all nodes through the local filesystem(either by copying it there or via nfs mounts) then, so long as it isaccessible through the same path on all nodes, Hadoop should work fine:the data named by file:///my_data/foo/bar should be the same on all hosts.

That said, accessing data over NFS will probably be slower than overHDFS. If the data resides on only a small subset of your nodes thenthese nodes could become overloaded. As a general rule, if you're goingto touch the data more than once, and have room, it would probably be agood idea to copy it into an HDFS filesystem.


Doug

Re: Can Hadoop MapReduce be used without using HDFS

Reply via email to