Hi,

Thank you Niels and thank you Nitin for your reply.

Actually, I want to run MR on a cloud store, which is open source. So I thought 
of implementing  a file system for the same and plugging it into Hadoop, just 
like S3/KFS are there. This would enable a hadoop client to talk to "My cloud 
store". But I do not have further clarity as to how to run MR on the cloud 
using the JobTracker/TaskTracker framework of Hadoop.

As per the link given by Niels, it shows that I can run MR on local file 
system. So is there any way of telling the JobTracker to read data from a set 
of nodes and then deploy TaskTracker daemons on those nodes (which would be "My 
cloud store" in this case) and fetch the result of MR.

Note: I do not want to fetch the data to my local computer as is the case with 
S3. Fetching the data would fail the purpose of using Hadoop (which is moving 
compute to data).

Thanks,
Nikhil

From: Agarwal, Nikhil
Sent: Sunday, February 17, 2013 11:53 AM
To: '[email protected]'
Subject: Can I perfrom a MR on my local filesystem


Hi,

Recently I followed a blog to run Hadoop on a single node cluster.

I wanted to ask that in a single node set-up of Hadoop is it necessary to have 
the data copied into Hadoop's HDFS before running a MR on it. Can I run MR on 
my local file system too without copying the data to HDFS?

In the Hadoop source code I saw there are implementations of other file systems 
too like S3, KFS, FTP, etc. so how does exactly a MR happen on S3 data store ? 
How does JobTracker or Tasktracker run in S3 ?



I would be very thankful to get a reply to this.



Thanks & Regards,

Nikhil

Reply via email to