Hi nithin: 

The fuse mount is what allows the filesystem to access distributed files in 
gluster: that is,  GlusterFS has its own fuse mount ...  And GlusterFileSystem 
wraps that in hadoop FileSystem semantics.

Meanwhile, The mapreduce jobs are invoked using on custom core-site and 
mapred-site XML nodes which specify GlusterFileSystem as the dfs.

On Feb 22, 2013, at 3:17 AM, Nikhil Agarwal <[email protected]> wrote:

> Hi All,
> 
>  
> 
> Thanks a lot for taking out your time to answer my question.
> 
>  
> 
> I am trying to implement a file system in hadoop under irg.apache.hadoop.fs 
> package something similar to KFS, glusterfs, etc. I wanted to know is that in 
> README.txt of glusterfs it is mentioned :
> 
>  
> 
> >> # ./bin/start-mapred.sh
>   If the map/reduce job/task trackers are up, all I/O will be done to 
> GlusterFS.
> 
>  
> 
> So, suppose my input files are scattered in different nodes(glusterfs 
> servers), how do I(hadoop client having glusterfs plugged in) issue a 
> Mapreduce command?
> 
> Moreover, after issuing a Mapreduce command would my hadoop client fetch all 
> the data from different servers to my local machine and then do a Mapreduce 
> or would it start the TaskTracker daemons on the machine(s) where the input 
> file(s) are located and perform a Mapreduce there?
> 
> Please rectify me if I am wrong but I suppose that the location of input 
> files top Mapreduce is being returned by the function getFileBlockLocations 
> (FileStatus file, long start, long len).
> 
>  
> 
> Thank you very much for your time and helping me out.
> 
>  
> 
> Regards,
> 
> Nikhil
> 
>  
> 
> 
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to