Thanks Matei. If the basic architecture is similar to the Google stuff, I can safely just work on the project using the information from the papers.
I am aware of the 4487 jira and the current status of the permissions mechanism. I had a look at them earlier. Cheers Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sun, Feb 15, 2009 at 2:40 PM, Matei Zaharia <ma...@cloudera.com> wrote: > Forgot to add, this JIRA details the latest security features that are > being > worked on in Hadoop trunk: > https://issues.apache.org/jira/browse/HADOOP-4487. > This document describes the current status and limitations of the > permissions mechanism: > http://hadoop.apache.org/core/docs/current/hdfs_permissions_guide.html. > > On Sun, Feb 15, 2009 at 2:35 PM, Matei Zaharia <ma...@cloudera.com> wrote: > > > I think it's safe to assume that Hadoop works like MapReduce/GFS at the > > level described in those papers. In particular, in HDFS, there is a > master > > node containing metadata and a number of slave nodes (datanodes) > containing > > blocks, as in GFS. Clients start by talking to the master to list > > directories, etc. When they want to read a region of some file, they tell > > the master the filename and offset, and they receive a list of block > > locations (datanodes). They then contact the individual datanodes to read > > the blocks. When clients write a file, they first obtain a new block ID > and > > list of nodes to write it to from the master, then contact the datanodes > to > > write it (actually, the datanodes pipeline the write as in GFS) and > report > > when the write is complete. HDFS actually has some security mechanisms > built > > in, authenticating users based on their Unix ID and providing Unix-like > file > > permissions. I don't know much about how these are implemented, but they > > would be a good place to start looking. > > > > On Sun, Feb 15, 2009 at 1:36 PM, Amandeep Khurana <ama...@gmail.com > >wrote: > > > >> Thanks Matie > >> > >> I had gone through the architecture document online. I am currently > >> working > >> on a project towards Security in Hadoop. I do know how the data moves > >> around > >> in the GFS but wasnt sure how much of that does HDFS follow and how > >> different it is from GFS. Can you throw some light on that? > >> > >> Security would also involve the Map Reduce jobs following the same > >> protocols. Thats why the question about how does the Hadoop framework > >> integrate with the HDFS, and how different is it from Map Reduce and > GFS. > >> The GFS and Map Reduce papers give a good information on how those > systems > >> are designed but there is nothing that concrete for Hadoop that I have > >> been > >> able to find. > >> > >> Amandeep > >> > >> > >> Amandeep Khurana > >> Computer Science Graduate Student > >> University of California, Santa Cruz > >> > >> > >> On Sun, Feb 15, 2009 at 12:07 PM, Matei Zaharia <ma...@cloudera.com> > >> wrote: > >> > >> > Hi Amandeep, > >> > Hadoop is definitely inspired by MapReduce/GFS and aims to provide > those > >> > capabilities as an open-source project. HDFS is similar to GFS (large > >> > blocks, replication, etc); some notable things missing are read-write > >> > support in the middle of a file (unlikely to be provided because few > >> Hadoop > >> > applications require it) and multiple appenders (the record append > >> > operation). You can read about HDFS architecture at > >> > http://hadoop.apache.org/core/docs/current/hdfs_design.html. The > >> MapReduce > >> > part of Hadoop interacts with HDFS in the same way that Google's > >> MapReduce > >> > interacts with GFS (shipping computation to the data), although Hadoop > >> > MapReduce also supports running over other distributed filesystems. > >> > > >> > Matei > >> > > >> > On Sun, Feb 15, 2009 at 11:57 AM, Amandeep Khurana <ama...@gmail.com> > >> > wrote: > >> > > >> > > Hi > >> > > > >> > > Is the HDFS architecture completely based on the Google Filesystem? > If > >> it > >> > > isnt, what are the differences between the two? > >> > > > >> > > Secondly, is the coupling between Hadoop and HDFS same as how it is > >> > between > >> > > the Google's version of Map Reduce and GFS? > >> > > > >> > > Amandeep > >> > > > >> > > > >> > > Amandeep Khurana > >> > > Computer Science Graduate Student > >> > > University of California, Santa Cruz > >> > > > >> > > >> > > > > >