Hi Jean, no that is not directly possible. You have to pass your data through the DFS client in order for that to be part of the dfs (e.g. hadoop fs -put .., etc. or programatically). (removing core-dev from this thread since this is really a core-user question)
> -----Original Message----- > From: Jean-Pierre [mailto:[EMAIL PROTECTED] > Sent: Friday, March 28, 2008 8:58 PM > To: [EMAIL PROTECTED]; core-dev > Subject: Re: [Map/Reduce][HDFS] > > Hello > > I'm not sure I've understood...actually I've already set this > field in the configuration file. I think this field is just > to specify the master for the HDFS. > > My problem is that I have many machines with, on each one, a > bunch of files which represent the distributed data ... and I > want to use this distribution of data with hadoop. Maybe > there is another configuration file which allow me to say to > hadoop how to use my file distribution. > Is it possible ? Should I look to adapt my distribution of > data to the hadoop one ? > > Anyway thanks for your answer Peeyush. > > On Fri, 2008-03-28 at 16:22 +0530, Peeyush Bishnoi wrote: > > hello , > > > > Yes you can do this by specify in hadoop-site.xml about the > location > > of namenode , where your data is already get distributed. > > > > --------------------------------------------------------------- > > <property> > > <name>fs.default.name</name> > > <value> <IPAddress:PortNo> </value> </property> > > > > --------------------------------------------------------------- > > > > Thanks > > > > --- > > Peeyush > > > > > > On Thu, 2008-03-27 at 15:41 -0400, Jean-Pierre wrote: > > > > > Hello, > > > > > > I'm working on large amount of logs, and I've noticed that the > > > distribution of data on the network (./hadoop dfs -put > input input) > > > takes a lot of time. > > > > > > Let's says that my data is already distributed among the > network, is > > > there anyway to say to hadoop to use the already existing > > > distribution ?. > > > > > > Thanks > > > > > >
