It is very possible (even easy). The data nodes run the datanode process. The task nodes run the task tracker. If the data nodes don't have a task tracker running, then they won't do any computation.
On 3/13/08 8:22 AM, "Andrey Pankov" <[EMAIL PROTECTED]> wrote: > Thanks, Ted! > > I also thought it is not good one to separate them out. Just was > wondering is it possible at all. Thanks! > > > Ted Dunning wrote: >> It is quite possible to do this. >> >> It is also a bad idea. >> >> One of the great things about map-reduce architectures is that data is near >> the computation so that you don't have to wait for the network. If you >> separate data and computation, you impose additional load on the cluster. >> >> What this will do to your throughput is an open question and it depends a >> lot on your programs. >> >> >> On 3/13/08 1:42 AM, "Andrey Pankov" <[EMAIL PROTECTED]> wrote: >> >>> Hi, >>> >>> Is it possible to configure hadoop cluster in such manner where there >>> are separately data-nodes and separately worker-nodes? I.e. when nodes >>> 1,2,3 store data in HDFS and nodes 3,4 and 5 do the map-reduce jobs and >>> take data from HDFS? >>> >>> If it's possible what impact will be on performance? Any suggestions? >>> >>> Thanks in advance, >>> >>> --- Andrey Pankov >> >> > > --- > Andrey Pankov
