Re: Question about Hadoop

2008-06-14 Thread Chanchal James
Thank you very much for explaining it to me, Ted.. Thats a great deal of info! I guess that could be how Yahoo Webmap is designed.. And for anyone trying to figure out the massiveness of Hadoop computing, http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/should

Failed Reduce Task

2008-06-14 Thread chanel
Hey everyone, I'm trying to get the hang of using Hadoop and I'm using the Michael Noll Ubuntu tutorials (http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)). Using the wordcount example that comes with version 0.17.1-dev I get this error output:

Guide to running Hadoop on Windows

2008-06-14 Thread Hayes Davis
I recently got started using Hadoop and spent some time getting distributed Hadoop running on Windows. I didn't find much on Google about running on Windows and the docs are a tad vague on the subject. Anyway, for anyone that's interested, I've written up a guide based on my experience called

Ec2 and MR Job question

2008-06-14 Thread Billy Pearson
I have a question someone may have answered here before but I can not find the answer. Assuming I have a cluster of servers hosting a large amount of data I want to run a large job that the maps take a lot of cpu power to run and the reduces only take a small amount cpu to run. I want to run

Re: Ec2 and MR Job question

2008-06-14 Thread Chris K Wensel
well, to answer your last question first, just set the # reducers to zero. but you can't just run reducers without mappers (as far as I know, having never tried). so your local job will need to run identity mappers in order to feed your reducers.

Re: Ec2 and MR Job question

2008-06-14 Thread Billy Pearson
I understand how to run it as two jobs my only question is Is there away to make the mappers store the final output in hdfs? so I can kill the ec2 machines without waiting to the reduce stage ends! Billy Chris K Wensel [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] well, to

Re: Ec2 and MR Job question

2008-06-14 Thread Billy Pearson
My second question is about the ec2 machines has anyone solved the hostname problem in a automated way? Example if I launch a ec2 server to run a task tracker the hostname reported back to my local cluster with its internal address the local reduce task can not access the map files on the ec2