On 16-apr-2007, at 18:41, Doug Cutting wrote:
Eelco Lempsink wrote:Inspired by http://www.mail-archive.com/nutch- [EMAIL PROTECTED]/msg02394.html I'm trying to run Hadoop on multiple CPU's, but without using HDFS.To be clear: you need some sort of shared filesystem, if not HDFS, then NFS, S3, or something else. For example, the job client interacts with the job tracker by copying files to the shared filesystem named by fs.default.name, and job inputs and outputs are assumed to come from a shared filesystem.
I'm not trying to run it on a cluster though, only on one host with multiple CPU's. So I guess the local filesystem is shared and therefore it should be fine.
However, If I try with fs.default.name set to "file:///tmp/hadoop- test/" still nothing happens.
To provide some more info, the TaskRunner keeps repeating this:INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000001_0 Need 12 map output(s) INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000001_0 Need 12 map output location(s) INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000001_0 Got 0 new map outputs from jobtracker and 0 map outputs from previous failures INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000001_0 Got 0 known map output location(s); scheduling... INFO org.apache.hadoop.mapred.TaskRunner: task_0001_r_000001_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
I'm unsure if this is a bug or a misconfiguration. I'd like to (help) fix(ing) it however, so if you need any more information, please let me know.
-- Regards, Eelco Lempsink
PGP.sig
Description: This is a digitally signed message part
