I could be wrong -- I thought that also controlled what Hadoop assumes the file system to be for non-absolute paths. Though I now also see an "fs.defaultFS" parameter that sounds a little more like it.
If setting these resolves the problem at least it's clear what's going on. Whether or not things ought to be smarter about assuming a certain file system is another question. On Tue, Feb 15, 2011 at 5:23 PM, Jeffrey Rodgers <[email protected]> wrote: > Hm, my understanding has always been fs.default.name should point to your > namenode. e.g: > > <property> > <name>fs.default.name</name> > <value>hdfs://ec2-50-16-170-221.compute-1.amazonaws.com:8020</value> > </property> > > On Mon, Feb 14, 2011 at 5:37 PM, Sean Owen <[email protected]> wrote: >> >> I think you're not setting your fs.default.name appropriately in the >> Hadoop config? This should control the base from which paths are >> resolved, so it this is not where you think it should be looking, >> check that setting. >> >> On Mon, Feb 14, 2011 at 10:34 PM, Jeffrey Rodgers <[email protected]> >> wrote: >> > Hello, >> > >> > My test environment is using Cloudera's Hadoop (CDH beta 3) using Whirr >> > to >> > spawn the EC2 cluster. I am spawning the cluster from another EC2 >> > instance. >> > >> > I'm attempting to use the Kmeans example following the instructions from >> > the >> > Quickstart guide. I mount my testdata on the HDFS and see: >> > >> > drwxr-xr-x - ubuntu supergroup 0 2011-02-14 21:48 >> > /user/ubuntu/Mahout-trunk >> > >> > Within Mahout-trunk is /testdata/. Note the usage of /user/ubuntu/. >> > >> > When I run the examples, they seem to be looking for /home/ (see error >> > log >> > below). Looking through the code, it looks there are functions for >> > getInput >> > so I assume there is a configuration setting of sorts, but it is not >> > apparent to me. >> > >> > no HADOOP_HOME set, running locally >> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter warn >> > WARNING: No >> > org.apache.mahout.clustering.syntheticcontrol.canopy.Job.props >> > found on classpath, will use command-line arguments only >> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter info >> > INFO: Running with default arguments >> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.metrics.jvm.JvmMetrics init >> > INFO: Initializing JVM Metrics with processName=JobTracker, sessionId= >> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.mapred.JobClient >> > configureCommandLineOptions >> > WARNING: Use GenericOptionsParser for parsing the arguments. >> > Applications >> > should implement Tool for the same. >> > Exception in thread "main" >> > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path >> > does >> > not exist: file:/home/ubuntu/Mahout-trunk/testdata >> > <trimmed> >> > >> > Thanks in advance, >> > Jeff >> > > >
