I could be wrong -- I thought that also controlled what Hadoop assumes
the file system to be for non-absolute paths. Though I now also see an
"fs.defaultFS" parameter that sounds a little more like it.

If setting these resolves the problem at least it's clear what's going
on. Whether or not things ought to be smarter about assuming a certain
file system is another question.

On Tue, Feb 15, 2011 at 5:23 PM, Jeffrey Rodgers <[email protected]> wrote:
> Hm, my understanding has always been fs.default.name should point to your
> namenode.  e.g:
>
>   <property>
>     <name>fs.default.name</name>
>     <value>hdfs://ec2-50-16-170-221.compute-1.amazonaws.com:8020</value>
>   </property>
>
> On Mon, Feb 14, 2011 at 5:37 PM, Sean Owen <[email protected]> wrote:
>>
>> I think you're not setting your fs.default.name appropriately in the
>> Hadoop config? This should control the base from which paths are
>> resolved, so it this is not where you think it should be looking,
>> check that setting.
>>
>> On Mon, Feb 14, 2011 at 10:34 PM, Jeffrey Rodgers <[email protected]>
>> wrote:
>> > Hello,
>> >
>> > My test environment is using Cloudera's Hadoop (CDH beta 3) using Whirr
>> > to
>> > spawn the EC2 cluster.  I am spawning the cluster from another EC2
>> > instance.
>> >
>> > I'm attempting to use the Kmeans example following the instructions from
>> > the
>> > Quickstart guide.  I mount my testdata on the HDFS and see:
>> >
>> > drwxr-xr-x   - ubuntu supergroup          0 2011-02-14 21:48
>> > /user/ubuntu/Mahout-trunk
>> >
>> > Within Mahout-trunk is /testdata/.  Note the usage of /user/ubuntu/.
>> >
>> > When I run the examples, they seem to be looking for /home/ (see error
>> > log
>> > below).  Looking through the code, it looks there are functions for
>> > getInput
>> > so I assume there is a configuration setting of sorts, but it is not
>> > apparent to me.
>> >
>> > no HADOOP_HOME set, running locally
>> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter warn
>> > WARNING: No
>> > org.apache.mahout.clustering.syntheticcontrol.canopy.Job.props
>> > found on classpath, will use command-line arguments only
>> > Feb 14, 2011 10:05:14 PM org.slf4j.impl.JCLLoggerAdapter info
>> > INFO: Running with default arguments
>> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
>> > INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
>> > Feb 14, 2011 10:05:14 PM org.apache.hadoop.mapred.JobClient
>> > configureCommandLineOptions
>> > WARNING: Use GenericOptionsParser for parsing the arguments.
>> > Applications
>> > should implement Tool for the same.
>> > Exception in thread "main"
>> > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
>> > does
>> > not exist: file:/home/ubuntu/Mahout-trunk/testdata
>> > <trimmed>
>> >
>> > Thanks in advance,
>> > Jeff
>> >
>
>

Reply via email to