RE: wordcount example using local file system instead of distributed one?

Hiller, Dean (Contractor) Wed, 08 Dec 2010 15:48:52 -0800

Thanks much!!! Very fair answers.  (At least now I know it is not me
doing something wrong and it works now).  Now if only someone would add
a line to comment out/on in hadoop like so at the end....or do I have to
put that on the dev list?

#uncomment to debug processes

#export HADOOP_OPTS="-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=9009
$HADOOP_OPTS"

At least that is how I was debugging these things and stepping through
hadoop code.

Thanks,

Dean

From: Aaron Eng [mailto:a...@maprtech.com] 
Sent: Wednesday, December 08, 2010 4:31 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: wordcount example using local file system instead of
distributed one?

>Why did that work out of curiosity?
Each of the config files can be used to set only specific configuration
values.  So even though you had a valid config property/value, it didn't
apply because it wasn't in the right file.  So, why did it work? Because
thats the way it is coded.  A better question, why didn't it work with
your original config files? Because of poor usability.  I think you can
find a link to the config values that work in the various config files
somewhere on the Apache Hadoop site (that was vague, wasn't it?).

>Am I doing something wrong
Yes, you need to run start-dfs.sh on the node that you want to become
the namenode and you need to run start-mapred.sh on the node that you
want to become jobtracker.  Again, the reason is very poor usability and
bad scripting.  Of course, someone will inevitably say that you could
just right your own scripts to control the services...

On Wed, Dec 8, 2010 at 3:09 PM, Hiller, Dean (Contractor)
<dean.hil...@broadridge.com> wrote:

Sweeeeeeet!!!!! That worked...didn't see that in the docs at all.  Why
did that work out of curiosity?

Also, I run ./start-dfs.sh on node 1 and ./start-mapred.sh on node 2.
Am I doing something wrong in that

I cannot run those on any other nodes L.  Ie. If I run it on a different
node, the NameNode process does not

Run on node 1 and the JobTracker does not run on node 2.  It is like
those are locked to those boxes because

Of the config where the slaves file allow all the slaves to be started
just fine.  Am I doing something wrong or

Is that just simply always the case?

My config is mapred is set to jobtracker of node 2, and my masters only
has node 1 and my slaves has node 1 

And node 2.  Is there no mapred masters like file for when I run
./start-mapred?

I would think ./start-dfs.sh should work from any node since masters
file and slaves file contain all the nodes they

Need to start things on, but it doesn't seem to work?  (just want it to
be more seamless in case I am on the wrong node

as right now, it seems to only shut down some things and not all if done
from wrong node).

Thanks,

Dean

From: Aaron Eng [mailto:a...@maprtech.com] 
Sent: Wednesday, December 08, 2010 3:57 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: wordcount example using local file system instead of
distributed one?

You will also need to restart services after that, in case that wasn't
obvious.

On Wed, Dec 8, 2010 at 2:56 PM, Aaron Eng <a...@maprtech.com> wrote:

Hi Dean,

Try removing the fs.default.name parameter from hdfs-site.xml and put it
in core-site.xml

On Wed, Dec 8, 2010 at 2:46 PM, Hiller, Dean (Contractor)
<dean.hil...@broadridge.com> wrote:

I run the following wordcount example(my hadoop shell seems to always
hit the local file system first so I had to add the hdfs...is that
normal??...I mean, I see it printing configDir= which is where I moved
the config dir and what I set the env var too which has the location in
the config files there but it still hits the local).

[r...@localhost hadoop]# ./bin/hadoop jar hadoop-0.20.2-examples.jar
wordcount

hdfs://206.88.43.8:54310/wordcount
hdfs://206.88.43.168:54310/wordcount-out

configDir=/mnt/mucho/hadoop-config/

classpath=/opt/hbase-install/hbase/hbase-0.20.6.jar:/opt/hbase-install/h
base/hba

se-0.20.6-test.jar:/mnt/mucho/hbase-config/:/opt/hbase-install/hbase/lib
/zookeep

er-3.2.2.jar

10/12/08 08:42:33 INFO input.FileInputFormat: Total input paths to
process : 13

org.apache.hadoop.ipc.RemoteException: java.io.FileNotFoundException:
File file:

/tmp/hadoop-root/mapred/system/job_201012080654_0010/job.xml does not
exist.

        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys

tem.java:361)

        at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.

java:245)

        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)

        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)

        at
org.apache.hadoop.fs.LocalFileSystem.copyToLocalFile(LocalFileSystem.

java:61)

        at
org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1197)

        at
org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:257)

        at
org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:234)

        at
org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2993)

In case it helps, here is my hdfs-site.xml that is used by the daemons
started AND the client(is that an issue...using the same one)...

<configuration>

<property>

  <name>fs.default.name</name>

  <value>hdfs://206.88.43.168:54310</value>

</property>

<property>

  <name>hadoop.tmp.dir</name>

  <value>/opt/data/hadooptmp</value>

</property>

<property>

  <name>dfs.data.dir</name>

  <value>/opt/data/hadoop</value>

</property>

<property>

  <name>dfs.replication</name>

  <value>2</value>

</property>

</configuration>

This message and any attachments are intended only for the use of the
addressee and
may contain information that is privileged and confidential. If the
reader of the 
message is not the intended recipient or an authorized representative of
the
intended recipient, you are hereby notified that any dissemination of
this
communication is strictly prohibited. If you have received this
communication in
error, please notify us immediately by e-mail and delete the message and
any
attachments from your system.

This message and any attachments are intended only for the use of the
addressee and
may contain information that is privileged and confidential. If the
reader of the 
message is not the intended recipient or an authorized representative of
the
intended recipient, you are hereby notified that any dissemination of
this
communication is strictly prohibited. If you have received this
communication in
error, please notify us immediately by e-mail and delete the message and
any
attachments from your system.

This message and any attachments are intended only for the use of the addressee 
and
may contain information that is privileged and confidential. If the reader of 
the 
message is not the intended recipient or an authorized representative of the
intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication in
error, please notify us immediately by e-mail and delete the message and any
attachments from your system.

RE: wordcount example using local file system instead of distributed one?

Reply via email to