Questions about dfs and MapRed in the Hadoop.

psdc1978 Tue, 05 Jan 2010 14:47:28 -0800

Hi list,

I downloaded the Hadoop 0.20.1 and now I'm looking to the source of
the MapReduce. I've got the following questions:


1 - What are the difference between the classes:
org.apache.hadoop.mapred.Reducer.java and
org.apache.hadoop.mapreduce.Reducer.java? In which case the 2 reducers
are used?

2 - The same question for the Mapper.java?

3 -
When I try to launch the dfs with the command
"/opt/hadoop/bin/start-dfs.sh", I get the following error:

172.24.110.12:40631: error: java.io.IOException: File
/tmp/hadoop-pcosta/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
java.io.IOException: File
/tmp/hadoop-pcosta/mapred/system/jobtracker.info could only be
replicated to 0 nodes, instead of 1
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)

I've formatted the namenode with the command "/opt/hadoop/bin/hadoop
namenode -format", and I still got the error.

Here is my hdfs-site.xml file:

pco...@netgdx-12:/opt/hadoop/conf$ cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
    <description>Default block replication.
      The actual number of replications can be specified when the file
is created.
      The default is used if replication is not specified in create time.
    </description>
  </property>
  <property>
    <name>dfs.namenode.logging.level</name>
    <value>all</value>
  </property>
</configuration>

What are the possible reasons to this happens?


4 - What's the purpose of the property in hdfs-site.xml called
"dfs.replication"?

I've read what is defined in the Hadoop site,
"dfs.replication - Default block replication. The actual number of
replications can be specified when the file is created. The default is
used if replication is not specified in create time. ", but I still
haven't understand it. Is it in how many machines a file will be
replicated?


I hope that all the questions fit in this mailing list.


Thank you.


Regards,
-- 
Pedro

Questions about dfs and MapRed in the Hadoop.

Reply via email to