Hello Andrey 

Just look at the -cacheDir with streaming , if it can help you out 

http://hadoop.apache.org/core/docs/current/streaming.html#Large+files
+and+archives+in+Hadoop+Streaming


Thankyou ,

---
Peeyush 

On Tue, 2008-03-11 at 17:30 +0200, Andrey Pankov wrote:

> Hi all,
> 
> I'm still new to Hadoop. I'd like to use Hadoop streaming in order to 
> combine mapper as Java class and reducer as C++ program. Currently I'm 
> at the beginning of this task and now I have troubles with Java class. 
>   It looks something like
> 
> 
> package org.company;
>   ...
> public class TestMapper extends MapReduceBase implements Mapper {
>   ...
>    public void map(WritableComparable key, Writable value,
>      OutputCollector output, Reporter reporter) throws IOException {
>   ...
> 
> 
> I created jar file with my class and it is accessible via $CLASSPATH. 
> I'm running stream job using
> 
> $HSTREAMING -mapper org.company.TestMapper -reducer "wc -l" -input /data 
> -output /out1
> 
> Hadoop cannot find TestMapper class. I'm using hadoop-0.16.0. The error is
> 
> ===========================
> 2008-03-07 18:58:07,734 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2008-03-07 18:58:07,833 INFO org.apache.hadoop.mapred.MapTask: 
> numReduceTasks: 1
> 2008-03-07 18:58:07,910 WARN org.apache.hadoop.mapred.TaskTracker: Error 
> running child
> java.lang.RuntimeException: java.lang.RuntimeException: 
> java.lang.ClassNotFoundException: org.company.TestMapper
>          at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:639)
>          at 
> org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:728)
>          at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:36)
>          at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
>          at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:82)
>          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:204)
>          at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2071)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.company.TestMapper
>          at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:607)
>          at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:631)
>          ... 6 more
> Caused by: java.lang.ClassNotFoundException: org.company.TestMapper
>          at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>          at java.security.AccessController.doPrivileged(Native Method)
>          at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>          at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>          at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:276)
>          at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
>          at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
>          at java.lang.Class.forName0(Native Method)
>          at java.lang.Class.forName(Class.java:247)
>          at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:587)
>          at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:605)
>          ... 7 more
> ===========================
> 
> What is interesting for me. I had put into Hadoop streaming 
> (StreamJob.java and StreamUtil.java) some debugging println(). Streaming 
> can see TestMapper on job configuration stage (StreamJob.setJobConf() 
> routine) but cannot later. Next code creates new instance of TestMapper 
> and calls toString() defined in TestMapper. It works.
> 
>      if (mapCmd_ != null) {
>        c = StreamUtil.goodClassOrNull(mapCmd_, defaultPackage);
>        if (c != null) {
>          System.out.println("#######################");
>          try {
>              System.out.println(c.newInstance().toString());
>          } catch (Exception e) { }
>          System.out.println("#######################");
>          jobConf_.setMapperClass(c);
>        } else {
> ...
>        }
>      }
> 
> 
> I tried to add jar file with TestMapper using option
>   "-file test_mapper.jar" . The result is the same.
> 
> Could anybody advice me something? Thanks in advance,
> 
> ---
> Andrey Pankov.
> 

Reply via email to