RE: Error using MultipleInputs

Sanchita Adhya Thu, 05 Jul 2012 05:18:24 -0700

Thank you Bejoy for your prompt response! I have gotten past the error!

-----Original Message-----
From: Bejoy Ks [mailto:bejoy.had...@gmail.com] 
Sent: Thursday, July 05, 2012 5:39 PM
To: common-user@hadoop.apache.org
Subject: Re: Error using MultipleInputs


Hi Sanchita

Try your code after commenting the following Line of code,

//conf.setInputFormat(TextInputFormat.class);

AFAIK This explicitly sets the input format as TextInputFormat instead of
MultipleInput and hence the compiler throws an error stating 'no input path
specified'.

Regards
Bejoy KS

On Thu, Jul 5, 2012 at 5:19 PM, Sanchita Adhya <sad...@infocepts.com> wrote:
> Hi,
>
>
>
> I am using cloudera's hadoop version - Hadoop 0.20.2-cdh3u3 and trying 
> to use the MultipleInputs incorporating separate mapper class in the 
> following
> manner-
>
>
>
> public static void main(String[] args) throws Exception {
>
>      JobConf conf = new JobConf(IntegrateExisting.class);
>
>      conf.setJobName("IntegrateExisting");
>
>
>
>      conf.setOutputKeyClass(Text.class);
>
>      conf.setOutputValueClass(Text.class);
>
>
>
>      Path existingKeysInputPath = new Path(args[0]);
>
>      Path newKeysInputPath = new Path(args[1]);
>
>     Path outputPath = new Path(args[2]);
>
>
>
>      MultipleInputs.addInputPath(conf, existingKeysInputPath, 
> TextInputFormat.class, MapExisting.class);
>
>      MultipleInputs.addInputPath(conf, newKeysInputPath, 
> TextInputFormat.class, MapNew.class);
>
>
>
>      conf.setCombinerClass(ReduceAndFilterOut.class);
>
>      conf.setReducerClass(ReduceAndFilterOut.class);
>
>
>
>      conf.setInputFormat(TextInputFormat.class);
>
>      conf.setOutputFormat(TextOutputFormat.class);
>
>
>
>      FileOutputFormat.setOutputPath(conf, outputPath);
>
>
>
>
>
>     //FileInputFormat.addInputPath(conf,existingKeysInputPath);
>
>    //FileInputFormat.addInputPath(conf,newKeysInputPath);
>
>
>
>      JobClient.runJob(conf);
>
>    }
>
>
>
> Without the commented lines in the above code, the MR job fails with 
> the following error-
>
>
>
> 12/07/05 16:59:25 ERROR security.UserGroupInformation:
> PriviledgedActionException as:root (auth:SIMPLE)
cause:java.io.IOException:
> No input paths specified in job
>
> Exception in thread "main" java.io.IOException: No input paths 
> specified in job
>
>         at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.ja
> va:153
> )
>
>         at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.jav
> a:205)
>
>         at
> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)
>
>         at
> org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)
>
>         at 
> org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>
>         at 
> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
>
>         at 
> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.ja
> va:1157)
>
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:83
> 3)
>
>         at 
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
>
>         at 
> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242)
>
>         at 
> org.myorg.IntegrateExisting.main(IntegrateExisting.java:122)
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> ava:39
> )
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl
> .java:25)
>
>         at java.lang.reflect.Method.invoke(Method.java:597)
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>
>
>
> Uncommenting the lines, leads to the following error in the mappers-
>
>
>
> java.lang.ClassCastException: org.apache.hadoop.mapred.FileSplit 
> cannot be cast to org.apache.hadoop.mapred.lib.TaggedInputSplit
>
>         at
> org.apache.hadoop.mapred.lib.DelegatingMapper.map(DelegatingMapper.jav
> a:48)
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>
>         at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
> ion.ja
> va:1157)
>
>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
>
>
>
> I see the MAPREDUCE-1178 that discusses the second error is included 
> in the
> CDH3 version. Is there any code missing from the above piece?
>
>
>
> Thanks for the help.
>
>
>
> Regards,
>
> Sanchita
>
>
>

RE: Error using MultipleInputs

Reply via email to