After specifying NLineInputFormat option, streaming job fails with    

Error from attempt_201205171448_0092_m_000000_0: java.lang.RuntimeException: 
PipeMapRed.waitOutputThreads(): subprocess failed with code 2

It spawns two mappers, but i am not sure whether the mapper runs with file 
names 
specified in the input option.  I was expecting one mapper to run with 
/user/devi/s_input/a.txt and one mapper to run with /user/devi/s_input/b.txt. I 
digged into the task files, but could not find anything.

Here is the simple  mapper perl script .All does is it reads the file and 
prints 
it. (It needs to do much more stuff, but I could not get the basic job itself 
to 
run).

 $i = 0;
   $userinput = <STDIN>;
   open(INFILE,"$userinput") || die "could not open the file $userinput \n";
   while (<INFILE>) {
     my $line = $_;
     print "$i".$line ;
     $i++;
   }
   close(INFILE);
exit;

My command is hadoop jar 
/usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u3.jar 
-input 
/user/devi/file.txt -output /user/devi/s_output -mapper "/usr/bin/perl 
/home/devi/Perl/crash_parser.pl" -inputformat 
org.apache.hadoop.mapred.lib.NLineInputFormat 


Really appreciate your help.

Devi


 




________________________________
From: Robert Evans <ev...@yahoo-inc.com>
To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
Sent: Thu, August 2, 2012 1:15:05 PM
Subject: Re: Hadoop : java.lang.ClassCastException: 
org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

The default text input format has a key of a LongWritable that is the
offset into the file.  The value is the full line.

On 8/2/12 2:59 PM, "Harit Himanshu" <harit.himan...@gmail.com> wrote:

>StackOverflow link -
>http://stackoverflow.com/questions/11784729/hadoop-java-lang-classcastexce
>ption-org-apache-hadoop-io-longwritable-cannot
>
>----------
>
>My program looks like
>
>public class TopKRecord extends Configured implements Tool {
>
>    public static class MapClass extends Mapper<Text, Text, Text, Text> {
>
>        public void map(Text key, Text value, Context context) throws
>IOException, InterruptedException {
>            // your map code goes here
>            String[] fields = value.toString().split(",");
>            String year = fields[1];
>            String claims = fields[8];
>
>            if (claims.length() > 0 && (!claims.startsWith("\""))) {
>                context.write(new Text(year.toString()), new
>Text(claims.toString()));
>            }
>        }
>    }
>  public int run(String args[]) throws Exception {
>        Job job = new Job();
>        job.setJarByClass(TopKRecord.class);
>
>        job.setMapperClass(MapClass.class);
>
>        FileInputFormat.setInputPaths(job, new Path(args[0]));
>        FileOutputFormat.setOutputPath(job, new Path(args[1]));
>
>        job.setJobName("TopKRecord");
>        job.setMapOutputValueClass(Text.class);
>        job.setNumReduceTasks(0);
>        boolean success = job.waitForCompletion(true);
>        return success ? 0 : 1;
>    }
>
>    public static void main(String args[]) throws Exception {
>        int ret = ToolRunner.run(new TopKRecord(), args);
>        System.exit(ret);
>    }
>}
>
>The data looks like
>
>"PATENT","GYEAR","GDATE","APPYEAR","COUNTRY","POSTATE","ASSIGNEE","ASSCODE
>","CLAIMS","NCLASS","CAT","SUBCAT","CMADE","CRECEIVE","RATIOCIT","GENERAL"
>,"ORIGINAL","FWDAPLAG","BCKGTLAG","SELFCTUB","SELFCTLB","SECDUPBD","SECDLW
>BD"
>3070801,1963,1096,,"BE","",,1,,269,6,69,,1,,0,,,,,,,
>3070802,1963,1096,,"US","TX",,1,,2,6,63,,0,,,,,,,,,
>3070803,1963,1096,,"US","IL",,1,,2,6,63,,9,,0.3704,,,,,,,
>3070804,1963,1096,,"US","OH",,1,,2,6,63,,3,,0.6667,,,,,,,
>
>On running this program I see the following on console
>
>12/08/02 12:43:34 INFO mapred.JobClient: Task Id :
>attempt_201208021025_0007_m_000000_0, Status : FAILED
>java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot
>be cast to org.apache.hadoop.io.Text
>    at com.hadoop.programs.TopKRecord$MapClass.map(TopKRecord.java:26)
>    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at javax.security.auth.Subject.doAs(Subject.java:396)
>    at 
>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.
>java:1121)
>    at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>I believe that the Class Types are mapped correctly, Class
>Mapper<http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/
>mapreduce/Mapper.html>
>,
>
>Please let me know what is that I am doing wrong here?
>
>
>Thank you
>
>+ Harit Himanshu

Reply via email to