I have  my own class as for the OutputCollector defined as following inside the 
file:

public static class Turple 
{
    public int name;
    public int ID;
}

and in main method, I use conf.setOutputValueClass(Turple.class) in order to 
specify output collector class type.

inside map, combiner, and reduce classes, I simply store some information into 
a new Turple and output them:

   Turple turp=new Turple();
   turp.name=ddddd;
   turp.ID=1111;
  
   output.collect(key,turp);


All those map, reduce and combiner's output type are set to Turple already.

The code, still, compiled successfully but keep remind me following info when 
running:
-----------------------------------------------------------------------------------------
08/08/20 14:03:44 INFO mapred.JobClient: Task Id : 
task_200808161218_0082_m_000001_0, Status : FAILED
java.lang.NullPointerException
    at 
org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:373)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:185)
    at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

----------------------------------------------------------------------------------------------------

It seems there're some problem with 'serilization'.  I think that should be 
caused by  miss using java class.  

Apprecaite any helps from you guys . Thanks!

Kunsheng 





      

Reply via email to