ClassCastException while reading data from HDFS through Spark

Vinoth Sankar Wed, 07 Oct 2015 02:12:08 -0700

 I'm just reading data from HDFS through Spark. It throws
*java.lang.ClassCastException:
org.apache.hadoop.io.LongWritable cannot be cast to
org.apache.hadoop.io.BytesWritable* at line no 6. I never used LongWritable
in my code, no idea how the data was in that format.


Note : I'm not using MapReduce Concepts and also I'm not creating Jobs
explicitly. So i can't use job.setMapOutputKeyClass and
job.setMapOutputValueClass.

JavaPairRDD<IntWritable, BytesWritable> hdfsContent =
sparkContext.sequenceFile(hdfsPath, IntWritable.class, BytesWritable.class);
JavaRDD<FileData> lines = hdfsContent.map(new Function<Tuple2<IntWritable,
BytesWritable>, FileData>()
{
public FileData call(Tuple2<IntWritable, BytesWritable> tuple2) throws
InvalidProtocolBufferException
{
byte[] bytes = tuple2._2().getBytes();
return FileData.parseFrom(bytes);
}
});

ClassCastException while reading data from HDFS through Spark

Reply via email to