I was having the same probs trying to read from HCatalog with Scala API. The way around this was that I created a wrapper InputFormat in Java that uses Spark's SerializableWritable.
I hacked this up Friday afternoon, tested a few times, and it seemed to work well. Here's an example: https://gist.github.com/granturing/7201912 I'm new to Spark and Scala so this may not be the "right way", but it worked for me! :) From: Arun Kumar <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Monday, October 28, 2013 4:52 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Reading custom inputformat from hadoop dfs Hi I am trying to read some custom sequence file from hadoop file system. The CustomInputFormat Class implements InputFormat<WritableComparable, Writable> . I am able to read the file in JavaRDD as follows JobConf job = new JobConf(); FileInputFormat.setInputPaths(job, new Path(input)); JavaPairRDD<WritableComparable, Writable> rdd = spark.hadoopRDD(job, CustomInputFormat.class, WritableComparable.class, Writable.class); But I want to read directly from scala api, I am trying as follows val job = new JobConf(); FileInputFormat.setInputPaths(job, new Path(input)); spark.hadoopRDD(job, classOf[CustomInputFormat], classOf[WritableComparable[Object]], classOf[Writable]); I am getting the following error: [error] argument expression's type is not compatible with formal parameter type; [error] found : java.lang.Class[CustomInputFormat] [error] required: java.lang.Class[_ <: org.apache.hadoop.mapred.InputFormat[?K,?V]] But my CustomInputFormat Class implements InputFormat<WritableComparable, Writable>. Is the generics causing the compilation problem? WritableComparable expects a type parameter.
