Finally I made it work. The trick was in "asSubclass" method: val mongoRDD = sc.newAPIHadoopFile("file:///root/jobs/dump/input.bson", classOf[BSONFileInputFormat].asSubclass(classOf[org.apache.hadoop.mapreduce.lib.input.FileInputFormat[Object, BSONObject]]), classOf[Object], classOf[BSONObject], config)
2014-08-06 0:43 GMT+04:00 Dmitriy Selivanov <selivanov.dmit...@gmail.com>: > Hello, I have issue when try to use bson file as spark input. I use > mongo-hadoop-connector 1.3.0 and spark 1.0.0: > val sparkConf = new SparkConf() > val sc = new SparkContext(sparkConf) > val config = new Configuration() > config.set("mongo.job.input.format", > "com.mongodb.hadoop.BSONFileInputFormat") > config.set("mapred.input.dir", "file:///root/jobs/dump/input.bson") > config.set("mongo.output.uri", "mongodb://" + args(0) + "/" + args(2)) > val mongoRDD = > sc.newAPIHadoopFile("file:///root/jobs/dump/input.bson", > classOf[BSONFileInputFormat], classOf[Object], classOf[BSONObject], config) > > But on last line I recieve error: "inferred type arguments > [Object,org.bson.BSONObject,com.mongodb.hadoop.BSONFileInputFormat] do not > conform to method newAPIHadoopFile's type parameter bounds [K,V,F <: > org.apache.hadoop.mapreduce.InputFormat[K,V]]" > this is very strange, because BSONFileInputFormat > extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat: > https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/BSONFileInputFormat.java > How I can solve this issue? > I have no problems with com.mongodb.hadoop.MongoInputFormat when use > mongodb collection as input. > And moreover seems there is no problem with java api: > https://github.com/crcsmnky/mongodb-spark-demo/blob/master/src/main/java/com/mongodb/spark/demo/Recommender.java > I'm not professional java/scala developer, please help. > > -- > Regards > Dmitriy Selivanov > -- Regards Dmitriy Selivanov