Re: issue with spark and bson input

Dmitriy Selivanov Wed, 06 Aug 2014 11:56:34 -0700

Finally I made it work. The trick was in "asSubclass" method:
val mongoRDD = sc.newAPIHadoopFile("file:///root/jobs/dump/input.bson",
classOf[BSONFileInputFormat].asSubclass(classOf[org.apache.hadoop.mapreduce.lib.input.FileInputFormat[Object,
BSONObject]]), classOf[Object], classOf[BSONObject], config)



2014-08-06 0:43 GMT+04:00 Dmitriy Selivanov <selivanov.dmit...@gmail.com>:

> Hello, I have issue when try to use bson file as spark input. I use
> mongo-hadoop-connector 1.3.0 and spark 1.0.0:
>      val sparkConf = new SparkConf()
>     val sc = new SparkContext(sparkConf)
>     val config = new Configuration()
>     config.set("mongo.job.input.format",
> "com.mongodb.hadoop.BSONFileInputFormat")
>     config.set("mapred.input.dir", "file:///root/jobs/dump/input.bson")
>     config.set("mongo.output.uri", "mongodb://" + args(0) + "/" + args(2))
>     val mongoRDD =
> sc.newAPIHadoopFile("file:///root/jobs/dump/input.bson",
> classOf[BSONFileInputFormat], classOf[Object], classOf[BSONObject], config)
>
> But on last line I recieve error: "inferred type arguments
> [Object,org.bson.BSONObject,com.mongodb.hadoop.BSONFileInputFormat] do not
> conform to method newAPIHadoopFile's type parameter bounds [K,V,F <:
> org.apache.hadoop.mapreduce.InputFormat[K,V]]"
> this is very strange, because BSONFileInputFormat
> extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat:
> https://github.com/mongodb/mongo-hadoop/blob/master/core/src/main/java/com/mongodb/hadoop/BSONFileInputFormat.java
> How I can solve this issue?
> I have no problems with com.mongodb.hadoop.MongoInputFormat when use
> mongodb collection as input.
> And moreover seems there is no problem with java api:
> https://github.com/crcsmnky/mongodb-spark-demo/blob/master/src/main/java/com/mongodb/spark/demo/Recommender.java
> I'm not professional java/scala developer, please help.
>
> --
> Regards
> Dmitriy Selivanov
>



-- 
Regards
Dmitriy Selivanov

Re: issue with spark and bson input

Reply via email to