Re: Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-18 Thread Marco Mistroni
Thansk Palash, ur suggestion put me on the right track
Reading works fine, however it seems that in writng, as the sparkSession is
not involved, then the connector does not know where to write
had to replace my writing code with this

MongoSpark.save(df.write.option("spark.mongodb.output.uri",
"mongodb://localhost:27017/test.tree"))

kr
 marco



On Tue, Jan 17, 2017 at 7:53 AM, Marco Mistroni  wrote:

> Uh. Many thanksWill try it out
>
> On 17 Jan 2017 6:47 am, "Palash Gupta"  wrote:
>
>> Hi Marco,
>>
>> What is the user and password you are using for mongodb connection? Did
>> you enable authorization?
>>
>> Better to include user & pass in mongo url.
>>
>> I remember I tested with python successfully.
>>
>> Best Regards,
>> Palash
>>
>>
>> Sent from Yahoo Mail on Android
>> 
>>
>> On Tue, 17 Jan, 2017 at 5:37 am, Marco Mistroni
>>  wrote:
>> hi all
>>  i have the folllowign snippet which loads a dataframe from  a csv file
>> and tries to save
>> it to mongodb.
>> For some reason, the MongoSpark.save method raises the following exception
>>
>> Exception in thread "main" java.lang.IllegalArgumentException: Missing
>> database name. Set via the 'spark.mongodb.output.uri' or
>> 'spark.mongodb.output.database' property
>> at com.mongodb.spark.config.MongoCompanionConfig$class.database
>> Name(MongoCompanionConfig.scala:260)
>> at com.mongodb.spark.config.WriteConfig$.databaseName(WriteConf
>> ig.scala:36)
>>
>> Which is bizzarre as i m pretty sure i am setting all the necessary
>> properties in the SparkConf
>>
>> could you kindly assist?
>>
>> I am running Spark 2.0.1 locally with a local mongodb instance running at
>> 127.0.0.1:27017
>> I am using version 2.0.0 of mongo-spark-connector
>> I am running on Scala 2.11
>>
>> kr
>>
>> val spark = SparkSession
>>  .builder()
>>  .master("local")
>>  .appName("Spark Mongo Example")
>>  .getOrCreate()
>> spark.conf.set("spark.mongodb.input.uri", "mongodb://127.0.0.1:27017/
>> ")
>> spark.conf.set("spark.mongodb.output.uri", "mongodb://
>> 127.0.0.1:27017/")
>> spark.conf.set("spark.mongodb.output.database", "test")
>>
>> println(s"SparkPRoperties:${spark.conf.getAll}")
>>
>>
>> val df = getDataFrame(spark) // Loading any dataframe from a file
>>
>> df.printSchema()
>>
>> println(s"Head:${df.head()}")
>> println(s"Count:${df.count()}")
>> println("##  SAVING TO MONGODB #")
>> import com.mongodb.spark.config._
>>
>> import com.mongodb.spark.config._
>>
>> val writeConfig = WriteConfig(Map("collection" -> "spark",
>> "writeConcern.w" -> "majority"), Some(WriteConfig(spark.sparkContext)))
>> MongoSpark.save(df, writeConfig)
>>
>>
>>
>>


Re: Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-16 Thread Marco Mistroni
Uh. Many thanksWill try it out

On 17 Jan 2017 6:47 am, "Palash Gupta"  wrote:

> Hi Marco,
>
> What is the user and password you are using for mongodb connection? Did
> you enable authorization?
>
> Better to include user & pass in mongo url.
>
> I remember I tested with python successfully.
>
> Best Regards,
> Palash
>
>
> Sent from Yahoo Mail on Android
> 
>
> On Tue, 17 Jan, 2017 at 5:37 am, Marco Mistroni
>  wrote:
> hi all
>  i have the folllowign snippet which loads a dataframe from  a csv file
> and tries to save
> it to mongodb.
> For some reason, the MongoSpark.save method raises the following exception
>
> Exception in thread "main" java.lang.IllegalArgumentException: Missing
> database name. Set via the 'spark.mongodb.output.uri' or
> 'spark.mongodb.output.database' property
> at com.mongodb.spark.config.MongoCompanionConfig$class.databaseName(
> MongoCompanionConfig.scala:260)
> at com.mongodb.spark.config.WriteConfig$.databaseName(
> WriteConfig.scala:36)
>
> Which is bizzarre as i m pretty sure i am setting all the necessary
> properties in the SparkConf
>
> could you kindly assist?
>
> I am running Spark 2.0.1 locally with a local mongodb instance running at
> 127.0.0.1:27017
> I am using version 2.0.0 of mongo-spark-connector
> I am running on Scala 2.11
>
> kr
>
> val spark = SparkSession
>  .builder()
>  .master("local")
>  .appName("Spark Mongo Example")
>  .getOrCreate()
> spark.conf.set("spark.mongodb.input.uri", "mongodb://127.0.0.1:27017/
> ")
> spark.conf.set("spark.mongodb.output.uri", "mongodb://127.0.0.1:27017/
> ")
> spark.conf.set("spark.mongodb.output.database", "test")
>
> println(s"SparkPRoperties:${spark.conf.getAll}")
>
>
> val df = getDataFrame(spark) // Loading any dataframe from a file
>
> df.printSchema()
>
> println(s"Head:${df.head()}")
> println(s"Count:${df.count()}")
> println("##  SAVING TO MONGODB #")
> import com.mongodb.spark.config._
>
> import com.mongodb.spark.config._
>
> val writeConfig = WriteConfig(Map("collection" -> "spark",
> "writeConcern.w" -> "majority"), Some(WriteConfig(spark.sparkContext)))
> MongoSpark.save(df, writeConfig)
>
>
>
>


Re: Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-16 Thread Palash Gupta
Hi,
Example:
dframe = 
sqlContext.read.format("com.mongodb.spark.sql.DefaultSource").option("spark.mongodb.input.uri",
 " 
mongodb://user:pass@172.26.7.192:27017/db_name.collection_name").load()dframe.printSchema()
One more thing if you create one db in mongo, please create a collection with a 
record. Otherwise mongo may not keep that db if online session die
//Palash

Sent from Yahoo Mail on Android 
 
  On Tue, 17 Jan, 2017 at 12:44 pm, Palash Gupta 
wrote:   Hi Marco,
What is the user and password you are using for mongodb connection? Did you 
enable authorization?
Better to include user & pass in mongo url.
I remember I tested with python successfully.
Best Regards,Palash

Sent from Yahoo Mail on Android 
 
  On Tue, 17 Jan, 2017 at 5:37 am, Marco Mistroni wrote:   
hi all
 i have the folllowign snippet which loads a dataframe from  a csv file and 
tries to save
it to mongodb.
For some reason, the MongoSpark.save method raises the following exception

Exception in thread "main" java.lang.IllegalArgumentException: Missing database 
name. Set via the 'spark.mongodb.output.uri' or 'spark.mongodb.output.database' 
property
    at 
com.mongodb.spark.config.MongoCompanionConfig$class.databaseName(MongoCompanionConfig.scala:260)
    at com.mongodb.spark.config.WriteConfig$.databaseName(WriteConfig.scala:36)

Which is bizzarre as i m pretty sure i am setting all the necessary properties 
in the SparkConf

could you kindly assist?

I am running Spark 2.0.1 locally with a local mongodb instance running at 
127.0.0.1:27017
I am using version 2.0.0 of mongo-spark-connector
I am running on Scala 2.11

kr

val spark = SparkSession
 .builder()
 .master("local")
 .appName("Spark Mongo Example")
 .getOrCreate() 
    spark.conf.set("spark.mongodb.input.uri", "mongodb://127.0.0.1:27017/")
    spark.conf.set("spark.mongodb.output.uri", "mongodb://127.0.0.1:27017/")
    spark.conf.set("spark.mongodb.output.database", "test")
    
    println(s"SparkPRoperties:${spark.conf.getAll}")
    
    
    val df = getDataFrame(spark) // Loading any dataframe from a file
    
    df.printSchema()

    println(s"Head:${df.head()}")
    println(s"Count:${df.count()}")
    println("##  SAVING TO MONGODB #")
    import com.mongodb.spark.config._
    
    import com.mongodb.spark.config._

    val writeConfig = WriteConfig(Map("collection" -> "spark", "writeConcern.w" 
-> "majority"), Some(WriteConfig(spark.sparkContext)))
    MongoSpark.save(df, writeConfig)



  
  


Re: Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-16 Thread Palash Gupta
Hi Marco,
What is the user and password you are using for mongodb connection? Did you 
enable authorization?
Better to include user & pass in mongo url.
I remember I tested with python successfully.
Best Regards,Palash

Sent from Yahoo Mail on Android 
 
  On Tue, 17 Jan, 2017 at 5:37 am, Marco Mistroni wrote:   
hi all
 i have the folllowign snippet which loads a dataframe from  a csv file and 
tries to save
it to mongodb.
For some reason, the MongoSpark.save method raises the following exception

Exception in thread "main" java.lang.IllegalArgumentException: Missing database 
name. Set via the 'spark.mongodb.output.uri' or 'spark.mongodb.output.database' 
property
    at 
com.mongodb.spark.config.MongoCompanionConfig$class.databaseName(MongoCompanionConfig.scala:260)
    at com.mongodb.spark.config.WriteConfig$.databaseName(WriteConfig.scala:36)

Which is bizzarre as i m pretty sure i am setting all the necessary properties 
in the SparkConf

could you kindly assist?

I am running Spark 2.0.1 locally with a local mongodb instance running at 
127.0.0.1:27017
I am using version 2.0.0 of mongo-spark-connector
I am running on Scala 2.11

kr

val spark = SparkSession
 .builder()
 .master("local")
 .appName("Spark Mongo Example")
 .getOrCreate() 
    spark.conf.set("spark.mongodb.input.uri", "mongodb://127.0.0.1:27017/")
    spark.conf.set("spark.mongodb.output.uri", "mongodb://127.0.0.1:27017/")
    spark.conf.set("spark.mongodb.output.database", "test")
    
    println(s"SparkPRoperties:${spark.conf.getAll}")
    
    
    val df = getDataFrame(spark) // Loading any dataframe from a file
    
    df.printSchema()

    println(s"Head:${df.head()}")
    println(s"Count:${df.count()}")
    println("##  SAVING TO MONGODB #")
    import com.mongodb.spark.config._
    
    import com.mongodb.spark.config._

    val writeConfig = WriteConfig(Map("collection" -> "spark", "writeConcern.w" 
-> "majority"), Some(WriteConfig(spark.sparkContext)))
    MongoSpark.save(df, writeConfig)



  


Spark vs MongoDB: saving DataFrame to db raises missing database name exception

2017-01-16 Thread Marco Mistroni
hi all
 i have the folllowign snippet which loads a dataframe from  a csv file and
tries to save
it to mongodb.
For some reason, the MongoSpark.save method raises the following exception

Exception in thread "main" java.lang.IllegalArgumentException: Missing
database name. Set via the 'spark.mongodb.output.uri' or
'spark.mongodb.output.database' property
at
com.mongodb.spark.config.MongoCompanionConfig$class.databaseName(MongoCompanionConfig.scala:260)
at
com.mongodb.spark.config.WriteConfig$.databaseName(WriteConfig.scala:36)

Which is bizzarre as i m pretty sure i am setting all the necessary
properties in the SparkConf

could you kindly assist?

I am running Spark 2.0.1 locally with a local mongodb instance running at
127.0.0.1:27017
I am using version 2.0.0 of mongo-spark-connector
I am running on Scala 2.11

kr

val spark = SparkSession
 .builder()
 .master("local")
 .appName("Spark Mongo Example")
 .getOrCreate()
spark.conf.set("spark.mongodb.input.uri", "mongodb://127.0.0.1:27017/")
spark.conf.set("spark.mongodb.output.uri", "mongodb://127.0.0.1:27017/")
spark.conf.set("spark.mongodb.output.database", "test")

println(s"SparkPRoperties:${spark.conf.getAll}")


val df = getDataFrame(spark) // Loading any dataframe from a file

df.printSchema()

println(s"Head:${df.head()}")
println(s"Count:${df.count()}")
println("##  SAVING TO MONGODB #")
import com.mongodb.spark.config._

import com.mongodb.spark.config._

val writeConfig = WriteConfig(Map("collection" -> "spark",
"writeConcern.w" -> "majority"), Some(WriteConfig(spark.sparkContext)))
MongoSpark.save(df, writeConfig)