looks like you have RDD of JSON. Try this: 
http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets 
<http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets>

Mohit Jaggi
Founder,
Data Orchard LLC
www.dataorchardllc.com




> On Nov 28, 2016, at 9:49 AM, Jun Kim <i2r....@gmail.com> wrote:
> 
> Hi, Mark!
> 
> Which kind of error message do you get?
> 
> The simplest way to convert RDD to DF is just importing implicits and use toDF
> 
> import spark.implicits._
> val df = rdd.toDF
> 
> :-)
> 
> 2016년 11월 29일 (화) 오전 1:26, Mark Mikolajczak <m...@flayranalytics.co.uk 
> <mailto:m...@flayranalytics.co.uk>>님이 작성:
> 
> 
> Hi All,
> 
> Hoping you can help:
> 
> 
> I have created an RDD from a NOSQL database and I want to convert the RDD to 
> a data frame. I have tried many options but all result in errors.
> 
>     val df = sc.couchbaseQuery(test).map(_.value).collect().foreach(println)
> 
> 
> {"accountStatus":"AccountOpen","custId":"140034"}
> {"accountStatus":"AccountOpen","custId":"140385"}
> {"accountStatus":"AccountClosed","subId":"10795","custId":"139698","subStatus":"Active"}
> {"accountStatus":"AccountClosed","subId":"11364","custId":"140925","subStatus":"Paused"}
> {"accountStatus":"AccountOpen","subId":"10413","custId":"138842","subStatus":"Active"}
> {"accountStatus":"AccountOpen","subId":"10414","custId":"138842","subStatus":"Active"}
> {"accountStatus":"AccountClosed","subId":"11314","custId":"140720","subStatus":"Paused"}
> {"accountStatus":"AccountOpen","custId":"139166"}
> {"accountStatus":"AccountClosed","subId":"10735","custId":"139558","subStatus":"Paused"}
> {"accountStatus":"AccountOpen","custId":"139575"}
> df: Unit = ()
> I have tried adding .toDF() to the end of my code and also creating a schema 
> and using createDataFrame but receive errors. Whats the best approach to 
> converting the RDD to Dataframe?
> 
> import org.apache.spark.sql.types._
> 
> // The schema is encoded in a string
> val schemaString = "accountStatus subId custId subStatus"
> 
> // Generate the schema based on the string of schema
> val fields = schemaString.split(" ")
>   .map(fieldName => StructField(fieldName, StringType, nullable = true))
> val schema = StructType(fields)
> //
> 
> val peopleDF = spark.createDataFrame(df,schema)
> 
> -- 
> Taejun Kim
> 
> Data Mining Lab.
> School of Electrical and Computer Engineering
> University of Seoul

Reply via email to