Hi All,
Hoping you can help: I have created an RDD from a NOSQL database and I want to convert the RDD to a data frame. I have tried many options but all result in errors. val df = sc.couchbaseQuery(test).map(_.value).collect().foreach(println) {"accountStatus":"AccountOpen","custId":"140034"} {"accountStatus":"AccountOpen","custId":"140385"} {"accountStatus":"AccountClosed","subId":"10795","custId":"139698","subStatus":"Active"} {"accountStatus":"AccountClosed","subId":"11364","custId":"140925","subStatus":"Paused"} {"accountStatus":"AccountOpen","subId":"10413","custId":"138842","subStatus":"Active"} {"accountStatus":"AccountOpen","subId":"10414","custId":"138842","subStatus":"Active"} {"accountStatus":"AccountClosed","subId":"11314","custId":"140720","subStatus":"Paused"} {"accountStatus":"AccountOpen","custId":"139166"} {"accountStatus":"AccountClosed","subId":"10735","custId":"139558","subStatus":"Paused"} {"accountStatus":"AccountOpen","custId":"139575"} df: Unit = () I have tried adding .toDF() to the end of my code and also creating a schema and using createDataFrame but receive errors. Whats the best approach to converting the RDD to Dataframe? import org.apache.spark.sql.types._ // The schema is encoded in a string val schemaString = "accountStatus subId custId subStatus" // Generate the schema based on the string of schema val fields = schemaString.split(" ") .map(fieldName => StructField(fieldName, StringType, nullable = true)) val schema = StructType(fields) // val peopleDF = spark.createDataFrame(df,schema)