Hi,
I would like to transform my rdd to a sql.dataframe.Dataframe, is there a
possible conversion to do the job? or what would be the easiest way to do
it?
def ConvertVal(iter):
# some code
return sqlContext.createDataFrame(Row("val1", "val2", "val3", "val4"))
rdd = sc.textFile("").mapPartitions(ConvertVal)
print(type(rdd)) #<class 'pyspark.rdd.PipelinedRDD'>
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/createDataFrame-question-tp26178.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]