Create DataFrame from textFile with unknown columns

olegshirokikh Sun, 05 Apr 2015 20:51:06 -0700

Assuming there is a text file with unknown number of columns, how one would
create a data frame? I have followed the example in Spark Docs where one
first creates a RDD of Rows, but it seems that you have to know exact number
of columns in file and can't to just this:


val rowRDD = sc.textFile("path/file").map(_.split("
|\\,")).map(_.org.apache.spark.sql.Row(_))

The above will work if I'd do ...Row(_(0), _(1), ...) but the number of
columns is unknown.

Also assuming that one has RDD[Row], why .toDF() is not defined on this RDD
type? Is it the only way to call .createDataFrame(...) method to create a DF
out of RDD[Row]?

Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Create-DataFrame-from-textFile-with-unknown-columns-tp22386.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Create DataFrame from textFile with unknown columns

Reply via email to