I am pleased with the release of the DataFrame API. However, I started
playing with it, and neither of the two main examples in the documentation
work: http://spark.apache.org/docs/1.3.0/sql-programming-guide.html
Specfically:
- Inferring the Schema Using Reflection
- Programmatically Specifying the Schema
Scala 2.11.6
Spark 1.3.0 prebuilt for Hadoop 2.4 and later
*Inferring the Schema Using Reflection*
scala> people.registerTempTable("people")
<console>:31: error: value registerTempTable is not a member of
org.apache.spark
.rdd.RDD[Person]
people.registerTempTable("people")
^
*Programmatically Specifying the Schema*
scala> val peopleDataFrame = sqlContext.createDataFrame(people, schema)
<console>:41: error: overloaded method value createDataFrame with
alternatives:
(rdd: org.apache.spark.api.java.JavaRDD[_],beanClass:
Class[_])org.apache.spar
k.sql.DataFrame <and>
(rdd: org.apache.spark.rdd.RDD[_],beanClass:
Class[_])org.apache.spark.sql.Dat
aFrame <and>
(rowRDD:
org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],columns:
java.util.List[String])org.apache.spark.sql.DataFrame <and>
(rowRDD:
org.apache.spark.api.java.JavaRDD[org.apache.spark.sql.Row],schema: o
rg.apache.spark.sql.types.StructType)org.apache.spark.sql.DataFrame <and>
(rowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row],schema:
org.apache
.spark.sql.types.StructType)org.apache.spark.sql.DataFrame
cannot be applied to (org.apache.spark.rdd.RDD[String],
org.apache.spark.sql.ty
pes.StructType)
val df = sqlContext.createDataFrame(people, schema)
Any help would be appreciated.
David