Re: Running spark function on parquet without sql

2015-03-15 Thread Cheng Lian
That's an unfortunate documentation bug in the programming guide... We failed to update it after making the change. Cheng On 2/28/15 8:13 AM, Deborah Siegel wrote: Hi Michael, Would you help me understand the apparent difference here.. The Spark 1.2.1 programming guide indicates: "Note tha

Re: Running spark function on parquet without sql

2015-02-27 Thread Deborah Siegel
Hi Michael, Would you help me understand the apparent difference here.. The Spark 1.2.1 programming guide indicates: "Note that if you call schemaRDD.cache() rather than sqlContext.cacheTable(...), tables will *not* be cached using the in-memory columnar format, and therefore sqlContext.cacheTa

Re: Running spark function on parquet without sql

2015-02-27 Thread Michael Armbrust
> > From Zhan Zhang's reply, yes I still get the parquet's advantage. > You will need to at least use SQL or the DataFrame API (coming in Spark 1.3) to specify the columns that you want in order to get the parquet benefits. The rest of your operations can be standard Spark. My next question is,

Re: Running spark function on parquet without sql

2015-02-27 Thread tridib
ory columnar store when cached the table using cacheTable()? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-function-on-parquet-without-sql-tp21833p21850.html Sent from the Apache Spark User List mailing list arc

Re: Running spark function on parquet without sql

2015-02-26 Thread Zhan Zhang
on be as fast as it would have been if I have used SQL? > > Please advice. > > Thanks & Regards > Tridib > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-function-on-parquet-without-sql-tp21833.

Running spark function on parquet without sql

2015-02-26 Thread tridib
& Regards Tridib -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-spark-function-on-parquet-without-sql-tp21833.html Sent from the Apache Spark User List mailing list archive at Nabble