Re: cache spark sql parquet file in memory?

2014-06-07 Thread Michael Armbrust
Not a stupid question!  I would like to be able to do this.  For now, you
might try writing the data to tachyon http://tachyon-project.org/ instead
of HDFS.  This is untested though, please report any issues you run into.

Michael


On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen xche...@gmail.com wrote:

 This might be a stupid question... but it seems that saveAsParquetFile()
 writes everything back to HDFS. I am wondering if it is possible to cache
 parquet-format intermediate results in memory, and therefore making spark
 sql queries faster.

 Thanks.
 -Simon



Re: cache spark sql parquet file in memory?

2014-06-07 Thread Marek Wiewiorka
I was also thinking of using tachyon to store parquet files - maybe
tomorrow I will give a try as well.


2014-06-07 20:01 GMT+02:00 Michael Armbrust mich...@databricks.com:

 Not a stupid question!  I would like to be able to do this.  For now, you
 might try writing the data to tachyon http://tachyon-project.org/
 instead of HDFS.  This is untested though, please report any issues you run
 into.

 Michael


 On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen xche...@gmail.com wrote:

 This might be a stupid question... but it seems that saveAsParquetFile()
 writes everything back to HDFS. I am wondering if it is possible to cache
 parquet-format intermediate results in memory, and therefore making spark
 sql queries faster.

 Thanks.
 -Simon





Re: cache spark sql parquet file in memory?

2014-06-07 Thread Xu (Simon) Chen
Is there a way to start tachyon on top of a yarn cluster?
 On Jun 7, 2014 2:11 PM, Marek Wiewiorka marek.wiewio...@gmail.com
wrote:

 I was also thinking of using tachyon to store parquet files - maybe
 tomorrow I will give a try as well.


 2014-06-07 20:01 GMT+02:00 Michael Armbrust mich...@databricks.com:

 Not a stupid question!  I would like to be able to do this.  For now, you
 might try writing the data to tachyon http://tachyon-project.org/
 instead of HDFS.  This is untested though, please report any issues you run
 into.

 Michael


 On Fri, Jun 6, 2014 at 8:13 PM, Xu (Simon) Chen xche...@gmail.com
 wrote:

 This might be a stupid question... but it seems that saveAsParquetFile()
 writes everything back to HDFS. I am wondering if it is possible to cache
 parquet-format intermediate results in memory, and therefore making spark
 sql queries faster.

 Thanks.
 -Simon






cache spark sql parquet file in memory?

2014-06-06 Thread Xu (Simon) Chen
This might be a stupid question... but it seems that saveAsParquetFile()
writes everything back to HDFS. I am wondering if it is possible to cache
parquet-format intermediate results in memory, and therefore making spark
sql queries faster.

Thanks.
-Simon