Re: SparkSQL + Parquet performance

2015-04-13 Thread Akhil Das
That totally depends on your disk IO and the number of CPUs that you have
in the cluster. For example, if you are having a disk IO of 100MB/s and a
handful of CPUs ( say 40 cores, on 10 machines), then it could take you to
~ 1GB/Sec i believe.

Thanks
Best Regards

On Tue, Apr 7, 2015 at 2:48 AM, Paolo Platter 
wrote:

>  Hi all,
>
>  is there anyone using SparkSQL + Parquet that has made a benchmark
> about storing parquet files on HDFS or on CFS ( Cassandra File System )?
>  What storage can improve performance of SparkSQL+ Parquet ?
>
>  Thanks
>
>  Paolo
>
>


SparkSQL + Parquet performance

2015-04-06 Thread Paolo Platter
Hi all,

is there anyone using SparkSQL + Parquet that has made a benchmark  about 
storing parquet files on HDFS or on CFS ( Cassandra File System )?
What storage can improve performance of SparkSQL+ Parquet ?

Thanks

Paolo