That totally depends on your disk IO and the number of CPUs that you have
in the cluster. For example, if you are having a disk IO of 100MB/s and a
handful of CPUs ( say 40 cores, on 10 machines), then it could take you to
~ 1GB/Sec i believe.
Thanks
Best Regards
On Tue, Apr 7, 2015 at 2:48 AM, Paolo Platter
wrote:
> Hi all,
>
> is there anyone using SparkSQL + Parquet that has made a benchmark
> about storing parquet files on HDFS or on CFS ( Cassandra File System )?
> What storage can improve performance of SparkSQL+ Parquet ?
>
> Thanks
>
> Paolo
>
>