subject:"Spark Distribution of Small Dataset"

Re: Spark Distribution of Small Dataset

2016-01-28 Thread Kevin Mellott

n Thu, Jan 28, 2016 at 4:41 AM, Philip Lee wrote: > Hi, > > Simple Question about Spark Distribution of Small Dataset. > > Let's say I have 8 machine with 48 cores and 48GB of RAM as a cluster. > Dataset (format is ORC by Hive) is so small like 1GB, but I copied it to >

Spark Distribution of Small Dataset

2016-01-28 Thread Philip Lee

Hi, Simple Question about Spark Distribution of Small Dataset. Let's say I have 8 machine with 48 cores and 48GB of RAM as a cluster. Dataset (format is ORC by Hive) is so small like 1GB, but I copied it to HDFS. 1) if spark-sql run the dataset distributed on HDFS in each machine, what ha