Re: Spark Shell doesnt seem to use spark workers but Spark Submit does.

kant kodali Wed, 23 Nov 2016 12:58:47 -0800

Sorry please ignore this if you like. Looks like the network throughput is
very low but every worker/executor machine is indeed working.

My current incoming Network throughput on each worker machine is about
2.5KB/s (Kilobyte per second) so this needs to go somewhere in 5MB-6MB/s
and that means somehow the table scan to do the count of billion rows in
Cassandra is not being done in parallel.

On Wed, Nov 23, 2016 at 12:45 PM, kant kodali <kanth...@gmail.com> wrote:

> Hi All,
>
>
> Spark Shell doesnt seem to use spark workers but Spark Submit does. I had
> the workers ips listed under conf/slaves file.
>
> I am trying to count number of rows in Cassandra using spark-shell  so I
> do the following on spark master
>
> val df = spark.sql("SELECT test from hello") // This has about billion rows
>
> scala> df.count
>
> [Stage 0:=>  (686 + 2) / 24686] // What are these numbers precisely?
>
>  This is taking forever so I checked the I/O, CPU, Network usage using
> dstat, iostat and so on it looks like nothing is going on in worker
> machines but for master I can see it.
>
> I am using spark 2.0.2
>
> Any ideas on what is going on? and how to fix this?
>
> Thanks,
>
> kant
>
>
>

Re: Spark Shell doesnt seem to use spark workers but Spark Submit does.

Reply via email to