
I was writing some docs on Spark P&T and came across this.

It is about the terminology or interpretation of that in Spark doc.

This is my understanding of cores and threads.

 Cores are physical cores. Threads are virtual cores. Cores with 2 threads
is called hyper threading technology so 2 threads per core makes the core
work on two loads at same time. In other words, every thread takes care of
one load.

Core has its own memory. So if you have a dual core with hyper threading,
the core works with 2 loads each at same time because of the 2 threads per
core, but this 2 threads will share memory in that core.

Some vendors as I am sure most of you aware charge licensing per core.

For example on the same host that I have Spark, I have a SAP product that
checks the licensing and shuts the application down if the license does not
agree with the cores speced.

This is what it says

License hostid:        00e04c69159a 0050b60fd1e7
Detected 12 logical processor(s), 6 core(s), in 1 chip(s)

So here I have 12 logical processors  and 6 cores and 1 chip. I call
logical processors as threads so I have 12 threads?

Now if I go and start worker process ${SPARK_HOME}/sbin/start-slaves.sh, I
see this in GUI page

[image: Inline images 1]

it says 12 cores but I gather it is threads?

Spark document
<http://spark.apache.org/docs/latest/submitting-applications.html> states
and I quote

[image: Inline images 2]

OK the line local[k] adds  ..  *set this to the number of cores on your

But I know that it means threads. Because if I went and set that to 6, it
would be only 6 threads as opposed to 12 threads.

the next line local[*] seems to indicate it correctly as it refers to
"logical cores" that in my understanding it is threads.

I trust that I am not nitpicking here!


Dr Mich Talebzadeh

LinkedIn * 


Reply via email to