Re: Spark process locality

vinay Bajaj Thu, 20 Feb 2014 00:31:29 -0800

Hi Mayur

I am trying to analyse the Apache logs which contains the traffic details.
Basically trying to figure out the statistics on Data points such as total
views from each country and unique URLs. And i have one cluster running
with 4 workers and one master (total space 240GB and 96 cores). And i was
trying some things to make it faster so was stuck with these locality type
of the process.


Regards
Vinay Bajaj


On Wed, Feb 19, 2014 at 11:34 PM, Mayur Rustagi <mayur.rust...@gmail.com>wrote:

> Process local implies the data is cached on the same jvm as the task, node
> local means its cached on the same system but not in the same jvm(on some
> other core perhaps). Wait modification is a tune process depends on your
> system configuration (memory vs disk vs network). I frankly never had to
> modify it..can you share your usecase that is requiring you to do that?
>
> Mayur Rustagi
> Ph: +919632149971
> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
> https://twitter.com/mayur_rustagi
>
>
>
> On Wed, Feb 19, 2014 at 1:59 AM, vinay Bajaj <vbajaj2...@gmail.com> wrote:
>
>> Hi
>>
>> It will be very helpful if anyone could elaborate your ideas on
>> spark.locality.wait and multiple locality levels (process-local,
>> node-local, rack-local and then any) and what is the best configuration i
>> can achieve by modifying this wait and what is the difference between
>> process local and node local.
>>
>> Thanks
>> Vinay Bajaj
>>
>>
>>
>

Re: Spark process locality

Reply via email to