Hello,
When we call cache() or persist(MEMORY_ONLY), how does the request flow to
the nodes?
I am assuming this will happen:
1. Driver knows which all nodes hold the partition for the given
rdd (where is this info stored?)
2. It sends a cache request to the node's executor
3. The executor will
are you running standalone - local mode or cluster mode. executor and
driver existance differ based on setup type. snapshot of your env UI would
be helpful to say
On Thu, Jan 7, 2016 at 11:51 AM, wrote:
> Hi,
>
>
>
> After I called rdd.persist(*MEMORY_ONLY_SER*), I
...@gmail.com'
Cc: 'user@spark.apache.org'
Subject: Re: Question in rdd caching in memory using persist
I have a standalone cluster. spark version is 1.3.1
From: Prem Sure [mailto:premsure...@gmail.com]
Sent: Thursday, January 07, 2016 12:32 PM
To: Barua, Seemanto (US)
Cc: spark users <u