The memory usage of blocks of data received through Spark Streaming is not
reflected in the Spark UI. It only shows the memory usage due to cached
RDDs.
I didnt find a JIRA for this, so I opened a new one.

https://issues.apache.org/jira/browse/SPARK-4072


TD

On Thu, Oct 23, 2014 at 12:47 AM, Haopu Wang <hw...@qilinsoft.com> wrote:

>  Patrick, thanks for the response. May I ask more questions?
>
>
>
> I'm running a Spark Streaming application which receives data from socket
> and does some transformations.
>
>
>
> The event injection rate is too high so the processing duration is larger
> than batch interval.
>
>
>
> So I see "Could not compute split, block input-0-1414049609200 not found"
> issue as discussed by others in this post: "
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-td11186.html#a11237
> "
>
>
>
> If the understanding is correct, Spark is lack of storage in this case
> because of event pile-up, so it needs to delete some splits in order to
> free memory.
>
>
>
> However, even in this case, I still see very small number (like 3MB) in
> the "Memory Used" column where the total memory seems to be quite big (like
> 6GB). So I think the number shown in this column may have problems.
>
>
>
> How do Spark calculate the total memory based on allocated JVM heap size?
> I guess it's related with the "spark.storage.memoryFraction" configuration,
> but want to know the details.
>
> And why the driver also uses memory to store RDD blocks?
>
>
>
> Thanks again for the answer!
>
>
>  ------------------------------
>
> *From:* Patrick Wendell [mailto:pwend...@gmail.com]
> *Sent:* 2014年10月23日 14:00
> *To:* Haopu Wang
> *Cc:* user
> *Subject:* Re: About "Memory usage" in the Spark UI
>
>
>
> It shows the amount of memory used to store RDD blocks, which are created
> when you run .cache()/.persist() on an RDD.
>
>
>
> On Wed, Oct 22, 2014 at 10:07 PM, Haopu Wang <hw...@qilinsoft.com> wrote:
>
> Hi, please take a look at the attached screen-shot. I wonders what's the
> "Memory Used" column mean.
>
>
>
> I give 2GB memory to the driver process and 12GB memory to the executor
> process.
>
>
>
> Thank you!
>
>
>
>
>

Reply via email to