The memory usage of blocks of data received through Spark Streaming is not reflected in the Spark UI. It only shows the memory usage due to cached RDDs. I didnt find a JIRA for this, so I opened a new one.
https://issues.apache.org/jira/browse/SPARK-4072 TD On Thu, Oct 23, 2014 at 12:47 AM, Haopu Wang <hw...@qilinsoft.com> wrote: > Patrick, thanks for the response. May I ask more questions? > > > > I'm running a Spark Streaming application which receives data from socket > and does some transformations. > > > > The event injection rate is too high so the processing duration is larger > than batch interval. > > > > So I see "Could not compute split, block input-0-1414049609200 not found" > issue as discussed by others in this post: " > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-td11186.html#a11237 > " > > > > If the understanding is correct, Spark is lack of storage in this case > because of event pile-up, so it needs to delete some splits in order to > free memory. > > > > However, even in this case, I still see very small number (like 3MB) in > the "Memory Used" column where the total memory seems to be quite big (like > 6GB). So I think the number shown in this column may have problems. > > > > How do Spark calculate the total memory based on allocated JVM heap size? > I guess it's related with the "spark.storage.memoryFraction" configuration, > but want to know the details. > > And why the driver also uses memory to store RDD blocks? > > > > Thanks again for the answer! > > > ------------------------------ > > *From:* Patrick Wendell [mailto:pwend...@gmail.com] > *Sent:* 2014年10月23日 14:00 > *To:* Haopu Wang > *Cc:* user > *Subject:* Re: About "Memory usage" in the Spark UI > > > > It shows the amount of memory used to store RDD blocks, which are created > when you run .cache()/.persist() on an RDD. > > > > On Wed, Oct 22, 2014 at 10:07 PM, Haopu Wang <hw...@qilinsoft.com> wrote: > > Hi, please take a look at the attached screen-shot. I wonders what's the > "Memory Used" column mean. > > > > I give 2GB memory to the driver process and 12GB memory to the executor > process. > > > > Thank you! > > > > >