[jira] [Comment Edited] (SPARK-25380) Generated plans occupy over 50% of Spark driver memory

2018-09-27 Thread Jungtaek Lim (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631295#comment-16631295
 ] 

Jungtaek Lim edited comment on SPARK-25380 at 9/28/18 3:41 AM:
---

IMHO it depends on how we see the issue and how we would like to tackle this.

If we think 200M of plan string is normal and usual, you're right the issue 
lays in UI and UI should deal with it well.
 (Even 200M of single plan would be out of expectation on end users and they 
might miss to consider allocating enough space on driver side for UI, so 
purging old plan would work for some cases but not for some other cases.)

If we don't think 200M of plan string is normal, we need to see actual case and 
investigate which physical node occupies much space on representing, and 
whether they're really needed or too verbose. If the huge string came from 
representing physical node itself which doesn't change among batches, we may be 
able to try storing template format of message for physical node and variables 
separately and apply just when page is requested.

If we know more, we could have better solution.

Since we are unlikely to get reproducer, I wouldn't want to block anyone to 
work on this. Anyone could tackle on UI issue.

EDIT: I might misunderstand your previous comment, so just removed the lines I 
mentioned it.


was (Author: kabhwan):
IMHO it depends on how we see the issue and how we would like to tackle this.

If we think 200M of plan string is normal and usual, you're right the issue 
lays in UI and UI should deal with it well.
(Even 200M of single plan would be out of expectation on end users and they 
might miss to consider allocating enough space on driver side for UI, so 
purging old plan would work for some cases but not for some other cases.)

If we don't think 200M of plan string is normal, we need to see actual case and 
investigate which physical node occupies much space on representing, and 
whether they're really needed or too verbose. If the huge string came from 
representing physical node itself which doesn't change among batches, we may be 
able to try storing template format of message for physical node and variables 
separately and apply just when page is requested.

If we know more, we could have better solution: according to your previous 
comment, I guess we're on the same page:
{quote}They seem to hold a lot more memory than just the plan graph structures 
do, it would be nice to know what exactly is holding on to that memory.
{quote}
Since we are unlikely to get reproducer, I wouldn't want to block anyone to 
work on this. Anyone could tackle on UI issue.

> Generated plans occupy over 50% of Spark driver memory
> --
>
> Key: SPARK-25380
> URL: https://issues.apache.org/jira/browse/SPARK-25380
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
> Environment: Spark 2.3.1 (AWS emr-5.16.0)
>  
>Reporter: Michael Spector
>Priority: Minor
> Attachments: Screen Shot 2018-09-06 at 23.19.56.png, Screen Shot 
> 2018-09-12 at 8.20.05.png, heapdump_OOM.png, image-2018-09-16-14-21-38-939.png
>
>
> When debugging an OOM exception during long run of a Spark application (many 
> iterations of the same code) I've found that generated plans occupy most of 
> the driver memory. I'm not sure whether this is a memory leak or not, but it 
> would be helpful if old plans could be purged from memory anyways.
> Attached are screenshots of OOM heap dump opened in JVisualVM.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-25380) Generated plans occupy over 50% of Spark driver memory

2018-09-16 Thread Nir Hedvat (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616675#comment-16616675
 ] 

Nir Hedvat edited comment on SPARK-25380 at 9/16/18 11:21 AM:
--

Experiencing the same problem

  !image-2018-09-16-14-21-38-939.png! 


was (Author: nir hedvat):
Same problem here (using Spark 2.3.1)

> Generated plans occupy over 50% of Spark driver memory
> --
>
> Key: SPARK-25380
> URL: https://issues.apache.org/jira/browse/SPARK-25380
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
> Environment: Spark 2.3.1 (AWS emr-5.16.0)
>  
>Reporter: Michael Spector
>Priority: Minor
> Attachments: Screen Shot 2018-09-06 at 23.19.56.png, Screen Shot 
> 2018-09-12 at 8.20.05.png, heapdump_OOM.png, image-2018-09-16-14-21-38-939.png
>
>
> When debugging an OOM exception during long run of a Spark application (many 
> iterations of the same code) I've found that generated plans occupy most of 
> the driver memory. I'm not sure whether this is a memory leak or not, but it 
> would be helpful if old plans could be purged from memory anyways.
> Attached are screenshots of OOM heap dump opened in JVisualVM.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org