hvanhovell commented on issue #23762: [SPARK-21492][SQL][WIP] Memory leak in 
SortMergeJoin
URL: https://github.com/apache/spark/pull/23762#issuecomment-464704518
 
 
   @taosaildrone I have two problems with this approach:
   
   1. This only works for a SMJ with Sorts as its direct input. That is the 
common case, however this won't work for any situation when there are operators 
in between the SMJ and the sort (for example when the sort is used between 
operators). 
   2. I am not sure if it safe to assume that you can close an underlying child 
like this. You are toast as soon as something used in a downstream operator 
still points to a sort buffer you are releasing. This should generally not be 
an issue because it would probably has surfaced as a bug. However this is a 
pretty low bar to clear as it also depends on usage patterns. I would like to 
see a more principled approach here where you have both improved API (close on 
iterators???) and well defined semantics.
   
   We should discuss this on the dev list.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to