[GitHub] [spark] HeartSaVioR commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

GitBox Wed, 15 Jul 2020 14:32:35 -0700


HeartSaVioR commented on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-659021386



   Well, I guess I already explained why `compactLogs` is the culprit of the 
memory issue, right? 
(https://github.com/apache/spark/pull/28904#discussion_r448049735)
   
   All entries are materialized while compacting, and one of reasons to do so 
was providing these entries into `compactLogs`. In other words, if we don't 
provide all entries into compactLogs in any way to reduce memory usage, it's 
already breaking the existing behavior of `compactLogs`, and it's worse to 
change the semantic without any notice (if it's a public API... I'm not sure 
this is productive discussion as we all know it's private API).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

Reply via email to