sarutak commented on PR #54575:
URL: https://github.com/apache/spark/pull/54575#issuecomment-3998215797

   @dongjoon-hyun Thank you for your feedback.
   
   > Do you want to have per-directory configurations in the future?
   
   I considered it might be helpful to have per-directory configurations (e.g. 
`spark.history.fs.cleaner.*`) but this such configurations are not supported at 
least in this PR, and I'd like to start with simple global settings and improve 
based on user feedback.
   
   > For now, spark.history.fs.update.interval is supposed to be applied for 
one scan for all directories?
   
   Yes.
   
   > spark.history.fs.cleaner.interval is also supposed to be applied for one 
scan for all directories?
   
   Yes.
   
   > When spark.history.fs.cleaner.maxNum is applied,
   > This PR will consider the total number of files for all directories, right?
   > Which directory will be selected as a victim for the tie?
   
   Yes, the property is applied to the total number of log entries across all 
directories. As the updated document says, when the limit is exceeded, the 
oldest completed attempts are deleted first regardless of which directory they 
belong to.
   
   > Since this introduces lots of ambiguity a little, could you revise the PR 
title and provide a corresponding documentation update, docs, together in this 
PR?
   
   Updated (You said `revise the PR title` but I thought it's type for PR 
description so I've updated only the description).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to