[ 
https://issues.apache.org/jira/browse/YUNIKORN-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479117#comment-17479117
 ] 

Anuraag Nalluri edited comment on YUNIKORN-949 at 1/20/22, 6:58 AM:
--------------------------------------------------------------------

Hi [~pbacsko] and [~wwei]. Do you have a recommendation as to what retention 
policy should be implemented? As suggested by Weiwei, I'm thinking of adding 
another route that specifies the max file size in MB. If the output file 
exceeds this limit, we should remove the entries in FIFO order (from top of the 
file). 

If this is unspecified, we should probably use a fraction of the total space 
available to us on the scheduler container as a reasonable default for the file 
size limit. However, if the container is very constrained on free space, this 
still doesn't guarantee the log file can reach its max allowed limit without 
causing OOM. If this idea is ok, we can pass the container memory limit through 
downward API.


was (Author: JIRAUSER283086):
Hi [~pbacsko] and [~wwei]. Do you have a recommendation as to what retention 
policy should be implemented? As suggested by Weiwei, I'm thinking of adding 
another route that specifies the max file size in MB. If the output file 
exceeds this limit, we should remove the entries in FIFO order (from top of the 
file). 

If this is unspecified, we should probably use a fraction of the total space 
available to us on the scheduler container as a reasonable default for the file 
size limit. However, if the container is very constrained on free space, this 
still doesn't guarantee the log file can reach its max allowed limit without 
causing OOM. If this idea is ok, we can pass the container memory limit through 
downward API.

But if we want to have higher confidence of not running in to OOM, perhaps we 
can use OS commands to get the remaining "free" space on container and set the 
default as a fraction of that. What do you guys think? 

> Location of the state dump file should be configurable
> ------------------------------------------------------
>
>                 Key: YUNIKORN-949
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-949
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler
>            Reporter: Peter Bacsko
>            Assignee: Anuraag Nalluri
>            Priority: Major
>
> In YUNIKORN-940, the periodic state dump feature was introduced.
> However, the location of the file is fixed: it's the current working 
> directory of the YK scheduler binary. This can become a problem with docker 
> containers having a small free space or if the user wants the state to be 
> logged frequently.
> The location of the file should be configurable, so it can be written an 
> external volume.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to