pbacsko commented on PR #502:
URL: https://github.com/apache/yunikorn-core/pull/502#issuecomment-1433407755

   > Looking through the changes I am not too worried about the new config not 
being OK to apply. The config validation looks at the new config and makes sure 
it all parses correctly and that the structure is valid. If we have a gap in 
that area we should update the config validator. 
   
   I checked the config validator. Yes, resource and ACL strings can cause 
problems and those are validated.
   The method `Queue.setTemplate()` can return an error and it's not validated, 
however if this fails we swallow it and return a nil without printing anything. 
This is suspect.
   
   > 
   > * the in memory config shows a queue `root.parent.leaf`  The newly created 
config has only got `root.parent`. Do we remove the queue `leaf`? Parent does 
not change type to a leaf and it was an auto parent before. How do we handle 
this...
   By default, `leaf` is marked for removal and goes into `Draining` state (if 
it is a managed queue). Applications in that queue keep running. In case of a 
parent, the entire hierarchy below it is marked for removal.
   
   > * the in memory config shows `root.my-queue` defined as a leaf. The newly 
created config has `root.my-queue.spark`. That as far as I can tell causes an 
automatic type change to parent for my-queue. What happens with the 
applications that run there?
   I think that is what we partly discussed in the JIRA ticket. We can create a 
new queue like `root.preserved` and move existing applications into that queue. 
Plus we have to update queue references in the `Application` objects. I can do 
a POC in this PR and let's see if it is any good.
   
   > * the in memory config shows `root.my-queue.spark` Both `my-queue` and 
`spark` are dynamic queues. The newly created config has `root.my-queue` as a 
leaf. That breaks things badly I think.
   After some investigation, it looks like the dynamic queues are automatically 
cleaned up by the partition manager as soon as they become empty. Other than 
that, I don't see special treatment.
   I suggest the same approach, running applications should be moved to a 
pre-defined queue.
   
   Another thing that is important that asks and allocations cause the 
priorities to be updated and re-calculated. Basically, we also need to do two 
things:
   1. Make sure that no preemption/scheduling cycle runs during the config 
update
   2. Recalculate all priorities if there was a change
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to