wilfred-s commented on issue #89: Core allocation/reservation logic renovation URL: https://github.com/apache/incubator-yunikorn-core/pull/89#issuecomment-587269592 New commits pushed with the smoke tests and further clean up. To link this back to the comment from @yangwwei: comment 1: (https://github.com/apache/incubator-yunikorn-core/pull/89#issuecomment-584913754) remarks: 1. internal unreserve: fixed the issue found (commit 1) 1. tryAllocate correctly unreserves (commit 2) 1. if reserved allocation fails try all nodes (commit 1) comment 2 (https://github.com/apache/incubator-yunikorn-core/pull/89#issuecomment-585361646) remarks 1. running predicates for each allocation try for each node in all cycles will cause a huge slow down. For example in a 100 node cluster 1 predicates check is run if there are enough resources available on the node. If we do it before that check and let it lead us and we have 99 nodes that do not fit the ask we would have run the predicates 100 times for the same alloc. Caching the predicate run is not possible as node usage can change and thus the predicate outcome would change. I think that is 1) is thus a no go. 1. the score used is really basic at the moment. However I could argue for or against all scores. A large node might have a longer average runtime per allocation (service type load) and thus release less often. Without metrics we really cannot argue for one or the other or for a 3rd alternative. 1. Yes we need better metrics, I will follow up with a new jira For the test failures: I have seen a number of them and they are transient. The tests use a manual scheduler (steps based on a counter). The manual scheduling in the smoke tests is I think the cause of the issue. The duration of the scheduling cycle is short and also cut even shorter when nothing needs to be done. We probably _waste_ scheduling cycles because we have nothing to do. When events are later processed we have no scheduling cycles to progress. I am thinking about a better solution or even using continuous scheduling in the smoke tests.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
