srowen commented on PR #45157: URL: https://github.com/apache/spark/pull/45157#issuecomment-1954623553
The issue is roughly: let's say you want to schedule a task that needs 1/9th of a GPU, and so you request resource amount `1.0/9.0`. Let's say that float is just a tiny bit bigger than 1/9th. After 8 of those resource requests have gone through, the amount of resource left is less than 1/9th, so the 9th can't schedule. I think trying to do integer math after floating-point still has the issue; the value is already 'wrong' when it comes in as a double in the API. It may in practice not come up a lot, as often these values are set as a Spark config, as a string, and a user is going to write "0.111" or something; some prefix of what is notionally a big repeating decimal, and so they will ask for less than 1/9th (say) of a GPU and schedule 9 as desired, leaving Spark thinking there's a tiny bit of GPU available when there isn't, but certainly nothing like 1/9th. It's still a bit ugly and yeah the API kind of needs to take a different input type to _really_ address this but maybe not worth breakage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
