anuraagnalluri commented on pull request #369: URL: https://github.com/apache/yunikorn-k8shim/pull/369#issuecomment-1081728125
@yangwwei Done, and changed necessary imports. Thanks for getting another pair of eyes on this. I was able to reproduce the error locally a couple times in plugin mode, but am still unsure why the allocations list is empty. When I ran in to the same failure as we see in CI checks, I was able to verify that the applicationID of the sleepjob pod belongs to the newly added `recovery_and_restart` suite and _not_ `basic_scheduling_test`. My initial thought was that a "completed" sleepjob with 0 allocations from a previous test could have been picked up, but this is not the case (as that test also tears down the namespace in cleanup). I could see the sleeppod was in "Running" state and ultimately could not identify any metadata differences in the failing case vs. when it's deployed in passing test runs. Is it possible that plugin-mode logic could specifically affect this behavior in a way normal mode cannot? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
