[
https://issues.apache.org/jira/browse/YUNIKORN-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887160#comment-17887160
]
Qi Zhu edited comment on YUNIKORN-2895 at 10/6/24 1:04 AM:
-----------------------------------------------------------
Thank you [~corleyma] for this info,
1. "Check if a partition's allocated resource <= total resource of the
partition" This should be fixed in 1.5.2/1.6.0, did you meet this for 1.5.1?
See the jira:
https://issues.apache.org/jira/browse/YUNIKORN-2731
It should be fixed in:
https://issues.apache.org/jira/browse/YUNIKORN-2632
2. Occasional abandoned pods – pending , this should be very specific
situation, can you provide more debug log, dump files about this?
And also need to see the pod event message.
3. Negative resourcesCheck for negative resources in the nodes, it seems a
known issue, and we will have more stable solution for non-yunikorn allocation.
But also valid if you can provide more debug logs / dump files.
Target 1.6.0 non-yunikorn allocation solution:
https://issues.apache.org/jira/browse/YUNIKORN-2791
4. Do you also see orphan allocations for 1.6.0? This Jira want to handle this
according to the slack channel posting.
was (Author: zhuqi):
Thank you [~corleyma] for this info,
1. "Check if a partition's allocated resource <= total resource of the
partition" This should be fixed in 1.5.2/1.6.0, did you meet this for 1.5.1?
See the jira:
https://issues.apache.org/jira/browse/YUNIKORN-2731
It should be fixed in:
https://issues.apache.org/jira/browse/YUNIKORN-2632
2. Occasional abandoned pods – pending , this should be very specific
situation, can you provide more log, dump files about this?
And also need to see the pod event message.
3. Negative resourcesCheck for negative resources in the nodes, it seems a
known issue, and we will have more stable solution for non-yunikorn allocation.
But also valid if you can provide more debug logs / dump files.
Target 1.6.0 non-yunikorn allocation solution:
https://issues.apache.org/jira/browse/YUNIKORN-2791
4. Do you also see orphan allocations for 1.6.0? This Jira want to handle this
according to the slack channel posting.
> Don't add duplicated allocation to node when the allocation ask fails
> ---------------------------------------------------------------------
>
> Key: YUNIKORN-2895
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2895
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Qi Zhu
> Assignee: Qi Zhu
> Priority: Critical
>
> When i try to revisit the new update allocation logic, the potential
> duplicated allocation to node will happen if the allocation already
> allocated. And we try to add the allocation to the node again and don't
> revert it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]