[ https://issues.apache.org/jira/browse/YUNIKORN-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Bacsko resolved YUNIKORN-3123. ------------------------------------ Fix Version/s: 1.8.0 Resolution: Fixed > Add retry logic to AssumePod to prevent PV races > ------------------------------------------------ > > Key: YUNIKORN-3123 > URL: https://issues.apache.org/jira/browse/YUNIKORN-3123 > Project: Apache YuniKorn > Issue Type: Bug > Components: shim - kubernetes > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > Internally we ran into a strange problem which occurs on OpenShift. It seems > to be related to how ephemeral volumes are handled by LSO (Local Storage > Operator). > {noformat} > │ Events: > > │ > │ Type Reason Age From Message > > │ > │ ---- ------ ---- ---- ------- > > │ > │ Normal Scheduling 22m yunikorn > impala-1755495449-zgbl/impala-executor-000-0 is queued and waiting for > allocation │ > │ Warning AssumePodError 22m yunikorn pod impala-executor-000-0 has > conflicting volume claims: node(s) didn't find available persistent volumes > to bind │ > │ Normal TaskFailed 22m yunikorn Task > impala-1755495449-zgbl/impala-executor-000-0 is failed > {noformat} > The underlying issue is very likely a race condition between two separate > volumeBinder instances. The one inside the {{VolumeBinding}} plugin already > sees the volume when the predicates are evaluated, so the node is seen as fit > for the a given pod. After the core completes the scheduling, > {{context.AssumePod()}} is called with yet another call to > {{SchedulerVolumeBinder.FindPodVolumes()}}. However, this instance hasn't > received the update about the volumes being ready, and it returns an error. > This also means that the bug is very sensitive to network latencies. > > It's difficult to reproduce. Our suggestion is adding a simple retry logic > around {{AssumePod()}}. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org