SameerMesiah97 commented on PR #61110:
URL: https://github.com/apache/airflow/pull/61110#issuecomment-3831664445
@Nataneljpwd
I still have two blocking concerns here:
1) I don’t think the quorum-read argument holds. Not setting
`resourceVersion` may require a quorum read , but that only guarantees
freshness of what the API server returns. It doesn’t mean we’re reading
directly from etcd, and more importantly it doesn’t enforce any stronger
invariants (like uniqueness or lifecycle ordering).
2) Even if the API response reflects the latest etcd state, Kubernetes does
not guarantee that there can’t be multiple pods with the same labels in
`Pending` or `Running` at the same time. I understand the SparkApplication
controller intends there to be a single driver, but that guarantee is eventual.
During restarts or reconciliation, overlapping driver pods are possible and
expected. Your filter for deletion_timestamp partially mitigates against this
by exlcuding terminating pods but I suspect there can still be windows where
the replacement pod is created before the deletion for the existing pod is
triggered.
Because of that, removing the existing phase-based prioritization is risky.
I see that you are now prioritizing 'Succeeded' pods but since you have removed
the ordering for `Running` and `Pending` pods, this does not mitigate the
concerns I explained above.
I would advise you to implement something like this:
```
pod = max(
pod_list,
key=lambda p: (
p.metadata.deletion_timestamp is None,
p.status.phase == PodPhase.RUNNING,
p.status.phase == PodPhase.SUCCEEDED,
p.metadata.creation_timestamp or
datetime.min.replace(tzinfo=timezone.utc),
p.metadata.name or "",
),
)
```
This handles the edge cases revealed by this PR without accidentally
returning multiple pods. I would advise removing the field selector as well
because we are already filtering in the lambda expression.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]