Re: [I] Track SparkSubmitHook Yarn Cluster application with Yarn CLI [airflow]

via GitHub Tue, 24 Oct 2023 02:25:31 -0700


tokoko commented on issue #24171:
URL: https://github.com/apache/airflow/issues/24171#issuecomment-1776846004


   @vchiapaikeo that's right. Full solution is really tricky here, that's why I 
chose to keep the implementation in-house, it was mostly just suited to our 
environment only :) Running `client` and `local` jobs in deferrable mode 
doesn't really make too much sense, I think. I would just throw an error in 
that case. 
   
   For `cluster` mode jobs, there's no comprehensive solution AFAIK. You can 
track standalone jobs with `spark-submit` itself, but you can't do the same in 
case of yarn and k8s. For them you would either use respective clients (`yarn` 
and `kubectl`) or rest api calls. A web of if-else clauses is probably the only 
way to go here.
   
   That's true, tracking YARN with rest calls assumes that rest api is exposed, 
which might or might not be the case depending on the environment. in case of 
k8s, it's probably safe to assume that both kubectl and rest api will be 
available.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Track SparkSubmitHook Yarn Cluster application with Yarn CLI [airflow]

Reply via email to