wanderijames edited a comment on pull request #17178:
URL: https://github.com/apache/airflow/pull/17178#issuecomment-885423290


   > We currently have one operator that allows us to run Spark job on 
Kubernetes. It works with both EKS and GCP as well as any other Kubernetes 
platform. - 
[SparkKubernetesOperator](https://github.com/apache/airflow/blob/d72b363929c86eb03fc9583002459bd10bc7eaeb/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py#L24).
 Why would anyone use this operator instead of the generic operator for 
Kubernetes?
   
   Hey @mik-laj, I am aware of apache spark and livy operator and also EMR 
operator. However, EMR on EKS works differently because EMR launches virtual 
cluster in your EKS. The pods (spark master and executors) launched are 
ephemeral, only existing when a start job is invoked. For more information, 
kindly visit https://aws.amazon.com/emr/features/eks/
   
   In addition, SparkKubernetesOperator is only suitable if you have Spark 
cluster has been setup in Kubernetes. In this case, it is not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to