If you run in non-cluster mode, the driver logs would be written to stdout.
So airflow would capture those.
Op vr 22 jun. 2018 18:04 schreef Naik Kaxil :
> Thanks @Neils and @Kyle.
>
> @Neils - I agree, I don't want to copy Hadoop configurations to my airflow
> VM. In this (Using SSHOperator)
Thanks @Neils and @Kyle.
@Neils - I agree, I don't want to copy Hadoop configurations to my airflow VM.
In this (Using SSHOperator) case, Airflow would just be receiving std output
right as opposed to driver logs?
@Kyle - If you can, then it would definitely be useful to have LivyOperators to
Hi Kaxil,
I would recommend using the SSHOperator to start the Spark Job on the
master node of the HDInsight cluster.
This avoids the problems associated with Livy, and doesn't require you to
open ports/copy the hadoop configuration to your airflow machine.
Niels
2018-06-22 14:17 GMT+02:00 Naik
I haven’t used any Azure products but I did build a Livy hook and operator
so I could submit concurrent spark jobs to EMR clusters. I was planning on
contributing the code, but it’s kinda a pain haha. If your interested I can
take another stab at getting the Livy hook and operator contributed.
On
Hi all,
Has anyone used the SparkSubmitOperator to submit Spark jobs on Azure HDInsight
cluster? Are you using Livy or spark-submit to run remote Spark jobs?
Regards,
Kaxil
Kaxil Naik
Data Reply
2nd Floor, Nova South
160 Victoria Street, Westminster
London SW1E 5LB - UK
phone: +44 (0)20 7730