RE: spark-submit exit status on k8s

2020-04-06 Thread Marshall Markham
Thank you, that looks promising as well.


  *   Marshall

From: Yinan Li 
Sent: Sunday, April 5, 2020 3:49 PM
To: Marshall Markham 
Cc: user 
Subject: Re: spark-submit exit status on k8s

Not sure if you are aware of this new feature in Airflow 
https://issues.apache.org/jira/browse/AIRFLOW-6542<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FAIRFLOW-6542=02%7C01%7Cmmarkham%40precisionlender.com%7C530b313ab9ec4ddbcf0e08d7d99a5b5c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637217129301907885=yjoWHJXbwX2smGsfZ81drbLjq0pD1nRJ6dkE0nPIjho%3D=0>.
 It's a way to use Airflow to orchestrate spark applications run using the 
Spark K8S operator 
(https://github.com/GoogleCloudPlatform/spark-on-k8s-operator<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FGoogleCloudPlatform%2Fspark-on-k8s-operator=02%7C01%7Cmmarkham%40precisionlender.com%7C530b313ab9ec4ddbcf0e08d7d99a5b5c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637217129301917883=5wfYSjvfOmPNFCoeohbd%2BRBrBg7CEAIUAG4G3L7FLxw%3D=0>).


On Sun, Apr 5, 2020 at 8:25 AM Masood Krohy 
mailto:masood.krohy@analytical.works>> wrote:

Another, simpler solution that I just thought of: just add an operation at the 
end of your Spark program to write an empty file somewhere, with filename 
SUCCESS for example. Add a stage to your AirFlow graph to check the existence 
of this file after running spark-submit. If the file is absent, then the Spark 
app must have failed.

The above should work if you want to avoid dealing with the REST API for 
monitoring.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C530b313ab9ec4ddbcf0e08d7d99a5b5c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637217129301917883=9pOoyp6WCmeiMUz6p1rNNu6KcxAbctGklxkMHcCPdL0%3D=0>
On 4/4/20 10:54 AM, Masood Krohy wrote:

I'm not in the Spark dev team, so cannot tell you why that priority was chosen 
for the JIRA issue or if anyone is about to finish the work on that; I'll let 
others jump in if they know.

Just wanted to offer a potential solution so that you can move ahead in the 
meantime.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C530b313ab9ec4ddbcf0e08d7d99a5b5c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637217129301927878=zcyQEn%2BkW0H6VLm%2BYTmnFv%2BgkoKYtpuKAi3GEb%2FKTcU%3D=0>
On 4/4/20 7:49 AM, Marshall Markham wrote:
Thank you very much Masood for your fast response. Last question, is the 
current status in Jira representative of the status of the ticket within the 
project team? This seems like a big deal for the K8s implementation and we were 
surprised to find it marked as priority low. Is there any discussion of picking 
up this work in the near future?

Thanks,
Marshall

From: Masood Krohy 
<mailto:masood.krohy@analytical.works>
Sent: Friday, April 3, 2020 9:34 PM
To: Marshall Markham 
<mailto:mmark...@precisionlender.com>; user 
<mailto:user@spark.apache.org>
Subject: Re: spark-submit exit status on k8s


While you wait for a fix on that JIRA ticket, you may be able to add an 
intermediary step in your AirFlow graph, calling Spark's REST API after 
submitting the job, and dig into the actual status of the application, and make 
a success/fail decision accordingly. You can make repeated calls in a loop to 
the REST API with few seconds delay between each call while the execution is in 
progress until the application fails or succeeds.

https://spark.apache.org/docs/latest/monitoring.html#rest-api<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api=02%7C01%7Cmmarkham%40precisionlender.com%7C530b313ab9ec4ddbcf0e08d7d99a5b5c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637217129301927878=JoBERHfkc5M9Z%2BxCwcuK%2FsQgRKUFpBGR18Sbvq9%2FPqA%3D=0>

Hope this helps.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C530b313ab9ec4ddbcf0e08d7d99a5b5c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637217129301937876=UlvrgfSXEVeJKl8bSosaeoAiA9ViRNJSXk0Nnwe4YqM%3D=0>
On 4/3/20 8:23 AM, Marshall Markham wrote:
Hi Team,

My team recently conducted a POC of Kubernetes/Airflow/Spark with great 
success. The major concern we have about this system, after the completion of 
our POC is a behavior of spark-submit. When called with a Kubernetes API 
endpoint as master spark-submit seem

RE: spark-submit exit status on k8s

2020-04-06 Thread Marshall Markham
This is a great idea Masood. We are actually managing our spark jobs with a 
kubernetes pod operator, we may stick something in at that layer to determine 
success/failure so that we are in the same node of the DAG.

Thanks again.


  *   Marshall

From: Masood Krohy 
Sent: Sunday, April 5, 2020 11:25 AM
To: Marshall Markham ; user 

Subject: Re: spark-submit exit status on k8s


Another, simpler solution that I just thought of: just add an operation at the 
end of your Spark program to write an empty file somewhere, with filename 
SUCCESS for example. Add a stage to your AirFlow graph to check the existence 
of this file after running spark-submit. If the file is absent, then the Spark 
app must have failed.

The above should work if you want to avoid dealing with the REST API for 
monitoring.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7Cad56ab9a471e42a5430b08d7d9757cc0%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637216970971153433=rdbgFrc1oZAIr06NPud8HkQyXulcPaNxyvRgbu4iPfI%3D=0>
On 4/4/20 10:54 AM, Masood Krohy wrote:

I'm not in the Spark dev team, so cannot tell you why that priority was chosen 
for the JIRA issue or if anyone is about to finish the work on that; I'll let 
others jump in if they know.

Just wanted to offer a potential solution so that you can move ahead in the 
meantime.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7Cad56ab9a471e42a5430b08d7d9757cc0%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637216970971153433=rdbgFrc1oZAIr06NPud8HkQyXulcPaNxyvRgbu4iPfI%3D=0>
On 4/4/20 7:49 AM, Marshall Markham wrote:
Thank you very much Masood for your fast response. Last question, is the 
current status in Jira representative of the status of the ticket within the 
project team? This seems like a big deal for the K8s implementation and we were 
surprised to find it marked as priority low. Is there any discussion of picking 
up this work in the near future?

Thanks,
Marshall

From: Masood Krohy 
<mailto:masood.krohy@analytical.works>
Sent: Friday, April 3, 2020 9:34 PM
To: Marshall Markham 
<mailto:mmark...@precisionlender.com>; user 
<mailto:user@spark.apache.org>
Subject: Re: spark-submit exit status on k8s


While you wait for a fix on that JIRA ticket, you may be able to add an 
intermediary step in your AirFlow graph, calling Spark's REST API after 
submitting the job, and dig into the actual status of the application, and make 
a success/fail decision accordingly. You can make repeated calls in a loop to 
the REST API with few seconds delay between each call while the execution is in 
progress until the application fails or succeeds.

https://spark.apache.org/docs/latest/monitoring.html#rest-api<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api=02%7C01%7Cmmarkham%40precisionlender.com%7Cad56ab9a471e42a5430b08d7d9757cc0%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637216970971163388=8xVi1S4hLRyGbyUk%2FYWjnZKqx%2FyZ3jujo%2Fx7%2FSYGQAg%3D=0>

Hope this helps.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7Cad56ab9a471e42a5430b08d7d9757cc0%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C0%7C637216970971163388=zuMgdEv0czlQFGlhB%2BeyoPMbchq7tbucyRlQ5fWgduI%3D=0>
On 4/3/20 8:23 AM, Marshall Markham wrote:
Hi Team,

My team recently conducted a POC of Kubernetes/Airflow/Spark with great 
success. The major concern we have about this system, after the completion of 
our POC is a behavior of spark-submit. When called with a Kubernetes API 
endpoint as master spark-submit seems to always return exit status 0. This is 
obviously a major issue preventing us from conditioning job graphs on the 
success or failure of our Spark jobs. I found Jira ticket SPARK-27697 under the 
Apache issues covering this bug. The ticket is listed as minor and does not 
seem to have any activity recently. I would like to up vote it and ask if there 
is anything I can do to move this forward. This could be the one thing standing 
between my team and our preferred batch workload implementation. Thank you.

Marshall Markham
Data Engineer
PrecisionLender, a Q2 Company

NOTE: This communication and any attachments are for the sole use of the 
intended recipient(s) and may contain confidential and/or privileged 
information. Any unauthorized review, use, disclosure or distribution is 
prohibited. If 

Re: spark-submit exit status on k8s

2020-04-05 Thread Yinan Li
Not sure if you are aware of this new feature in Airflow
https://issues.apache.org/jira/browse/AIRFLOW-6542. It's a way to use
Airflow to orchestrate spark applications run using the Spark K8S operator (
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator).


On Sun, Apr 5, 2020 at 8:25 AM Masood Krohy 
wrote:

> Another, simpler solution that I just thought of: just add an operation at
> the end of your Spark program to write an empty file somewhere, with
> filename SUCCESS for example. Add a stage to your AirFlow graph to check
> the existence of this file after running spark-submit. If the file is
> absent, then the Spark app must have failed.
>
> The above should work if you want to avoid dealing with the REST API for
> monitoring.
>
> Masood
>
> __
>
> Masood Krohy, Ph.D.
> Data Science Advisor|Platform Architecthttps://www.analytical.works
>
> On 4/4/20 10:54 AM, Masood Krohy wrote:
>
> I'm not in the Spark dev team, so cannot tell you why that priority was
> chosen for the JIRA issue or if anyone is about to finish the work on that;
> I'll let others jump in if they know.
>
> Just wanted to offer a potential solution so that you can move ahead in
> the meantime.
>
> Masood
>
> __
>
> Masood Krohy, Ph.D.
> Data Science Advisor|Platform Architecthttps://www.analytical.works
>
> On 4/4/20 7:49 AM, Marshall Markham wrote:
>
> Thank you very much Masood for your fast response. Last question, is the
> current status in Jira representative of the status of the ticket within
> the project team? This seems like a big deal for the K8s implementation and
> we were surprised to find it marked as priority low. Is there any
> discussion of picking up this work in the near future?
>
>
>
> Thanks,
>
> Marshall
>
>
>
> *From:* Masood Krohy 
> 
> *Sent:* Friday, April 3, 2020 9:34 PM
> *To:* Marshall Markham 
> ; user 
> 
> *Subject:* Re: spark-submit exit status on k8s
>
>
>
> While you wait for a fix on that JIRA ticket, you may be able to add an
> intermediary step in your AirFlow graph, calling Spark's REST API after
> submitting the job, and dig into the actual status of the application, and
> make a success/fail decision accordingly. You can make repeated calls in a
> loop to the REST API with few seconds delay between each call while the
> execution is in progress until the application fails or succeeds.
>
> https://spark.apache.org/docs/latest/monitoring.html#rest-api
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=VeYtrGQ2yfkYvxuEvqgaTVoTf2ap5krWlmtR8OJBcr0%3D=0>
>
> Hope this helps.
>
> Masood
>
> __
>
>
>
> Masood Krohy, Ph.D.
>
> Data Science Advisor|Platform Architect
>
> https://www.analytical.works 
> <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=1e07VVnMzpaUTR4ppvZxY5XCEcfRzCX7gA6YgdlWWaU%3D=0>
>
> On 4/3/20 8:23 AM, Marshall Markham wrote:
>
> Hi Team,
>
>
>
> My team recently conducted a POC of Kubernetes/Airflow/Spark with great
> success. The major concern we have about this system, after the completion
> of our POC is a behavior of spark-submit. When called with a Kubernetes API
> endpoint as master spark-submit seems to always return exit status 0. This
> is obviously a major issue preventing us from conditioning job graphs on
> the success or failure of our Spark jobs. I found Jira ticket SPARK-27697
> under the Apache issues covering this bug. The ticket is listed as minor
> and does not seem to have any activity recently. I would like to up vote it
> and ask if there is anything I can do to move this forward. This could be
> the one thing standing between my team and our preferred batch workload
> implementation. Thank you.
>
>
>
> *Marshall Markham*
>
> Data Engineer
>
> PrecisionLender, a Q2 Company
>
>
>
> NOTE: This communication and any attachments are for the sole use of the
> intended recipient(s) and may contain confidential and/or privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by replying to this email, and destroy all copies of the original
> message.
>
> NOTE: This communication and any attachments are for the sole use of the
> intended recipient(s) and may contain confidential and/or privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by replying to this email, and destroy all copies of the original
> message.
>
>


Re: spark-submit exit status on k8s

2020-04-05 Thread Masood Krohy
Another, simpler solution that I just thought of: just add an operation 
at the end of your Spark program to write an empty file somewhere, with 
filename SUCCESS for example. Add a stage to your AirFlow graph to check 
the existence of this file after running spark-submit. If the file is 
absent, then the Spark app must have failed.


The above should work if you want to avoid dealing with the REST API for 
monitoring.


Masood

__

Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works

On 4/4/20 10:54 AM, Masood Krohy wrote:


I'm not in the Spark dev team, so cannot tell you why that priority 
was chosen for the JIRA issue or if anyone is about to finish the work 
on that; I'll let others jump in if they know.


Just wanted to offer a potential solution so that you can move ahead 
in the meantime.


Masood

__

Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works
On 4/4/20 7:49 AM, Marshall Markham wrote:


Thank you very much Masood for your fast response. Last question, is 
the current status in Jira representative of the status of the ticket 
within the project team? This seems like a big deal for the K8s 
implementation and we were surprised to find it marked as priority 
low. Is there any discussion of picking up this work in the near future?


Thanks,

Marshall

*From:*Masood Krohy 
*Sent:* Friday, April 3, 2020 9:34 PM
*To:* Marshall Markham ; user 


*Subject:* Re: spark-submit exit status on k8s

While you wait for a fix on that JIRA ticket, you may be able to add 
an intermediary step in your AirFlow graph, calling Spark's REST API 
after submitting the job, and dig into the actual status of the 
application, and make a success/fail decision accordingly. You can 
make repeated calls in a loop to the REST API with few seconds delay 
between each call while the execution is in progress until the 
application fails or succeeds.


https://spark.apache.org/docs/latest/monitoring.html#rest-api 
<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=VeYtrGQ2yfkYvxuEvqgaTVoTf2ap5krWlmtR8OJBcr0%3D=0>


Hope this helps.

Masood

__
Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works  
<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=1e07VVnMzpaUTR4ppvZxY5XCEcfRzCX7gA6YgdlWWaU%3D=0>

On 4/3/20 8:23 AM, Marshall Markham wrote:

Hi Team,

My team recently conducted a POC of Kubernetes/Airflow/Spark with
great success. The major concern we have about this system, after
the completion of our POC is a behavior of spark-submit. When
called with a Kubernetes API endpoint as master spark-submit
seems to always return exit status 0. This is obviously a major
issue preventing us from conditioning job graphs on the success
or failure of our Spark jobs. I found Jira ticket SPARK-27697
under the Apache issues covering this bug. The ticket is listed
as minor and does not seem to have any activity recently. I would
like to up vote it and ask if there is anything I can do to move
this forward. This could be the one thing standing between my
team and our preferred batch workload implementation. Thank you.

*Marshall Markham*

Data Engineer

PrecisionLender, a Q2 Company

NOTE: This communication and any attachments are for the sole use
of the intended recipient(s) and may contain confidential and/or
privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended
recipient, please contact the sender by replying to this email,
and destroy all copies of the original message.

NOTE: This communication and any attachments are for the sole use of 
the intended recipient(s) and may contain confidential and/or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited. If you are not the intended recipient, 
please contact the sender by replying to this email, and destroy all 
copies of the original message. 


Re: spark-submit exit status on k8s

2020-04-04 Thread Masood Krohy
I'm not in the Spark dev team, so cannot tell you why that priority was 
chosen for the JIRA issue or if anyone is about to finish the work on 
that; I'll let others jump in if they know.


Just wanted to offer a potential solution so that you can move ahead in 
the meantime.


Masood

__

Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works

On 4/4/20 7:49 AM, Marshall Markham wrote:


Thank you very much Masood for your fast response. Last question, is 
the current status in Jira representative of the status of the ticket 
within the project team? This seems like a big deal for the K8s 
implementation and we were surprised to find it marked as priority 
low. Is there any discussion of picking up this work in the near future?


Thanks,

Marshall

*From:*Masood Krohy 
*Sent:* Friday, April 3, 2020 9:34 PM
*To:* Marshall Markham ; user 


*Subject:* Re: spark-submit exit status on k8s

While you wait for a fix on that JIRA ticket, you may be able to add 
an intermediary step in your AirFlow graph, calling Spark's REST API 
after submitting the job, and dig into the actual status of the 
application, and make a success/fail decision accordingly. You can 
make repeated calls in a loop to the REST API with few seconds delay 
between each call while the execution is in progress until the 
application fails or succeeds.


https://spark.apache.org/docs/latest/monitoring.html#rest-api 
<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=VeYtrGQ2yfkYvxuEvqgaTVoTf2ap5krWlmtR8OJBcr0%3D=0>


Hope this helps.

Masood

__
Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works  
<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=1e07VVnMzpaUTR4ppvZxY5XCEcfRzCX7gA6YgdlWWaU%3D=0>

On 4/3/20 8:23 AM, Marshall Markham wrote:

Hi Team,

My team recently conducted a POC of Kubernetes/Airflow/Spark with
great success. The major concern we have about this system, after
the completion of our POC is a behavior of spark-submit. When
called with a Kubernetes API endpoint as master spark-submit seems
to always return exit status 0. This is obviously a major issue
preventing us from conditioning job graphs on the success or
failure of our Spark jobs. I found Jira ticket SPARK-27697 under
the Apache issues covering this bug. The ticket is listed as minor
and does not seem to have any activity recently. I would like to
up vote it and ask if there is anything I can do to move this
forward. This could be the one thing standing between my team and
our preferred batch workload implementation. Thank you.

*Marshall Markham*

Data Engineer

PrecisionLender, a Q2 Company

NOTE: This communication and any attachments are for the sole use
of the intended recipient(s) and may contain confidential and/or
privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended
recipient, please contact the sender by replying to this email,
and destroy all copies of the original message.

NOTE: This communication and any attachments are for the sole use of 
the intended recipient(s) and may contain confidential and/or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited. If you are not the intended recipient, 
please contact the sender by replying to this email, and destroy all 
copies of the original message. 


RE: spark-submit exit status on k8s

2020-04-04 Thread Marshall Markham
Thank you very much Masood for your fast response. Last question, is the 
current status in Jira representative of the status of the ticket within the 
project team? This seems like a big deal for the K8s implementation and we were 
surprised to find it marked as priority low. Is there any discussion of picking 
up this work in the near future?

Thanks,
Marshall

From: Masood Krohy 
Sent: Friday, April 3, 2020 9:34 PM
To: Marshall Markham ; user 

Subject: Re: spark-submit exit status on k8s


While you wait for a fix on that JIRA ticket, you may be able to add an 
intermediary step in your AirFlow graph, calling Spark's REST API after 
submitting the job, and dig into the actual status of the application, and make 
a success/fail decision accordingly. You can make repeated calls in a loop to 
the REST API with few seconds delay between each call while the execution is in 
progress until the application fails or succeeds.

https://spark.apache.org/docs/latest/monitoring.html#rest-api<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fmonitoring.html%23rest-api=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=VeYtrGQ2yfkYvxuEvqgaTVoTf2ap5krWlmtR8OJBcr0%3D=0>

Hope this helps.

Masood

__



Masood Krohy, Ph.D.

Data Science Advisor|Platform Architect

https://www.analytical.works<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.analytical.works%2F=02%7C01%7Cmmarkham%40precisionlender.com%7C5de463febcd142287ba208d7d8384f1c%7Cf06d459bd9354ad7a9d3a82343c4c9da%7C0%7C1%7C637215608668550345=1e07VVnMzpaUTR4ppvZxY5XCEcfRzCX7gA6YgdlWWaU%3D=0>
On 4/3/20 8:23 AM, Marshall Markham wrote:
Hi Team,

My team recently conducted a POC of Kubernetes/Airflow/Spark with great 
success. The major concern we have about this system, after the completion of 
our POC is a behavior of spark-submit. When called with a Kubernetes API 
endpoint as master spark-submit seems to always return exit status 0. This is 
obviously a major issue preventing us from conditioning job graphs on the 
success or failure of our Spark jobs. I found Jira ticket SPARK-27697 under the 
Apache issues covering this bug. The ticket is listed as minor and does not 
seem to have any activity recently. I would like to up vote it and ask if there 
is anything I can do to move this forward. This could be the one thing standing 
between my team and our preferred batch workload implementation. Thank you.

Marshall Markham
Data Engineer
PrecisionLender, a Q2 Company

NOTE: This communication and any attachments are for the sole use of the 
intended recipient(s) and may contain confidential and/or privileged 
information. Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please contact the sender by 
replying to this email, and destroy all copies of the original message.
NOTE: This communication and any attachments are for the sole use of the 
intended recipient(s) and may contain confidential and/or privileged 
information. Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please contact the sender by 
replying to this email, and destroy all copies of the original message.


Re: spark-submit exit status on k8s

2020-04-03 Thread Masood Krohy
While you wait for a fix on that JIRA ticket, you may be able to add an 
intermediary step in your AirFlow graph, calling Spark's REST API after 
submitting the job, and dig into the actual status of the application, 
and make a success/fail decision accordingly. You can make repeated 
calls in a loop to the REST API with few seconds delay between each call 
while the execution is in progress until the application fails or succeeds.


https://spark.apache.org/docs/latest/monitoring.html#rest-api

Hope this helps.

Masood

__

Masood Krohy, Ph.D.
Data Science Advisor|Platform Architect
https://www.analytical.works

On 4/3/20 8:23 AM, Marshall Markham wrote:


Hi Team,

My team recently conducted a POC of Kubernetes/Airflow/Spark with 
great success. The major concern we have about this system, after the 
completion of our POC is a behavior of spark-submit. When called with 
a Kubernetes API endpoint as master spark-submit seems to always 
return exit status 0. This is obviously a major issue preventing us 
from conditioning job graphs on the success or failure of our Spark 
jobs. I found Jira ticket SPARK-27697 under the Apache issues covering 
this bug. The ticket is listed as minor and does not seem to have any 
activity recently. I would like to up vote it and ask if there is 
anything I can do to move this forward. This could be the one thing 
standing between my team and our preferred batch workload 
implementation. Thank you.


*Marshall Markham*

Data Engineer

PrecisionLender, a Q2 Company

NOTE: This communication and any attachments are for the sole use of 
the intended recipient(s) and may contain confidential and/or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited. If you are not the intended recipient, 
please contact the sender by replying to this email, and destroy all 
copies of the original message.