Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-11-15 Thread Elkhan Dadashov
Thanks for the clarification, Marcelo.

On Tue, Nov 15, 2016 at 6:20 PM Marcelo Vanzin  wrote:

> On Tue, Nov 15, 2016 at 5:57 PM, Elkhan Dadashov 
> wrote:
> > This is confusing in the sense that, the client needs to stay alive for
> > Spark Job to finish successfully.
> >
> > Actually the client can die  or finish (in Yarn-cluster mode), and the
> spark
> > job will successfully finish.
>
> That's an internal class, and you're looking at an internal javadoc
> that describes how the app handle works. For the app handle to be
> updated, the "client" (i.e. the sub process) needs to stay alive. So
> the javadoc is correct. It has nothing to do with whether the
> application succeeds or not.
>
>
> --
> Marcelo
>


Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-11-15 Thread Marcelo Vanzin
On Tue, Nov 15, 2016 at 5:57 PM, Elkhan Dadashov  wrote:
> This is confusing in the sense that, the client needs to stay alive for
> Spark Job to finish successfully.
>
> Actually the client can die  or finish (in Yarn-cluster mode), and the spark
> job will successfully finish.

That's an internal class, and you're looking at an internal javadoc
that describes how the app handle works. For the app handle to be
updated, the "client" (i.e. the sub process) needs to stay alive. So
the javadoc is correct. It has nothing to do with whether the
application succeeds or not.


-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-11-15 Thread Elkhan Dadashov
Hi Marcelo,

This part of the JaaDoc is confusing:

https://github.com/apache/spark/blob/master/launcher/src/main/java/org/apache/spark/launcher/LauncherServer.java


"
* In *cluster mode*, this means that the client that launches the
* application *must remain alive for the duration of the application* (or
until the app handle is
* disconnected).
*/
class LauncherServer implements Closeable {
"
This is confusing in the sense that, the client needs to stay alive for
Spark Job to finish successfully.

Actually the client can die  or finish (in Yarn-cluster mode), and the
spark job will successfully finish.

Yeah, the client needs to stay alive until appHandle state is Submitted (or
maybe Running), but not until Final state, unless you want to query the
state of Spark app using appHandle.

I still do not get the meaning of the comment above.

Thanks.


On Tue, Oct 18, 2016 at 3:07 PM Marcelo Vanzin  wrote:

> On Tue, Oct 18, 2016 at 3:01 PM, Elkhan Dadashov 
> wrote:
> > Does my map task need to wait until Spark job finishes ?
>
> No...
>
> > Or is there any way, my map task finishes after launching Spark job, and
> I
> > can still query and get status of Spark job outside of map task (or
> failure
> > reason, if it has failed) ? (maybe by querying Spark job id ?)
>
> ...but if the SparkLauncher handle goes away, then you lose the
> ability to track the app's state, unless you talk directly to the
> cluster manager.
>
> > I guess also if i want my Spark job to be killed, if corresponding
> delegator
> > map task is killed, that means my map task needs to stay alive, so i
> still
> > have SparkAppHandle reference ?
>
> Correct, unless you talk directly to the cluster manager.
>
> --
> Marcelo
>


Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-10-28 Thread Elkhan Dadashov
I figured out JOB id returned from sparkAppHandle.getAppId(), is unique
ApplicationId which looks like these:

for Local mode Spark env: Local-1477184581895
For Distributed Spark mode: Application_1477504900821_0005

ApplicationId represents the globally unique identifier for an application.

The globally unique nature of the identifier is achieved by using the cluster
timestamp i.e. start-time of the ResourceManager along with a monotonically
increasing counter for the application.




On Sat, Oct 22, 2016 at 5:18 PM Elkhan Dadashov 
wrote:

> I found answer regarding logging in the JavaDoc of SparkLauncher:
>
> "Currently, all applications are launched as child processes. The child's
> stdout and stderr are merged and written to a logger (see
> java.util.logging)."
>
> One last question. sparkAppHandle.getAppId() - does this function
> return org.apache.hadoop.mapred.*JobID* which makes it easy tracking in
> Yarn ? Or is appId just the Spark app name we assign ?
>
> If it is JobID, then even if the SparkLauncher handle goes away, by
> talking directly to the cluster manager, i can get Job details.
>
> Thanks.
>
> On Sat, Oct 22, 2016 at 4:53 PM Elkhan Dadashov 
> wrote:
>
> Thanks, Marcelo.
>
> One more question regarding getting logs.
>
> In previous implementation of SparkLauncer we could read logs from :
>
> sparkLauncher.getInputStream()
> sparkLauncher.getErrorStream()
>
> What is the recommended way of getting logs and logging of Spark execution
> while using sparkLauncer#startApplication() ?
>
> Thanks.
>
> On Tue, Oct 18, 2016 at 3:07 PM Marcelo Vanzin 
> wrote:
>
> On Tue, Oct 18, 2016 at 3:01 PM, Elkhan Dadashov 
> wrote:
> > Does my map task need to wait until Spark job finishes ?
>
> No...
>
> > Or is there any way, my map task finishes after launching Spark job, and
> I
> > can still query and get status of Spark job outside of map task (or
> failure
> > reason, if it has failed) ? (maybe by querying Spark job id ?)
>
> ...but if the SparkLauncher handle goes away, then you lose the
> ability to track the app's state, unless you talk directly to the
> cluster manager.
>
> > I guess also if i want my Spark job to be killed, if corresponding
> delegator
> > map task is killed, that means my map task needs to stay alive, so i
> still
> > have SparkAppHandle reference ?
>
> Correct, unless you talk directly to the cluster manager.
>
> --
> Marcelo
>
>


Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-10-22 Thread Elkhan Dadashov
I found answer regarding logging in the JavaDoc of SparkLauncher:

"Currently, all applications are launched as child processes. The child's
stdout and stderr are merged and written to a logger (see
java.util.logging)."

One last question. sparkAppHandle.getAppId() - does this function
return org.apache.hadoop.mapred.*JobID* which makes it easy tracking in
Yarn ? Or is appId just the Spark app name we assign ?

If it is JobID, then even if the SparkLauncher handle goes away, by talking
directly to the cluster manager, i can get Job details.

Thanks.

On Sat, Oct 22, 2016 at 4:53 PM Elkhan Dadashov 
wrote:

> Thanks, Marcelo.
>
> One more question regarding getting logs.
>
> In previous implementation of SparkLauncer we could read logs from :
>
> sparkLauncher.getInputStream()
> sparkLauncher.getErrorStream()
>
> What is the recommended way of getting logs and logging of Spark execution
> while using sparkLauncer#startApplication() ?
>
> Thanks.
>
> On Tue, Oct 18, 2016 at 3:07 PM Marcelo Vanzin 
> wrote:
>
> On Tue, Oct 18, 2016 at 3:01 PM, Elkhan Dadashov 
> wrote:
> > Does my map task need to wait until Spark job finishes ?
>
> No...
>
> > Or is there any way, my map task finishes after launching Spark job, and
> I
> > can still query and get status of Spark job outside of map task (or
> failure
> > reason, if it has failed) ? (maybe by querying Spark job id ?)
>
> ...but if the SparkLauncher handle goes away, then you lose the
> ability to track the app's state, unless you talk directly to the
> cluster manager.
>
> > I guess also if i want my Spark job to be killed, if corresponding
> delegator
> > map task is killed, that means my map task needs to stay alive, so i
> still
> > have SparkAppHandle reference ?
>
> Correct, unless you talk directly to the cluster manager.
>
> --
> Marcelo
>
>


Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-10-22 Thread Elkhan Dadashov
Thanks, Marcelo.

One more question regarding getting logs.

In previous implementation of SparkLauncer we could read logs from :

sparkLauncher.getInputStream()
sparkLauncher.getErrorStream()

What is the recommended way of getting logs and logging of Spark execution
while using sparkLauncer#startApplication() ?

Thanks.

On Tue, Oct 18, 2016 at 3:07 PM Marcelo Vanzin  wrote:

> On Tue, Oct 18, 2016 at 3:01 PM, Elkhan Dadashov 
> wrote:
> > Does my map task need to wait until Spark job finishes ?
>
> No...
>
> > Or is there any way, my map task finishes after launching Spark job, and
> I
> > can still query and get status of Spark job outside of map task (or
> failure
> > reason, if it has failed) ? (maybe by querying Spark job id ?)
>
> ...but if the SparkLauncher handle goes away, then you lose the
> ability to track the app's state, unless you talk directly to the
> cluster manager.
>
> > I guess also if i want my Spark job to be killed, if corresponding
> delegator
> > map task is killed, that means my map task needs to stay alive, so i
> still
> > have SparkAppHandle reference ?
>
> Correct, unless you talk directly to the cluster manager.
>
> --
> Marcelo
>


Re: Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-10-18 Thread Marcelo Vanzin
On Tue, Oct 18, 2016 at 3:01 PM, Elkhan Dadashov  wrote:
> Does my map task need to wait until Spark job finishes ?

No...

> Or is there any way, my map task finishes after launching Spark job, and I
> can still query and get status of Spark job outside of map task (or failure
> reason, if it has failed) ? (maybe by querying Spark job id ?)

...but if the SparkLauncher handle goes away, then you lose the
ability to track the app's state, unless you talk directly to the
cluster manager.

> I guess also if i want my Spark job to be killed, if corresponding delegator
> map task is killed, that means my map task needs to stay alive, so i still
> have SparkAppHandle reference ?

Correct, unless you talk directly to the cluster manager.

-- 
Marcelo

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Does the delegator map task of SparkLauncher need to stay alive until Spark job finishes ?

2016-10-18 Thread Elkhan Dadashov
Hi,

Does the delegator map task of SparkLauncher need to stay alive until Spark
job finishes ?

1)
Currently, I have mapper tasks, which launches Spark job via
SparkLauncer#startApplication()

Does my map task need to wait until Spark job finishes ?

Or is there any way, my map task finishes after launching Spark job, and I
can still query and get status of Spark job outside of map task (or failure
reason, if it has failed) ? (maybe by querying Spark job id ?)

2)
I guess also if i want my Spark job to be killed, if corresponding
delegator map task is killed, that means my map task needs to stay alive,
so i still have SparkAppHandle reference ?

Thanks.