Re: Event logging not working when worker machine terminated

2015-09-09 Thread David Rosenstrauch

Standalone.

On 09/08/2015 11:18 PM, Jeff Zhang wrote:

What cluster mode do you use ? Standalone/Yarn/Mesos ?


On Wed, Sep 9, 2015 at 11:15 AM, David Rosenstrauch 
wrote:


Our Spark cluster is configured to write application history event logging
to a directory on HDFS.  This all works fine.  (I've tested it with Spark
shell.)

However, on a large, long-running job that we ran tonight, one of our
machines at the cloud provider had issues and had to be terminated and
replaced in the middle of the job.

The job completed correctly, and shows in state FINISHED in the "Completed
Applications" section of the Spark GUI.  However, when I try to look at the
application's history, the GUI says "Application history not found" and
"Application ... is still in progress".

The reason appears to be the machine that was terminated.  When I click on
the executor list for that job, Spark is showing the executor from the
terminated machine as still in state RUNNING.

Any solution/workaround for this?  BTW, I'm running Spark v1.3.0.

Thanks,

DR

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org








-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Event logging not working when worker machine terminated

2015-09-09 Thread Charles Chao
I have encountered the same problem after migrating from 1.2.2 to 1.3.0.
After some searching this appears to be a bug introduced in 1.3. Hopefully
it¹s fixed in 1.4.

Thanks, 

Charles





On 9/9/15, 7:30 AM, "David Rosenstrauch"  wrote:

>Standalone.
>
>On 09/08/2015 11:18 PM, Jeff Zhang wrote:
>> What cluster mode do you use ? Standalone/Yarn/Mesos ?
>>
>>
>> On Wed, Sep 9, 2015 at 11:15 AM, David Rosenstrauch 
>> wrote:
>>
>>> Our Spark cluster is configured to write application history event
>>>logging
>>> to a directory on HDFS.  This all works fine.  (I've tested it with
>>>Spark
>>> shell.)
>>>
>>> However, on a large, long-running job that we ran tonight, one of our
>>> machines at the cloud provider had issues and had to be terminated and
>>> replaced in the middle of the job.
>>>
>>> The job completed correctly, and shows in state FINISHED in the
>>>"Completed
>>> Applications" section of the Spark GUI.  However, when I try to look
>>>at the
>>> application's history, the GUI says "Application history not found" and
>>> "Application ... is still in progress".
>>>
>>> The reason appears to be the machine that was terminated.  When I
>>>click on
>>> the executor list for that job, Spark is showing the executor from the
>>> terminated machine as still in state RUNNING.
>>>
>>> Any solution/workaround for this?  BTW, I'm running Spark v1.3.0.
>>>
>>> Thanks,
>>>
>>> DR
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>
>
>-
>To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>For additional commands, e-mail: user-h...@spark.apache.org
>


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Event logging not working when worker machine terminated

2015-09-09 Thread David Rosenstrauch
Thanks for the info.  Do you know if there's a ticket already open for 
this issue?  If so, I'd like to monitor it.


Thanks,

DR

On 09/09/2015 11:50 AM, Charles Chao wrote:

I have encountered the same problem after migrating from 1.2.2 to 1.3.0.
After some searching this appears to be a bug introduced in 1.3. Hopefully
it¹s fixed in 1.4.

Thanks,

Charles





On 9/9/15, 7:30 AM, "David Rosenstrauch"  wrote:


Standalone.

On 09/08/2015 11:18 PM, Jeff Zhang wrote:

What cluster mode do you use ? Standalone/Yarn/Mesos ?


On Wed, Sep 9, 2015 at 11:15 AM, David Rosenstrauch 
wrote:


Our Spark cluster is configured to write application history event
logging
to a directory on HDFS.  This all works fine.  (I've tested it with
Spark
shell.)

However, on a large, long-running job that we ran tonight, one of our
machines at the cloud provider had issues and had to be terminated and
replaced in the middle of the job.

The job completed correctly, and shows in state FINISHED in the
"Completed
Applications" section of the Spark GUI.  However, when I try to look
at the
application's history, the GUI says "Application history not found" and
"Application ... is still in progress".

The reason appears to be the machine that was terminated.  When I
click on
the executor list for that job, Spark is showing the executor from the
terminated machine as still in state RUNNING.

Any solution/workaround for this?  BTW, I'm running Spark v1.3.0.

Thanks,

DR

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org








-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Event logging not working when worker machine terminated

2015-09-09 Thread Charles Chao
Fixed in 1.3.1

https://issues.apache.org/jira/browse/SPARK-6950

Thanks, 

Charles





On 9/9/15, 8:54 AM, "David Rosenstrauch"  wrote:

>Thanks for the info.  Do you know if there's a ticket already open for
>this issue?  If so, I'd like to monitor it.
>
>Thanks,
>
>DR
>
>On 09/09/2015 11:50 AM, Charles Chao wrote:
>> I have encountered the same problem after migrating from 1.2.2 to 1.3.0.
>> After some searching this appears to be a bug introduced in 1.3.
>>Hopefully
>> it¹s fixed in 1.4.
>>
>> Thanks,
>>
>> Charles
>>
>>
>>
>>
>>
>> On 9/9/15, 7:30 AM, "David Rosenstrauch"  wrote:
>>
>>> Standalone.
>>>
>>> On 09/08/2015 11:18 PM, Jeff Zhang wrote:
 What cluster mode do you use ? Standalone/Yarn/Mesos ?


 On Wed, Sep 9, 2015 at 11:15 AM, David Rosenstrauch

 wrote:

> Our Spark cluster is configured to write application history event
> logging
> to a directory on HDFS.  This all works fine.  (I've tested it with
> Spark
> shell.)
>
> However, on a large, long-running job that we ran tonight, one of our
> machines at the cloud provider had issues and had to be terminated
>and
> replaced in the middle of the job.
>
> The job completed correctly, and shows in state FINISHED in the
> "Completed
> Applications" section of the Spark GUI.  However, when I try to look
> at the
> application's history, the GUI says "Application history not found"
>and
> "Application ... is still in progress".
>
> The reason appears to be the machine that was terminated.  When I
> click on
> the executor list for that job, Spark is showing the executor from
>the
> terminated machine as still in state RUNNING.
>
> Any solution/workaround for this?  BTW, I'm running Spark v1.3.0.
>
> Thanks,
>
> DR
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>
>
>-
>To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>For additional commands, e-mail: user-h...@spark.apache.org
>


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Event logging not working when worker machine terminated

2015-09-08 Thread David Rosenstrauch
Our Spark cluster is configured to write application history event 
logging to a directory on HDFS.  This all works fine.  (I've tested it 
with Spark shell.)


However, on a large, long-running job that we ran tonight, one of our 
machines at the cloud provider had issues and had to be terminated and 
replaced in the middle of the job.


The job completed correctly, and shows in state FINISHED in the 
"Completed Applications" section of the Spark GUI.  However, when I try 
to look at the application's history, the GUI says "Application history 
not found" and "Application ... is still in progress".


The reason appears to be the machine that was terminated.  When I click 
on the executor list for that job, Spark is showing the executor from 
the terminated machine as still in state RUNNING.


Any solution/workaround for this?  BTW, I'm running Spark v1.3.0.

Thanks,

DR

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Event logging not working when worker machine terminated

2015-09-08 Thread Jeff Zhang
What cluster mode do you use ? Standalone/Yarn/Mesos ?


On Wed, Sep 9, 2015 at 11:15 AM, David Rosenstrauch 
wrote:

> Our Spark cluster is configured to write application history event logging
> to a directory on HDFS.  This all works fine.  (I've tested it with Spark
> shell.)
>
> However, on a large, long-running job that we ran tonight, one of our
> machines at the cloud provider had issues and had to be terminated and
> replaced in the middle of the job.
>
> The job completed correctly, and shows in state FINISHED in the "Completed
> Applications" section of the Spark GUI.  However, when I try to look at the
> application's history, the GUI says "Application history not found" and
> "Application ... is still in progress".
>
> The reason appears to be the machine that was terminated.  When I click on
> the executor list for that job, Spark is showing the executor from the
> terminated machine as still in state RUNNING.
>
> Any solution/workaround for this?  BTW, I'm running Spark v1.3.0.
>
> Thanks,
>
> DR
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Best Regards

Jeff Zhang