bq.  to get the logs from the data nodes

Minor correction: the logs are collected from machines where node managers
run.

Cheers

On Wed, Dec 3, 2014 at 3:39 PM, Ganelin, Ilya <[email protected]>
wrote:

>  You want to look further up the stack (there are almost certainly other
> errors before this happens) and those other errors may give your better
> idea of what is going on. Also if you are running on yarn you can run "yarn
> logs -applicationId <yourAppId>" to get the logs from the data nodes.
>
>
>
> Sent with Good (www.good.com)
>
>
> -----Original Message-----
> *From: *S. Zhou [[email protected]]
> *Sent: *Wednesday, December 03, 2014 06:30 PM Eastern Standard Time
> *To: *[email protected]
> *Subject: *Spark executor lost
>
>  We are using Spark job server to submit spark jobs (our spark version is
> 0.91). After running the spark job server for a while, we often see the
> following errors (executor lost) in the spark job server log. As a
> consequence, the spark driver (allocated inside spark job server) gradually
> loses executors. And finally the spark job server no longer be able to
> submit jobs. We tried to google the solutions but so far no luck. Please
> help if you have any ideas. Thanks!
>
> [2014-11-25 01:37:36,250] INFO  parkDeploySchedulerBackend []
> [akka://JobServer/user/context-supervisor/next-staging] - Executor 6
> disconnected, so removing it
> [2014-11-25 01:37:36,252] ERROR cheduler.TaskSchedulerImpl []
> [akka://JobServer/user/context-supervisor/next-staging] - Lost executor 6
> on XXXX: remote Akka client disassociated
> [2014-11-25 01:37:36,252] INFO  ark.scheduler.DAGScheduler [] [] - *Executor
> lost*: 6 (epoch 8)
> [2014-11-25 01:37:36,252] INFO  ge.BlockManagerMasterActor [] [] - Trying
> to remove executor 6 from BlockManagerMaster.
> [2014-11-25 01:37:36,252] INFO  storage.BlockManagerMaster [] [] - Removed
> 6 successfully in removeExecutor
> [2014-11-25 01:37:36,286] INFO  ient.AppClient$ClientActor []
> [akka://JobServer/user/context-supervisor/next-staging] - Executor updated:
> app-20141125002023-0037/6 is now FAILED (Command exited with code 143)
>
>
>
> ------------------------------
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed.  If the reader of this message is not the
> intended recipient, you are hereby notified that any review,
> retransmission, dissemination, distribution, copying or other use of, or
> taking of any action in reliance upon this information is strictly
> prohibited. If you have received this communication in error, please
> contact the sender and delete the material from your computer.
>

Reply via email to