If it is due to heartbeat problem and driver explicitly killed the
executors, there should be some driver logs mentioned about it. So you
could check the driver log about it. Also container (executor) logs are
useful, if this container is killed, then there'll be some signal related
logs, like
There was nothing in nodemanager logs that indicated why container was
killed.
Here's the guess: Since killed executors were experiencing high GC
activities (full GC) before death they most likely failed to respond to
heart beat to driver or nodemanager and got killed due to it.
This is more
Using pastebin seems to be better.
The attachment may not go through.
FYI
On Tue, Mar 1, 2016 at 6:07 PM, Jeff Zhang wrote:
> Do you mind to attach the whole yarn app log ?
>
> On Wed, Mar 2, 2016 at 10:03 AM, Nirav Patel
> wrote:
>
>> Hi Ryan,
>>
>>
Do you mind to attach the whole yarn app log ?
On Wed, Mar 2, 2016 at 10:03 AM, Nirav Patel wrote:
> Hi Ryan,
>
> I did search "OutOfMemoryError" earlier and just now but it doesn't
> indicate anything else.
>
> Another thing is Job fails at "saveAsHadoopDataset" call to
Hi Ryan,
I did search "OutOfMemoryError" earlier and just now but it doesn't
indicate anything else.
Another thing is Job fails at "saveAsHadoopDataset" call to huge rdd. Most
of the executors fails at this stage. I don't understand that as well.
Because that should be a straight write job to
Could you search "OutOfMemoryError" in the executor logs? It could be
"OufOfMemoryError: Direct Buffer Memory" or something else.
On Tue, Mar 1, 2016 at 6:23 AM, Nirav Patel wrote:
> Hi,
>
> We are using spark 1.5.2 or yarn. We have a spark application utilizing
> about
Hi,
We are using spark 1.5.2 or yarn. We have a spark application utilizing
about 15GB executor memory and 1500 overhead. However, at certain stage we
notice higher GC time (almost same as task time) spent. These executors are
bound to get killed at some point. However, nodemanager or resource