Hm. Something must have changed, as it was happening quite consistently and now
I can't get it to reproduce. Thank you for the offer, and if it happens again I
will try grabbing thread dumps and I will see if I can figure out what is going
on.
On Sunday, November 6, 2016 10:02 AM, Aniket Bhatnagar
<[email protected]> wrote:
I doubt it's GC as you mentioned that the pause is several minutes. Since it's
reproducible in local mode, can you run the spark application locally and once
your job is complete (and application appears paused), can you take 5 thread
dumps (using jstack or jcmd on the local spark JVM process) with 1 second delay
between each dump and attach them? I can take a look.
Thanks,Aniket
On Sun, Nov 6, 2016 at 2:21 PM Michael Johnson <[email protected]> wrote:
Thanks; I tried looking at the thread dumps for the driver and the one executor
that had that option in the UI, but I'm afraid I don't know how to interpret
what I saw... I don't think it could be my code directly, since at this point
my code has all completed? Could GC be taking that long?
(I could also try grabbing the thread dumps and pasting them here, if that
would help?)
On Sunday, November 6, 2016 8:36 AM, Aniket Bhatnagar
<[email protected]> wrote:
In order to know what's going on, you can study the thread dumps either from
spark UI or from any other thread dump analysis tool.
Thanks,Aniket
On Sun, Nov 6, 2016 at 1:31 PM Michael Johnson
<[email protected]> wrote:
I'm doing some processing and then clustering of a small dataset (~150 MB).
Everything seems to work fine, until the end; the last few lines of my program
are log statements, but after printing those, nothing seems to happen for a
long time...many minutes; I'm not usually patient enough to let it go, but I
think one time when I did just wait, it took over an hour (and did eventually
exit on its own). Any ideas on what's happening, or how to troubleshoot?
(This happens both when running locally, using the localhost mode, as well as
on a small cluster with four 4-processor nodes each with 15GB of RAM; in both
cases the executors have 2GB+ of RAM, and none of the inputs/outputs on any of
the stages is more than 75 MB...)
Thanks,Michael