I have the same issue with Spark 2.0.1, Java 1.8.x and pyspark. I also use SparkSQL and JDBC. My application runs locally. It happens only of I connect to the UI during Spark execution and even if I close the browser before the execution ends. I observed this behaviour both on macOS Sierra and Red Hat 6.7
Il 16 nov 2016 3:09 AM, "Michael Johnson" <mjjohnson....@yahoo.com.invalid> ha scritto: > The extremely long hand/pause has started happening again. I've been > running on a small remote cluster, so I used the UI to grab thread dumps > rather than doing it from the command line. There seems to be one executor > still alive, along with the driver; I grabbed 4 thread dumps from each, a > couple of seconds apart. I'd greatly appreciate any help tracking down > what's going on! (I've attached them, but I can paste them somewhere if > that's more convenient.) > > Thanks, > Michael > > > > > On Sunday, November 6, 2016 10:49 PM, Michael Johnson < > mjjohnson....@yahoo.com.INVALID> wrote: > > > Hm. Something must have changed, as it was happening quite consistently > and now I can't get it to reproduce. Thank you for the offer, and if it > happens again I will try grabbing thread dumps and I will see if I can > figure out what is going on. > > > On Sunday, November 6, 2016 10:02 AM, Aniket Bhatnagar < > aniket.bhatna...@gmail.com> wrote: > > > I doubt it's GC as you mentioned that the pause is several minutes. Since > it's reproducible in local mode, can you run the spark application locally > and once your job is complete (and application appears paused), can you > take 5 thread dumps (using jstack or jcmd on the local spark JVM process) > with 1 second delay between each dump and attach them? I can take a look. > > Thanks, > Aniket > > On Sun, Nov 6, 2016 at 2:21 PM Michael Johnson <mjjohnson....@yahoo.com> > wrote: > > Thanks; I tried looking at the thread dumps for the driver and the one > executor that had that option in the UI, but I'm afraid I don't know how to > interpret what I saw... I don't think it could be my code directly, since > at this point my code has all completed? Could GC be taking that long? > > (I could also try grabbing the thread dumps and pasting them here, if that > would help?) > > On Sunday, November 6, 2016 8:36 AM, Aniket Bhatnagar < > aniket.bhatna...@gmail.com> wrote: > > > In order to know what's going on, you can study the thread dumps either > from spark UI or from any other thread dump analysis tool. > > Thanks, > Aniket > > On Sun, Nov 6, 2016 at 1:31 PM Michael Johnson > <mjjohnson....@yahoo.com.invalid> > wrote: > > I'm doing some processing and then clustering of a small dataset (~150 > MB). Everything seems to work fine, until the end; the last few lines of my > program are log statements, but after printing those, nothing seems to > happen for a long time...many minutes; I'm not usually patient enough to > let it go, but I think one time when I did just wait, it took over an hour > (and did eventually exit on its own). Any ideas on what's happening, or how > to troubleshoot? > > (This happens both when running locally, using the localhost mode, as well > as on a small cluster with four 4-processor nodes each with 15GB of RAM; in > both cases the executors have 2GB+ of RAM, and none of the inputs/outputs > on any of the stages is more than 75 MB...) > > Thanks, > Michael > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >