Re: Very long pause/hang at end of execution

Michael Johnson Wed, 16 Nov 2016 21:14:27 -0800

On Wed, Nov 16, 2016 at 10:44 AM Aniket Bhatnagar <aniket.bhatna...@gmail.com> 
wrote:
Thanks for sharing the thread dump. I had a look at them and couldn't find 
anything unusual. Is there anything in the logs (driver + executor) that 
suggests what's going on? Also, what does the spark job do and what is the 
version of spark and hadoop you are using?


I haven't seen anything in the logs; when I observed it happening before, in 
local mode, the last output before the hang would be a log statement from my 
code (that is, I had a log4j logger and was calling info() on that logger). 
That was also the last line of my main() function. Then, I saw no more output, 
neither from the driver nor the executors. I have seen the pause be as short as 
a few minutes, or approaching an hour. As far as I can tell, when it continues, 
the log statements look more or less normal.
Locally, I'm using Spark 2.0.1 built for Hadoop 2.7, but without installing 
Hadoop. Remotely, I'm running on Google Cloud Dataproc, which also uses Spark 
2.0.1, along with Hadoop 2.7.3. I've had it happen both locally and remotely.
The job loads data from a text file (using SparkContext.textFile()), and then 
splits each line and converts it into an array of integers. From there, I do 
some sketching (the data encodes either a tree, a graph, or text, and I create 
a fixed-length sketch that probabilistically produces similar results for 
similar nodes in the tree/graph). I then do some lightweight clustering on the 
sketches, and save the cluster assignments to a text file.
For what it's worth, when I look at the GC stats from the UI, they seem a bit 
high (they can be as high as 1 minute GC for a 15 minute run). However, those 
stats do not change during the pause period.
On Wed, Nov 16, 2016 at 2:48 AM Aniket Bhatnagar <aniket.bhatna...@gmail.com> 
wrote:
Also, how are you launching the application? Through spark submit or creating 
spark content in your app? 


I'm calling spark-submit, and then within my app I call 
SparkContext.getOrCreate() to get a context. I then call sc.textFile() to load 
my data into an RDD, and then perform various actions on that. I tried adding a 
call to sc.stop() at the very end, after seeing some discussion that that might 
be necessary, but it didn't seem to make a difference.
The strange thing is that this behavior comes and goes. I tried opening the UI, 
as Pietro suggested, but that didn't seem to trigger it for me; I haven't 
figured out what, if anything, will make it happen every time.
On Wednesday, November 16, 2016 4:41 AM, Pietro Pugni <pietro.pu...@gmail.com> 
wrote:

I have the same issue with Spark 2.0.1, Java 1.8.x and pyspark. I also use 
SparkSQL and JDBC. My application runs locally. It happens only of I connect to 
the UI during Spark execution and even if I close the browser before the 
execution ends. I observed this behaviour both on macOS Sierra and Red Hat 6.7

That is interesting that you are seeing this too. I can't get it to happen by 
using the UI...but I also am having difficulty making it happen at all right 
now. (Only trying locally at the moment.)

Re: Very long pause/hang at end of execution

Reply via email to