Thanks Marcelo! 

The reason I was asking that question is that I was expecting my spark job
to be a "map only" job. In other words, it should finish after the
mapPartitions run for all partitions. This is because the job is only
mapPartitions() plus count() where mapPartitions only yield one integer for
each partition. The first stage running "count at
/root/workspace/**/mapred/aerospike_calculations.py:35" completed after
reasonably long time. I was expecting the job to complete right away after
the first stage is complete. To my surprise, the second stage calling
"collect at NativeMethodAccessorImpl.java:-2" runs super slow, about as slow
as the first stage. 

I want to know what the second stage is doing..

================================UI============================
Spark Stages
Total Duration: 8.2 h
Scheduling Mode: FIFO
Active Stages: 1
Completed Stages: 2
Failed Stages: 0
Active Stages (1)

Stage Id        Description     Submitted       Duration        Tasks: 
Succeeded/Total  Input   Shuffle
Read    Shuffle Write
2       
(kill) collect at NativeMethodAccessorImpl.java:-2 +details
2015/08/13 16:01:59     4.1 h   
360/2048        375.1 GB                
Completed Stages (2)

Stage Id        Description     Submitted       Duration        Tasks: 
Succeeded/Total  Input   Shuffle
Read    Shuffle Write
1       
count at /root/workspace/**/aerospike_calculations.py:35
2015/08/13 12:02:40     7.5 h   
2048/2048       1785.6 GB               
0       
first at SerDeUtil.scala:70 +details
2015/08/13 12:02:34     4 s     
1/1     839.0 MB                
Failed Stages (0)

Stage Id        Description     Submitted       Duration        Tasks: 
Succeeded/Total  Input   Shuffle
Read    Shuffle Write   Failure Reason




--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/What-does-NativeMethodAccessorImpl-java-do-tp13667p13684.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to