first() is allowed to "run locally", which means that the driver will
execute the action itself without launching any tasks. This is also true of
take(n) for sufficiently small n, for instance.


On Wed, Feb 19, 2014 at 9:55 AM, David Thomas <dt5434...@gmail.com> wrote:

> If I perform a 'collect' action on the RDD, I can see a new stage getting
> created in the spark web UI (http://master:4040/stages/), but when I do a
> 'first' action, I don't see any stage getting created. However on the
> console I see these lines:
>
> 14/02/19 10:51:31 INFO SparkContext: Starting job: first at xxx.scala:110
> 14/02/19 10:51:31 INFO DAGScheduler: Got job 110 (first at xxx.scala:110)
> with 1 output partitions (allowLocal=true)
> 14/02/19 10:51:31 INFO DAGScheduler: Final stage: Stage 2 (first at
> xxx.scala:110)
> 14/02/19 10:51:31 INFO DAGScheduler: Parents of final stage: List()
> 14/02/19 10:51:31 INFO DAGScheduler: Missing parents: List()
>
> So why doesn't the webUI list the stages created when I run the 'first'
> action?
>

Reply via email to