Hi,
I'm building a spark application in which I load some data from an
Elasticsearch cluster (using latest elasticsearch-hadoop connector) and
continue to perform some calculations on the spark cluster.

In one case, I use collect on the RDD as soon as it is created (loaded from
ES).
However, it is sometimes hangs on one (and sometimes more) node and doesn't
continue.
In the web UI, I can see that one node is stuck on scheduler delay and
prevents from the job to continue,
(while others have finished).

Do you have any idea what is going on here?

The data that is being loaded is fairly small, and only gets mapped once to
domain objects before being collected.

Thank you



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-hangs-on-collect-stuck-on-scheduler-delay-tp24283.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to