Hi, I'm building a spark application in which I load some data from an Elasticsearch cluster (using latest elasticsearch-hadoop connector) and continue to perform some calculations on the spark cluster.
In one case, I use collect on the RDD as soon as it is created (loaded from ES). However, it is sometimes hangs on one (and sometimes more) node and doesn't continue. In the web UI, I can see that one node is stuck on scheduler delay and prevents from the job to continue, (while others have finished). Do you have any idea what is going on here? The data that is being loaded is fairly small, and only gets mapped once to domain objects before being collected. Thank you -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-hangs-on-collect-stuck-on-scheduler-delay-tp24283.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org