Hi Preeze,
Is there any designed way that the client connects back to the driver
(still running in YARN) for collecting results at a later stage?
No, there is not support built into Spark for this. For this to happen
seamlessly the driver will have to start a server (pull model) or send the
results to some other server once the jobs complete (push model), both of
which add complexity to the driver. Alternatively, you can just poll on the
output files that your application produces; e.g. you can have your driver
write the results of a count to a file and poll on that file. Something
like that.
-Andrew
2015-01-19 5:59 GMT-08:00 Romi Kuntsman r...@totango.com:
in yarn-client mode it only controls the environment of the executor
launcher
So you either use yarn-client mode, and then your app keeps running and
controlling the process
Or you use yarn-cluster mode, and then you send a jar to YARN, and that jar
should have code to report the result back to you
*Romi Kuntsman*, *Big Data Engineer*
http://www.totango.com
On Thu, Jan 15, 2015 at 1:52 PM, preeze etan...@gmail.com wrote:
From the official spark documentation
(http://spark.apache.org/docs/1.2.0/running-on-yarn.html):
In yarn-cluster mode, the Spark driver runs inside an application master
process which is managed by YARN on the cluster, and the client can go
away
after initiating the application.
Is there any designed way that the client connects back to the driver
(still
running in YARN) for collecting results at a later stage?
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-client-reconnect-to-driver-in-yarn-cluster-deployment-mode-tp10122.html
Sent from the Apache Spark Developers List mailing list archive at
Nabble.com.
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org