Github user chesterxgchen commented on the pull request:
https://github.com/apache/spark/pull/2786#issuecomment-62450789
>>We don't officially support calling directly in to the yarn Client at
this point.
So far, we haven't done this yet. As the communication is one-way push from
Server to Application. But we won't like to do something in our next release of
application. My next PR, would setup of the communication channel to enable
this possibility.
>>I think I would like to see the second pr or an overall design before
putting this piece in to see how things really fit together
Technically, the second part PR is not directly related to the this PR,
event though we used both changes together in our Application.
In nutshell, the 2nd PR is simply the following:
1) When Submitting the Spark Job to Yarn, send the application host and
port ( akka URL) in the arguments to the Client class;
2) In the spark job, try to connect to the Application with the host and
port to established the handshake.
In our case, simply resolve the Akka actor via Akka Selection on given Akka
URL.
3) Once the connection is reestablished, we can now send the message from
a actor ( created from SparkContext's actor system) to the remote actor
listener which listen to the message.
4) The spark job can be defined in a newly defined submain () method ( new
trait) and exception throw can be directly caught and send as error message to
the remote listener before re-throw again. The exception is relayed to
Application side to stop the calling job and display on the UI
5) All logs can be redirected to both to listener and yarn container
console ( using PrintStream re-direct to overwrite println)
6) create a new SparkJob Listener to catch the same job status in the spark
UI to the listener, which then re-displace the progress and job iteration tasks
in the application UI with spark tracking URL.
7) Other application stats (such as error counter) can be send as task
message, so the application can update the application directly.
I am currently working on isolate the AKKA piece to so that the Netty can
be used as the communication layer. In this way, large data size can be
transferred. I will make it configurable for people to plugin other network
protocols.
Due to our own release schedule, I was not able to work as fast as I hoped.
But hope this give you the sense what's the overall PR is about.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]