Github user chesterxgchen commented on the pull request:

    https://github.com/apache/spark/pull/2786#issuecomment-62450789
  
    
    >>We don't officially support calling directly in to the yarn Client at 
this point.
    
    So far, we haven't done this yet. As the communication is one-way push from 
Server to Application. But we won't like to do something in our next release of 
application.  My next PR, would setup of the communication channel to enable 
this possibility. 
    
    
    >>I think I would like to see the second pr or an overall design before 
putting this piece in to see how things really fit together
    
    Technically, the second part PR is not directly related to the this PR, 
event though we used both changes together in our Application. 
    
    In nutshell, the 2nd PR is simply the following: 
    
    1) When Submitting the Spark Job to Yarn,   send the application host and 
port ( akka URL)  in the arguments to the Client class; 
    
    2) In the spark job, try to connect to the Application with the host and 
port to established the handshake. 
    In our case, simply resolve the Akka actor via Akka Selection on given Akka 
URL. 
    
    3)  Once the connection is reestablished, we can now send the message from 
a actor ( created from SparkContext's actor system) to the remote actor 
listener which listen to the message. 
    
    4) The spark job can be defined in a newly defined submain () method ( new 
trait)  and exception throw can be directly caught and send as error message to 
the remote listener before re-throw again. The exception is relayed to 
Application side to stop the calling job and display on the UI
    
    5) All logs can be redirected to both to listener and yarn container 
console ( using PrintStream re-direct to overwrite println) 
    
    6) create a new SparkJob Listener to catch the same job status in the spark 
UI to the listener, which then re-displace the progress and job iteration tasks 
in the application UI with spark tracking URL. 
    
    7) Other application stats (such as error counter) can be send as task 
message, so the application can update the application directly. 
    
    
    I am currently working on isolate the AKKA piece to so that the Netty can 
be used as the communication layer. In this way, large data size can be 
transferred. I will make it configurable for people to plugin other network 
protocols. 
    
    Due to our own release schedule, I was not able to work as fast as I hoped. 
But hope this give you the sense what's the overall PR is about. 
    
    
    
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to