GitHub user chesterxgchen opened a pull request:

    https://github.com/apache/spark/pull/2786

    [SPARK-3913] Spark Yarn Client API change to expose Yarn Resource Capacity, 
Yarn Application Listener and KillApplication APIs

    Spark Yarn Client API change to expose Yarn Resource Capacity, Yarn 
Application Listener and KillApplication APIs
    
    When working with Spark with Yarn cluster mode, we have following issues:
    1) We don't know how much yarn max capacity ( memory and cores) before we 
specify the number of executor and memories for spark drivers and executors. We 
we set a big number, the job can potentially exceeds the limit and got killed.
    It would be better we let the application know that the yarn resource 
capacity a head of time and the spark config can adjusted dynamically.
    2) Once job started, we would like some feedback from yarn application. 
Currently, the spark client basically block the call and returns when the job 
is finished or failed or killed.
    If the job runs for few hours, we have no idea how far it has gone, the 
progress and resource usage, tracking URL etc. This Pull Request will not 
complete solve the issue #2, but it will allow expose Yarn Application status: 
such as when the job is started, killed, finished, the tracking URL etc, some 
limited progress reporting ( for CDH5 we found the progress only reports 0, 10 
and 100%)
    
    I will have another Pull Request to address the Yarn Application and Spark 
Job communication issue, that's not covered here.
    
    3) If we decide to stop the spark job, the Spark Yarn Client expose a stop 
method. But the stop method, in many cases, does not stop the yarn application.
    
      So we need to expose the yarn client's killApplication() API to spark 
client.
    
    The proposed change is to change Client Constructor, change the first 
argument from ClientArguments to
     YarnResourceCapacity => ClientArguments
    
     Were YarnResourceCapacity contains yarn's max memory and virtual cores as 
well as overheads.
    
    This allows application to adjust the memory and core settings accordingly.
    
    For existing application that ignore the YarnResourceCapacity the
    
    def toArgs (capacity: YarnResourceCapacity) = new ClientArguments(...)
    
     We also defined the YarnApplicationListener interface that expose some of 
the information about YarnApplicationReport.
    
      Client.addYarnApplicaitonListener(listerner)
      will allow them to get call back at different state of the application, 
so they can react accordingly.
    
      For example, onApplicationInit() the callback will invoked when the AppId 
is available but application is not yet started. Once can use this AppId to 
kill the application if the run is not longer desired.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AlpineNow/spark SPARK-3913

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2786.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2786
    
----
commit fd66c16a34af149e16b2af8de742044ea32dd332
Author: chesterxgchen <[email protected]>
Date:   2014-10-12T03:33:05Z

    [SPARK-3913]
    Spark Yarn Client API change to expose Yarn Resource Capacity, Yarn 
Application Listener and KillApplication APIs
    
    When working with Spark with Yarn cluster mode, we have following issues:
    1) We don't know how much yarn max capacity ( memory and cores) before we 
specify the number of executor and memories for spark drivers and executors. We 
we set a big number, the job can potentially exceeds the limit and got killed.
    It would be better we let the application know that the yarn resource 
capacity a head of time and the spark config can adjusted dynamically.
    2) Once job started, we would like some feedback from yarn application. 
Currently, the spark client basically block the call and returns when the job 
is finished or failed or killed.
    If the job runs for few hours, we have no idea how far it has gone, the 
progress and resource usage, tracking URL etc. This Pull Request will not 
complete solve the issue #2, but it will allow expose Yarn Application status: 
such as when the job is started, killed, finished, the tracking URL etc, some 
limited progress reporting ( for CDH5 we found the progress only reports 0, 10 
and 100%)
    
       I will have another Pull Request to address the Yarn Application and 
Spark Job communication issue, that's not covered here.
    3) If we decide to stop the spark job, the Spark Yarn Client expose a stop 
method. But the stop method, in many cases, does not the yarn application.
    
      So we need to expose the yarn client's killApplication() API to spark 
client.
    
    The proposed change is to change Client Constructor, change the first 
argument from ClientArguments to
     YarnResourceCapacity => ClientArguments
    
     Were YarnResourceCapacity contains yarn's max memory and virtual cores as 
well as overheads.
    
    This allows application to adjust the memory and core settings accordingly.
    
    For existing application that ignore the YarnResourceCapacity the
    
    def toArgs (capacity: YarnResourceCapacity) = new ClientArguments(...)
    
     We also defined the YarnApplicationListener interface that expose some of 
the information about YarnApplicationReport.
    
      Client.addYarnApplicaitonListener(listern)
      will allow them to get call back at different state of the application, 
so they can react accordingly.
    
      For example, onApplicationInit() the callback will invoked when the AppId 
is available but application is not yet started. Once can use this AppId to 
kill the application if the run is not longer desired.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to