[GitHub] spark pull request #15009: [SPARK-17443][SPARK-11035] Stop Spark Application...

vanzin Thu, 13 Oct 2016 11:34:04 -0700

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15009#discussion_r83280861
  
    --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
    @@ -1189,6 +1217,10 @@ private[spark] class Client(
     private object Client extends Logging {
     
       def main(argStrings: Array[String]) {
    +    mainWithEnv(argStrings, Map() ++ sys.env)
    +  }
    +
    +  def mainWithEnv(argStrings: Array[String], env: Map[String, String]): 
Unit = {
    --- End diff --
    
    There's something that bothers me about this "env" argument. What is it 
supposed to be? It sounds too much like it should be a custom `sys.env` but the 
code that calls this tells me otherwise. Reading the rest of the code it seems 
you're using it both for `SparkConf` entries and as an override for `sys.env`; 
I think it would be better to have separate arguments.
    
    Also, I think it would be better to create an explicit interface for this. 
Like a trait that defines a `sparkMain` method that takes app args and 
Spark-specific args. Something like:
    
    ```
    trait SparkApp {
      this: Singleton =>
    
      def sparkMain(args: Array[String], conf: Map[String, String]): Int
    ```
    
    Thinking about the future, I think it would even be good to think about 
making that trait a public interface, and have `conf` be a `SparkConf` 
(although there are a few complications w.r.t. logging before that can happen). 
That would solve at least a couple of problems:
    
    - have an explicit interface for Spark apps instead of overloading Java's 
main()
    - have an explicit exit code for Spark apps (see how messy that is with 
yarn-cluster mode currently)
    
    For your particular change, that trait can remain `private[spark]`, and if 
the class being run does not implement it, you can throw an exception if 
launching an app in-process, kinda like your current code. But I think having 
an explicit interface for this would make both your approach easier to 
understand, and easier to extend it to other cluster managers / apps in the 
future.
    
    If adding the trait, you can also make the code in `SparkSubmit` simpler by 
having an implementation of the trait that wraps a regular app that just has a 
`main(String[])` method.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #15009: [SPARK-17443][SPARK-11035] Stop Spark Application...

Reply via email to