[jira] [Comment Edited] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045483#comment-16045483
 ] 

Lantao Jin edited comment on SPARK-21023 at 6/10/17 10:15 AM:
--

[~cloud_fan] {{\-\-properties-file}} and {{\-\-extra-properties-file}} both 
exist could confuse the user. Actually, it already confuse me. What is the 
{{\-\-extra-properties-file}} use for? [~vanzin]'s suggestion is do not change 
existing behavior and based on this suggestion I propose to add an environment 
variable {{SPARK_CONF_REPLACE_ALLOWED}}.


was (Author: cltlfcjin):
[~cloud_fan] {{--properties-file}} and {{--extra-properties-file}} both exist 
could confuse the user. Actually, it already confuse me. What is the 
{{--extra-properties-file}} use for? [~vanzin]'s suggestion is do not change 
existing behavior and based on this suggestion I propose to add an environment 
variable {{SPARK_CONF_REPLACE_ALLOWED}}.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|N/A|"foo"|
> |spark.E|"foo"|N/A|"foo"|
> |spark.F|"foo"|N/A|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045452#comment-16045452
 ] 

Lantao Jin edited comment on SPARK-21023 at 6/10/17 8:18 AM:
-

If current behavior should be keep. I can add an environment variable.E.g 
{{SPARK_CONF_REPLACE_ALLOWED}} with default value "true" and set to 
{{childEnv}} map in SparkSubmitCommandBuilder and it will be set in very 
beginning.
{code}
  public SparkLauncher setConfReplaceBehavior(String allowed) {
checkNotNull(allowed, "allowed");
builder.childEnv.put(SPARK_CONF_REPLACE_ALLOWED, allowed);
return this;
  }
{code}
Then we can export SPARK_CONF_REPLACE_ALLOWED=false in {{spark-env.sh}} to fix 
this case and keep current behavior by default. Generally, the file 
{{spark-env.sh}} deployed by infra team and protect by linux file permission 
mechanism.

Of course, user can export to any value before submitting. But it means the 
user definitely know what they want instead of the current unexpected result.


was (Author: cltlfcjin):
If current behavior should be keep. I can add an environment variable.E.g 
{{SPARK_CONF_REPLACE_ALLOWED}} with default value "true" and set to 
{{childEnv}} map in SparkLauncher.class.
{code}
  public SparkLauncher setConfReplaceBehavior(String allowed) {
checkNotNull(allowed, "allowed");
builder.childEnv.put(SPARK_CONF_REPLACE_ALLOWED, allowed);
return this;
  }
{code}
Then we can export SPARK_CONF_REPLACE_ALLOWED=false in {{spark-env.sh}} to fix 
this case and keep current behavior by default. Generally, the file 
{{spark-env.sh}} deployed by infra team and protect by linux file permission 
mechanism.

Of course, user can export to any value before submitting. But it means the 
user definitely know what they want instead of the current unexpected result.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|"foo"|"foo"|
> |spark.F|"foo"|"foo"|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045452#comment-16045452
 ] 

Lantao Jin edited comment on SPARK-21023 at 6/10/17 8:16 AM:
-

If current behavior should be keep. I can add an environment variable.E.g 
{{SPARK_CONF_REPLACE_ALLOWED}} with default value "true" and set to 
{{childEnv}} map in SparkLauncher.class.
{code}
  public SparkLauncher setConfReplaceBehavior(String allowed) {
checkNotNull(allowed, "allowed");
builder.childEnv.put(SPARK_CONF_REPLACE_ALLOWED, allowed);
return this;
  }
{code}
Then we can export SPARK_CONF_REPLACE_ALLOWED=false in {{spark-env.sh}} to fix 
this case and keep current behavior by default. Generally, the file 
{{spark-env.sh}} deployed by infra team and protect by linux file permission 
mechanism.

Of course, user can export to any value before submitting. But it means the 
user definitely know what they want instead of the current unexpected result.


was (Author: cltlfcjin):
If current behavior should be keep. I can add an environment variable.E.g 
{{SPARK_CONF_REPLACE_ALLOWED}} with default value "true" and set to 
{{childEnv}} map in AbstractCommandBuilder.class.
{code}
  static final String SPARK_CONF_REPLACE_ALLOWED = "SPARK_CONF_REPLACE_ALLOWED";
{code}
Then we can export SPARK_CONF_REPLACE_ALLOWED=false in {{spark-env.sh}} to fix 
this case and keep current behavior by default. Generally, the file 
{{spark-env.sh}} deployed by infra team and protect by linux file permission 
mechanism.

Of course, user can export to any value before submitting. But it means the 
user definitely know what they want instead of the current unexpected result.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|"foo"|"foo"|
> |spark.F|"foo"|"foo"|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043160#comment-16043160
 ] 

Lantao Jin edited comment on SPARK-21023 at 6/8/17 6:06 PM:


The purpose is making the default configuration loaded anytime. Because the 
parameters app developer set always less the it should be.
For example: App dev set spark.executor.instances=100 in their properties file. 
But one month later the spark version upgrade to a new version by infra team 
and dynamic resource allocation enabled. But the old job can not load the new 
parameters so no dynamic feature enable for it. It still causes more challenge 
to control cluster for infra team and bad performance for app team.


was (Author: cltlfcjin):
*The purpose is making the default configuration loaded anytime.* Because the 
parameters app developer set always less the it should be.
For example: App dev set spark.executor.instances=100 in their properties file. 
But one month later the spark version upgrade to a new version by infra team 
and dynamic resource allocation enabled. But the old job can not load the new 
parameters so no dynamic feature enable for it. It still causes more challenge 
to control cluster for infra team and bad performance for app team.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043124#comment-16043124
 ] 

Lantao Jin edited comment on SPARK-21023 at 6/8/17 5:45 PM:


[~vanzin] I suggest to change the current behavior and offer a document to 
illustrate this. \-\-properties-file will overwrite the args which are set in  
spark-defaults.conf first. It's equivalent to set dozens of {{--conf k=v}} in 
command line. Please review and open for any ideas.


was (Author: cltlfcjin):
[~vanzin] I suggest to change the current behavior and offer a document to 
illustrate this. --properties-file will overwrite the args which are set in  
spark-defaults.conf first. It's equivalent to set dozens of {{--conf k=v}} in 
command line. Please review and open for any ideas.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043053#comment-16043053
 ] 

Marcelo Vanzin edited comment on SPARK-21023 at 6/8/17 5:11 PM:


I thought we had an issue for adding a user-specific config file that is loaded 
on top of the defaults, but I can't find it. In any case, changing the current 
behavior is not really desired, but you can add this as a new feature without 
changing the current behavior.


was (Author: vanzin):
I thought we have an issue for adding a user-specific config file that is 
loaded on top of the defaults, but I can't find it. In any case, changing the 
current behavior is not really desired, but you can add this as a new feature 
without changing the current behavior.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org