[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045495#comment-16045495
 ] 

Lantao Jin commented on SPARK-21023:


[~cloud_fan], I come to see what you mean. Maybe add a 
{{\-\-merged-properties-file}} as a option and explain in document is good 
enough for this case. Don't spend effort to make sure the default properties 
file always be loaded. Just make sure the spark user knows what they do.

And in document, we can explain the different options:
{quote}
{{\-\-properties-file}} user-specified properties file which will replace the 
default properties file.
{{\-\-merged-properties-file}} user-specified properties file which will merge 
with the default properties file.
{quote}

I think I should close this as JIRA as the original purpose (make sure load 
default properties file) is not an issue. I will file a new one to implement 
the new feature.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-specified||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-specified||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|N/A|"foo"|
> |spark.E|"foo"|N/A|"foo"|
> |spark.F|"foo"|N/A|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045483#comment-16045483
 ] 

Lantao Jin commented on SPARK-21023:


[~cloud_fan] {{--properties-file}} and {{--extra-properties-file}} both exist 
could confuse the user. Actually, it already confuse me. What is the 
{{--extra-properties-file}} use for? [~vanzin]'s suggestion is do not change 
existing behavior and based on this suggestion I propose to add an environment 
variable {{SPARK_CONF_REPLACE_ALLOWED}}.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|N/A|"foo"|
> |spark.E|"foo"|N/A|"foo"|
> |spark.F|"foo"|N/A|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Wenchen Fan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045472#comment-16045472
 ] 

Wenchen Fan commented on SPARK-21023:
-

can't we just introduce something like `--extra-properties-file` for this new 
feature?

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|"foo"|"foo"|
> |spark.F|"foo"|"foo"|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045452#comment-16045452
 ] 

Lantao Jin commented on SPARK-21023:


If current behavior should be keep. I can add an environment variable.E.g 
{{SPARK_CONF_REPLACE_ALLOWED}} with default value "true" and set to 
{{childEnv}} map in AbstractCommandBuilder.class.
{code}
  static final String SPARK_CONF_REPLACE_ALLOWED = "SPARK_CONF_REPLACE_ALLOWED";
{code}
Then we can export SPARK_CONF_REPLACE_ALLOWED=false in {{spark-env.sh}} to fix 
this case and keep current behavior by default. Generally, the file 
{{spark-env.sh}} deployed by infra team and protect by linux file permission 
mechanism.

Of course, user can export to any value before submitting. But it means the 
user definitely know what they want instead of the current unexpected result.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|"foo"|"foo"|
> |spark.F|"foo"|"foo"|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-10 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045441#comment-16045441
 ] 

Lantao Jin commented on SPARK-21023:


I modify the description with adding two tables to illustrate why I consider it 
as a bug. Escalate to dev mailing list to discuss.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> For example:
> Current implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|N/A|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|N/A|N/A|
> |spark.F|"foo"|N/A|N/A|
> Expected right implement
> ||Property name||Value in default||Value in user-special||Finally value||
> |spark.A|"foo"|"bar"|"bar"|
> |spark.B|"foo"|N/A|"foo"|
> |spark.C|N/A|"bar"|"bar"|
> |spark.D|"foo"|"foo"|"foo"|
> |spark.E|"foo"|"foo"|"foo"|
> |spark.F|"foo"|"foo"|"foo"|
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043209#comment-16043209
 ] 

Marcelo Vanzin commented on SPARK-21023:


Then your best bet is a new command line option that implements the behavior 
you want.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043191#comment-16043191
 ] 

Lantao Jin commented on SPARK-21023:


I think {{--conf}} couldn't help this. Because from the view of infra team, 
they hope their cluster level configuration can take effect in all jobs if no 
customer overwrite it. Does it make sense if we add a switch val in 
spark-env.sh?

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043179#comment-16043179
 ] 

Lantao Jin commented on SPARK-21023:


{quote}
it may break existing applications
{quote}
I really know the risk and hope to do the right thing. Need find a way to keep 
current behavior and can easily control the behavior follow we want.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043176#comment-16043176
 ] 

Marcelo Vanzin commented on SPARK-21023:


bq.  When and where the new config option be set? 

That's what makes that option awkward. It would have to be set in the user 
config or in the command line with {{\-\-conf}}. So it's not that much 
different from a new command line option, other than it avoids a new command 
line option.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043172#comment-16043172
 ] 

Lantao Jin commented on SPARK-21023:


{quote}
Another option is to have a config option
{quote}
Oh, sorry. {{--properties-file}} can skip to load the default configuration 
file. When and where the new config option be set? In spark-env?

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043168#comment-16043168
 ] 

Marcelo Vanzin commented on SPARK-21023:


bq. The purpose is making the default configuration loaded anytime.

We all understand the purpose. But it breaks the existing behavior, so it may 
break existing applications. That makes your solution, as presented, a 
non-starter.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043164#comment-16043164
 ] 

Lantao Jin commented on SPARK-21023:


{quote}
Another option is to have a config option
{quote}
LGTM

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043160#comment-16043160
 ] 

Lantao Jin commented on SPARK-21023:


*The purpose is making the default configuration loaded anytime.* Because the 
parameters app developer set always less the it should be.
For example: App dev set spark.executor.instances=100 in their properties file. 
But one month later the spark version upgrade to a new version by infra team 
and dynamic resource allocation enabled. But the old job can not load the new 
parameters so no dynamic feature enable for it. It still causes more challenge 
to control cluster for infra team and bad performance for app team.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043132#comment-16043132
 ] 

Marcelo Vanzin commented on SPARK-21023:


Another option is to have a config option that controls whether the default 
file is loaded on top of {{--properties-file}}. If avoids adding a new command 
line argument, but is a little more awkward to use.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043131#comment-16043131
 ] 

Marcelo Vanzin commented on SPARK-21023:


bq. I suggest to change the current behavior

Yes, and we're saying that should not be done, because it's a change in 
semantics that might cause breakages in people's workflows. Regardless of 
whether the new behavior is better or worse, implementing it is a breaking 
change.

If you want this you need to implement it in a way that does not change the 
current behavior - e.g., as a new command line argument instead of modifying 
the behavior of the existing one.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Lantao Jin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043124#comment-16043124
 ] 

Lantao Jin commented on SPARK-21023:


[~vanzin] I suggest to change the current behavior and offer a document to 
illustrate this. --properties-file will overwrite the args which are set in  
spark-defaults.conf first. It's equivalent to set dozens of {{--conf k=v}} in 
command line. Please review and open for any ideas.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043097#comment-16043097
 ] 

Apache Spark commented on SPARK-21023:
--

User 'LantaoJin' has created a pull request for this issue:
https://github.com/apache/spark/pull/18243

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043053#comment-16043053
 ] 

Marcelo Vanzin commented on SPARK-21023:


I thought we have an issue for adding a user-specific config file that is 
loaded on top of the defaults, but I can't find it. In any case, changing the 
current behavior is not really desired, but you can add this as a new feature 
without changing the current behavior.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21023) Ignore to load default properties file is not a good choice from the perspective of system

2017-06-08 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043036#comment-16043036
 ] 

Sean Owen commented on SPARK-21023:
---

Maybe, but it would be a behavior change now. There are equal counter-arguments 
for the current behavior.

> Ignore to load default properties file is not a good choice from the 
> perspective of system
> --
>
> Key: SPARK-21023
> URL: https://issues.apache.org/jira/browse/SPARK-21023
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Submit
>Affects Versions: 2.1.1
>Reporter: Lantao Jin
>Priority: Minor
>
> The default properties file {{spark-defaults.conf}} shouldn't be ignore to 
> load even though the submit arg {{--properties-file}} is set. The reasons are 
> very easy to see:
> * Infrastructure team need continually update the {{spark-defaults.conf}} 
> when they want set something as default for entire cluster as a tuning 
> purpose.
> * Application developer only want to override the parameters they really want 
> rather than others they even doesn't know (Set by infrastructure team).
> * The purpose of using {{\-\-properties-file}} from most of application 
> developers is to avoid setting dozens of {{--conf k=v}}. But if 
> {{spark-defaults.conf}} is ignored, the behaviour becomes unexpected finally.
> All this caused by below codes:
> {code}
>   private Properties loadPropertiesFile() throws IOException {
> Properties props = new Properties();
> File propsFile;
> if (propertiesFile != null) {
> // default conf property file will not be loaded when app developer use 
> --properties-file as a submit args
>   propsFile = new File(propertiesFile);
>   checkArgument(propsFile.isFile(), "Invalid properties file '%s'.", 
> propertiesFile);
> } else {
>   propsFile = new File(getConfDir(), DEFAULT_PROPERTIES_FILE);
> }
> //...
> return props;
>   }
> {code}
> I can offer a patch to fix it if you think it make sense.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org