1. I've tried with and without escaping equals sign, it doesn't affect the
results.

2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for getting
system properties set in the local shell (although not for executors).

3. We're using the default fine-grained mesos mode, not setting
spark.mesos.coarse, so it doesn't seem immediately related to that ticket.
Should I file a bug report?


On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell <pwend...@gmail.com> wrote:

> The third issue may be related to this:
> https://issues.apache.org/jira/browse/SPARK-2022
>
> We can take a look at this during the bug fix period for the 1.1
> release next week. If we come up with a fix we can backport it into
> the 1.0 branch also.
>
> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell <pwend...@gmail.com>
> wrote:
> > Thanks for digging around here. I think there are a few distinct issues.
> >
> > 1. Properties containing the '=' character need to be escaped.
> > I was able to load properties fine as long as I escape the '='
> > character. But maybe we should document this:
> >
> > == spark-defaults.conf ==
> > spark.foo a\=B
> > == shell ==
> > scala> sc.getConf.get("spark.foo")
> > res2: String = a=B
> >
> > 2. spark.driver.extraJavaOptions, when set in the properties file,
> > don't affect the driver when running in client mode (always the case
> > for mesos). We should probably document this. In this case you need to
> > either use --driver-java-options or set SPARK_SUBMIT_OPTS.
> >
> > 3. Arguments aren't propagated on Mesos (this might be because of the
> > other issues, or a separate bug).
> >
> > - Patrick
> >
> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger <c...@koeninger.org>
> wrote:
> >> In addition, spark.executor.extraJavaOptions does not seem to behave as
> I
> >> would expect; java arguments don't seem to be propagated to executors.
> >>
> >>
> >> $ cat conf/spark-defaults.conf
> >>
> >> spark.master
> >>
> mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters
> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23
> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>
> >>
> >> $ ./bin/spark-shell
> >>
> >> scala> sc.getConf.get("spark.executor.extraJavaOptions")
> >> res0: String = -Dfoo.bar.baz=23
> >>
> >> scala> sc.parallelize(1 to 100).map{ i => (
> >>      |  java.net.InetAddress.getLocalHost.getHostName,
> >>      |  System.getProperty("foo.bar.baz")
> >>      | )}.collect
> >>
> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null),
> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null),
> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null),
> >> (dn-02.mxstg,null), ...
> >>
> >>
> >>
> >> Note that this is a mesos deployment, although I wouldn't expect that to
> >> affect the availability of spark.driver.extraJavaOptions in a local
> spark
> >> shell.
> >>
> >>
> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger <c...@koeninger.org>
> wrote:
> >>
> >>> Either whitespace or equals sign are valid properties file formats.
> >>> Here's an example:
> >>>
> >>> $ cat conf/spark-defaults.conf
> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>>
> >>> $ ./bin/spark-shell -v
> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> Adding default property: spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>>
> >>>
> >>> scala>  System.getProperty("foo.bar.baz")
> >>> res0: String = null
> >>>
> >>>
> >>> If you add double quotes, the resulting string value will have double
> >>> quotes.
> >>>
> >>>
> >>> $ cat conf/spark-defaults.conf
> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>>
> >>> $ ./bin/spark-shell -v
> >>> Using properties file: /opt/spark/conf/spark-defaults.conf
> >>> Adding default property:
> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23"
> >>>
> >>> scala>  System.getProperty("foo.bar.baz")
> >>> res0: String = null
> >>>
> >>>
> >>> Neither one of those affects the issue; the underlying problem in my
> case
> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS and
> >>> SPARK_JAVA_OPTS environment variables, but nothing parses
> >>> spark-defaults.conf before the java process is started.
> >>>
> >>> Here's an example of the process running when only spark-defaults.conf
> is
> >>> being used:
> >>>
> >>> $ ps -ef | grep spark
> >>>
> >>> 514       5182  2058  0 21:05 pts/2    00:00:00 bash ./bin/spark-shell
> -v
> >>>
> >>> 514       5189  5182  4 21:05 pts/2    00:00:22
> /usr/local/java/bin/java
> >>> -cp
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m
> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class
> >>> org.apache.spark.repl.Main
> >>>
> >>>
> >>> Here's an example of it when the command line --driver-java-options is
> >>> used (and thus things work):
> >>>
> >>>
> >>> $ ps -ef | grep spark
> >>> 514       5392  2058  0 21:15 pts/2    00:00:00 bash ./bin/spark-shell
> -v
> >>> --driver-java-options -Dfoo.bar.baz=23
> >>>
> >>> 514       5399  5392 80 21:15 pts/2    00:00:06
> /usr/local/java/bin/java
> >>> -cp
> >>>
> ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx
> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= -Xms512m
> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v
> >>> --driver-java-options -Dfoo.bar.baz=23 --class
> org.apache.spark.repl.Main
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell <pwend...@gmail.com>
> >>> wrote:
> >>>
> >>>> Cody - in your example you are using the '=' character, but in our
> >>>> documentation and tests we use a whitespace to separate the key and
> >>>> value in the defaults file.
> >>>>
> >>>> docs: http://spark.apache.org/docs/latest/configuration.html
> >>>>
> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23
> >>>>
> >>>> I'm not sure if the java properties file parser will try to interpret
> >>>> the equals sign. If so you might need to do this.
> >>>>
> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23"
> >>>>
> >>>> Do those work for you?
> >>>>
> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin <van...@cloudera.com>
> >>>> wrote:
> >>>> > Hi Cody,
> >>>> >
> >>>> > Could you file a bug for this if there isn't one already?
> >>>> >
> >>>> > For system properties SparkSubmit should be able to read those
> >>>> > settings and do the right thing, but that obviously won't work for
> >>>> > other JVM options... the current code should work fine in cluster
> mode
> >>>> > though, since the driver is a different process. :-)
> >>>> >
> >>>> >
> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger <c...@koeninger.org
> >
> >>>> wrote:
> >>>> >> We were previously using SPARK_JAVA_OPTS to set java system
> properties
> >>>> via
> >>>> >> -D.
> >>>> >>
> >>>> >> This was used for properties that varied on a
> >>>> per-deployment-environment
> >>>> >> basis, but needed to be available in the spark shell and workers.
> >>>> >>
> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been
> deprecated,
> >>>> and
> >>>> >> replaced by spark-defaults.conf and command line arguments to
> >>>> spark-submit
> >>>> >> or spark-shell.
> >>>> >>
> >>>> >> However, setting spark.driver.extraJavaOptions and
> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is not a
> >>>> replacement
> >>>> >> for SPARK_JAVA_OPTS:
> >>>> >>
> >>>> >>
> >>>> >> $ cat conf/spark-defaults.conf
> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
> >>>> >>
> >>>> >> $ ./bin/spark-shell
> >>>> >>
> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>>> >> res0: String = null
> >>>> >>
> >>>> >>
> >>>> >> $ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
> >>>> >>
> >>>> >> scala> System.getProperty("foo.bar.baz")
> >>>> >> res0: String = 23
> >>>> >>
> >>>> >>
> >>>> >> Looking through the shell scripts for spark-submit and
> spark-class, I
> >>>> can
> >>>> >> see why this is; parsing spark-defaults.conf from bash could be
> >>>> brittle.
> >>>> >>
> >>>> >> But from an ergonomic point of view, it's a step back to go from a
> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to requiring
> >>>> command
> >>>> >> line arguments.
> >>>> >>
> >>>> >> I can solve this with an ad-hoc script to wrap spark-shell with the
> >>>> >> appropriate arguments, but I wanted to bring the issue up to see if
> >>>> anyone
> >>>> >> else had run into it,
> >>>> >> or had any direction for a general solution (beyond parsing java
> >>>> properties
> >>>> >> files from bash).
> >>>> >
> >>>> >
> >>>> >
> >>>> > --
> >>>> > Marcelo
> >>>>
> >>>
> >>>
>

Reply via email to