Hi Tim, (added dev, removed user) I've created https://issues.apache.org/jira/browse/SPARK-8009 to track this.
-kr, Gerard. On Sat, May 30, 2015 at 7:10 PM, Tim Chen <t...@mesosphere.io> wrote: > So sounds like some generic downloadable uris support can solve this > problem, that Mesos automatically places in your sandbox and you can refer > to it. > > If so please file a jira and this is a pretty simple fix on the Spark side. > > Tim > > On Sat, May 30, 2015 at 7:34 AM, andy petrella <andy.petre...@gmail.com> > wrote: > >> Hello, >> >> I'm currently exploring DCOS for the spark notebook, and while looking at >> the spark configuration I found something interesting which is actually >> converging to what we've discovered: >> >> https://github.com/mesosphere/universe/blob/master/repo/packages/S/spark/0/marathon.json >> >> So the logging is working fine here because the spark package is using >> the spark-class which is able to configure the log4j file. But the >> interesting part comes with the fact that the `uris` parameter is filled in >> with a downloadable path to the log4j file! >> >> However, it's not possible when creating the spark context ourselfves and >> relying on the mesos sheduler backend only. Unles the spark.executor.uri >> (or a another one) can take more than one downloadable path. >> >> my.2ยข >> >> andy >> >> >> On Fri, May 29, 2015 at 5:09 PM Gerard Maas <gerard.m...@gmail.com> >> wrote: >> >>> Hi Tim, >>> >>> Thanks for the info. We (Andy Petrella and myself) have been diving a >>> bit deeper into this log config: >>> >>> The log line I was referring to is this one (sorry, I provided the >>> others just for context) >>> >>> *Using Spark's default log4j profile: >>> org/apache/spark/log4j-defaults.properties* >>> >>> That line comes from Logging.scala [1] where a default config is loaded >>> is none is found in the classpath upon the startup of the Spark Mesos >>> executor in the Mesos sandbox. At that point in time, none of the >>> application-specific resources have been shipped yet as the executor JVM is >>> just starting up. To load a custom configuration file we should have it >>> already on the sandbox before the executor JVM starts and add it to the >>> classpath on the startup command. Is that correct? >>> >>> For the classpath customization, It looks like it should be possible to >>> pass a -Dlog4j.configuration property by using the >>> 'spark.executor.extraClassPath' that will be picked up at [2] and that >>> should be added to the command that starts the executor JVM, but the >>> resource must be already on the host before we can do that. Therefore we >>> also need some means of 'shipping' the log4j.configuration file to the >>> allocated executor. >>> >>> This all boils down to your statement on the need of shipping extra >>> files to the sandbox. Bottom line: It's currently not possible to specify a >>> config file for your mesos executor. (ours grows several GB/day). >>> >>> The only workaround I found so far is to open up the Spark assembly, >>> replace the log4j-default.properties and pack it up again. That would >>> work, although kind of rudimentary as we use the same assembly for many >>> jobs. Probably, accessing the log4j API programmatically should also work >>> (I didn't try that yet) >>> >>> Should we open a JIRA for this functionality? >>> >>> -kr, Gerard. >>> >>> >>> >>> >>> [1] >>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/Logging.scala#L128 >>> [2] >>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L77 >>> >>> On Thu, May 28, 2015 at 7:50 PM, Tim Chen <t...@mesosphere.io> wrote: >>> >>>> >>>> ---------- Forwarded message ---------- >>>> From: Tim Chen <t...@mesosphere.io> >>>> Date: Thu, May 28, 2015 at 10:49 AM >>>> Subject: Re: [Streaming] Configure executor logging on Mesos >>>> To: Gerard Maas <gerard.m...@gmail.com> >>>> >>>> >>>> Hi Gerard, >>>> >>>> The log line you referred to is not Spark logging but Mesos own >>>> logging, which is using glog. >>>> >>>> Our own executor logs should only contain very few lines though. >>>> >>>> Most of the log lines you'll see is from Spark, and it can be controled >>>> by specifiying a log4j.properties to be downloaded with your Mesos task. >>>> Alternatively if you are downloading Spark executor via spark.executor.uri, >>>> you can include log4j.properties in that tar ball. >>>> >>>> I think we probably need some more configurations for Spark scheduler >>>> to pick up extra files to be downloaded into the sandbox. >>>> >>>> Tim >>>> >>>> >>>> >>>> >>>> >>>> On Thu, May 28, 2015 at 6:46 AM, Gerard Maas <gerard.m...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm trying to control the verbosity of the logs on the Mesos executors >>>>> with no luck so far. The default behaviour is INFO on stderr dump with an >>>>> unbounded growth that gets too big at some point. >>>>> >>>>> I noticed that when the executor is instantiated, it locates a default >>>>> log configuration in the spark assembly: >>>>> >>>>> I0528 13:36:22.958067 26890 exec.cpp:206] Executor registered on slave >>>>> 20150528-063307-780930314-5050-8152-S5 >>>>> Spark assembly has been built with Hive, including Datanucleus jars on >>>>> classpath >>>>> Using Spark's default log4j profile: >>>>> org/apache/spark/log4j-defaults.properties >>>>> >>>>> So, no matter what I provide in my job jar files (or also tried with >>>>> (spark.executor.extraClassPath=log4j.properties) takes effect in the >>>>> executor's configuration. >>>>> >>>>> How should I configure the log on the executors? >>>>> >>>>> thanks, Gerard. >>>>> >>>> >>>> >>>> >>> >