Reviving this to see if others would like to chime in about this "expression language" for config options.
On Fri, Mar 13, 2015 at 7:57 PM, Dale Richardson <dale...@hotmail.com> wrote: > Mridul,I may have added some confusion by giving examples in completely > different areas. For example the number of cores available for tasking on > each worker machine is a resource-controller level configuration variable. > In standalone mode (ie using Spark's home-grown resource manager) the > configuration variable SPARK_WORKER_CORES is an item that spark admins can > set (and we can use expressions for). The equivalent variable for YARN > (Yarn.nodemanager.resource.cpu-vcores) is only used by Yarn's node manager > setup and is set by Yarn administrators and outside of control of spark > (and most users). If you are not a cluster administrator then both > variables are irrelevant to you. The same goes for SPARK_WORKER_MEMORY. > > As for spark.executor.memory, As there is no way to know the attributes > of a machine before a task is allocated to it, we cannot use any of the > JVMInfo functions. For options like that the expression parser can easily > be limited to supporting different byte units of scale (kb/mb/gb etc) and > other configuration variables only. > Regards,Dale. > > > > > > Date: Fri, 13 Mar 2015 17:30:51 -0700 > > Subject: Re: Spark config option 'expression language' feedback request > > From: mri...@gmail.com > > To: dale...@hotmail.com > > CC: dev@spark.apache.org > > > > Let me try to rephrase my query. > > How can a user specify, for example, what the executor memory should > > be or number of cores should be. > > > > I dont want a situation where some variables can be specified using > > one set of idioms (from this PR for example) and another set cannot > > be. > > > > > > Regards, > > Mridul > > > > > > > > > > On Fri, Mar 13, 2015 at 4:06 PM, Dale Richardson <dale...@hotmail.com> > wrote: > > > > > > > > > > > > Thanks for your questions Mridul. > > > I assume you are referring to how the functionality to query system > state works in Yarn and Mesos? > > > The API's used are the standard JVM API's so the functionality will > work without change. There is no real use case for using > 'physicalMemoryBytes' in these cases though, as the JVM size has already > been limited by the resource manager. > > > Regards,Dale. > > >> Date: Fri, 13 Mar 2015 08:20:33 -0700 > > >> Subject: Re: Spark config option 'expression language' feedback > request > > >> From: mri...@gmail.com > > >> To: dale...@hotmail.com > > >> CC: dev@spark.apache.org > > >> > > >> I am curious how you are going to support these over mesos and yarn. > > >> Any configure change like this should be applicable to all of them, > not > > >> just local and standalone modes. > > >> > > >> Regards > > >> Mridul > > >> > > >> On Friday, March 13, 2015, Dale Richardson <dale...@hotmail.com> > wrote: > > >> > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > PR#4937 ( https://github.com/apache/spark/pull/4937) is a feature > to > > >> > allow for Spark configuration options (whether on command line, > environment > > >> > variable or a configuration file) to be specified via a simple > expression > > >> > language. > > >> > > > >> > > > >> > Such a feature has the following end-user benefits: > > >> > - Allows for the flexibility in specifying time intervals or byte > > >> > quantities in appropriate and easy to follow units e.g. 1 week > rather > > >> > rather then 604800 seconds > > >> > > > >> > - Allows for the scaling of a configuration option in relation to a > system > > >> > attributes. e.g. > > >> > > > >> > SPARK_WORKER_CORES = numCores - 1 > > >> > > > >> > SPARK_WORKER_MEMORY = physicalMemoryBytes - 1.5 GB > > >> > > > >> > - Gives the ability to scale multiple configuration options > together eg: > > >> > > > >> > spark.driver.memory = 0.75 * physicalMemoryBytes > > >> > > > >> > spark.driver.maxResultSize = spark.driver.memory * 0.8 > > >> > > > >> > > > >> > The following functions are currently supported by this PR: > > >> > NumCores: Number of cores assigned to the JVM (usually > == > > >> > Physical machine cores) > > >> > PhysicalMemoryBytes: Memory size of hosting machine > > >> > > > >> > JVMTotalMemoryBytes: Current bytes of memory allocated to the JVM > > >> > > > >> > JVMMaxMemoryBytes: Maximum number of bytes of memory available > to the > > >> > JVM > > >> > > > >> > JVMFreeMemoryBytes: maxMemoryBytes - totalMemoryBytes > > >> > > > >> > > > >> > I was wondering if anybody on the mailing list has any further > ideas on > > >> > other functions that could be useful to have when specifying spark > > >> > configuration options? > > >> > Regards,Dale. > > >> > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > > For additional commands, e-mail: dev-h...@spark.apache.org > > > >