GitHub user tigerquoll opened a pull request:
https://github.com/apache/spark/pull/4937
[Spark-6214][CORE] - A simple expression language to specify configuration
options
This is a proposal to allow for configuration options to be specified via a
simple expression language. This language would have the following features:â¨
* Allow for basic arithmetic (+-/*) with support bracketed expressions and
standard precedence rules.
* Support for and normalisation of common units of reference eg. MB, GB,
seconds,minutes,hours, days and weeks.
* Allow for the referencing of basic environmental information currently
defined as:
numCores: Number of cores assigned to the JVMâ¨
physicalMemoryBytes: Memory size of hosting machine
â¨JVMTotalMemoryBytes: current bytes of memory allocated to the JVMâ¨
JVMMaxMemoryBytes: Maximum number of bytes of memory available to the
JVMâ¨
JVMFreeMemoryBytes: maxMemoryBytes - totalMemoryBytes
* Allow for the limited referencing of other configuration values when
specifying values. (Other configuration values must be initialised and
explicitly passed into the expression evaluator for this functionality to be
enabled).
Such a feature would have the following end-user benefits:
* Allow for the flexibility in specifying time intervals or byte quantities
in appropriate and easy to follow units e.g. 1 week rather rather then 604800
seconds
* Have a consistent means of entering configuration information regardless
of the configuration option being added. (eg questions such as âis the
particular option specified in ms or seconds?â become irrelevant, because the
user can pick what ever unit makes sense for the magnitude of the value they
are specifying)
* Allow for the scaling of a configuration option in relation to a system
attributes. e.g.
SPARK_WORKER_CORES = numCores - 1
SPARK_WORKER_MEMORY = physicalMemoryBytes - 1.5 GB
* Being able to scale multiple configuration options together eg:
spark.driver.memory = 0.75 * physicalMemoryBytes
spark.driver.maxResultSize = spark.driver.memory * 0.8
This PR only contains the implementation of the expression language and
associated unit tests (more then 120 unit tests). If this PR is accepted, the
idea is that moving options over to use this expression language would be done
in one or more follow-up PRs.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tigerquoll/spark SPARK-6214
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4937.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4937
----
commit 49e981005ae736b879f6a4330ec17d2828c74400
Author: Dale <[email protected]>
Date: 2015-02-26T22:53:46Z
Initial checkin of basic expression parser
commit c6f137a0233aace1a1b6221c9ea3b89a4c8929dc
Author: Dale <[email protected]>
Date: 2015-03-01T06:02:30Z
WIP for ByteExpressionParser
commit 07b8b00570c70b474adde81477461bd06658237d
Author: Dale <[email protected]>
Date: 2015-03-01T12:41:12Z
WIP for TimeExpressionParser
commit ae710cdb40d6e28ecee8e2add1d91748920cc182
Author: Dale <[email protected]>
Date: 2015-03-05T21:40:48Z
Code tidy up, added some comments, created factory object
commit 6a32ef9b812259ed42c047332d0f23c68d445531
Author: Dale <[email protected]>
Date: 2015-03-07T06:25:52Z
Added Physical Memory function, more unit tests for quantity objects
commit 5867704a3abcc23cd3f3c56a4de2b1df26390877
Author: Dale <[email protected]>
Date: 2015-03-07T06:26:29Z
Added more unit tests to increase code coverage
commit 5db9ef08e398c371a4dd65a0e2d6c0f5c5b4a6ef
Author: Dale <[email protected]>
Date: 2015-03-07T06:27:21Z
Refactored code
commit 6795a8d70f96c22c229447ee727787389079a474
Author: Dale <[email protected]>
Date: 2015-03-07T06:45:56Z
lint checks now pass
commit c6cb7f13eed00d702c3d29d8e6bbd53f4acf35cb
Author: Dale <[email protected]>
Date: 2015-03-07T06:49:27Z
Quick import tidyup
commit ede092d1f8b1049bb37a0aa022a24c6e0e14b39d
Author: Dale <[email protected]>
Date: 2015-03-07T09:38:44Z
Moved cpuCores to MachineInfoFunctions trait
commit 5fd80e9828a2808f73ed72580827b37ffd69dc82
Author: Dale <[email protected]>
Date: 2015-03-07T09:57:11Z
Updated comments
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]