[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126293302
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,21 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 512, 1g, 2g). ")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
--- End diff --

IIUC, Should it here be `Int.MaxValue / 1024`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126293500
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -175,6 +246,8 @@ package object config {
 
ConfigBuilder("spark.scheduler.listenerbus.metrics.maxListenerClassesTimed")
   .internal()
   .intConf
+  .checkValue(v => v > 0,
--- End diff --

Please unify this style to one for all the checks if possible, like `_ > 0`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126293557
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -231,6 +315,9 @@ package object config {
   private[spark] val BLOCK_MANAGER_PORT = 
ConfigBuilder("spark.blockManager.port")
 .doc("Port to use for the block manager when a more specific setting 
is not provided.")
 .intConf
+.checkValue(v => v == 0 || (1024 <= v && v < 65536),
--- End diff --

Maybe we should let user to decide whether to bind to a privileged port. 
Taking an assumption here may potentially break some use cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126293363
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,21 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 512, 1g, 2g). ")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  "The driver memory must be greater than 0MB " +
--- End diff --

Is it more accurate to use MiB instead of MB here? also please add WS like 
`0 MiB`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126293375
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +62,63 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 512, 1g, 2g, 
8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
--- End diff --

Also here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-08 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126292668
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +62,63 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 512, 1g, 2g, 
8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  "The executor memory must be greater than 0M " +
+  s"and less than ${Int.MaxValue / (1024 * 1024)}M.")
 .createWithDefaultString("1g")
 
-  private[spark] val IS_PYTHON_APP = 
ConfigBuilder("spark.yarn.isPython").internal()
-.booleanConf.createWithDefault(false)
+  private[spark] val IS_PYTHON_APP = ConfigBuilder("spark.yarn.isPython")
+.internal()
+.booleanConf
+.createWithDefault(false)
 
-  private[spark] val CPUS_PER_TASK = 
ConfigBuilder("spark.task.cpus").intConf.createWithDefault(1)
+  private[spark] val CPUS_PER_TASK = ConfigBuilder("spark.task.cpus")
+.doc("Number of cores to allocate for each task. ")
+.intConf
+.checkValue(v => v > 0,
+  "Number of cores to allocate for task event queue must not be 
negative")
+.createWithDefault(1)
 
   private[spark] val DYN_ALLOCATION_MIN_EXECUTORS =
-
ConfigBuilder("spark.dynamicAllocation.minExecutors").intConf.createWithDefault(0)
+ConfigBuilder("spark.dynamicAllocation.minExecutors")
+  .doc("Lower bound for the number of executors if dynamic allocation 
is enabled.")
+  .intConf
+  .checkValue(v => v >= 0,
--- End diff --

Actually Configuration like `spark.dynamicAllocation.minExecutors` already 
has check validation logics in the code where honors it 
(ExecutorAllocationManager). So probably checking here again makes duplication.

Also one limitation here for checking validation is that it could only 
verify the validity of single configuration, but for some confs like 
`spark.dynamicAllocation.minExecutors` they should be verified with other confs 
like `spark.dynamicAllocation.minExecutors <= .initialExecutors <= 
xxx.maxExecutors`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274206
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -88,50 +137,76 @@ package object config {
 .createOptional
 
   private[spark] val PY_FILES = ConfigBuilder("spark.submit.pyFiles")
+.doc("Comma-separated list of .zip, .egg, or .py files to place on " +
+  "the PYTHONPATH for Python apps.")
 .internal()
 .stringConf
 .toSequence
 .createWithDefault(Nil)
 
   private[spark] val MAX_TASK_FAILURES =
 ConfigBuilder("spark.task.maxFailures")
+  .doc("Number of failures of any particular task before giving up on 
the job. " +
+"The total number of failures spread across different tasks " +
+"will not cause the job to fail; " +
+"a particular task has to fail this number of attempts. " +
+"Should be greater than or equal to 1. Number of allowed retries = 
this value - 1.")
   .intConf
+  .checkValue(_ > 0, "The retry times of task should be greater than 
or equal to 1")
   .createWithDefault(4)
 
   // Blacklist confs
   private[spark] val BLACKLIST_ENABLED =
 ConfigBuilder("spark.blacklist.enabled")
+  .doc("If set to 'true', prevent Spark from scheduling tasks on 
executors " +
+"that have been blacklisted due to too many task failures. " +
+"The blacklisting algorithm can be further controlled by " +
+"the other 'spark.blacklist' configuration options. ")
   .booleanConf
   .createOptional
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_EXECUTOR =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerExecutor")
+  .doc("For a given task, how many times it can be " +
+"retried on one executor before the executor is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(1)
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_NODE =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerNode")
+  .doc("For a given task, how many times it can be " +
+"retried on one node, before the entire node is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC =
 ConfigBuilder("spark.blacklist.application.maxFailedTasksPerExecutor")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC_STAGE =
 ConfigBuilder("spark.blacklist.stage.maxFailedTasksPerExecutor")
+  .doc("How many different tasks must fail on one executor, " +
+"within one stage, before the executor is blacklisted for that 
stage.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILED_EXEC_PER_NODE =
 ConfigBuilder("spark.blacklist.application.maxFailedExecutorsPerNode")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILED_EXEC_PER_NODE_STAGE =
 ConfigBuilder("spark.blacklist.stage.maxFailedExecutorsPerNode")
+  .doc("How many different executors are marked as blacklisted for a 
given stage, " +
+"before the entire node is marked as failed for the stage.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274201
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -88,50 +137,76 @@ package object config {
 .createOptional
 
   private[spark] val PY_FILES = ConfigBuilder("spark.submit.pyFiles")
+.doc("Comma-separated list of .zip, .egg, or .py files to place on " +
+  "the PYTHONPATH for Python apps.")
 .internal()
 .stringConf
 .toSequence
 .createWithDefault(Nil)
 
   private[spark] val MAX_TASK_FAILURES =
 ConfigBuilder("spark.task.maxFailures")
+  .doc("Number of failures of any particular task before giving up on 
the job. " +
+"The total number of failures spread across different tasks " +
+"will not cause the job to fail; " +
+"a particular task has to fail this number of attempts. " +
+"Should be greater than or equal to 1. Number of allowed retries = 
this value - 1.")
   .intConf
+  .checkValue(_ > 0, "The retry times of task should be greater than 
or equal to 1")
   .createWithDefault(4)
 
   // Blacklist confs
   private[spark] val BLACKLIST_ENABLED =
 ConfigBuilder("spark.blacklist.enabled")
+  .doc("If set to 'true', prevent Spark from scheduling tasks on 
executors " +
+"that have been blacklisted due to too many task failures. " +
+"The blacklisting algorithm can be further controlled by " +
+"the other 'spark.blacklist' configuration options. ")
   .booleanConf
   .createOptional
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_EXECUTOR =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerExecutor")
+  .doc("For a given task, how many times it can be " +
+"retried on one executor before the executor is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(1)
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_NODE =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerNode")
+  .doc("For a given task, how many times it can be " +
+"retried on one node, before the entire node is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC =
 ConfigBuilder("spark.blacklist.application.maxFailedTasksPerExecutor")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274200
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -88,50 +137,76 @@ package object config {
 .createOptional
 
   private[spark] val PY_FILES = ConfigBuilder("spark.submit.pyFiles")
+.doc("Comma-separated list of .zip, .egg, or .py files to place on " +
+  "the PYTHONPATH for Python apps.")
 .internal()
 .stringConf
 .toSequence
 .createWithDefault(Nil)
 
   private[spark] val MAX_TASK_FAILURES =
 ConfigBuilder("spark.task.maxFailures")
+  .doc("Number of failures of any particular task before giving up on 
the job. " +
+"The total number of failures spread across different tasks " +
+"will not cause the job to fail; " +
+"a particular task has to fail this number of attempts. " +
+"Should be greater than or equal to 1. Number of allowed retries = 
this value - 1.")
   .intConf
+  .checkValue(_ > 0, "The retry times of task should be greater than 
or equal to 1")
   .createWithDefault(4)
 
   // Blacklist confs
   private[spark] val BLACKLIST_ENABLED =
 ConfigBuilder("spark.blacklist.enabled")
+  .doc("If set to 'true', prevent Spark from scheduling tasks on 
executors " +
+"that have been blacklisted due to too many task failures. " +
+"The blacklisting algorithm can be further controlled by " +
+"the other 'spark.blacklist' configuration options. ")
   .booleanConf
   .createOptional
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_EXECUTOR =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerExecutor")
+  .doc("For a given task, how many times it can be " +
+"retried on one executor before the executor is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(1)
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_NODE =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerNode")
+  .doc("For a given task, how many times it can be " +
+"retried on one node, before the entire node is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274204
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -88,50 +137,76 @@ package object config {
 .createOptional
 
   private[spark] val PY_FILES = ConfigBuilder("spark.submit.pyFiles")
+.doc("Comma-separated list of .zip, .egg, or .py files to place on " +
+  "the PYTHONPATH for Python apps.")
 .internal()
 .stringConf
 .toSequence
 .createWithDefault(Nil)
 
   private[spark] val MAX_TASK_FAILURES =
 ConfigBuilder("spark.task.maxFailures")
+  .doc("Number of failures of any particular task before giving up on 
the job. " +
+"The total number of failures spread across different tasks " +
+"will not cause the job to fail; " +
+"a particular task has to fail this number of attempts. " +
+"Should be greater than or equal to 1. Number of allowed retries = 
this value - 1.")
   .intConf
+  .checkValue(_ > 0, "The retry times of task should be greater than 
or equal to 1")
   .createWithDefault(4)
 
   // Blacklist confs
   private[spark] val BLACKLIST_ENABLED =
 ConfigBuilder("spark.blacklist.enabled")
+  .doc("If set to 'true', prevent Spark from scheduling tasks on 
executors " +
+"that have been blacklisted due to too many task failures. " +
+"The blacklisting algorithm can be further controlled by " +
+"the other 'spark.blacklist' configuration options. ")
   .booleanConf
   .createOptional
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_EXECUTOR =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerExecutor")
+  .doc("For a given task, how many times it can be " +
+"retried on one executor before the executor is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(1)
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_NODE =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerNode")
+  .doc("For a given task, how many times it can be " +
+"retried on one node, before the entire node is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC =
 ConfigBuilder("spark.blacklist.application.maxFailedTasksPerExecutor")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC_STAGE =
 ConfigBuilder("spark.blacklist.stage.maxFailedTasksPerExecutor")
+  .doc("How many different tasks must fail on one executor, " +
+"within one stage, before the executor is blacklisted for that 
stage.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILED_EXEC_PER_NODE =
 ConfigBuilder("spark.blacklist.application.maxFailedExecutorsPerNode")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274198
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -88,50 +137,76 @@ package object config {
 .createOptional
 
   private[spark] val PY_FILES = ConfigBuilder("spark.submit.pyFiles")
+.doc("Comma-separated list of .zip, .egg, or .py files to place on " +
+  "the PYTHONPATH for Python apps.")
 .internal()
 .stringConf
 .toSequence
 .createWithDefault(Nil)
 
   private[spark] val MAX_TASK_FAILURES =
 ConfigBuilder("spark.task.maxFailures")
+  .doc("Number of failures of any particular task before giving up on 
the job. " +
+"The total number of failures spread across different tasks " +
+"will not cause the job to fail; " +
+"a particular task has to fail this number of attempts. " +
+"Should be greater than or equal to 1. Number of allowed retries = 
this value - 1.")
   .intConf
+  .checkValue(_ > 0, "The retry times of task should be greater than 
or equal to 1")
   .createWithDefault(4)
 
   // Blacklist confs
   private[spark] val BLACKLIST_ENABLED =
 ConfigBuilder("spark.blacklist.enabled")
+  .doc("If set to 'true', prevent Spark from scheduling tasks on 
executors " +
+"that have been blacklisted due to too many task failures. " +
+"The blacklisting algorithm can be further controlled by " +
+"the other 'spark.blacklist' configuration options. ")
   .booleanConf
   .createOptional
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_EXECUTOR =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerExecutor")
+  .doc("For a given task, how many times it can be " +
+"retried on one executor before the executor is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
--- End diff --

`should be greater than 0`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274203
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -88,50 +137,76 @@ package object config {
 .createOptional
 
   private[spark] val PY_FILES = ConfigBuilder("spark.submit.pyFiles")
+.doc("Comma-separated list of .zip, .egg, or .py files to place on " +
+  "the PYTHONPATH for Python apps.")
 .internal()
 .stringConf
 .toSequence
 .createWithDefault(Nil)
 
   private[spark] val MAX_TASK_FAILURES =
 ConfigBuilder("spark.task.maxFailures")
+  .doc("Number of failures of any particular task before giving up on 
the job. " +
+"The total number of failures spread across different tasks " +
+"will not cause the job to fail; " +
+"a particular task has to fail this number of attempts. " +
+"Should be greater than or equal to 1. Number of allowed retries = 
this value - 1.")
   .intConf
+  .checkValue(_ > 0, "The retry times of task should be greater than 
or equal to 1")
   .createWithDefault(4)
 
   // Blacklist confs
   private[spark] val BLACKLIST_ENABLED =
 ConfigBuilder("spark.blacklist.enabled")
+  .doc("If set to 'true', prevent Spark from scheduling tasks on 
executors " +
+"that have been blacklisted due to too many task failures. " +
+"The blacklisting algorithm can be further controlled by " +
+"the other 'spark.blacklist' configuration options. ")
   .booleanConf
   .createOptional
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_EXECUTOR =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerExecutor")
+  .doc("For a given task, how many times it can be " +
+"retried on one executor before the executor is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(1)
 
   private[spark] val MAX_TASK_ATTEMPTS_PER_NODE =
 ConfigBuilder("spark.blacklist.task.maxTaskAttemptsPerNode")
+  .doc("For a given task, how many times it can be " +
+"retried on one node, before the entire node is blacklisted for 
that task.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC =
 ConfigBuilder("spark.blacklist.application.maxFailedTasksPerExecutor")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
   .createWithDefault(2)
 
   private[spark] val MAX_FAILURES_PER_EXEC_STAGE =
 ConfigBuilder("spark.blacklist.stage.maxFailedTasksPerExecutor")
+  .doc("How many different tasks must fail on one executor, " +
+"within one stage, before the executor is blacklisted for that 
stage.")
   .intConf
+  .checkValue(_ > 0, "The value should be greater than or equal to 1")
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274183
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +63,66 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 2g, 8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  "The executor memory must be greater than 0M " +
+  s"and less than ${Int.MaxValue / (1024 * 1024)}M.")
 .createWithDefaultString("1g")
 
-  private[spark] val IS_PYTHON_APP = 
ConfigBuilder("spark.yarn.isPython").internal()
-.booleanConf.createWithDefault(false)
+  private[spark] val IS_PYTHON_APP = ConfigBuilder("spark.yarn.isPython")
+.internal()
+.booleanConf
+.checkValues(Set(false, true))
+.createWithDefault(false)
 
-  private[spark] val CPUS_PER_TASK = 
ConfigBuilder("spark.task.cpus").intConf.createWithDefault(1)
+  private[spark] val CPUS_PER_TASK = ConfigBuilder("spark.task.cpus")
+.doc("Number of cores to allocate for each task. ")
+.intConf
+.checkValue(v => v > 0,
+  "Number of cores to allocate for task event queue must not be 
negative")
+.createWithDefault(1)
 
   private[spark] val DYN_ALLOCATION_MIN_EXECUTORS =
-
ConfigBuilder("spark.dynamicAllocation.minExecutors").intConf.createWithDefault(0)
+ConfigBuilder("spark.dynamicAllocation.minExecutors")
+  .doc("Lower bound for the number of executors if dynamic allocation 
is enabled.")
+  .intConf
+  .checkValue(v => v >= 0,
+"Lower bound for the number of executors should be greater than or 
equal to 0")
+  .createWithDefault(0)
 
   private[spark] val DYN_ALLOCATION_INITIAL_EXECUTORS =
 ConfigBuilder("spark.dynamicAllocation.initialExecutors")
+  .doc("Initial number of executors to run if dynamic allocation is 
enabled. " +
+"If `--num-executors` (or `spark.executor.instances`) is set " +
+"and larger than this value, " +
+"it will be used as the initial number of executors.")
   .fallbackConf(DYN_ALLOCATION_MIN_EXECUTORS)
--- End diff --

check if >= 0?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274186
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +63,66 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 2g, 8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  "The executor memory must be greater than 0M " +
+  s"and less than ${Int.MaxValue / (1024 * 1024)}M.")
 .createWithDefaultString("1g")
 
-  private[spark] val IS_PYTHON_APP = 
ConfigBuilder("spark.yarn.isPython").internal()
-.booleanConf.createWithDefault(false)
+  private[spark] val IS_PYTHON_APP = ConfigBuilder("spark.yarn.isPython")
+.internal()
+.booleanConf
+.checkValues(Set(false, true))
+.createWithDefault(false)
 
-  private[spark] val CPUS_PER_TASK = 
ConfigBuilder("spark.task.cpus").intConf.createWithDefault(1)
+  private[spark] val CPUS_PER_TASK = ConfigBuilder("spark.task.cpus")
+.doc("Number of cores to allocate for each task. ")
+.intConf
+.checkValue(v => v > 0,
+  "Number of cores to allocate for task event queue must not be 
negative")
+.createWithDefault(1)
 
   private[spark] val DYN_ALLOCATION_MIN_EXECUTORS =
-
ConfigBuilder("spark.dynamicAllocation.minExecutors").intConf.createWithDefault(0)
+ConfigBuilder("spark.dynamicAllocation.minExecutors")
+  .doc("Lower bound for the number of executors if dynamic allocation 
is enabled.")
+  .intConf
+  .checkValue(v => v >= 0,
+"Lower bound for the number of executors should be greater than or 
equal to 0")
+  .createWithDefault(0)
 
   private[spark] val DYN_ALLOCATION_INITIAL_EXECUTORS =
 ConfigBuilder("spark.dynamicAllocation.initialExecutors")
+  .doc("Initial number of executors to run if dynamic allocation is 
enabled. " +
+"If `--num-executors` (or `spark.executor.instances`) is set " +
+"and larger than this value, " +
+"it will be used as the initial number of executors.")
   .fallbackConf(DYN_ALLOCATION_MIN_EXECUTORS)
 
   private[spark] val DYN_ALLOCATION_MAX_EXECUTORS =
-
ConfigBuilder("spark.dynamicAllocation.maxExecutors").intConf.createWithDefault(Int.MaxValue)
+ConfigBuilder("spark.dynamicAllocation.maxExecutors")
+  .doc("Upper bound for the number of executors if dynamic allocation 
is enabled. ")
+  .intConf
+  .createWithDefault(Int.MaxValue)
 
   private[spark] val SHUFFLE_SERVICE_ENABLED =
-
ConfigBuilder("spark.shuffle.service.enabled").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.shuffle.service.enabled")
+  .doc("Enables the external shuffle service. This service preserves " 
+
+"the shuffle files written by executors so the executors can be 
safely removed. " +
+"This must be enabled if spark.dynamicAllocation.enabled is 
'true'. " +
+"The external shuffle service must be set up in order to enable 
it. " +
+"See dynamic allocation configuration and setup documentation for 
more information. ")
+  .booleanConf
+  .checkValues(Set(false, true))
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126274185
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +63,66 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 2g, 8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  "The executor memory must be greater than 0M " +
+  s"and less than ${Int.MaxValue / (1024 * 1024)}M.")
 .createWithDefaultString("1g")
 
-  private[spark] val IS_PYTHON_APP = 
ConfigBuilder("spark.yarn.isPython").internal()
-.booleanConf.createWithDefault(false)
+  private[spark] val IS_PYTHON_APP = ConfigBuilder("spark.yarn.isPython")
+.internal()
+.booleanConf
+.checkValues(Set(false, true))
+.createWithDefault(false)
 
-  private[spark] val CPUS_PER_TASK = 
ConfigBuilder("spark.task.cpus").intConf.createWithDefault(1)
+  private[spark] val CPUS_PER_TASK = ConfigBuilder("spark.task.cpus")
+.doc("Number of cores to allocate for each task. ")
+.intConf
+.checkValue(v => v > 0,
+  "Number of cores to allocate for task event queue must not be 
negative")
+.createWithDefault(1)
 
   private[spark] val DYN_ALLOCATION_MIN_EXECUTORS =
-
ConfigBuilder("spark.dynamicAllocation.minExecutors").intConf.createWithDefault(0)
+ConfigBuilder("spark.dynamicAllocation.minExecutors")
+  .doc("Lower bound for the number of executors if dynamic allocation 
is enabled.")
+  .intConf
+  .checkValue(v => v >= 0,
+"Lower bound for the number of executors should be greater than or 
equal to 0")
+  .createWithDefault(0)
 
   private[spark] val DYN_ALLOCATION_INITIAL_EXECUTORS =
 ConfigBuilder("spark.dynamicAllocation.initialExecutors")
+  .doc("Initial number of executors to run if dynamic allocation is 
enabled. " +
+"If `--num-executors` (or `spark.executor.instances`) is set " +
+"and larger than this value, " +
+"it will be used as the initial number of executors.")
   .fallbackConf(DYN_ALLOCATION_MIN_EXECUTORS)
 
   private[spark] val DYN_ALLOCATION_MAX_EXECUTORS =
-
ConfigBuilder("spark.dynamicAllocation.maxExecutors").intConf.createWithDefault(Int.MaxValue)
+ConfigBuilder("spark.dynamicAllocation.maxExecutors")
+  .doc("Upper bound for the number of executors if dynamic allocation 
is enabled. ")
+  .intConf
+  .createWithDefault(Int.MaxValue)
--- End diff --

check if >= 0?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273756
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,22 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 1g, 2g). ")
--- End diff --

OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273760
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,22 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .checkValues(Set(false, true))
--- End diff --

thanks.
please review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273432
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +63,66 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 2g, 8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  "The executor memory must be greater than 0M " +
+  s"and less than ${Int.MaxValue / (1024 * 1024)}M.")
 .createWithDefaultString("1g")
 
-  private[spark] val IS_PYTHON_APP = 
ConfigBuilder("spark.yarn.isPython").internal()
-.booleanConf.createWithDefault(false)
+  private[spark] val IS_PYTHON_APP = ConfigBuilder("spark.yarn.isPython")
+.internal()
+.booleanConf
+.checkValues(Set(false, true))
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273429
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -51,29 +63,66 @@ package object config {
 
ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val EXECUTOR_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.executor.userClassPathFirst")
+  .doc("Same functionality as spark.driver.userClassPathFirst, " +
+"but applied to executor instances.")
+  .booleanConf
+  .checkValues(Set(false, true))
--- End diff --

ditto


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273417
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,22 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 1g, 2g). ")
--- End diff --

nvm, we should not mention that and always ask users to add unit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273242
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,22 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .checkValues(Set(false, true))
+  .createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 1g, 2g). ")
--- End diff --

we should also mention that, if no unit is given, e.g. `500`, it means 
500mb. We also need to update `configuration.md` for this conf.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126273196
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -35,10 +35,22 @@ package object config {
 
ConfigBuilder(SparkLauncher.DRIVER_EXTRA_LIBRARY_PATH).stringConf.createOptional
 
   private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
-
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
+ConfigBuilder("spark.driver.userClassPathFirst")
+  .doc("Whether to give user-added jars precedence over Spark's own 
jars " +
+"when loading classes in the driver. This feature can be used to 
mitigate " +
+"conflicts between Spark's dependencies and user dependencies. " +
+"It is currently an experimental feature. This is used in cluster 
mode only. ")
+  .booleanConf
+  .checkValues(Set(false, true))
--- End diff --

boolean conf doesn't need `checkValue`...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126090693
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -38,7 +38,12 @@ package object config {
 
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
--- End diff --

OK,
please review it again.
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-07 Thread heary-cao
Github user heary-cao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r126090608
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -38,7 +38,12 @@ package object config {
 
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 1g, 2g). ")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  s"The driver memory must be greater than 0MB " +
--- End diff --

remove it.
thanks.
please review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r125938087
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -38,7 +38,12 @@ package object config {
 
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
+  "where SparkContext is initialized. (e.g. 1g, 2g). ")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  s"The driver memory must be greater than 0MB " +
--- End diff --

I think we can remove 's'.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r125938115
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -54,7 +59,11 @@ package object config {
 
ConfigBuilder("spark.executor.userClassPathFirst").booleanConf.createWithDefault(false)
 
   private[spark] val EXECUTOR_MEMORY = 
ConfigBuilder("spark.executor.memory")
+.doc("Amount of memory to use per executor process (e.g. 2g, 8g).")
 .bytesConf(ByteUnit.MiB)
+.checkValue(v => v > 0 && v <= Int.MaxValue / (1024 * 1024),
+  s"The executor memory must be greater than 0M " +
--- End diff --

ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-06 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18555#discussion_r125901919
  
--- Diff: 
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -38,7 +38,12 @@ package object config {
 
ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.createWithDefault(false)
 
   private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")
+.doc("Amount of memory to use for the driver process, i.e. " +
--- End diff --

Can you also check for other configs in this file and add `checkValue`s as 
much as possible? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18555: [Minor]add checkValue in spark.internal.config ab...

2017-07-06 Thread heary-cao
GitHub user heary-cao opened a pull request:

https://github.com/apache/spark/pull/18555

[Minor]add checkValue in spark.internal.config about how to correctly set 
configurations


## What changes were proposed in this pull request?

add checkValue for configurations `spark.driver.memory`
add checkValue for configurations `spark.executor.memory`
add checkValue for configurations `spark.blockManager.port`

## How was this patch tested?
existing test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/heary-cao/spark driver_host

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18555.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18555


commit fc4f269328440c584ef5adde491463414cbf1fea
Author: caoxuewen 
Date:   2017-07-06T13:06:15Z

add checkValue in spark.internal.config about how to correctly set 
configurations




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org