HyukjinKwon commented on a change in pull request #27577: [DOC] add config 
naming policy
URL: https://github.com/apache/spark/pull/27577#discussion_r379322859
 
 

 ##########
 File path: 
core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala
 ##########
 @@ -17,6 +17,34 @@
 
 package org.apache.spark.internal.config
 
+// 
======================================================================================
+//                       The naming policy of configurations
+// 
======================================================================================
+// In general, the config name should be a noun that describes its basic 
purpose. It's
+// recommended to add prefix to the config name to make the scope clearer. For 
example,
+// `spark.scheduler.mode` clearly indicates that this config is for the 
scheduler.
+//
+// A config name can have multiple prefixes that are structured, which is 
similar to a
+// qualified Java class name. Each prefix behaves like a namespace. We should 
only create
+// a namespace if it's meaningful and can be shared by multiple configs. For 
example,
+// `buffer.inMemoryThreshold` is preferred over `buffer.in.memory.threshold`.
+//
+// The followings are some best practices of naming configs for some common 
cases:
+// 1. When adding configs for a big feature, it's better to create an umbrella 
config that
+//    can turn the feature on/off, with a name like `featureName.enabled`. The 
other configs
+//    of this feature should be put under the `featureName` namespace. For 
example:
+//      - spark.sql.cbo.enabled
+//      - spark.sql.cbo.starSchemaDetection
+//      - spark.sql.cbo.joinReorder.enabled
+//      - spark.sql.cbo.joinReorder.dp.threshold
+// 2. When adding a boolean config, the name should be a verb that describes 
what
+//    happens if this config is set to true, e.g. 
`spark.shuffle.sort.useRadixSort`.
+// 3. When adding a config to specify a time duration, it's better to put the 
time unit
+//    in the config name. For example, `featureName.timeoutMs`, which clearly 
indicates
+//    that the time unit is millisecond. The config entry should be created 
with
+//    `ConfigBuilder#timeConf`, to support time strings like `2 minutes`.
+
 
 Review comment:
   No big deal but we better to use multiline comment style: `/* ... */`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to