[GitHub] spark pull request: Organize configuration docs

mateiz Wed, 28 May 2014 15:10:32 -0700

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/880#discussion_r13159367
  
    --- Diff: docs/configuration.md ---
    @@ -705,42 +720,69 @@ Apart from these, the following properties are also 
available, and may be useful
       </td>
     </tr>
     <tr>
    -  <td><code>spark.task.cpus</code></td>
    -  <td>1</td>
    +  <td><code>spark.ui.filters</code></td>
    +  <td>None</td>
       <td>
    -    Number of cores to allocate for each task.
    +    Comma separated list of filter class names to apply to the Spark web 
ui. The filter should be a
    +    standard <a 
href="http://docs.oracle.com/javaee/6/api/javax/servlet/Filter.html";>
    +    javax servlet Filter</a>. Parameters to each filter can also be 
specified by setting a
    +    java system property of spark.&lt;class name of 
filter&gt;.params='param1=value1,param2=value2'
    +    (e.g. -Dspark.ui.filters=com.test.filter1
    +    -Dspark.com.test.filter1.params='param1=foo,param2=testing')
       </td>
     </tr>
     <tr>
    -  <td><code>spark.executor.extraJavaOptions</code></td>
    -  <td>(none)</td>
    +  <td><code>spark.ui.acls.enable</code></td>
    +  <td>false</td>
       <td>
    -    A string of extra JVM options to pass to executors. For instance, GC 
settings or other
    -    logging. Note that it is illegal to set Spark properties or heap size 
settings with this 
    -    option. Spark properties should be set using a SparkConf object or the 
    -    spark-defaults.conf file used with the spark-submit script. Heap size 
settings can be set
    -    with spark.executor.memory.
    +    Whether Spark web ui acls should are enabled. If enabled, this checks 
to see if the user has
    +    access permissions to view the web ui. See 
<code>spark.ui.view.acls</code> for more details.
    +    Also note this requires the user to be known, if the user comes across 
as null no checks
    +    are done. Filters can be used to authenticate and set the user.
       </td>
     </tr>
     <tr>
    -  <td><code>spark.executor.extraClassPath</code></td>
    -  <td>(none)</td>
    +  <td><code>spark.ui.view.acls</code></td>
    +  <td>Empty</td>
       <td>
    -    Extra classpath entries to append to the classpath of executors. This 
exists primarily
    -    for backwards-compatibility with older versions of Spark. Users 
typically should not need
    -    to set this option.
    +    Comma separated list of users that have view access to the Spark web 
ui. By default only the
    +    user that started the Spark job has view access.
       </td>
     </tr>
    +</table>
    +
    +#### Spark Streaming
    +<table class="table">
    +<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
     <tr>
    -  <td><code>spark.executor.extraLibraryPath</code></td>
    -  <td>(none)</td>
    +  <td><code>spark.streaming.blockInterval</code></td>
    +  <td>200</td>
       <td>
    -    Set a special library path to use when launching executor JVM's.
    +    Interval (milliseconds) at which data received by Spark Streaming 
receivers is coalesced
    +    into blocks of data before storing them in Spark.
    +  </td>
    +</tr>
    +<tr>
    +  <td><code>spark.streaming.unpersist</code></td>
    +  <td>true</td>
    +  <td>
    +    Force RDDs generated and persisted by Spark Streaming to be 
automatically unpersisted from
    +    Spark's memory. The raw input data received by Spark Streaming is also 
automatically cleared.
    +    Setting this to false will allow the raw data and persisted RDDs to be 
accessible outside the
    +    streaming application as they will not be cleared automatically. But 
it comes at the cost of
    +    higher memory usage in Spark.
       </td>
     </tr>
    -
     </table>
     
    +#### Cluster Managers (YARN, Mesos, Standalone)
    +Each cluster manager in Spark has additional configuration options. 
Configurations 
    +can be found on the pages for each mode:
    +
    + * [Yarn](running-on-yarn.html#configuration)
    --- End diff --
    
    Should say YARN



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Organize configuration docs

Reply via email to