[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

pwendell Wed, 09 Apr 2014 14:13:00 -0700

Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/204#discussion_r11459038
  
    --- Diff: docs/monitoring.md ---
    @@ -12,17 +12,71 @@ displays useful information about the application. This 
includes:
     
     * A list of scheduler stages and tasks
     * A summary of RDD sizes and memory usage
    -* Information about the running executors
     * Environmental information.
    +* Information about the running executors
     
     You can access this interface by simply opening 
`http://<driver-node>:4040` in a web browser.
    -If multiple SparkContexts are running on the same host, they will bind to 
succesive ports
    +If multiple SparkContexts are running on the same host, they will bind to 
successive ports
     beginning with 4040 (4041, 4042, etc).
     
    -Spark's Standalone Mode cluster manager also has its own
    -[web UI](spark-standalone.html#monitoring-and-logging). 
    +Note that this information is only available for the duration of the 
application by default.
    +To view the web UI after the fact, set `spark.eventLog.enabled` to true 
before starting the
    +application. This configures Spark to log Spark events that encode the 
information displayed
    +in the UI to persisted storage.
     
    -Note that in both of these UIs, the tables are sortable by clicking their 
headers,
    +## Viewing After the Fact
    +
    +Spark's Standalone Mode cluster manager also has its own
    +[web UI](spark-standalone.html#monitoring-and-logging). If an application 
has logged events over
    +the course of its lifetime, then the Standalone master's web UI will 
automatically re-render the
    +application's UI after the application has finished.
    +
    +If Spark is run on Mesos or YARN, it is still possible to reconstruct the 
UI of a finished
    +application through Spark's history server, provided that the 
application's event logs exist.
    +You can start a the history server by executing:
    +
    +    ./sbin/start-history-server.sh <base-logging-directory>
    +
    +The base logging directory must be supplied, and should contain 
sub-directories that each
    +represents an application's event logs. This creates a web interface at
    +`http://<server-url>:18080` by default, but the port can be changed by 
supplying an extra
    +parameter to the start script. The history server depends on the following 
variables:
    +
    +<table class="table">
    +  <tr><th style="width:21%">Environment Variable</th><th>Meaning</th></tr>
    +  <tr>
    +    <td><code>SPARK_DAEMON_MEMORY</code></td>
    +    <td>Memory to allocate to the history server. (default: 512m).</td>
    +  </tr>
    +  <tr>
    +    <td><code>SPARK_DAEMON_JAVA_OPTS</code></td>
    +    <td>JVM options for the history server (default: none).</td>
    +  </tr>
    +</table>
    +
    +Further, the history server can be configured as follows:
    +
    +<table class="table">
    +  <tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
    +  <tr>
    +    <td>spark.history.updateInterval</td>
    +    <td>10</td>
    +    <td>
    +      The period at which information displayed by this history server is 
updated. Each update
    --- End diff --
    
    I'd say "The period, in seconds, at which..."



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

Reply via email to