HeartSaVioR edited a comment on issue #23260: [SPARK-26311][YARN] New feature: 
custom log URL for stdout/stderr
URL: https://github.com/apache/spark/pull/23260#issuecomment-455088037
 
 
   Verified with YARN cluster as well.
   
   > Don't set `spark.history.custom.executor.log.url` (default)
   
   ```
   <a 
href="http://<node_host>:<node_port>/node/containerlogs/container_1547708601909_0010_01_000002/spark/stderr?start=-4096">stderr</a>
   <a 
href="http://<node_host>:<node_port>/node/containerlogs/container_1547708601909_0010_01_000002/spark/stdout?start=-4096">stdout</a>
   ```
   
   Provided default log URLs.
   
   > spark.history.custom.executor.log.url 
`{{HTTP_SCHEME}}host:port/testurl/node_http_address/{{NODE_HTTP_ADDRESS}}/cluster_id/{{CLUSTER_ID}}/container_id/{{CONTAINER_ID}}/user/{{USER}}/file/{{FILE_NAME}}`
   
   ```
   <a 
href="http://host:port/testurl/node_http_address/<node_host>:<node_port>/cluster_id//container_id/container_1547708601909_0010_01_000002/user/spark/file/stderr">stderr</a>
   <a 
href="http://host:port/testurl/node_http_address/<node_host>:<node_port>/cluster_id//container_id/container_1547708601909_0010_01_000002/user/spark/file/stdout">stdout</a>
   ```
   
   (I have tested this with Hadoop 2.7.3 and `clusterId` is not set in YARN 
cluster, which ends up with empty string.)
   
   Renewed log urls.
   
   > spark.history.custom.executor.log.url 
`{{HTTP_SCHEME}}{{NODE_HTTP_ADDRESS}}/node/containerlogs/{{CONTAINER_ID}}/{{USER}}/{{FILE_NAME}}?start=-4096`
   
   ```
   <a 
href="http://<node_host>:<node_port>/node/containerlogs/container_1547708601909_0010_01_000002/spark/stderr?start=-4096">stderr</a>
   <a 
href="http://<node_host>:<node_port>/node/containerlogs/container_1547708601909_0010_01_000002/spark/stdout?start=-4096">stdout</a>
   ```
   
   This mimics the default log URLs on YARN.
   
   > spark.history.custom.executor.log.url 
`{{HTTP_SCHEME}}{{NODE_HTTP_ADDRESS}}/node/containerlogs/{{CONTAINER_ID}}/{{USER}}/nonexisting/{{NON_EXISTING}}/{{FILE_NAME}}?start=-4096`
   
   ```
   <a 
href="http://<node_host>:<node_port>/node/containerlogs/container_1547708601909_0010_01_000002/spark/stdout?start=-4096">stdout</a>
   <a 
href="http://<node_host>:<node_port>/node/containerlogs/container_1547708601909_0010_01_000002/spark/stderr?start=-4096">stderr</a>
   ```
   
   Failing back to provide default log URLs since some pattern is not available 
in attributes, as well as warn messages logged in history server log as below:
   
   ```
   19/01/17 08:35:39 WARN HistoryAppStatusListener: Fail to renew executor log 
urls: some of required attributes are missing in app's event log.. Required: 
Set(NON_EXISTING, NODE_HTTP_ADDRESS, USER, HTTP_SCHEME, CONTAINER_ID) / 
available: Set(NODE_HTTP_ADDRESS, USER, LOG_FILES, CLUSTER_ID, HTTP_SCHEME, 
CONTAINER_ID). Failing back to show app's origin log urls.
   19/01/17 08:35:39 WARN HistoryAppStatusListener: Fail to renew executor log 
urls: some of required attributes are missing in app's event log.. Required: 
Set(NON_EXISTING, NODE_HTTP_ADDRESS, USER, HTTP_SCHEME, CONTAINER_ID) / 
available: Set(NODE_HTTP_ADDRESS, USER, LOG_FILES, CLUSTER_ID, HTTP_SCHEME, 
CONTAINER_ID). Failing back to show app's origin log urls.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to