[
https://issues.apache.org/jira/browse/SPARK-35428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rostislav Nedelchev updated SPARK-35428:
----------------------------------------
Attachment: image-2022-08-03-12-03-39-533.png
> Spark history Server to S3 doesn't show incomplete applications
> ---------------------------------------------------------------
>
> Key: SPARK-35428
> URL: https://issues.apache.org/jira/browse/SPARK-35428
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.4.5
> Environment: Jupyter Notebook sparkmagic with Spark(2.4.5) client
> mode running on Kubernetes
> Reporter: Tianbin Jiang
> Priority: Major
> Attachments: image-2022-08-03-12-03-39-533.png
>
>
> Jupyter Notebook sparkmagic with Spark(2.4.5) client mode running on
> Kubernetes. I am redirecting the spark event logs to a S3 with the following
> configuration:
>
> spark.eventLog.enabled = true
> spark.history.ui.port = 18080
> spark.eventLog.dir = s3://livy-spark-log/spark-history/
> spark.history.fs.logDirectory = s3://livy-spark-log/spark-history/
> spark.history.fs.update.interval = 5s
> spark.eventLog.buffer.kb = 1k
>
> spark.streaming.driver.writeAheadLog.closeFileAfterWrite = true
> spark.streaming.receiver.writeAheadLog.closeFileAfterWrite = true
>
>
> Once my application is completed, I can see it shows up on the spark history
> server. However, running applications doesn't show up on "incomplete
> applications". I have also checked the log, whenever my application end, I
> can see this message:
>
> {{21/05/17 06:14:18 INFO k8s.KubernetesClusterSchedulerBackend: Shutting
> down all executors}}
> {{21/05/17 06:14:18 INFO
> k8s.KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each
> executor to shut down}}
> {{21/05/17 06:14:18 WARN k8s.ExecutorPodsWatchSnapshotSource: Kubernetes
> client has been closed (this is expected if the application is shutting
> down.)}}
> *{{21/05/17 06:14:18 INFO s3n.MultipartUploadOutputStream: close
> closed:false
> s3://livy-spark-log/spark-history/spark-48c3141875fe4c67b5708400134ea3d6.inprogress}}*
> *{{21/05/17 06:14:19 INFO s3n.S3NativeFileSystem: rename
> s3://livy-spark-log/spark-history/spark-48c3141875fe4c67b5708400134ea3d6.inprogress
> s3://livy-spark-log/spark-history/spark-48c3141875fe4c67b5708400134ea3d6}}*
> {{21/05/17 06:14:19 INFO spark.MapOutputTrackerMasterEndpoint:
> MapOutputTrackerMasterEndpoint stopped!}}
> {{21/05/17 06:14:19 INFO memory.MemoryStore: MemoryStore cleared}}
> {{21/05/17 06:14:19 INFO storage.BlockManager: BlockManager stopped}}
>
>
> {{I am not able to see any xx.inprogress file on S3 though. Anyone had this
> problem before? Otherwise, I would take it as a bug.}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]