[
https://issues.apache.org/jira/browse/KYLIN-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoxiang Yu resolved KYLIN-5636.
---------------------------------
Resolution: Fixed
> automatically clean up dependent files after the build task
> -----------------------------------------------------------
>
> Key: KYLIN-5636
> URL: https://issues.apache.org/jira/browse/KYLIN-5636
> Project: Kylin
> Issue Type: Improvement
> Components: Tools, Build and Test
> Affects Versions: 5.0-alpha
> Reporter: Zhiting Guo
> Assignee: Zhiting Guo
> Priority: Major
> Fix For: 5.0-beta
>
>
> *question:*
> The files uploaded under the path spark.kubernetes.file.upload.path are not
> automatically deleted
> 1: When spark creates a driverPod, it uploads dependencies to the specified
> path. The build task is in cluster mode and needs to create a driverPod.
> Running the build task multiple times results in a large path file.
> 2: At present, the upload.path path we configured (s3a://kylin/spark-on-k8s)
> is a fixed path, and spark will create a subdirectory in this directory, the
> spark-upload-uuid directory, and then store the dependencies in it.
> *dev design*
> Core idea, add dynamic subdirectory under the original upload.path path,
> delete the entire subdirectory when the task is over
> Build task: upload.path + jobId (e.g. s3a://kylin/spark-on-k8s/uuid)
> Delete the dependency directory when the build task is finished
>
> Automatically delete dependent function is called, kill-9 situation will lead
> to the deletion function is not called, garbage cleaning function needs to be
> added to the bottom of the policy, such as greater than three months before
> the directory is automatically deleted
--
This message was sent by Atlassian Jira
(v8.20.10#820010)