[
https://issues.apache.org/jira/browse/BEAM-6237?focusedWorklogId=187129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-187129
]
ASF GitHub Bot logged work on BEAM-6237:
----------------------------------------
Author: ASF GitHub Bot
Created on: 18/Jan/19 22:58
Start Date: 18/Jan/19 22:58
Worklog Time Spent: 10m
Work Description: youngoli commented on pull request #7571: [BEAM-6237]
Fix ULR not deleting artifacts after running jobs.
URL: https://github.com/apache/beam/pull/7571
This change switches the ULR from using
LocalFileSystemArtifact[Stager/Retrieval]Service to using
BeamFileSystemArtifact[Staging/Retrieval]Service which has functionality to
remove artifacts after running a job. With this change ValidatesRunner tests no
longer leave huge amounts of artifacts when run with the ULR.
Other code had to be changed to allow this switch. In particular, the old
code would store the path to the staged files after creating the staging
service. This code instead stores an artifact staging session token, to keep
track of a specific staging session (since the job server may have multiple
staging sessions from different jobs). The new code also has changes to
correctly pass the artifact retrieval token (passed to the ReferenceRunner as
part of a RunJobRequest) to the BeamFileSystemArtifactRetrievalService.
------------------------
Follow this checklist to help us incorporate your contribution quickly and
easily:
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue, if applicable. This will automatically link the pull request to the
issue.
- [ ] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
It will help us expedite review of your Pull Request if you tag someone
(e.g. `@username`) to look at it.
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | --- | --- | --- | --- | ---
Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
</br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
| --- | --- | ---
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 187129)
Time Spent: 10m
Remaining Estimate: 0h
> ULR ValidatesRunner tests not deleting artifacts.
> -------------------------------------------------
>
> Key: BEAM-6237
> URL: https://issues.apache.org/jira/browse/BEAM-6237
> Project: Beam
> Issue Type: Bug
> Components: runner-direct
> Reporter: Daniel Oliveira
> Assignee: Daniel Oliveira
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When running ValidatesRunner tests with the ULR, artifacts are never deleted.
> Since a new job is run per test, this uses up massive amounts of disk storage
> quickly (over 20 Gigabytes per execution). This often causes the machine
> running these tests to run out of disk space which means tests start failing.
> The ULR should be modified to delete these artifacts after they have been
> staged to avoid this issue. Flink already does this, so the infrastructure
> exists.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)