[
https://issues.apache.org/jira/browse/BEAM-5440?focusedWorklogId=270578&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-270578
]
ASF GitHub Bot logged work on BEAM-5440:
----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Jul/19 03:39
Start Date: 02/Jul/19 03:39
Worklog Time Spent: 10m
Work Description: sambvfx commented on pull request #8982: [BEAM-5440]
Pass docker run options to SDK harness containers
URL: https://github.com/apache/beam/pull/8982
This adds support for passing docker run options through to the SDK harness
container. The goal was to support mounting volumes, but is generalized to
capture other `docker run [OPTIONS]` the user may desire. (`--user`?)
After reviewing the related [jira
issue](https://issues.apache.org/jira/browse/BEAM-5440), I chose the route of
modifying the `DockerPayload` proto, and did not add additional
`PortableOptions` flags. Instead these options are being parsed from the
existing `PortableOptions.environment_config` within the SDK. I found how the
`PortableOptions.environment_type` and `PortableOptions.environment_config`
work fairly unintuitive and adding this additional esoteric behavior was
acceptable for now.
```
--environment_config "-v /tmp/beam_test:/tmp/beam_test {container_name}"
```
With these changes, I've successfully executed a simple beam graph with a
flink runner (via the python sdk) that mounts a volume and and touches a file.
These changes should not break existing pipelines that simply omit any of the
docker options.
Assuming this is the desired approach, remaining issues would be:
- Port changes to other SDKS
- Add concept of `DockerPayload.options` to other parts of the core code.
e.g.
[Environments.createDockerEnvironment](https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/Environments.java#L117).
- Better tests? I've added a simple test in `portability_runner_test.py`
confirming the parsing/construction of the DockerPayload proto.
- I also updated `DockerSdkWorkerHandler`, but as far as I can tell this
isn't being used anywhere?
Any additional guidance would be appreciated!
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
<br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
| --- | --- | [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
Pre-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
--- |Java | Python | Go | Website
--- | --- | --- | --- | ---
Non-portable | [](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/)
Portable | --- | [](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/)
| --- | ---
See
[.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md)
for trigger phrase, status and link of all Jenkins jobs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 270578)
Time Spent: 10m
Remaining Estimate: 0h
> Add option to mount a directory inside SDK harness containers
> -------------------------------------------------------------
>
> Key: BEAM-5440
> URL: https://issues.apache.org/jira/browse/BEAM-5440
> Project: Beam
> Issue Type: New Feature
> Components: java-fn-execution, sdk-java-core
> Reporter: Maximilian Michels
> Priority: Major
> Labels: portability, portability-flink
> Time Spent: 10m
> Remaining Estimate: 0h
>
> While experimenting with the Python SDK locally, I found it inconvenient that
> I can't mount a host directory to the Docker containers, i.e. the input must
> already be in the container and the results of a Write remain inside the
> container. For local testing, users may want to mount a host directory.
> Since BEAM-5288 the {{Environment}} carries explicit environment information,
> we could a) add volume args to the {{DockerPayload}}, or b) provide a general
> Docker arguments field.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)