[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=415409=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-415409 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 03/Apr/20 09:37 Start Date: 03/Apr/20 09:37 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r402882661 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); Review comment: Fix: https://github.com/apache/beam/pull/11303 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 415409) Time Spent: 4h 40m (was: 4.5h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=415392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-415392 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 03/Apr/20 08:53 Start Date: 03/Apr/20 08:53 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r402852054 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); Review comment: Indeed looks like a rebasing error. A bit tricky one to get right because we changed the way the container removal worked before the rebasing was done. We were using the `--rm` initially but before the rebase we changed it to remove the container explicitly via `docker remove `. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 415392) Time Spent: 4.5h (was: 4h 20m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=415099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-415099 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 02/Apr/20 22:40 Start Date: 02/Apr/20 22:40 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r402637170 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); Review comment: Why was this added in this PR? It seems orthogonal to `semi_persist_dir`. I believe this is a regression; perhaps a rebasing error? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 415099) Time Spent: 4h 20m (was: 4h 10m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=315209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315209 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 17:51 Start Date: 19/Sep/19 17:51 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315209) Time Spent: 4h 10m (was: 4h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=315208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315208 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 17:50 Start Date: 19/Sep/19 17:50 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r326304747 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); +} + +String semiPersistDir = pipelineOptions.as(RemoteEnvironmentOptions.class).getSemiPersistDir(); +ImmutableList.Builder argsBuilder = +ImmutableList.builder() +.add(String.format("--id=%s", workerId)) +.add(String.format("--logging_endpoint=%s", loggingEndpoint)) +.add(String.format("--artifact_endpoint=%s", artifactEndpoint)) +.add(String.format("--provision_endpoint=%s", provisionEndpoint)) +.add(String.format("--control_endpoint=%s", controlEndpoint)); +if (semiPersistDir != null) { Review comment: Actually, the semi_persist_dir is not inferred from the pipeline options in the bootloader code. Like you said, it has to be this way currently, but it would be nice to not duplicate this information in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315208) Time Spent: 4h (was: 3h 50m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 4h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=315202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315202 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 17:32 Start Date: 19/Sep/19 17:32 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r326296442 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); +} + +String semiPersistDir = pipelineOptions.as(RemoteEnvironmentOptions.class).getSemiPersistDir(); +ImmutableList.Builder argsBuilder = +ImmutableList.builder() +.add(String.format("--id=%s", workerId)) +.add(String.format("--logging_endpoint=%s", loggingEndpoint)) +.add(String.format("--artifact_endpoint=%s", artifactEndpoint)) +.add(String.format("--provision_endpoint=%s", provisionEndpoint)) +.add(String.format("--control_endpoint=%s", controlEndpoint)); +if (semiPersistDir != null) { Review comment: That's right. This is redundant here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315202) Time Spent: 3h 50m (was: 3h 40m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=315174=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315174 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 16:50 Start Date: 19/Sep/19 16:50 Worklog Time Spent: 10m Work Description: tweise commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r326278269 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); +} + +String semiPersistDir = pipelineOptions.as(RemoteEnvironmentOptions.class).getSemiPersistDir(); +ImmutableList.Builder argsBuilder = +ImmutableList.builder() +.add(String.format("--id=%s", workerId)) +.add(String.format("--logging_endpoint=%s", loggingEndpoint)) +.add(String.format("--artifact_endpoint=%s", artifactEndpoint)) +.add(String.format("--provision_endpoint=%s", provisionEndpoint)) +.add(String.format("--control_endpoint=%s", controlEndpoint)); +if (semiPersistDir != null) { Review comment: So we essentially pass the same piece of information to the worker twice: As entry point argument and then again within the pipeline options. It needs to be done this way due to the container contract. Would be nice to revisit in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315174) Time Spent: 3h 40m (was: 3.5h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=315166=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315166 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 16:37 Start Date: 19/Sep/19 16:37 Worklog Time Spent: 10m Work Description: tweise commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r326273162 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java ## @@ -132,20 +133,30 @@ public RemoteEnvironment createEnvironment(Environment environment) throws Excep // host networking on Mac) .add("--env=DOCKER_MAC_CONTAINER=" + System.getenv("DOCKER_MAC_CONTAINER")); -List args = -ImmutableList.of( -String.format("--id=%s", workerId), -String.format("--logging_endpoint=%s", loggingEndpoint), -String.format("--artifact_endpoint=%s", artifactEndpoint), -String.format("--provision_endpoint=%s", provisionEndpoint), -String.format("--control_endpoint=%s", controlEndpoint)); +Boolean retainDockerContainer = + pipelineOptions.as(ManualDockerEnvironmentOptions.class).getRetainDockerContainers(); +if (!retainDockerContainer) { + dockerOptsBuilder.add("--rm"); +} + +String semiPersistDir = pipelineOptions.as(RemoteEnvironmentOptions.class).getSemiPersistDir(); +ImmutableList.Builder argsBuilder = +ImmutableList.builder() +.add(String.format("--id=%s", workerId)) +.add(String.format("--logging_endpoint=%s", loggingEndpoint)) +.add(String.format("--artifact_endpoint=%s", artifactEndpoint)) +.add(String.format("--provision_endpoint=%s", provisionEndpoint)) +.add(String.format("--control_endpoint=%s", controlEndpoint)); +if (semiPersistDir != null) { Review comment: Isn't this provided to the environment through the provision endpoint? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315166) Time Spent: 3.5h (was: 3h 20m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=315153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315153 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 16:30 Start Date: 19/Sep/19 16:30 Worklog Time Spent: 10m Work Description: mxm commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-533209634 I've squashed the fixup commits and updated the PR. Will merge once the tests pass again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315153) Time Spent: 3h 20m (was: 3h 10m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=314955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-314955 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 10:36 Start Date: 19/Sep/19 10:36 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-533071697 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 314955) Time Spent: 3h 10m (was: 3h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=314875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-314875 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 19/Sep/19 07:39 Start Date: 19/Sep/19 07:39 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-533008240 @mxm @tweise Thanks a lot for the review. Sorry that I missed the comments from @tweise. I have updated the PR, would be great if you can take another look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 314875) Time Spent: 3h (was: 2h 50m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=311277=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-311277 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 12/Sep/19 08:58 Start Date: 12/Sep/19 08:58 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-530733430 R: @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 311277) Time Spent: 2h 50m (was: 2h 40m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=308248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308248 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 07/Sep/19 00:59 Start Date: 07/Sep/19 00:59 Worklog Time Spent: 10m Work Description: tweise commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321948469 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import com.google.auto.service.AutoService; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** Options that are used to control configuration of the remote environment. */ +@Experimental +@Hidden +public interface RemoteEnvironmentOptions extends PipelineOptions { + + @Description("Local semi-persistent directory") + @Default.String("/tmp") Review comment: I think the default should be null (no default), so that the environment can pick its suitable tmp directory when nothing is specified by the user. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 308248) Time Spent: 2h 40m (was: 2.5h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=308245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308245 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 07/Sep/19 00:58 Start Date: 07/Sep/19 00:58 Worklog Time Spent: 10m Work Description: tweise commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321948469 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import com.google.auto.service.AutoService; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** Options that are used to control configuration of the remote environment. */ +@Experimental +@Hidden +public interface RemoteEnvironmentOptions extends PipelineOptions { + + @Description("Local semi-persistent directory") + @Default.String("/tmp") Review comment: I think the default should be null, so that the environment can pick its suitable tmp directory when nothing is specified by the user. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 308245) Time Spent: 2.5h (was: 2h 20m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=308241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308241 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 07/Sep/19 00:49 Start Date: 07/Sep/19 00:49 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-529056178 Thanks for the review @mxm ! I have update the PR according your comments. I appreciate if you can have a another look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 308241) Time Spent: 2h 20m (was: 2h 10m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307892=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307892 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 06/Sep/19 14:24 Start Date: 06/Sep/19 14:24 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321759869 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import com.google.auto.service.AutoService; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** Options that are used to control configuration of the remote environment. */ +@Experimental +@Hidden +public interface RemoteEnvironmentOptions extends PipelineOptions { + + @Description("Local semi-persistent directory") + @Default.String("/tmp") Review comment: Let's keep the existing default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307892) Time Spent: 2h 10m (was: 2h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307745 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 06/Sep/19 10:29 Start Date: 06/Sep/19 10:29 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321674314 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import com.google.auto.service.AutoService; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** Options that are used to control configuration of the remote environment. */ +@Experimental +@Hidden +public interface RemoteEnvironmentOptions extends PipelineOptions { + + @Description("Local semi-persistent directory") + @Default.String("/tmp") Review comment: Currently, we keep the same as the default value of other default configuration, such as:`boot.go`. - https://github.com/apache/beam/blob/d21bbaf4c70986c2dbdbe8f6fce35b2b2cb4843d/sdks/go/container/boot.go#L41 - https://github.com/apache/beam/blob/d21bbaf4c70986c2dbdbe8f6fce35b2b2cb4843d/sdks/python/container/boot.go#L51 So, how about we keep using `/tmp` as default value ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307745) Time Spent: 2h (was: 1h 50m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307725 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 06/Sep/19 09:39 Start Date: 06/Sep/19 09:39 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321655867 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/options/RemoteEnvironmentOptionsTest.java ## @@ -0,0 +1,37 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import static org.junit.Assert.assertEquals; + +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Tests for {@link RemoteEnvironmentOptions}. */ +@RunWith(JUnit4.class) +public class RemoteEnvironmentOptionsTest { + + @Test + public void testSemiDirectory() { +RemoteEnvironmentOptions options = PipelineOptionsFactory.as(RemoteEnvironmentOptions.class); +String semiDir = "/ab/cd"; +options.setSemiPersistDir(semiDir); +assertEquals(semiDir, options.getSemiPersistDir()); Review comment: This should also test the default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307725) Time Spent: 1h 50m (was: 1h 40m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307723 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 06/Sep/19 09:39 Start Date: 06/Sep/19 09:39 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321655867 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/options/RemoteEnvironmentOptionsTest.java ## @@ -0,0 +1,37 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import static org.junit.Assert.assertEquals; + +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Tests for {@link RemoteEnvironmentOptions}. */ +@RunWith(JUnit4.class) +public class RemoteEnvironmentOptionsTest { + + @Test + public void testSemiDirectory() { +RemoteEnvironmentOptions options = PipelineOptionsFactory.as(RemoteEnvironmentOptions.class); +String semiDir = "/ab/cd"; +options.setSemiPersistDir(semiDir); +assertEquals(semiDir, options.getSemiPersistDir()); Review comment: This should test the default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307723) Time Spent: 1.5h (was: 1h 20m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307724=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307724 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 06/Sep/19 09:39 Start Date: 06/Sep/19 09:39 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321656526 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import com.google.auto.service.AutoService; +import org.apache.beam.sdk.annotations.Experimental; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList; + +/** Options that are used to control configuration of the remote environment. */ +@Experimental +@Hidden +public interface RemoteEnvironmentOptions extends PipelineOptions { + + @Description("Local semi-persistent directory") + @Default.String("/tmp") Review comment: Should this be `System.getProperty('java.io.tmpdir')`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307724) Time Spent: 1h 40m (was: 1.5h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307599 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 06/Sep/19 03:21 Start Date: 06/Sep/19 03:21 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321562506 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -815,6 +815,7 @@ message StartWorkerRequest { org.apache.beam.model.pipeline.v1.ApiServiceDescriptor logging_endpoint = 3; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor artifact_endpoint = 4; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor provision_endpoint = 5; + string semi_persist_dir = 6; Review comment: Oh, Yes, I see, this is useless changes. `pipeline_options` already defined in the `ProvisionInfo`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307599) Time Spent: 1h 20m (was: 1h 10m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307221=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307221 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 05/Sep/19 14:54 Start Date: 05/Sep/19 14:54 Worklog Time Spent: 10m Work Description: tweise commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321316686 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -815,6 +815,7 @@ message StartWorkerRequest { org.apache.beam.model.pipeline.v1.ApiServiceDescriptor logging_endpoint = 3; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor artifact_endpoint = 4; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor provision_endpoint = 5; + string semi_persist_dir = 6; Review comment: This should not be added here. Pipeline options are provided through the provisioning endpoint. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307221) Time Spent: 1h 10m (was: 1h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307217 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 05/Sep/19 14:49 Start Date: 05/Sep/19 14:49 Worklog Time Spent: 10m Work Description: tweise commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321313810 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java ## @@ -92,18 +92,18 @@ private final int environmentExpirationMillis; public static DefaultJobBundleFactory create(JobInfo jobInfo) { +PipelineOptions pipelineOption = Review comment: "pipelineOptions" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307217) Time Spent: 1h (was: 50m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307165=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307165 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 05/Sep/19 14:00 Start Date: 05/Sep/19 14:00 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321282379 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -815,6 +815,7 @@ message StartWorkerRequest { org.apache.beam.model.pipeline.v1.ApiServiceDescriptor logging_endpoint = 3; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor artifact_endpoint = 4; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor provision_endpoint = 5; + string semi_persist_dir = 6; Review comment: I'm not sure whether this flexibility is desired. I could imagine that the person who starts the worker pool does not want arbitrary persist directories, but rather rather a fixed one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307165) Time Spent: 50m (was: 40m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307135 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 05/Sep/19 13:14 Start Date: 05/Sep/19 13:14 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r321250359 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -815,6 +815,7 @@ message StartWorkerRequest { org.apache.beam.model.pipeline.v1.ApiServiceDescriptor logging_endpoint = 3; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor artifact_endpoint = 4; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor provision_endpoint = 5; + string semi_persist_dir = 6; Review comment: Great to have your suggestions. :) Maybe I have not understood your idea. If we configure the dir for the whole pool, we may loose the flexibility that different jobs may set different semi_persist_dir for the workers in the same worker pool? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 307135) Time Spent: 40m (was: 0.5h) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=305641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-305641 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 03/Sep/19 15:15 Start Date: 03/Sep/19 15:15 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#discussion_r320328068 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -815,6 +815,7 @@ message StartWorkerRequest { org.apache.beam.model.pipeline.v1.ApiServiceDescriptor logging_endpoint = 3; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor artifact_endpoint = 4; org.apache.beam.model.pipeline.v1.ApiServiceDescriptor provision_endpoint = 5; + string semi_persist_dir = 6; Review comment: Should this be dynamic or rather configured up front for the worker pool? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 305641) Time Spent: 0.5h (was: 20m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=303951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303951 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 29/Aug/19 23:34 Start Date: 29/Aug/19 23:34 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-526399094 I appreciate if you have time to look up the changes @robertwb @mxm :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303951) Time Spent: 20m (was: 10m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=303612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303612 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 29/Aug/19 11:45 Start Date: 29/Aug/19 11:45 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452 Currently "semi_persist_dir" is not configurable. This may become a problem in certain scenarios. For example, the default value of "semi_persist_dir" is "/tmp" (https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48) in Python SDK harness. When the environment type is "PROCESS", the disk of "/tmp" may be filled up and unexpected issues will occur in production environment. So, This pull request makes the semi_persist_dir configurable through adding a new PipelineOption(RemoteEnvironmentOptions).The Pipeline option will be passed to the `DefaultJobBundleFactory` and then be used in each EnvironmentFactory(docker, process, external and embedded). For details of the discussion can be found in [1]. [1] https://lists.apache.org/list.html?d...@beam.apache.org:lte=1M:%5BDISCUSS%5D%20Turn%20%60WindowedValue Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build