[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438846 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 17:15 Start Date: 29/May/20 17:15 Worklog Time Spent: 10m Work Description: steveniemitz edited a comment on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-636086069 If you look at [DataflowPipelineOptions](https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java) it doesn't include `DataflowWorkerHarnessOptions`. In fact, `DataflowWorkerHarnessOptions` implement `DataflowPipelineOptions` instead. The harness options are used in the harness itself, while the `DataflowPipelineOptions` are what are validated against in the dataflow runner. edit: Also to clarify, user's don't (in general) directly implement `DataflowPipelineOptions`, they're included implicitly when the dataflow runner is used. One could specifically implement `DataflowWorkerHarnessOptions` (or even just define the property in any options they have, we actually used to just do that) if they wanted to. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438846) Time Spent: 6h (was: 5h 50m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 6h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438842 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 17:14 Start Date: 29/May/20 17:14 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-636086329 gotcha. Makes sense. LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438842) Time Spent: 5h 40m (was: 5.5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438843 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 17:14 Start Date: 29/May/20 17:14 Worklog Time Spent: 10m Work Description: pabloem merged pull request #11849: URL: https://github.com/apache/beam/pull/11849 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438843) Time Spent: 5h 50m (was: 5h 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438841 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 17:13 Start Date: 29/May/20 17:13 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-636086069 If you look at [DataflowPipelineOptions](https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java) it doesn't include `DataflowWorkerHarnessOptions`. In fact, `DataflowWorkerHarnessOptions` implement `DataflowPipelineOptions` instead. The harness options are used in the harness itself, while the `DataflowPipelineOptions` are what are validated against in the dataflow runner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438841) Time Spent: 5.5h (was: 5h 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438838 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 17:10 Start Date: 29/May/20 17:10 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-636084490 Just a question - I am a little confused. How come the `DataflowPipelineDebugOptions` class is visible, but `DataflowWorkerHarnessOptions` isn't? If you inherit from it, shouldn't you be able to use it? It may be that we generally encourage users to rely on DataflowPipelineDebugOptions for their Dataflow pipeline needs? If so, it makes sense to move the option... I'm just confused about what is affecting the visibility of the classes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438838) Time Spent: 5h 20m (was: 5h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438538=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438538 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 01:03 Start Date: 29/May/20 01:03 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-635697264 =/ looks like the dataflow precommit succeeded but the API call to update it here failed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438538) Time Spent: 5h 10m (was: 5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438535 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 29/May/20 01:00 Start Date: 29/May/20 01:00 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-635696194 > gahh so sorry that I missed this. I guess you did have to end up contributing this : ) heh no problem, teamwork! :highfive: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438535) Time Spent: 5h (was: 4h 50m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438496=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438496 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 28/May/20 23:10 Start Date: 28/May/20 23:10 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11849: URL: https://github.com/apache/beam/pull/11849#issuecomment-635663865 gahh so sorry that I missed this. I guess you did have to end up contributing this : ) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438496) Time Spent: 4h 50m (was: 4h 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438363 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 28/May/20 17:26 Start Date: 28/May/20 17:26 Worklog Time Spent: 10m Work Description: steveniemitz opened a new pull request #11849: URL: https://github.com/apache/beam/pull/11849 #11710 added the plumbing to use `workerCacheMb` parameter to size the streaming dataflow worker state cache. However, the parameter itself is inaccessible from user jobs because it's in `DataflowWorkerHarnessOptions`, which is only exposed in the worker itself. Trying to set it produces: ``` $ java -jar myjob.jar ... --runner=DataflowRunner --workerCacheMb=400 Exception in thread "main" java.lang.IllegalArgumentException: Class interface ...MyOptions missing a property named 'workerCacheMb'. at org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1625) ``` This simply moves it to a user-accessible location. R: @omarismail94 @pabloem Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) |
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=438347=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-438347 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 28/May/20 16:55 Start Date: 28/May/20 16:55 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-635468825 hm, I was just rebasing my work against this commit and realized something. I had moved the flag to `DataflowPipelineDebugOptions`. `DataflowWorkerHarnessOptions` are not included in the options set visible to the "client" side of things. This means that you can't actually set this flag from a job submission, eg: ``` $ java -jar myjob.jar ... --runner=DataflowRunner --workerCacheMb=400 Exception in thread "main" java.lang.IllegalArgumentException: Class interface ...MyOptions missing a property named 'workerCacheMb'. at org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1625) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 438347) Time Spent: 4.5h (was: 4h 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=434574=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434574 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 18/May/20 19:36 Start Date: 18/May/20 19:36 Worklog Time Spent: 10m Work Description: TheNeuralBit merged pull request #11743: URL: https://github.com/apache/beam/pull/11743 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 434574) Time Spent: 4h 20m (was: 4h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=434571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434571 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 18/May/20 19:28 Start Date: 18/May/20 19:28 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #11743: URL: https://github.com/apache/beam/pull/11743#issuecomment-630390198 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 434571) Time Spent: 4h 10m (was: 4h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=434567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434567 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 18/May/20 19:25 Start Date: 18/May/20 19:25 Worklog Time Spent: 10m Work Description: omarismail94 opened a new pull request #11743: URL: https://github.com/apache/beam/pull/11743 R: @TheNeuralBit Adding work done on BEAM-9964 to CHANGES.md for Beam 2.22 release Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=434560=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434560 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 18/May/20 19:12 Start Date: 18/May/20 19:12 Worklog Time Spent: 10m Work Description: omarismail94 closed pull request #11741: URL: https://github.com/apache/beam/pull/11741 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 434560) Time Spent: 3h 50m (was: 3h 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=434559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434559 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 18/May/20 19:12 Start Date: 18/May/20 19:12 Worklog Time Spent: 10m Work Description: omarismail94 commented on pull request #11741: URL: https://github.com/apache/beam/pull/11741#issuecomment-630383080 I messed this up LOL. Will create a new PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 434559) Time Spent: 3h 40m (was: 3.5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: P3 > Fix For: 2.22.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=434557=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-434557 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 18/May/20 19:10 Start Date: 18/May/20 19:10 Worklog Time Spent: 10m Work Description: omarismail94 opened a new pull request #11741: URL: https://github.com/apache/beam/pull/11741 Per R: @TheNeuralBit email, adding work done on BEAM-9964 to CHANGES.md for Beam 2.22 release Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433905=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433905 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 20:48 Start Date: 15/May/20 20:48 Worklog Time Spent: 10m Work Description: pabloem merged pull request #11710: URL: https://github.com/apache/beam/pull/11710 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433905) Time Spent: 3h 10m (was: 3h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 3h 10m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433906 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 20:48 Start Date: 15/May/20 20:48 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-629476263 thanks @omarismail94 ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433906) Time Spent: 3h 20m (was: 3h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 3h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433875 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 20:01 Start Date: 15/May/20 20:01 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-629454663 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433875) Time Spent: 3h (was: 2h 50m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 3h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433742 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 16:06 Start Date: 15/May/20 16:06 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-629343652 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433742) Time Spent: 2h 50m (was: 2h 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433741 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 16:04 Start Date: 15/May/20 16:04 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-629342947 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433741) Time Spent: 2h 40m (was: 2.5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433451 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 15/May/20 00:38 Start Date: 15/May/20 00:38 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628958485 Seems like java precommits are broken on master - but this change LGTM. I'll wait for precommits to be fixed if possible. Thanks @omarismail94 ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433451) Time Spent: 2.5h (was: 2h 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433420 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 22:44 Start Date: 14/May/20 22:44 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628924201 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433420) Time Spent: 2h 20m (was: 2h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433381 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:41 Start Date: 14/May/20 21:41 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628901207 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433381) Time Spent: 2h 10m (was: 2h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433367 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:17 Start Date: 14/May/20 21:17 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628891444 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433367) Time Spent: 2h (was: 1h 50m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433365 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:16 Start Date: 14/May/20 21:16 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628890978 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433365) Time Spent: 1h 50m (was: 1h 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433363 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:11 Start Date: 14/May/20 21:11 Worklog Time Spent: 10m Work Description: omarismail94 commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-62729 New changes passed ./gradlew -p runners/google-cloud-dataflow-java check on my computer This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433363) Time Spent: 1h 40m (was: 1.5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433359 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:07 Start Date: 14/May/20 21:07 Worklog Time Spent: 10m Work Description: omarismail94 commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628886984 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433359) Time Spent: 1.5h (was: 1h 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433358=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433358 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 21:05 Start Date: 14/May/20 21:05 Worklog Time Spent: 10m Work Description: omarismail94 commented on a change in pull request #11710: URL: https://github.com/apache/beam/pull/11710#discussion_r425430039 ## File path: runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillStateCacheTest.java ## @@ -130,7 +133,8 @@ private static StateNamespace triggerNamespace(long start, int triggerIdx) { @Before public void setUp() { -cache = new WindmillStateCache(); +options = PipelineOptionsFactory.as(DataflowWorkerHarnessOptions.class); +cache = new WindmillStateCache(options.getWorkerCacheMb()); assertEquals(0, cache.getWeight()); Review comment: Fixed this by adding a new Test method in WindmillStateCacheTest class. I created a new getter in the WindmillStateCache to retrieve the size of max weight on bytes, and compared it to the initial value set This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433358) Time Spent: 1h 20m (was: 1h 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433281=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433281 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 19:00 Start Date: 14/May/20 19:00 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628828183 > These shouldn't need to be battles but more like: > > ``` > you: dev@ hey we got this patch we use locally it is X > dev@: sounds great, submit a PR or we have a larger plan around this do you want to work on it with us > ``` heh, I feel bad clogging up this PR with unrelated conversations. If that process you described was how it worked in real life, that'd be great. Feel free to ping me on the ASF slack (at-steve) if you want to chat more about this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433281) Time Spent: 1h 10m (was: 1h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433279=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433279 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 18:56 Start Date: 14/May/20 18:56 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628826053 > > > ahh this is great. We've been running a similar patch in our fork forever. > > > > > > Feel free to submit patches upstream > > heh, I've been choosing my battles ;) These shouldn't need to be battles but more like: ``` you: dev@ hey we got this patch we use locally it is X dev@: sounds great, submit a PR or we have a larger plan around this do you want to work on it with us ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433279) Time Spent: 1h (was: 50m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433276=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433276 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 18:54 Start Date: 14/May/20 18:54 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628824973 > > ahh this is great. We've been running a similar patch in our fork forever. > > Feel free to submit patches upstream heh, I've been choosing my battles ;) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433276) Time Spent: 50m (was: 40m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433268 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 18:45 Start Date: 14/May/20 18:45 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628820475 > ahh this is great. We've been running a similar patch in our fork forever. Feel free to submit patches upstream This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433268) Time Spent: 40m (was: 0.5h) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433265 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 18:40 Start Date: 14/May/20 18:40 Worklog Time Spent: 10m Work Description: steveniemitz commented on pull request #11710: URL: https://github.com/apache/beam/pull/11710#issuecomment-628818041 ahh this is great. We've been running a similar patch in our fork forever. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433265) Time Spent: 0.5h (was: 20m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433259 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 18:27 Start Date: 14/May/20 18:27 Worklog Time Spent: 10m Work Description: pabloem commented on a change in pull request #11710: URL: https://github.com/apache/beam/pull/11710#discussion_r425345741 ## File path: runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/WindmillStateCacheTest.java ## @@ -130,7 +133,8 @@ private static StateNamespace triggerNamespace(long start, int triggerIdx) { @Before public void setUp() { -cache = new WindmillStateCache(); +options = PipelineOptionsFactory.as(DataflowWorkerHarnessOptions.class); +cache = new WindmillStateCache(options.getWorkerCacheMb()); assertEquals(0, cache.getWeight()); Review comment: Can you add a check to this test to make sure that the `maximumWeight` of the cache is the 100 MB? (perhaps use a number different than 100 to be sure). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 433259) Time Spent: 20m (was: 10m) > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Assignee: Omar Ismail >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?focusedWorklogId=433251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-433251 ] ASF GitHub Bot logged work on BEAM-9964: Author: ASF GitHub Bot Created on: 14/May/20 18:11 Start Date: 14/May/20 18:11 Worklog Time Spent: 10m Work Description: omarismail94 opened a new pull request #11710: URL: https://github.com/apache/beam/pull/11710 R:@ pabloem Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make it allowable to change the cache value in Streaming when setting -workerCacheMB. Passed ./gradlew -p runners/google-cloud-dataflow-java check on my computer [1] https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build