[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=98052&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98052 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 03/May/18 21:57 Start Date: 03/May/18 21:57 Worklog Time Spent: 10m Work Description: tgroh closed pull request #5272: [BEAM-4031] Add labels flag to Go SDK Dataflow Runner. URL: https://github.com/apache/beam/pull/5272 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/go/pkg/beam/runners/dataflow/dataflow.go b/sdks/go/pkg/beam/runners/dataflow/dataflow.go index 093628769f3..38d04fee143 100644 --- a/sdks/go/pkg/beam/runners/dataflow/dataflow.go +++ b/sdks/go/pkg/beam/runners/dataflow/dataflow.go @@ -52,6 +52,7 @@ var ( endpoint= flag.String("dataflow_endpoint", "", "Dataflow endpoint (optional).") stagingLocation = flag.String("staging_location", "", "GCS staging location (required).") image = flag.String("worker_harness_container_image", "", "Worker harness container image (required).") + labels = flag.String("labels", "", "JSON-formatted map[string]string of job labels (optional).") numWorkers = flag.Int64("num_workers", 0, "Number of workers (optional).") zone= flag.String("zone", "", "GCP zone (optional)") region = flag.String("region", "us-central1", "GCP Region (optional)") @@ -93,6 +94,12 @@ func Execute(ctx context.Context, p *beam.Pipeline) error { if *image == "" { *image = jobopts.GetContainerImage(ctx) } + var jobLabels map[string]string + if *labels != "" { + if err := json.Unmarshal([]byte(*labels), &jobLabels); err != nil { + return fmt.Errorf("Error reading --label flag as JSON: %v", err) + } + } jobName := jobopts.GetJobName() edges, _, err := p.Build() @@ -201,6 +208,7 @@ func Execute(ctx context.Context, p *beam.Pipeline) error { TempStoragePrefix: *stagingLocation + "/tmp", Experiments: append(jobopts.GetExperiments(), "beam_fn_api"), }, + Labels: jobLabels, Steps: steps, } This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 98052) Time Spent: 2h (was: 1h 50m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Jason Kuster >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 2h > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=98047&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98047 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 03/May/18 21:43 Start Date: 03/May/18 21:43 Worklog Time Spent: 10m Work Description: jasonkuster commented on issue #5272: [BEAM-4031] Add labels flag to Go SDK Dataflow Runner. URL: https://github.com/apache/beam/pull/5272#issuecomment-386446626 @tgroh can you give this a quick look? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 98047) Time Spent: 1h 50m (was: 1h 40m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Jason Kuster >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=98046&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98046 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 03/May/18 21:42 Start Date: 03/May/18 21:42 Worklog Time Spent: 10m Work Description: jasonkuster commented on a change in pull request #5272: [BEAM-4031] Add labels flag to Go SDK Dataflow Runner. URL: https://github.com/apache/beam/pull/5272#discussion_r185944748 ## File path: sdks/go/pkg/beam/runners/dataflow/dataflow.go ## @@ -93,6 +94,12 @@ func Execute(ctx context.Context, p *beam.Pipeline) error { if *image == "" { *image = jobopts.GetContainerImage(ctx) } + var jobLabels map[string]string + if *labels != "" { + if err := json.Unmarshal([]byte(*labels), &jobLabels); err != nil { + return err Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 98046) Time Spent: 1h 40m (was: 1.5h) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Jason Kuster >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=98042&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98042 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 03/May/18 21:26 Start Date: 03/May/18 21:26 Worklog Time Spent: 10m Work Description: herohde commented on a change in pull request #5272: [BEAM-4031] Add labels flag to Go SDK Dataflow Runner. URL: https://github.com/apache/beam/pull/5272#discussion_r185941021 ## File path: sdks/go/pkg/beam/runners/dataflow/dataflow.go ## @@ -93,6 +94,12 @@ func Execute(ctx context.Context, p *beam.Pipeline) error { if *image == "" { *image = jobopts.GetContainerImage(ctx) } + var jobLabels map[string]string + if *labels != "" { + if err := json.Unmarshal([]byte(*labels), &jobLabels); err != nil { + return err Review comment: return a descriptive error This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 98042) Time Spent: 1.5h (was: 1h 20m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Jason Kuster >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=98039&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98039 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 03/May/18 21:23 Start Date: 03/May/18 21:23 Worklog Time Spent: 10m Work Description: jasonkuster opened a new pull request #5272: [BEAM-4031] Add labels flag to Go SDK Dataflow Runner. URL: https://github.com/apache/beam/pull/5272 Signed-off-by: Jason Kuster Add labels flag to Go SDK Dataflow Runner. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes. - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue. - [ ] Write a pull request description that is detailed enough to understand: - [ ] What the pull request does - [ ] Why it does it - [ ] How it does it - [ ] Why this approach - [ ] Each commit in the pull request should have a meaningful subject line and body. - [ ] Run `./gradlew build` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 98039) Time Spent: 1h 10m (was: 1h) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Jason Kuster >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=98040&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98040 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 03/May/18 21:23 Start Date: 03/May/18 21:23 Worklog Time Spent: 10m Work Description: jasonkuster commented on issue #5272: [BEAM-4031] Add labels flag to Go SDK Dataflow Runner. URL: https://github.com/apache/beam/pull/5272#issuecomment-386440902 R: @herohde This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 98040) Time Spent: 1h 20m (was: 1h 10m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Jason Kuster >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=89505&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89505 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 10/Apr/18 17:17 Start Date: 10/Apr/18 17:17 Worklog Time Spent: 10m Work Description: tgroh closed pull request #5070: [BEAM-4031] Add more Go SDK Dataflow options URL: https://github.com/apache/beam/pull/5070 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/go/pkg/beam/runners/dataflow/dataflow.go b/sdks/go/pkg/beam/runners/dataflow/dataflow.go index 4173c9f4f5a..cdc8221b9b9 100644 --- a/sdks/go/pkg/beam/runners/dataflow/dataflow.go +++ b/sdks/go/pkg/beam/runners/dataflow/dataflow.go @@ -53,6 +53,10 @@ var ( stagingLocation = flag.String("staging_location", "", "GCS staging location (required).") image = flag.String("worker_harness_container_image", "", "Worker harness container image (required).") numWorkers = flag.Int64("num_workers", 0, "Number of workers (optional).") + zone= flag.String("zone", "", "GCP zone (optional)") + network = flag.String("network", "", "GCP network (optional)") + tempLocation= flag.String("temp_location", "", "Temp location (optional)") + machineType = flag.String("worker_machine_type", "", "GCE machine type (optional)") dryRun = flag.Bool("dry_run", false, "Dry run. Just print the job, but don't submit it.") teardownPolicy = flag.String("teardown_policy", "", "Job teardown policy (internal only).") @@ -174,6 +178,9 @@ func Execute(ctx context.Context, p *beam.Pipeline) error { }}, WorkerHarnessContainerImage: *image, NumWorkers: 1, + MachineType: *machineType, + Network: *network, + Zone:*zone, }}, TempStoragePrefix: *stagingLocation + "/tmp", Experiments: jobopts.GetExperiments(), @@ -187,6 +194,9 @@ func Execute(ctx context.Context, p *beam.Pipeline) error { if *teardownPolicy != "" { job.Environment.WorkerPools[0].TeardownPolicy = *teardownPolicy } + if *tempLocation != "" { + job.Environment.TempStoragePrefix = *tempLocation + } printJob(ctx, job) if *dryRun { This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89505) Time Spent: 1h (was: 50m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Henning Rohde >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=89504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89504 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 10/Apr/18 17:17 Start Date: 10/Apr/18 17:17 Worklog Time Spent: 10m Work Description: jasonkuster commented on issue #5070: [BEAM-4031] Add more Go SDK Dataflow options URL: https://github.com/apache/beam/pull/5070#issuecomment-379963008 @tgroh senpai pls notice me (Why hello. -Senpai) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89504) Time Spent: 50m (was: 40m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Henning Rohde >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=89234&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89234 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 10/Apr/18 03:31 Start Date: 10/Apr/18 03:31 Worklog Time Spent: 10m Work Description: jasonkuster commented on issue #5070: [BEAM-4031] Add more Go SDK Dataflow options URL: https://github.com/apache/beam/pull/5070#issuecomment-379963008 @tgroh senpai pls notice me This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89234) Time Spent: 40m (was: 0.5h) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Henning Rohde >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=89233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89233 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 10/Apr/18 03:30 Start Date: 10/Apr/18 03:30 Worklog Time Spent: 10m Work Description: jasonkuster commented on issue #5070: [BEAM-4031] Add more Go SDK Dataflow options URL: https://github.com/apache/beam/pull/5070#issuecomment-379962838 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89233) Time Spent: 0.5h (was: 20m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Henning Rohde >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=89178&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89178 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 09/Apr/18 23:03 Start Date: 09/Apr/18 23:03 Worklog Time Spent: 10m Work Description: herohde commented on issue #5070: [BEAM-4031] Add more Go SDK Dataflow options URL: https://github.com/apache/beam/pull/5070#issuecomment-379920401 R: @jasonkuster This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89178) Time Spent: 20m (was: 10m) > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Henning Rohde >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4031) Add missing dataflow customization options for Go SDK
[ https://issues.apache.org/jira/browse/BEAM-4031?focusedWorklogId=89177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89177 ] ASF GitHub Bot logged work on BEAM-4031: Author: ASF GitHub Bot Created on: 09/Apr/18 23:02 Start Date: 09/Apr/18 23:02 Worklog Time Spent: 10m Work Description: herohde opened a new pull request #5070: [BEAM-4031] Add more Go SDK Dataflow options URL: https://github.com/apache/beam/pull/5070 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89177) Time Spent: 10m Remaining Estimate: 0h > Add missing dataflow customization options for Go SDK > - > > Key: BEAM-4031 > URL: https://issues.apache.org/jira/browse/BEAM-4031 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Henning Rohde >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > We're missing at least: > zone > temp_location > worker_machine_type -- This message was sent by Atlassian JIRA (v7.6.3#76005)