[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91922 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 17/Apr/18 21:00 Start Date: 17/Apr/18 21:00 Worklog Time Spent: 10m Work Description: robertwb closed pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/go/examples/streaming_wordcap/wordcap.go b/sdks/go/examples/streaming_wordcap/wordcap.go new file mode 100644 index 000..045fb375577 --- /dev/null +++ b/sdks/go/examples/streaming_wordcap/wordcap.go @@ -0,0 +1,81 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// streaming_wordcap is a toy streaming pipeline that uses PubSub. It +// does the following: +//(1) create a topic and publish a few messages to it +//(2) start a streaming pipeline that converts the messages to +//upper case and logs the result. +// +// NOTE: it only runs on Dataflow and must be manually cancelled. +package main + +import ( + "context" + "flag" + "os" + "strings" + + "github.com/apache/beam/sdks/go/pkg/beam" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/stringx" + "github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio" + "github.com/apache/beam/sdks/go/pkg/beam/log" + "github.com/apache/beam/sdks/go/pkg/beam/options/gcpopts" + "github.com/apache/beam/sdks/go/pkg/beam/util/pubsubx" + "github.com/apache/beam/sdks/go/pkg/beam/x/beamx" + "github.com/apache/beam/sdks/go/pkg/beam/x/debug" +) + +var ( + input = flag.String("input", os.ExpandEnv("$USER-wordcap"), "Pubsub input topic.") +) + +var ( + data = []string{ + "foo", + "bar", + "baz", + } +) + +func main() { + flag.Parse() + beam.Init() + + ctx := context.Background() + project := gcpopts.GetProject(ctx) + + log.Infof(ctx, "Publishing %v messages to: %v", len(data), *input) + + defer pubsubx.CleanupTopic(ctx, project, *input) + sub, err := pubsubx.Publish(ctx, project, *input, data...) + if err != nil { + log.Fatal(ctx, err) + } + + log.Infof(ctx, "Running streaming wordcap with subscription: %v", sub.ID()) + + p := beam.NewPipeline() + s := p.Root() + + col := pubsubio.Read(s, project, *input, {Subscription: sub.ID()}) + str := beam.ParDo(s, stringx.FromBytes, col) + cap := beam.ParDo(s, strings.ToUpper, str) + debug.Print(s, cap) + + if err := beamx.Run(context.Background(), p); err != nil { + log.Exitf(ctx, "Failed to execute job: %v", err) + } +} diff --git a/sdks/go/pkg/beam/core/util/stringx/bytes.go b/sdks/go/pkg/beam/core/util/stringx/bytes.go new file mode 100644 index 000..b2110fa1072 --- /dev/null +++ b/sdks/go/pkg/beam/core/util/stringx/bytes.go @@ -0,0 +1,28 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91833 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 17/Apr/18 17:09 Start Date: 17/Apr/18 17:09 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-382070307 Thanks @wcn3. @robertwb I think we can merge now. PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91833) Time Spent: 2.5h (was: 2h 20m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91832 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 17/Apr/18 17:08 Start Date: 17/Apr/18 17:08 Worklog Time Spent: 10m Work Description: herohde commented on a change in pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#discussion_r182157552 ## File path: sdks/go/pkg/beam/util/pubsubx/pubsub.go ## @@ -0,0 +1,110 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package pubsubx contains utilities for working with Google PubSub. +package pubsubx + +import ( + "context" + "fmt" + "time" + + "cloud.google.com/go/pubsub" + "github.com/apache/beam/sdks/go/pkg/beam/log" +) + +// MakeQualifiedTopicName returns a fully-qualified topic name for +// the given project and topic. +func MakeQualifiedTopicName(project, topic string) string { Review comment: Ack. We use "make" in many places in the code (incl. this specific name in biqqueryio). I would rather consider such changes in a global naming pass outside of this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91832) Time Spent: 2h 20m (was: 2h 10m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91829 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 17/Apr/18 17:05 Start Date: 17/Apr/18 17:05 Worklog Time Spent: 10m Work Description: herohde commented on a change in pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#discussion_r182156926 ## File path: sdks/go/pkg/beam/runners/dataflow/translate.go ## @@ -351,6 +363,45 @@ func translateEdge(edge *graph.MultiEdge) (string, properties, error) { UserName: buildName(edge.Scope(), "flatten"), // TODO: user-defined }, nil + case graph.External: Review comment: Ack. The Dataflow translation won't likely change further and any other interpretation of the external payloads would have to be supported by the underlying engine. Instead of adding a stateful registration to an otherwise pure translation, substitutions in the proto pipeline would be a preferable way to go long term. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91829) Time Spent: 2h 10m (was: 2h) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91015=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91015 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 13/Apr/18 23:20 Start Date: 13/Apr/18 23:20 Worklog Time Spent: 10m Work Description: wcn3 commented on a change in pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#discussion_r181528869 ## File path: sdks/go/pkg/beam/runners/dataflow/translate.go ## @@ -351,6 +363,45 @@ func translateEdge(edge *graph.MultiEdge) (string, properties, error) { UserName: buildName(edge.Scope(), "flatten"), // TODO: user-defined }, nil + case graph.External: Review comment: To support the open/closed principle, it would be nice if we could have modules register a function to be called for a certain URN. This would leave the translator intact, and new external payload types would be self-injecting. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91015) Time Spent: 2h (was: 1h 50m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91011 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 13/Apr/18 23:12 Start Date: 13/Apr/18 23:12 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-381281757 Rebased. @robertwb PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91011) Time Spent: 1h 50m (was: 1h 40m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=91003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91003 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 13/Apr/18 22:52 Start Date: 13/Apr/18 22:52 Worklog Time Spent: 10m Work Description: robertwb commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-381278950 Looks like this needs a rebase. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91003) Time Spent: 1h 40m (was: 1.5h) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=89190=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89190 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 09/Apr/18 23:22 Start Date: 09/Apr/18 23:22 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-379923813 R: @tweise (please feel free to merge) Go streaming w/ native Kafka I/O on Flink would take a similar approach. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 89190) Time Spent: 1.5h (was: 1h 20m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87343 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 03/Apr/18 23:45 Start Date: 03/Apr/18 23:45 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-378434195 R: @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 87343) Time Spent: 1h 10m (was: 1h) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87344 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 03/Apr/18 23:45 Start Date: 03/Apr/18 23:45 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-378434195 R: @robertwb (please merge) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 87344) Time Spent: 1h 20m (was: 1h 10m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87342 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 03/Apr/18 23:44 Start Date: 03/Apr/18 23:44 Worklog Time Spent: 10m Work Description: herohde commented on a change in pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#discussion_r178992732 ## File path: sdks/go/examples/streaming_wordcap/wordcap.go ## @@ -0,0 +1,81 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// streaming_wordcap is a toy streaming pipeline that uses PubSub. It +// does the following: +//(1) create a topic and publish a few messages to it +//(2) start a streaming pipeline that converts the messages to +//upper case and logs the result. +// +// NOTE: it only runs on Dataflow and must be manually cancelled. +package main + +import ( + "context" + "flag" + "os" + "strings" + + "github.com/apache/beam/sdks/go/pkg/beam" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/stringx" + "github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio" + "github.com/apache/beam/sdks/go/pkg/beam/log" + "github.com/apache/beam/sdks/go/pkg/beam/options/gcpopts" + "github.com/apache/beam/sdks/go/pkg/beam/util/pubsubx" + "github.com/apache/beam/sdks/go/pkg/beam/x/beamx" + "github.com/apache/beam/sdks/go/pkg/beam/x/debug" +) + +var ( + input = flag.String("input", os.ExpandEnv("$USER-wordcap"), "Pubsub input topic.") +) + +var ( + data = []string{ + "foo", + "bar", + "baz", + } +) + +func main() { + flag.Parse() + beam.Init() + + ctx := context.Background() + project := gcpopts.GetProject(ctx) + + log.Infof(ctx, "Publishing %v messages to: %v", len(data), *input) + + defer pubsubx.CleanupTopic(ctx, project, *input) + sub, err := pubsubx.Publish(ctx, project, *input, data...) + if err != nil { + log.Fatal(ctx, err) + } + + log.Infof(ctx, "Running streaming wordcap with subscription: %v", sub.ID()) + + p := beam.NewPipeline() + s := p.Root() + + col := pubsubio.Read(s, project, *input, {Subscription: sub.ID()}) + str := beam.ParDo(s, stringx.FromBytes, col) + cap := beam.ParDo(s, strings.ToUpper, str) Review comment: It does not. Function registration is just a tiny performance optimization to avoid hitting the symbol table to find that function on the worker. It has no semantic impact. I'd prefer to not do it for the examples to keep them simpler. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 87342) Time Spent: 1h (was: 50m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87330 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 03/Apr/18 23:14 Start Date: 03/Apr/18 23:14 Worklog Time Spent: 10m Work Description: lostluck commented on a change in pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#discussion_r178988229 ## File path: sdks/go/examples/streaming_wordcap/wordcap.go ## @@ -0,0 +1,81 @@ +// Licensed to the Apache Software Foundation (ASF) under one or more +// contributor license agreements. See the NOTICE file distributed with +// this work for additional information regarding copyright ownership. +// The ASF licenses this file to You under the Apache License, Version 2.0 +// (the "License"); you may not use this file except in compliance with +// the License. You may obtain a copy of the License at +// +//http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// streaming_wordcap is a toy streaming pipeline that uses PubSub. It +// does the following: +//(1) create a topic and publish a few messages to it +//(2) start a streaming pipeline that converts the messages to +//upper case and logs the result. +// +// NOTE: it only runs on Dataflow and must be manually cancelled. +package main + +import ( + "context" + "flag" + "os" + "strings" + + "github.com/apache/beam/sdks/go/pkg/beam" + "github.com/apache/beam/sdks/go/pkg/beam/core/util/stringx" + "github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio" + "github.com/apache/beam/sdks/go/pkg/beam/log" + "github.com/apache/beam/sdks/go/pkg/beam/options/gcpopts" + "github.com/apache/beam/sdks/go/pkg/beam/util/pubsubx" + "github.com/apache/beam/sdks/go/pkg/beam/x/beamx" + "github.com/apache/beam/sdks/go/pkg/beam/x/debug" +) + +var ( + input = flag.String("input", os.ExpandEnv("$USER-wordcap"), "Pubsub input topic.") +) + +var ( + data = []string{ + "foo", + "bar", + "baz", + } +) + +func main() { + flag.Parse() + beam.Init() + + ctx := context.Background() + project := gcpopts.GetProject(ctx) + + log.Infof(ctx, "Publishing %v messages to: %v", len(data), *input) + + defer pubsubx.CleanupTopic(ctx, project, *input) + sub, err := pubsubx.Publish(ctx, project, *input, data...) + if err != nil { + log.Fatal(ctx, err) + } + + log.Infof(ctx, "Running streaming wordcap with subscription: %v", sub.ID()) + + p := beam.NewPipeline() + s := p.Root() + + col := pubsubio.Read(s, project, *input, {Subscription: sub.ID()}) + str := beam.ParDo(s, stringx.FromBytes, col) + cap := beam.ParDo(s, strings.ToUpper, str) Review comment: In PR 5011, we're registering a ton of functions. Does that need to be done here? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 87330) Time Spent: 50m (was: 40m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87308 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 03/Apr/18 22:12 Start Date: 03/Apr/18 22:12 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-378416184 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 87308) Time Spent: 40m (was: 0.5h) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=85903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-85903 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 30/Mar/18 05:02 Start Date: 30/Mar/18 05:02 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-377445766 @wslulciuc Please let me know if this works for your fanout pipeline. I think all the pieces are present with this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 85903) Time Spent: 0.5h (was: 20m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=83325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83325 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 22/Mar/18 20:26 Start Date: 22/Mar/18 20:26 Worklog Time Spent: 10m Work Description: herohde commented on issue #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939#issuecomment-375446047 R: @wslulciuc @wcn3 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 83325) Time Spent: 20m (was: 10m) > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming
[ https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=83323=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83323 ] ASF GitHub Bot logged work on BEAM-3856: Author: ASF GitHub Bot Created on: 22/Mar/18 20:24 Start Date: 22/Mar/18 20:24 Worklog Time Spent: 10m Work Description: herohde opened a new pull request #4939: [BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub URL: https://github.com/apache/beam/pull/4939 * Add streaming Wordcap example. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 83323) Time Spent: 10m Remaining Estimate: 0h > Add prototype support for Go SDK streaming > -- > > Key: BEAM-3856 > URL: https://issues.apache.org/jira/browse/BEAM-3856 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Willy Lulciuc >Assignee: Henning Rohde >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)