udim commented on a change in pull request #12913: URL: https://github.com/apache/beam/pull/12913#discussion_r499050065
########## File path: website/www/site/content/en/get-started/wordcount-example.md ########## @@ -1424,15 +1424,15 @@ outputs. This example uses an unbounded `PCollection` and streams the results to Google Pub/Sub. The code formats the results and writes them to a Pub/Sub topic -using [`beam.io.WriteStringsToPubSub`](https://beam.apache.org/releases/pydoc/{{< param release_latest >}}/apache_beam.io.gcp.pubsub.html#apache_beam.io.gcp.pubsub.WriteStringsToPubSub). +using [`beam.io.WriteToPubSub`](https://beam.apache.org/releases/pydoc/{{< param release_latest >}}/apache_beam.io.gcp.pubsub.html#apache_beam.io.gcp.pubsub.WriteToPubSub). {{< highlight java >}} // This example is not currently available for the Beam SDK for Java. {{< /highlight >}} {{< highlight py >}} # Write to Pub/Sub - output | beam.io.WriteStringsToPubSub(known_args.output_topic) + output | beam.io.WriteToPubSub(known_args.output_topic) Review comment: Similarly here, assuming that strings are being written they would have to be converted to `bytes` first. ``` _ = (output | 'EncodeString' >> Map(lambda s: s.encode('utf-8')) | beam.io.WriteToPubSub(known_args.output_topic)) ``` ########## File path: website/www/site/content/en/get-started/wordcount-example.md ########## @@ -1405,10 +1405,10 @@ messages from a Pub/Sub subscription or topic using {{< highlight py >}} # Read from Pub/Sub into a PCollection. if known_args.input_subscription: - lines = p | beam.io.ReadStringsFromPubSub( + lines = p | beam.io.ReadFromPubSub( Review comment: An important difference is that ReadFromPubSub returns a `bytes` object (the raw message data). I would: - rename `lines` to `data` - add an additional step below: `lines = data | 'DecodeString' >> beam.Map(lambda d: d.decode('utf-8')))` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
