rosetn commented on a change in pull request #13227: URL: https://github.com/apache/beam/pull/13227#discussion_r530678693
########## File path: website/www/site/content/en/documentation/io/developing-io-overview.md ########## @@ -46,33 +46,32 @@ are the recommended steps to get started: For **bounded (batch) sources**, there are currently two options for creating a Beam source: +1. Use `Splittable DoFn`. + 1. Use `ParDo` and `GroupByKey`. -1. Use the `Source` interface and extend the `BoundedSource` abstract subclass. -`ParDo` is the recommended option, as implementing a `Source` can be tricky. See -[When to use the Source interface](#when-to-use-source) for a list of some use -cases where you might want to use a `Source` (such as -[dynamic work rebalancing](/blog/2016/05/18/splitAtFraction-method.html)). +`Splittable DoFn` is the recommended option, as it's the most recent source framework for both +bounded and unbounded sources. This is meant to replace the `Source` APIs( +[BoundedSource](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/BoundedSource.html) and +[UnboundedSource](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/UnboundedSource.html)) +in the new system. Please read +[Splittable DoFn Programming Guide](/learn/programming-guide/#splittable-dofns) for how to write one +Splittable DoFn. For more information, see the +[roadmap for multi-SDK connector efforts](/roadmap/connectors-multi-sdk/). -(Java only) For **unbounded (streaming) sources**, you must use the `Source` -interface and extend the `UnboundedSource` abstract subclass. `UnboundedSource` -supports features that are useful for streaming pipelines, such as -checkpointing. +For Java and Python **unbounded (streaming) sources**, you must use the `Splittable DoFn`, which +supports features that are useful for streaming pipelines, including checkpointing, controlling +watermark, tracking backlog. Review comment: Missing "and" "watermark, and tracking backlog." ########## File path: website/www/site/content/en/documentation/io/developing-io-overview.md ########## @@ -46,33 +46,32 @@ are the recommended steps to get started: For **bounded (batch) sources**, there are currently two options for creating a Beam source: +1. Use `Splittable DoFn`. + 1. Use `ParDo` and `GroupByKey`. -1. Use the `Source` interface and extend the `BoundedSource` abstract subclass. -`ParDo` is the recommended option, as implementing a `Source` can be tricky. See -[When to use the Source interface](#when-to-use-source) for a list of some use -cases where you might want to use a `Source` (such as -[dynamic work rebalancing](/blog/2016/05/18/splitAtFraction-method.html)). +`Splittable DoFn` is the recommended option, as it's the most recent source framework for both +bounded and unbounded sources. This is meant to replace the `Source` APIs( +[BoundedSource](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/BoundedSource.html) and +[UnboundedSource](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/UnboundedSource.html)) +in the new system. Please read Review comment: Remove instances of "please" on these pages: https://developers.google.com/style/tone#politeness ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
