[GitHub] [beam] robertwb commented on a diff in pull request #22020: Removes examples of unscalable sinks from documentation.

GitBox Fri, 24 Jun 2022 08:22:13 -0700


robertwb commented on code in PR #22020:
URL: https://github.com/apache/beam/pull/22020#discussion_r906168373



##########
website/www/site/content/en/documentation/io/developing-io-overview.md:
##########
@@ -180,9 +180,17 @@ To create a Beam sink, we recommend that you use a `ParDo` 
that writes the
 received records to the data store. To develop more complex sinks (for example,
 to support data de-duplication when failures are retried by a runner), use
 `ParDo`, `GroupByKey`, and other available Beam transforms.
+Many data services are optimized to write batches of elements at a time,
+so it may make sense to group the elements into batches before writing.
+Persistant connectons can be initialized in a DoFn's `setUp` or `startBundle`
+method rather than upon the receipt of every element as well.

Review Comment:
   I don't think any sinks to date have found a use/need for shared, but I 
suppose it's possible. I don't think it's common/specific enough to mention 
here though. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] robertwb commented on a diff in pull request #22020: Removes examples of unscalable sinks from documentation.

Reply via email to