olehborysevych commented on code in PR #30435:
URL: https://github.com/apache/beam/pull/30435#discussion_r1507553765
##########
learning/prompts/documentation-lookup-nolinks/47_batching_for_external_calls.md:
##########
@@ -0,0 +1,83 @@
+Prompt: How to reduce payload when calling external services from my Apache
Beam pipeline?
+
+Response:
+To reduce payload when calling external services from your Apache Beam
pipeline, you can employ batching techniques by using the `GroupIntoBatches`
transform. Batching involves aggregating multiple elements into a single
payload, reducing the number of requests sent to the external service and
minimizing overhead.
+
+Under the hood, the `GroupIntoBatches` transform utilizes state and timers to
grant users precise control over batch size and buffering duration parameters,
while abstracting away the implementation details. Key parameters include:
+* `maxBufferDuration`: controls the maximum waiting time for a batch to be
emitted.
+* `batchSize`: determines the maximum number of elements in each batch.
Elements are buffered until the specified number is reached, then emitted as a
batch.
+* `batchSizeBytes` (Java only): limits the byte size of a single batch,
determined by the input coder.
+* `elementByteSize` (Java only): sets the byte size of a single batch using a
user-defined function.
+* `withShardedKey()`: enhances parallelism by distributing a single key across
multiple threads.
+
+The following sample code snippets illustrate the use of `GroupIntoBatches` to
batch elements in an Apache Beam pipeline in Java and Python.
+
+Java:
+
+```java
+public class GroupIntoBatchesExample {
Review Comment:
@dariabezkorovaina I would betterreplace this with snippets from
documentation or example from
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/GroupIntoBatches.html
or
https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]