This is an automated email from the ASF dual-hosted git repository.
bbejeck pushed a commit to branch 2.3
in repository https://gitbox.apache.org/repos/asf/kafka.git
The following commit(s) were added to refs/heads/2.3 by this push:
new 71f2174 port paragrpah from CP docs (#7808)
71f2174 is described below
commit 71f2174c07a08e4e4e255f837b8e65ad839d8c02
Author: A. Sophie Blee-Goldman <[email protected]>
AuthorDate: Mon Dec 9 13:35:17 2019 -0800
port paragrpah from CP docs (#7808)
The AK Streams architecture docs should explain how the maximum parallelism
is determined
Reviewers: Bill Bejeck <[email protected]>
---
docs/streams/architecture.html | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/docs/streams/architecture.html b/docs/streams/architecture.html
index 8bc3156..7efd7ea 100644
--- a/docs/streams/architecture.html
+++ b/docs/streams/architecture.html
@@ -66,6 +66,14 @@
</p>
<p>
+ Slightly simplified, the maximum parallelism at which your application
may run is bounded by the maximum number of stream tasks, which itself is
determined by
+ maximum number of partitions of the input topic(s) the application is
reading from. For example, if your input topic has 5 partitions, then you can
run up to 5
+ applications instances. These instances will collaboratively process
the topic’s data. If you run a larger number of app instances than partitions
of the input
+ topic, the “excess” app instances will launch but remain idle;
however, if one of the busy instances goes down, one of the idle instances will
resume the former’s
+ work.
+ </p>
+
+ <p>
It is important to understand that Kafka Streams is not a resource
manager, but a library that "runs" anywhere its stream processing application
runs.
Multiple instances of the application are executed either on the same
machine, or spread across multiple machines and tasks can be distributed
automatically
by the library to those running application instances. The assignment
of partitions to tasks never changes; if an application instance fails, all its
assigned