Repository: samza
Updated Branches:
  refs/heads/master 83ed46616 -> 1aee39ff1


SAMZA-766: fixed broken links in samza-container.html


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/1aee39ff
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/1aee39ff
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/1aee39ff

Branch: refs/heads/master
Commit: 1aee39ff199743aaf150c3199cdbf65fb09e0dd0
Parents: 83ed466
Author: Aleksandar Pejakovic <[email protected]>
Authored: Tue Sep 8 00:32:04 2015 -0700
Committer: Yan Fang <[email protected]>
Committed: Tue Sep 8 00:32:04 2015 -0700

----------------------------------------------------------------------
 docs/learn/documentation/versioned/container/samza-container.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/1aee39ff/docs/learn/documentation/versioned/container/samza-container.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/container/samza-container.md 
b/docs/learn/documentation/versioned/container/samza-container.md
index f97e8a3..a7236a6 100644
--- a/docs/learn/documentation/versioned/container/samza-container.md
+++ b/docs/learn/documentation/versioned/container/samza-container.md
@@ -53,9 +53,9 @@ The number of partitions in the input streams is determined 
by the systems from
 
 If a Samza job has more than one input stream, the number of task instances 
for the Samza job is the maximum number of partitions across all input streams. 
For example, if a Samza job is reading from PageViewEvent (12 partitions), and 
ServiceMetricEvent (14 partitions), then the Samza job would have 14 task 
instances (numbered 0 through 13). Task instances 12 and 13 only receive events 
from ServiceMetricEvent, because there is no corresponding PageViewEvent 
partition.
 
-With this default approach to assigning input streams to task instances, Samza 
is effectively performing a group-by operation on the input streams with their 
partitions as the key. Other strategies for grouping input stream partitions 
are possible by implementing a new 
[SystemStreamPartitionGrouper](../api/javadocs/org/apache/samza/container/SystemStreamPartitionGrouper.html)
 and factory, and configuring the job to use it via the 
job.systemstreampartition.grouper.factory configuration value.
+With this default approach to assigning input streams to task instances, Samza 
is effectively performing a group-by operation on the input streams with their 
partitions as the key. Other strategies for grouping input stream partitions 
are possible by implementing a new 
[SystemStreamPartitionGrouper](../api/javadocs/org/apache/samza/container/grouper/stream/SystemStreamPartitionGrouper.html)
 and factory, and configuring the job to use it via the 
job.systemstreampartition.grouper.factory configuration value.
 
-Samza provides the above-discussed per-partition grouper as well as the 
[GroupBySystemStreamPartitionGrouper](../api/javadocs/org/apache/samza/container/systemstreampartition/groupers/GroupBySystemStreamPartition),
 which provides a separate task class instance for every input stream 
partition, effectively grouping by the input stream itself. This provides 
maximum scalability in terms of how many containers can be used to process 
those input streams and is appropriate for very high volume jobs that need no 
grouping of the input streams.
+Samza provides the above-discussed per-partition grouper as well as the 
GroupBySystemStreamPartitionGrouper, which provides a separate task class 
instance for every input stream partition, effectively grouping by the input 
stream itself. This provides maximum scalability in terms of how many 
containers can be used to process those input streams and is appropriate for 
very high volume jobs that need no grouping of the input streams.
 
 Considering the above example of a PageViewEvent partitioned 12 ways and a 
ServiceMetricEvent partitioned 14 ways, the GroupBySystemStreamPartitionGrouper 
would create 12 + 14 = 26 task instances, which would then be distributed 
across the number of containers configured, as discussed below.
 

Reply via email to