lakshmi-manasa-g opened a new pull request #1531:
URL: https://github.com/apache/samza/pull/1531


   Feature: Elasticity for Samza job. Throughput via parallelism is tied to the 
number of tasks which is equal to the partition count of input streams. If a 
job is facing lag and is already at the max container count = number of tasks = 
number of input partitions, then the only choice it is left with is to 
repartition the input. This PR is part of the feature which aims to increase 
throughput by scaling task count beyond the input partition count. In this PR, 
the config and basic class for elasticity are introduced. 
   
   Changes:  Introduce config "task.elasticity.factor" which defaults to 1. If 
factor = X>1 then each task is split into X elastic tasks. Also, introduce 
SystemStreamPartitionKeyHash which represents the portion of SSP that an 
elastic task will consume.
   
   Tests: existing tests pass.
   
   API changes: New config "task.elasticity.factor" which if > 1 enables this 
elasticity feature.
   
   upgrade/usage instructions: add above config with value >1 to enable feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to