[
https://issues.apache.org/jira/browse/SAMZA-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xinyu Liu updated SAMZA-863:
----------------------------
Attachment: SAMZA-863.1.patch
> Support multi-threading in samza tasks
> --------------------------------------
>
> Key: SAMZA-863
> URL: https://issues.apache.org/jira/browse/SAMZA-863
> Project: Samza
> Issue Type: New Feature
> Affects Versions: 0.11
> Reporter: Xinyu Liu
> Assignee: Xinyu Liu
> Attachments: DESIGN-SAMZA-863-0.pdf, DESIGN-SAMZA-863-1.pdf,
> DESIGN-SAMZA-863-2.pdf, DESIGN-SAMZA-863-3.pdf, SAMZA-863.0.patch,
> SAMZA-863.1.patch, perf-test-results.pdf
>
>
> Currently a samza container executes the tasks sequentially in a single
> thread. For example, we have message 1 and 2 in the pending queue for task 1
> and task 2. Task 1 will process message 1, and until its completion task 2
> can process message 2. If we want to handle more messages in parallel, we
> have to increase the container count, e.g. from 1 to 2 in the example.
> While this solution has been working for many CPU-bound job scenarios, we do
> see its drawback for IO-bound jobs.In this kind of jobs, the task makes
> IO/Network requests, i.e, db calls, rest calls or external service RPC calls.
> These IO calls significantly slow down the task processing. We can increase
> container number in order to parallelize the IO calls, but it results in low
> CPU utilization. If we can improve CPU utilization by allocating multiple
> contains in the same CPU core, it will still cause dramatic memory growth due
> to the memory being allocated for each container.
> To better scale the performance of IO-bound jobs, we are proposing to support
> multi-threaded processing in samza. The design proposal will come soon.
> rbs:
> https://reviews.apache.org/r/48243/: SAMZA-961: Async tasks and
> multithreading model
> https://reviews.apache.org/r/48213/: SAMZA-960: Make system producer thread
> safe
> https://reviews.apache.org/r/48182/: SAMZA-958: Make store/cache thread safe
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)