[
https://issues.apache.org/jira/browse/SAMZA-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853138#comment-13853138
]
Chris Riccomini commented on SAMZA-111:
---------------------------------------
I have been doing a bit of digging into this.
I've written a MockSystem that simulates a KafkaSystem, with a configurable
number of BrokerProxy threads. When I attach a debugger, and take CPU samples,
I can see some interesting problems:
1. SystemConsumers is not filtering out SystemStreamPartitions that it's
fetching before calling poll on the underlying system (in this case,
KafkaSystemConsumer, which extends BlockingEnvelopeMap). This means every
poll() request is iterating over all SystemStreamPartitions, even though every
single one has outstanding messages to process already, and doesn't need to be
polled.
2. If you filter out the fetchMap appropriately, the next bottleneck is in
SystemConsumers.poll, which is spending all of its time iterating over the
fetchMap to do the filtering.
3. The systemFetchMap.size call in SystemConsumers.poll is taking 15% of total
CPU on the main thread.
I did a quick hack to keep the filtered systemFetchMap cached as fetchMap is
updated (rather than rebuilding every time poll is called), and temporarily
disabled the systemFetchMap.size call. These two changes boosted simulated
performance from 650 messages/sec to 70,000-100,000 messages/sec with 12
threads, 1000 streams, and 4 partitions each (4000 partitions divided evenly
between 12 simulated BrokerProxy threads).
I am convinced that we can make this go much faster, as well. My hack is not
really that efficient.
I'll post a patch with the MockSystem shortly.
> SystemConsumers is slow with large partition count
> --------------------------------------------------
>
> Key: SAMZA-111
> URL: https://issues.apache.org/jira/browse/SAMZA-111
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.6.0
> Reporter: Chris Riccomini
> Assignee: Chris Riccomini
>
> We have been seeing very slow processing speed when running a Samza container
> that consumes from 1000s of partitions. We don't see a corresponding slow
> speed when running the same code, but with fewer input partitions (say 8-24).
> The messages per second seems to drop off as more partitions are added to the
> Samza container. One Samza job has ~2500 partitions, and is seeing only 6000
> messages/sec. The same code running with ~9 partitions is seeing 30,000
> messages/sec.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)