[ 
https://issues.apache.org/jira/browse/SAMZA-111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated SAMZA-111:
----------------------------------

    Attachment: SAMZA-111.0.png
                SAMZA-111.0.patch

Attaching a patch and CPU sample. RB available at:

  https://reviews.apache.org/r/16430/

Changes:

1. Added a bunch of docs to SystemConsumers.
2. Added MockSystem to simulate consumers.
3. Wrote TestSamzaContainerPerformance, which uses MockSystem to test speed of 
framework.
4. Updated SystemConsumers to cache the systemFetchMap rather than rebuilding 
it on every poll call.
5. Removed an erroneous config in TestStatefulTask. Didn't do anything.
6. Made samza-core a compile time dependency for samza-test, since MockSystem 
uses it now.

I ran this performance test with 1000 input streams, 12 consumer threads, 4 
partitions per input stream, batch size 5000, and consumer sleep of 1 ms.

Before the SystemConsumers changes:

{noformat}
./gradlew :samza-test:clean :samza-test:test 
-Dtest.single=TestSamzaContainerPerformance -Dsamza.task.max.messages=10000
...
31782 [ThreadJob] INFO org.apache.samza.test.performance.TestPerformanceTask - 
Processed 10000 messages in 25 seconds.
{noformat}

This is about 400 messages/sec.

After the SystemConsumers changes:

{noformat}
./gradlew :samza-test:clean :samza-test:test 
-Dtest.single=TestSamzaContainerPerformance -Dsamza.task.max.messages=1000000
...
31782 [ThreadJob] INFO org.apache.samza.test.performance.TestPerformanceTask - 
Processed 1000000 messages in 13 seconds.
{noformat}

This is about 77,000 messages/sec.

After these changes, the slowdown seems to come almost entirely from grizzled's 
SL4J logging (as noted in my previous comment). Take a look at SAMZA-111.0.png 
for a screenshot of the CPU sample, and you can see it's nearly all trace and 
debug statements that are taking the time. I think we should open a separate 
ticket to fix that.

> SystemConsumers is slow with large partition count
> --------------------------------------------------
>
>                 Key: SAMZA-111
>                 URL: https://issues.apache.org/jira/browse/SAMZA-111
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>         Attachments: 
> 12-threads-1000-streams-4-partitions-each-with-hacky-fix.png, 
> 12-threads-1000-streams-4-partitions-each.png, 
> 12-threads-8-streams-4-partitions-each.png, SAMZA-111.0.patch, 
> SAMZA-111.0.png, samza-perf-hacks.0.diff, samza-perf-hacks.png
>
>
> We have been seeing very slow processing speed when running a Samza container 
> that consumes from 1000s of partitions. We don't see a corresponding slow 
> speed when running the same code, but with fewer input partitions (say 8-24).
> The messages per second seems to drop off as more partitions are added to the 
> Samza container. One Samza job has ~2500 partitions, and is seeing only 6000 
> messages/sec. The same code running with ~9 partitions is seeing 30,000 
> messages/sec.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to