[
https://issues.apache.org/jira/browse/SAMZA-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jakob Homan updated SAMZA-106:
------------------------------
Attachment: SAMZA-106.patch
Thats-better.png
After shot, showing usage down to about 7.5%. The two samples are about as
comparable as I can get. Also patch to pre-compute and return that. It may be
worth doing this for StreamPartition as well (and shouldn't Partition's hash
just be its int?), but that can be a different, newbie jira.
> Cache hashcode for SystemStreamPartition
> ----------------------------------------
>
> Key: SAMZA-106
> URL: https://issues.apache.org/jira/browse/SAMZA-106
> Project: Samza
> Issue Type: Improvement
> Components: container
> Affects Versions: 0.6.0
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Fix For: 0.7.0
>
> Attachments: Hashcode-More-Like-Slushcode.png, SAMZA-106.patch,
> Thats-better.png
>
>
> Profiling shows that in jobs that consume lots of partitions (ooto hundreds),
> a large chunk of the main thread's time (31.5%, in my tests) is spent
> recalculating SystemStreamPartitions. Most of this (26.5%) comes from the
> BlockingEnvelopeMetricsMap.incPoll. Since SSPs are (meant to be) immutable,
> there's no reason to recalculate each time. We can just do it at object
> creation and return that result.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)