[ 
https://issues.apache.org/jira/browse/SAMZA-106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated SAMZA-106:
------------------------------

    Attachment: SAMZA-106.patch
                Thats-better.png

After shot, showing usage down to about 7.5%.  The two samples are about as 
comparable as I can get.  Also patch to pre-compute and return that.  It may be 
worth doing this for StreamPartition as well (and shouldn't Partition's hash 
just be its int?), but that can be a different, newbie jira.

> Cache hashcode for SystemStreamPartition
> ----------------------------------------
>
>                 Key: SAMZA-106
>                 URL: https://issues.apache.org/jira/browse/SAMZA-106
>             Project: Samza
>          Issue Type: Improvement
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>             Fix For: 0.7.0
>
>         Attachments: Hashcode-More-Like-Slushcode.png, SAMZA-106.patch, 
> Thats-better.png
>
>
> Profiling shows that in jobs that consume lots of partitions (ooto hundreds), 
> a large chunk of the main thread's time (31.5%, in my tests) is spent 
> recalculating SystemStreamPartitions.  Most of this (26.5%) comes from the 
> BlockingEnvelopeMetricsMap.incPoll. Since SSPs are (meant to be) immutable, 
> there's no reason to recalculate each time.  We can just do it at object 
> creation and return that result.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to