[jira] [Updated] (BEAM-11707) Optimize WindmillStateCache CPU usage

Sam Whittle (Jira) Fri, 12 Mar 2021 00:26:06 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-11707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sam Whittle updated BEAM-11707:
-------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Open)

> Optimize WindmillStateCache CPU usage
> -------------------------------------
>
>                 Key: BEAM-11707
>                 URL: https://issues.apache.org/jira/browse/BEAM-11707
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Sam Whittle
>            Assignee: Sam Whittle
>            Priority: P2
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> From profiling nexmark Query11 which has many unique tags per key, I observed 
> that the WindmillStateCache cpu usage was 6% of CPU.
> The usage appears to be due to the invalidation set maintenance as well as 
> many reads/inserts.
> The invalidation set is maintained so that if a key encounters an error 
> processing or the cache token changes, we can invalidate all the entries for 
> a key.  Currently this is done by removing all entries for the key from the 
> cache.  Another alternative which appears much more CPU efficient is to 
> instead leave the entries in the cache but make them unreachable.  This can 
> be done by having a per-key object that uses object equality as part of the 
> cache lookup.  Then to discard entries for the key, we start using a new 
> per-key object.  Cleanup of per-key objects can be done with a weak reference 
> map.
> Another cost to the cache is that objects are grouped by window so that they 
> are kept/evicted all at once.  However currently when reading items into the 
> cache, we fetch the window set and then lookup each tag in it.  This could be 
> cached for the key to avoid multiple cache lookups. Similarly for putting 
> objects we lookup and insert each tag separately and then update the cache to 
> update the weight for the per-window set.  This could be done once after all 
> updates for the window have been made.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-11707) Optimize WindmillStateCache CPU usage

Reply via email to