[
https://issues.apache.org/jira/browse/STORM-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stig Rohde Døssing closed STORM-2540.
-------------------------------------
Resolution: Won't Do
> Get rid of window compaction in WindowManager
> ---------------------------------------------
>
> Key: STORM-2540
> URL: https://issues.apache.org/jira/browse/STORM-2540
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-client
> Affects Versions: 2.0.0
> Reporter: Stig Rohde Døssing
> Assignee: Stig Rohde Døssing
>
> Storm's windowing support uses trigger and eviction policies to control the
> size of the windows passed to WindowingBolts. The WindowManager has a hard
> coded limit of 100 tuples before tuples will start getting evicted from the
> window, probably as an attempt to avoid overly huge windows when using time
> based eviction policies. Whenever a tuple is added to the window, the hard
> cap is checked, and if the number of tuples in the window exceeds the cap the
> WindowManager evaluates the EvictionPolicy for the tuples to figure out if
> some can be removed.
> This hard cap is ineffective in most configurations, and has a surprising
> interaction with the count based policy.
> If the windowing bolt is configured to use timestamp fields in the tuples to
> determine the current time, the WatermarkingXPolicy classes are used. In this
> configuration, the compaction isn't doing anything because tuples cannot be
> evicted until the WatermarkGenerator sends a new watermark, and when it does
> the TriggerPolicy causes the WindowManager to evict any expired tuples anyway.
> If the windowing bolt is using the count based policy, compaction has the
> unexpected effect of hard capping the user's configured max count to 100. If
> the configured count is less than 100, the compaction again has no effect.
> When the bolt is configured to use the tuple arrival time based policy, the
> compaction only has an effect if there are tuples older than the configured
> window duration, which only happens if the window happens to trigger slightly
> late. This can cause tuples to be evicted from the window before the user's
> bolt sees them. Even when tuples are evicted with the compaction mechanism
> they are kept in memory until the next time a window is presented to the
> user's bolt.
> I think the compaction mechanism should be removed. The only policy that
> benefits is the time based policy, and in that case it would be better to
> just add a configurable max tuple count to that policy.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)