Github user haohui commented on the pull request:
https://github.com/apache/storm/pull/855#issuecomment-153820936
The high level API looks good to me overall. It maintains a view of all the
events in the window which is a powerful concept.
I have several questions on how this PR can help implement two common use
cases.
(1) Aggregation (e.g., min / max) over a sliding window
(2) Stream joins over a large amount of data
The abstractions of views in memory are insufficient because for an
efficient algorithm for (1) does not need every single events in the window,
and (2) the events in the window need to be spilled to secondary storage. To me
it seems that it still requires writing a lot of custom code. The issue might
be mitigated to add flexibility on whether and where to keep the events in the
window in the API.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---