[
https://issues.apache.org/jira/browse/SAMZA-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16721519#comment-16721519
]
ASF GitHub Bot commented on SAMZA-2044:
---------------------------------------
GitHub user georgantasp opened a pull request:
https://github.com/apache/samza/pull/862
SAMZA-2044 - Change api to avoid OOMem Exceptions
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/georgantasp/samza SAMZA-2044
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/samza/pull/862.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #862
----
commit 0844025d7b037150d42d71c692abb9484945957b
Author: Peter Georgantas <peter@...>
Date: 2018-12-14T15:14:58Z
SAMZA-2044 - change api to avoid OOMem Exceptions
----
> EOSMessage causes out of memory Exceptions related to WindowOperatorImpl
> ------------------------------------------------------------------------
>
> Key: SAMZA-2044
> URL: https://issues.apache.org/jira/browse/SAMZA-2044
> Project: Samza
> Issue Type: Bug
> Components: container
> Affects Versions: 0.14.0, 1.0
> Reporter: Peter Georgantas
> Priority: Major
>
> The contract of the handleEndOfStream method dictates that a collection of
> results be returned. In the case of WindowOperatorImpl which has a backing
> RocksDB store, this effectively causes the entirety of the store to be pulled
> into memory. In many cases, this will cause out of memory exceptions
> (otherwise why not keep the store in memory in the first place).
> Since this is a protected api, I have a relatively simple change to propose
> which could allow data to be consumed downstream as it is brought out of the
> store:
> {{protected void handleEndOfStream(Consumer<WindowPane<K, Object>> consumer,
> MessageCollector collector, TaskCoordinator coordinator)}}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)