Dale LaBossiere created QUARKS-211:
--------------------------------------

             Summary: Topology.events() uses unbounded 
PlumbingStreams.isolate(): can cause out of memory
                 Key: QUARKS-211
                 URL: https://issues.apache.org/jira/browse/QUARKS-211
             Project: Quarks
          Issue Type: Bug
          Components: Runtime
            Reporter: Dale LaBossiere
            Priority: Critical


Using an unbounded isolate is problematic.  If the downstream processing can't 
keep up for an extended period, ultimately the application can accumulate 
tuples until an "out of memory" condition occurs.

Note: the javadoc says, see ``PlumbingStreams#pressureReliever()`` but the code 
is actually using an **unbounded** ``PlumbingStreams#isolate()``

While an app could add its own bounded isolator to an events() generated 
stream, that adds undesired per-tuple processing latency due to the additional 
isolation point.

Changing to a bounded scheme for the default seems better.  Either:
a) a bounded isolate (there's now an impl for that) and live with blocking the 
event listener when the queue is full
b) a pressureReliever and live with dropping events/tuples when the queue is 
full.
Pick some not unreasonable bound list (e.g., 1000?) and document it.
I think "a" may be the lesser evil.

Additionally allow (require?) the Topology.events(eventSetup) handler to 
provide an "isolator function":

    ``events(Function<Consumer<T>,UnaryOperator<TStream<T>> eventSetup)``

Then eventSetup can:
    return null;  // use the (new) default isolator
or    return stream -> PlumbingStreams.pressureReducer(stream, queueSize);
or    stream -> PlumbingStreams.isolate(stream, queueSize);
or    Functions.identity() (or ``t -> t``) - for no isolation (maybe other 
characteristics about the app eliminate the need/desire for this per-tuple 
isolation latency)
or ...  e.g., a "time bounded" pressureReducer or isolator, or a time&count 
bounded version when such things exist

Looking for some consensus / votes for this.  I don't think we want our first 
release to have the current behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to