[
https://issues.apache.org/jira/browse/GEARPUMP-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manu Zhang updated GEARPUMP-31:
-------------------------------
Component/s: streaming
> Dynamic processor deletion of DAG
> ---------------------------------
>
> Key: GEARPUMP-31
> URL: https://issues.apache.org/jira/browse/GEARPUMP-31
> Project: Apache Gearpump
> Issue Type: New Feature
> Components: streaming
> Reporter: JongHyok Lee
>
> Dynamic processor deletion of DAG required as a part of Dynamic DAG.
> Here's my usecase sample.
> First, let's assume that there is an application which is consisted with
> several processes
> (A) : Source which reads from Kafka
> (B) : Filter input with specific value
> (C) : Filter input with another specific value
> (D) : Sink input to HDFS
> (E) : Evaluate value of f(x1, x2, ...) where x1, x2, ... is part of input
> (F) : Transform input to another form
> (G) : Merge two input stream to one
> (H) : Sink input to Kafka topic
> (I) : Sink input to another Kafka topic
> (J) : Sink input to the third Kafka topic
> and the graph is
> (A) ~> (D)
> (A) ~> (B) ~> (E) ~> (H)
> (E) ~> (G) ~> (I)
> (A) ~> (C) ~> (F) ~> (J)
> (F) ~> (G)
> While the application is running, let's assume that there was a request from
> the consuming component (it can be UI or other system which consumes kafka
> message) of the data stream that they don't need merged data any more. Then
> maybe (G) ~> (I) part of the graph should be removed from DAG. Also, if
> consuming component says they don't need different 'value' to filter, then
> processor (C) should be removed and (F) should receive input from (F) (input
> might be explicitly described in this case).
> This kind of functionality is really important for the system dealing with
> really huge data streams, and so cannot let even part of DAG processors use
> CPU power when not needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)