Sean Owen updated SPARK-10320:
    Assignee: Cody Koeninger

> Kafka Support new topic subscriptions without requiring restart of the 
> streaming context
> ----------------------------------------------------------------------------------------
>                 Key: SPARK-10320
>                 URL: https://issues.apache.org/jira/browse/SPARK-10320
>             Project: Spark
>          Issue Type: New Feature
>          Components: Streaming
>            Reporter: Sudarshan Kadambi
>            Assignee: Cody Koeninger
>             Fix For: 2.0.0
> Spark Streaming lacks the ability to subscribe to newer topics or unsubscribe 
> to current ones once the streaming context has been started. Restarting the 
> streaming context increases the latency of update handling.
> Consider a streaming application subscribed to n topics. Let's say 1 of the 
> topics is no longer needed in streaming analytics and hence should be 
> dropped. We could do this by stopping the streaming context, removing that 
> topic from the topic list and restarting the streaming context. Since with 
> some DStreams such as DirectKafkaStream, the per-partition offsets are 
> maintained by Spark, we should be able to resume uninterrupted (I think?) 
> from where we left off with a minor delay. However, in instances where 
> expensive state initialization (from an external datastore) may be needed for 
> datasets published to all topics, before streaming updates can be applied to 
> it, it is more convenient to only subscribe or unsubcribe to the incremental 
> changes to the topic list. Without such a feature, updates go unprocessed for 
> longer than they need to be, thus affecting QoS.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to