[ 
https://issues.apache.org/jira/browse/SPARK-52330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Jerry Peng updated SPARK-52330:
--------------------------------------
    Description: 
The SPIP proposes a new execution mode called “{*}Real-time Mode{*}”  in Spark 
Structured Streaming that significantly lowers end-to-end latency for 
processing streams of data.

Our goal is to make Spark capable of handling streaming jobs that need results 
*almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want to 
achieve this *without changing the high-level DataFrame/Dataset API* that users 
already use – so existing streaming queries can run in this new 
ultra-low-latency mode by simply turning it on, without rewriting their logic.

In short, we’re trying to enable Spark to power *real-time applications* (like 
instant anomaly alerts or live personalization) that today cannot meet their 
latency requirements with Spark’s current streaming engine.

 

SPIP doc: 
[https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing]

  was:
The SPIP proposes to add a new execution mode called “{*}Real-time Mode{*}”  in 
Spark Structured Streaming that significantly lowers end-to-end latency for 
processing streams of data.

Our goal is to make Spark capable of handling streaming jobs that need results 
*almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want to 
achieve this *without changing the high-level DataFrame/Dataset API* that users 
already use – so existing streaming queries can run in this new 
ultra-low-latency mode by simply turning it on, without rewriting their logic.

In short, we’re trying to enable Spark to power *real-time applications* (like 
instant anomaly alerts or live personalization) that today cannot meet their 
latency requirements with Spark’s current streaming engine.

 

SPIP doc: 
[https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing]


> SPIP: Real-Time Mode in Apache Spark Structured Streaming
> ---------------------------------------------------------
>
>                 Key: SPARK-52330
>                 URL: https://issues.apache.org/jira/browse/SPARK-52330
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Structured Streaming
>    Affects Versions: 4.1.0
>            Reporter: Boyang Jerry Peng
>            Priority: Major
>
> The SPIP proposes a new execution mode called “{*}Real-time Mode{*}”  in 
> Spark Structured Streaming that significantly lowers end-to-end latency for 
> processing streams of data.
> Our goal is to make Spark capable of handling streaming jobs that need 
> results *almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want 
> to achieve this *without changing the high-level DataFrame/Dataset API* that 
> users already use – so existing streaming queries can run in this new 
> ultra-low-latency mode by simply turning it on, without rewriting their logic.
> In short, we’re trying to enable Spark to power *real-time applications* 
> (like instant anomaly alerts or live personalization) that today cannot meet 
> their latency requirements with Spark’s current streaming engine.
>  
> SPIP doc: 
> [https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to