[ https://issues.apache.org/jira/browse/SPARK-52330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Boyang Jerry Peng updated SPARK-52330: -------------------------------------- Description: The SPIP proposes a new execution mode called “{*}Real-time Mode{*}” in Spark Structured Streaming that significantly lowers end-to-end latency for processing streams of data. Our goal is to make Spark capable of handling streaming jobs that need results *almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want to achieve this *without changing the high-level DataFrame/Dataset API* that users already use – so existing streaming queries can run in this new ultra-low-latency mode by simply turning it on, without rewriting their logic. In short, we’re trying to enable Spark to power *real-time applications* (like instant anomaly alerts or live personalization) that today cannot meet their latency requirements with Spark’s current streaming engine. SPIP doc: [https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing] was: The SPIP proposes to add a new execution mode called “{*}Real-time Mode{*}” in Spark Structured Streaming that significantly lowers end-to-end latency for processing streams of data. Our goal is to make Spark capable of handling streaming jobs that need results *almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want to achieve this *without changing the high-level DataFrame/Dataset API* that users already use – so existing streaming queries can run in this new ultra-low-latency mode by simply turning it on, without rewriting their logic. In short, we’re trying to enable Spark to power *real-time applications* (like instant anomaly alerts or live personalization) that today cannot meet their latency requirements with Spark’s current streaming engine. SPIP doc: [https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing] > SPIP: Real-Time Mode in Apache Spark Structured Streaming > --------------------------------------------------------- > > Key: SPARK-52330 > URL: https://issues.apache.org/jira/browse/SPARK-52330 > Project: Spark > Issue Type: Umbrella > Components: Structured Streaming > Affects Versions: 4.1.0 > Reporter: Boyang Jerry Peng > Priority: Major > > The SPIP proposes a new execution mode called “{*}Real-time Mode{*}” in > Spark Structured Streaming that significantly lowers end-to-end latency for > processing streams of data. > Our goal is to make Spark capable of handling streaming jobs that need > results *almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want > to achieve this *without changing the high-level DataFrame/Dataset API* that > users already use – so existing streaming queries can run in this new > ultra-low-latency mode by simply turning it on, without rewriting their logic. > In short, we’re trying to enable Spark to power *real-time applications* > (like instant anomaly alerts or live personalization) that today cannot meet > their latency requirements with Spark’s current streaming engine. > > SPIP doc: > [https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org