Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-04 Thread Mich Talebzadeh
t;> On Jun 2, 2025, at 12:50 PM, DB Tsai >> wrote: >> >>>>>>>>>> >> >>>>>>>>>> +1 looking forward to seeing real-time mode. >> >>>>>>>>>> Sent from my iPhone >> >>>>>>&

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-04 Thread Jerry Peng
>> > >>>>>>>>>> On Jun 1, 2025, at 9:47 PM, Xiao Li > wrote: > >>>>>>>>>> > >>>>>>>>>>  > >>>>>>>>>> +1 > >>>>>>>>>> > >>&

[VOTE][RESULT] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-04 Thread L. C. Hsieh
The vote passes with 24 +1s (12 binding +1s), 1 +0s (1 binding +0s) and no -1s. Thanks to all who helped with the vote! (* = binding) +1: Dongjoon Hyun (*) Yuanjian Li (*) Tathagata Das (*) Huaxin Gao (*) Xiao Li (*) DB Tsai (*) L.C. Hsieh (*) Denny Lee Gengliang Wang (*) Sakthi Anish Shrigondeka

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-04 Thread L. C. Hsieh
;>>>> >>>>>>>>>> +1 looking forward to seeing real-time mode. >>>>>>>>>> Sent from my iPhone >>>>>>>>>> >>>>>>>>>> On Jun 1, 2025, at 9:47 PM, Xiao Li wrote: >>>>

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Cheng Pan
+1 (non-binding) Thanks, Cheng Pan > On Jun 2, 2025, at 03:00, L. C. Hsieh wrote: > > Hi all, > > I would like to start a vote on the new real-time mode in Apache Spark > Structured Streaming. > > Discussion thread: > https://lists.apache.org/thread/ovmfbzfkc3t9

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Wenchen Fan
gt;> >>>>>>>>>  >>>>>>>>> +1 >>>>>>>>> >>>>>>>>> huaxin gao 于2025年6月1日周日 20:00写道: >>>>>>>>> >>>>>>>>>> +1 >>>>>>>>>> &

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Reynold Xin
gt;>>>>> +1 >>>>>>>> >>>>>>>> On Sun, Jun 1, 2025 at 7:50 PM Tathagata Das < >>>>>>>> tathagata.das1...@gmail.com> wrote: >>>>>>>> >>>>>>>>> +1

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Kent Yao
t;>> >>>>>>> +1 >>>>>>> >>>>>>> On Sun, Jun 1, 2025 at 7:50 PM Tathagata Das < >>>>>>> tathagata.das1...@gmail.com> wrote: >>>>>>> >>>>>>>> +1 (binding) >>>&

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread bo yang
1, 2025, at 9:47 PM, Xiao Li wrote: >>>>>>>> >>>>>>>>  >>>>>>>> +1 >>>>>>>> >>>>>>>> huaxin gao 于2025年6月1日周日 20:00写道: >>>>>>>> >>>>>>>>> +1 >&

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Sandy Ryza
: >>>>>> >>>>>>> +1 (binding) >>>>>>> super excited about this! >>>>>>> >>>>>>> On Sun, Jun 1, 2025 at 10:45 PM Yuanjian Li >>>>>>> wrote: >>>>>>

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Chao Sun
2025 at 7:50 PM Tathagata Das < >>>>> tathagata.das1...@gmail.com> wrote: >>>>> >>>>>> +1 (binding) >>>>>> super excited about this! >>>>>> >>>>>> On Sun, Jun 1, 2025 at 10:45

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Rozov, Vlad
.com>> wrote: +1 Dongjoon On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh mailto:vii...@gmail.com>> wrote: Hi all, I would like to start a vote on the new real-time mode in Apache Spark Structured Streaming. Discussion thread: https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs7

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Jungtaek Lim
ut this! >>>>> >>>>> On Sun, Jun 1, 2025 at 10:45 PM Yuanjian Li >>>>> wrote: >>>>> >>>>>> +1 >>>>>> >>>>>> On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun >>>>>> wr

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Wenchen Fan
On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun >>>>> wrote: >>>>> >>>>>> +1 >>>>>> >>>>>> Dongjoon >>>>>> >>>>>> >>>>>> On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh wrote

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Peter Toth
wrote: >>> >>>> +1 >>>> >>>> On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun >>>> wrote: >>>> >>>>> +1 >>>>> >>>>> Dongjoon >>>>> >>>>&g

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread xianjin
com> wrote:Hi all, I would like to start a vote on the new real-time mode in Apache Spark Structured Streaming. Discussion thread: https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs75fhpvdckn90f SPIP: https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?tab=t

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Mich Talebzadeh
h-d-5205b2/> On Sun, 1 Jun 2025 at 20:01, L. C. Hsieh wrote: > Hi all, > > I would like to start a vote on the new real-time mode in Apache Spark > Structured Streaming. > > Discussion thread: > https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs75fhpvdckn90f > S

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Mark Hamstra
> >> >>>>>> >> +1 >>>>>> >> >>>>>> >> On Sun, Jun 1, 2025 at 7:50 PM Tathagata Das >>>>>> >> wrote: >>>>>> >>> >>>>>> >>> +1 (binding

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-02 Thread Liu Cao
d about this! >>>>> >>> >>>>> >>> On Sun, Jun 1, 2025 at 10:45 PM Yuanjian Li < >>>>> xyliyuanj...@gmail.com> wrote: >>>>> >>>> >>>>> >>>> +1 >>>>> >>>> &g

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Anish Shrigondekar
+1 (binding) >>>> >>> super excited about this! >>>> >>> >>>> >>> On Sun, Jun 1, 2025 at 10:45 PM Yuanjian Li >>>> wrote: >>>> >>>> >>>> >>>> +1 >>>> >>>>

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Sakthi
gt; >>> >>> +1 (binding) >>> >>> super excited about this! >>> >>> >>> >>> On Sun, Jun 1, 2025 at 10:45 PM Yuanjian Li >>> wrote: >>> >>>> >>> >>>> +1 >>> >>>> >>>

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Gengliang Wang
; >>>> >> >>>> +1 >> >>>> >> >>>> On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun >> wrote: >> >>>>> >> >>>>> +1 >> >>>>> >> >

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread L. C. Hsieh
25 at 19:00 Dongjoon Hyun wrote: >>>>> >>>>> +1 >>>>> >>>>> Dongjoon >>>>> >>>>> >>>>> On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh wrote: >>>>>> >>>&

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Denny Lee
> super excited about this! > >>> > >>> On Sun, Jun 1, 2025 at 10:45 PM Yuanjian Li > wrote: > >>>> > >>>> +1 > >>>> > >>>> On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun > wrote: > >>>>&g

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread DB Tsai
real-time mode in Apache Spark Structured Streaming. Discussion thread: https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs75fhpvdckn90f SPIP: https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?tab=t.0#heading=h.ulas5788cm9t JIRA: https://issues.apache.org/jira/browse/S

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Xiao Li
gt;>> On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun >>> wrote: >>> >>>> +1 >>>> >>>> Dongjoon >>>> >>>> >>>> On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh wrote: >>>> >>>&

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread huaxin gao
t;>> >>> Dongjoon >>> >>> >>> On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh wrote: >>> >>>> Hi all, >>>> >>>> I would like to start a vote on the new real-time mode in Apache Spark >>>> Structured S

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Tathagata Das
t;>> Hi all, >>> >>> I would like to start a vote on the new real-time mode in Apache Spark >>> Structured Streaming. >>> >>> Discussion thread: >>> https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs75fhpvdckn90f >>> SPIP: >&g

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Yuanjian Li
+1 On Sun, Jun 1, 2025 at 19:00 Dongjoon Hyun wrote: > +1 > > Dongjoon > > > On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh wrote: > >> Hi all, >> >> I would like to start a vote on the new real-time mode in Apache Spark >> Structured Streaming. >>

Re: [VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread Dongjoon Hyun
+1 Dongjoon On Sun, Jun 1, 2025 at 12:02 L. C. Hsieh wrote: > Hi all, > > I would like to start a vote on the new real-time mode in Apache Spark > Structured Streaming. > > Discussion thread: > https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs75fhpvdckn90f > SPIP: &

[VOTE] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-06-01 Thread L. C. Hsieh
Hi all, I would like to start a vote on the new real-time mode in Apache Spark Structured Streaming. Discussion thread: https://lists.apache.org/thread/ovmfbzfkc3t9odvv5gs75fhpvdckn90f SPIP: https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?tab=t.0#heading

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Jungtaek Lim
> if I get the right answer too slowly it becomes useless or wrong. What >>>>>>>> I >>>>>>>> call the "Late and Correct is Useless" Principle >>>>>>>> >>>>>>>> In summary, "Real-time Mode" s

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Sakthi
gt;>>>>>>> if I get the right answer too slowly it becomes useless or wrong. What >>>>>>>> I >>>>>>>> call the "Late and Correct is Useless" Principle >>>>>>>> >>>>>>>> In su

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Jules Damji
gt;> +1. Super excited by this initiative! > >>>>> > >>>>> On Wed, May 28, 2025 at 1:54 PM Yanbo Liang <yblia...@gmail.com> > >>>>> wrote: > >>>>> > >>>>>> +1 > >>>>>> > >>

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Mich Talebzadeh
the right answer too slowly it becomes useless or wrong. What >>>>>>>> I >>>>>>>> call the "Late and Correct is Useless" Principle >>>>>>>> >>>>>>>> In summary, "Real-time Mode" seems to describe

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Mark Hamstra
ease of use, >>>>>>> leveraging established, battle-tested components.I invite the audience >>>>>>> to >>>>>>> have a discussion on this. >>>>>>> >>>>>>> HTH >>>>>>> >

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread L. C. Hsieh
;>>> mich.talebza...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> My point about "in real time application or data, there is nothing >>>>>>> as an answer which is supposed

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Denny Lee
t;>>> >>>>> On Wed, May 28, 2025 at 9:08 AM Mich Talebzadeh < >>>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> My point about "in real time application or data,

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread xianjin
comes useless or wrong" is actually fundamental to why we need this Spark Structured Streaming proposal.The proposal is precisely about enabling Spark to power applications where, as I define it, the timeliness of the answer is as critical as its correctness. Spark's current streaming engine

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Jerry Peng
The timeliness is part of the >>>>>>>> application. >>>>>>>> if I get the right answer too slowly it becomes useless or wrong. What >>>>>>>> I >>>>>>>> call the "Late and Correct is Useless" Principl

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-30 Thread Mich Talebzadeh
;>>> have a discussion on this. >>>>>>> >>>>>>> HTH >>>>>>> >>>>>>> Dr Mich Talebzadeh, >>>>>>> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR >>>>>>

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
>>>> >>>>>> >>>>>> >>>>>> On Thu, 29 May 2025 at 19:15, Yang Jie wrote: >>>>>> >>>>>>> +1 >>>>>>> >>>>>>> On 2025/05/29 16:25:19 Xiao Li wrote: >

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mark Hamstra
; > > +1. >>>>>> > > >>>>>> > > On Thu, May 29, 2025 at 3:36 PM DB Tsai >>>>>> wrote: >>>>>> > > >>>>>> > >> +1 >>>>>> > >> Sent from my iPhone >>&

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
gt;> > >> On May 29, 2025, at 12:15 AM, John Zhuge >>>>> wrote: >>>>> > >> >>>>> > >>  >>>>> > >> +1 Nice feature >>>>> > >> >>>>> > >> On Wed, May 28, 2025 at

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
gt;> > >>>>> >> > >>>>>> +1 >> > >>>>>> >> > >>>>>> On Wed, May 28, 2025 at 12:34 PM huaxin gao < >> huaxin.ga...@gmail.com> >> > >>>>>> wrote: >> >

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mark Hamstra
gt; wrote: >>>> > >> >>>> > >>> +1 >>>> > >>> >>>> > >>> Kent Yao 于2025年5月28日周三 19:31写道: >>>> > >>> >>>> > >>>> +1, LGTM. >>>> > >>>&

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
gt; > >>>> +1, LGTM. >>> > >>>> >>> > >>>> Kent >>> > >>>> >>> > >>>> 在 2025年5月29日星期四,Chao Sun 写道: >>> > >>>> >>> > >>>>> +1. Super excited by this in

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mark Hamstra
>>> > >> On Wed, May 28, 2025 at 9:53 PM Yuanjian Li < >>>>> xyliyuanj...@gmail.com> >>>>> > >> wrote: >>>>> > >> >>>>> > >>> +1 >>>>> > >>> >>>>> >

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
28, 2025 at 1:54 PM Yanbo Liang >> > >>>>> wrote: >> > >>>>> >> > >>>>>> +1 >> > >>>>>> >> > >>>>>> On Wed, May 28, 2025 at 12:34 PM huaxin gao < >> huaxin.ga...@gmai

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Mich Talebzadeh
8, 2025 at 12:34 PM huaxin gao < > huaxin.ga...@gmail.com> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> +1 > > >>>>>>> By unifying batch and low-latency streaming in Spark, we can > >

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Yang Jie
>>>>>>> > >>>>>>>>>> A stronger definition of real time. The engineering definition of > >>>>>>>>>> real time is roughly fast enough to be interactive > >>>>>>>>>> > >>&

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Jerry Peng
, we can >>>>>>>> eliminate the need for separate streaming engines, reducing system >>>>>>>> complexity and operational cost. Excited to see this direction! >>>>>>>> >>>>>>>> On Wed, May 28, 2025 at 9:0

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Xiao Li
>>> complexity and operational cost. Excited to see this direction! >>>>>>> >>>>>>> On Wed, May 28, 2025 at 9:08 AM Mich Talebzadeh < >>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>> >>>&g

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread Yuming Wang
alebza...@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> My point about "in real time application or data, there is nothing >>>>>>> as an answer which is supposed to be late and correct.

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread DB Tsai
nd correct. The timeliness is part of the application. if I get the right answer too slowly it becomes useless or wrong" is actually fundamental to why we need this Spark Structured Streaming proposal.The proposal is precisely about enabling Spark to power applications where, as I define it, the

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-29 Thread John Zhuge
;>>> mich.talebza...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> My point about "in real time application or data, there is nothing as >>>>>> an answer which is supposed to be late and correct

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Yuanjian Li
data, there is nothing as >>>>> an answer which is supposed to be late and correct. The timeliness is part >>>>> of the application. if I get the right answer too slowly it becomes >>>>> useless >>>>> or wrong" is actually

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Kent Yao
t; mich.talebza...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> My point about "in real time application or data, there is nothing as >>>> an answer which is supposed to be late and correct. The timeliness is part >>>> of the

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Chao Sun
;in real time application or data, there is nothing as an >>> answer which is supposed to be late and correct. The timeliness is part of >>> the application. if I get the right answer too slowly it becomes useless or >>> wrong" is actually fundamental to *why* we need this

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Yanbo Liang
swer too slowly it becomes useless or >> wrong" is actually fundamental to *why* we need this Spark Structured >> Streaming proposal. >> >> The proposal is precisely about enabling Spark to power applications >> where, as I define it, the *timeliness* of the answer is as

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread huaxin gao
t;in real time application or data, there is nothing as an > answer which is supposed to be late and correct. The timeliness is part of > the application. if I get the right answer too slowly it becomes useless or > wrong" is actually fundamental to *why* we need this Spark Structured &

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Mich Talebzadeh
at the query should run in the new ultra >>> low-latency execution mode. A time interval can also be specified, e.g. >>> “300 Seconds”, to indicate how long each micro-batch should run for. >>> " >>> >>> will inevitably depend on many factors. Not tha

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Denny Lee
;> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> >> >> On Wed, 28 May 2025 at 05:13, Jerry Peng >>

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Mich Talebzadeh
, 28 May 2025 at 05:13, Jerry Peng > wrote: > >> Hi all, >> >> I want to start a discussion thread for the SPIP titled “Real-Time Mode >> in Apache Spark Structured Streaming” that I've been working on with Siying >> Dong, Indrajit Roy, Chao Sun, Jungtaek Lim,

Re: [DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-28 Thread Mich Talebzadeh
is | GDPR view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> On Wed, 28 May 2025 at 05:13, Jerry Peng wrote: > Hi all, > > I want to start a discussion thread for the SPIP titled “Real-Time Mode in > Apache Spark Structured Streaming” that I

[DISCUSS] SPIP: Real-Time Mode in Apache Spark Structured Streaming

2025-05-27 Thread Jerry Peng
Hi all, I want to start a discussion thread for the SPIP titled “Real-Time Mode in Apache Spark Structured Streaming” that I've been working on with Siying Dong, Indrajit Roy, Chao Sun, Jungtaek Lim, and Michael Armbrust: [JIRA <https://issues.apache.org/jira/browse/SPARK-52330>]

Re: Building an Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration

2024-02-09 Thread Mich Talebzadeh
explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Fri, 9 Feb 2024 at 16:16, Mich Talebzadeh wrote: > Appreciate your thoughts on this, Personally I think Spark Structured > Streaming can be used effectively i

Building an Event-Driven Real-Time Data Processor with Spark Structured Streaming and API Integration

2024-02-09 Thread Mich Talebzadeh
Appreciate your thoughts on this, Personally I think Spark Structured Streaming can be used effectively in an Event Driven Architecture as well as continuous streaming) >From the link here <https://www.linkedin.com/posts/activity-7161748945801617409-v29V?utm_source=share&

Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-09 Thread Mich Talebzadeh
Hi Ashok, Thanks for pointing out the databricks article Scalable Spark Structured Streaming for REST API Destinations | Databricks Blog <https://www.databricks.com/blog/scalable-spark-structured-streaming-rest-api-destinations> I browsed it and it is basically similar to many of us in

Re: Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-08 Thread Mich Talebzadeh
destruction. On Mon, 8 Jan 2024 at 19:30, Mich Talebzadeh wrote: > Thought it might be useful to share my idea with fellow forum members. During > the breaks, I worked on the *seamless integration of Spark Structured > Streaming with Flask REST API for real-time data ingestion and analyt

Spark Structured Streaming and Flask REST API for Real-Time Data Ingestion and Analytics.

2024-01-08 Thread Mich Talebzadeh
Thought it might be useful to share my idea with fellow forum members. During the breaks, I worked on the *seamless integration of Spark Structured Streaming with Flask REST API for real-time data ingestion and analytics*. The use case revolves around a scenario where data is generated through

SPARK-24156: Kafka messages left behind in Spark Structured Streaming

2023-10-19 Thread Phillip Henry
Hi, folks, A few years ago, I asked about SSS not processing the final batch left on a Kafka topic when using groupBy, OutputMode.Append and withWatermark. At the time, Jungtaek Lim kindly pointed out (27/7/20) that this was expected behaviour, that (if I have this correct) a message needs to arr

Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-19 Thread Mich Talebzadeh
t due to my on-going > personal stuff. I'll adjust the JIRA first. > > Thanks, > Dongjoon. > > > On Sat, Feb 18, 2023 at 10:51 AM Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> https://issues.apache.org/jira/browse/SPARK-42485 >> >> >

Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Holden Karau
m> wrote: > >> https://issues.apache.org/jira/browse/SPARK-42485 >> >> >> Spark Structured Streaming is a very useful tool in dealing with Event >> Driven Architecture. In an Event Driven Architecture, there is generally a >> main loop that listens for e

Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Dongjoon Hyun
ongjoon. On Sat, Feb 18, 2023 at 10:51 AM Mich Talebzadeh wrote: > https://issues.apache.org/jira/browse/SPARK-42485 > > > Spark Structured Streaming is a very useful tool in dealing with Event > Driven Architecture. In an Event Driven Architecture, there is generally a > main l

SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Mich Talebzadeh
https://issues.apache.org/jira/browse/SPARK-42485 Spark Structured Streaming is a very useful tool in dealing with Event Driven Architecture. In an Event Driven Architecture, there is generally a main loop that listens for events and then triggers a call-back function when one of those events is

Fwd: Shutting down spark structured streaming when the streaming process completed current process

2023-02-07 Thread Mich Talebzadeh
ill in no case be liable for any monetary damages arising from such loss, damage or destruction. -- Forwarded message - From: Mich Talebzadeh Date: Fri, 23 Apr 2021 at 10:36 Subject: Shutting down spark structured streaming when the streaming process completed current process To: u

Re: How to gracefully shutdown Spark Structured Streaming

2021-04-26 Thread Mich Talebzadeh
Spark Structured Streaming AKA SSS is a very useful tool in dealing with Event Driven Architecture. In an Event Driven Architecture, there is generally a main loop that listens for events and then triggers a call-back function when one of those events is detected. In a streaming application the

How to gracefully shutdown Spark Structured Streaming

2021-04-26 Thread Mich Talebzadeh
Hi, Apologies. I just want to ensure that my subscription to dev@spark.apache.org works OK. Regards, Mich view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destru

Re: Checkpointing in Spark Structured Streaming

2021-03-22 Thread Jungtaek Lim
be able to prevent concurrent streaming queries trying to >> update to the same checkpoint which might possibly mess up the checkpoint. >> You need to make sure there's only one streaming query running against a >> specific checkpoint. >> >> 1. https://issues.apache.o

Re: Checkpointing in Spark Structured Streaming

2021-03-22 Thread Rohit Agrawal
at 1:55 AM Rohit Agrawal wrote: > >> Hi, >> >> I have been experimenting with the Continuous mode and the Micro batch >> mode in Spark Structured Streaming. When enabling checkpoint to S3 instead >> of the local File System we see that Continuous mode has no chang

Re: Checkpointing in Spark Structured Streaming

2021-03-22 Thread Jungtaek Lim
issues.apache.org/jira/browse/SPARK-34383 On Tue, Mar 23, 2021 at 1:55 AM Rohit Agrawal wrote: > Hi, > > I have been experimenting with the Continuous mode and the Micro batch > mode in Spark Structured Streaming. When enabling checkpoint to S3 instead > of the local File System we s

Checkpointing in Spark Structured Streaming

2021-03-22 Thread Rohit Agrawal
Hi, I have been experimenting with the Continuous mode and the Micro batch mode in Spark Structured Streaming. When enabling checkpoint to S3 instead of the local File System we see that Continuous mode has no change in latency (expected due to async checkpointing) however the Micro-batch mode

Re: [Spark Structured Streaming on K8S]: Debug - File handles/descriptor (unix pipe) leaking

2018-07-23 Thread Abhishek Tripathi
Hello Dev! Spark structured streaming job with simple window aggregation is leaking file descriptor on kubernetes as cluster manager setup. It seems bug. I am suing HDFS as FS for checkpointing. Have anyone observed same? Thanks for any help. Please find more details in trailing email. For

Re: Kafka Spark structured streaming latency benchmark.

2017-01-02 Thread Prashant Sharma
This issue was fixed in https://issues.apache.org/jira/browse/SPARK-18991. --Prashant On Tue, Dec 20, 2016 at 6:16 PM, Prashant Sharma wrote: > Hi Shixiong, > > Thanks for taking a look, I am trying to run and see if making > ContextCleaner run more frequently and/or making it non blocking wil

Re: Kafka Spark structured streaming latency benchmark.

2016-12-20 Thread Prashant Sharma
Hi Shixiong, Thanks for taking a look, I am trying to run and see if making ContextCleaner run more frequently and/or making it non blocking will help. --Prashant On Tue, Dec 20, 2016 at 4:05 AM, Shixiong(Ryan) Zhu wrote: > Hey Prashant. Thanks for your codes. I did some investigation and it

Re: Kafka Spark structured streaming latency benchmark.

2016-12-20 Thread Jacek Laskowski
Hi, (what a timing. Just reviewed CC yesterday!) In ALS they trigger cleaning up shufflemapstages themselves so if I understood the issue the streaming part could do it too. Jacek On 19 Dec 2016 11:35 p.m., "Shixiong(Ryan) Zhu" wrote: > Hey Prashant. Thanks for your codes. I did some investig

Re: Kafka Spark structured streaming latency benchmark.

2016-12-19 Thread Shixiong(Ryan) Zhu
Hey Prashant. Thanks for your codes. I did some investigation and it turned out that ContextCleaner is too slow and its "referenceQueue" keeps growing. My hunch is cleaning broadcast is very slow since it's a blocking call. On Mon, Dec 19, 2016 at 12:50 PM, Shixiong(Ryan) Zhu < shixi...@databricks

Re: Kafka Spark structured streaming latency benchmark.

2016-12-19 Thread Shixiong(Ryan) Zhu
Hey, Prashant. Could you track the GC root of byte arrays in the heap? On Sat, Dec 17, 2016 at 10:04 PM, Prashant Sharma wrote: > Furthermore, I ran the same thing with 26 GB as the memory, which would > mean 1.3GB per thread of memory. My jmap >

Re: Kafka Spark structured streaming latency benchmark.

2016-12-17 Thread Prashant Sharma
Furthermore, I ran the same thing with 26 GB as the memory, which would mean 1.3GB per thread of memory. My jmap results and jstat results

Re: Spark structured streaming

2016-03-08 Thread Michael Armbrust
> > > - > > "Courage doesn't always roar. Sometimes courage is the quiet voice at the > > end of the day saying I will try again" > > > > > > > >

Re: Spark structured streaming

2016-03-08 Thread Jacek Laskowski
wski > To:Praveen Devarao/India/IBM@IBMIN > Cc: user , dev > Date:08/03/2016 04:17 pm > Subject:Re: Spark structured streaming > > > > > Hi Praveen, > > I've spent few hours on the changes rel

Re: Spark structured streaming

2016-03-08 Thread Praveen Devarao
try again" From: Jacek Laskowski To: Praveen Devarao/India/IBM@IBMIN Cc: user , dev Date: 08/03/2016 04:17 pm Subject: Re: Spark structured streaming Hi Praveen, I've spent few hours on the changes related to streaming dataframes (included in the SPARK-8360) and

Re: Spark structured streaming

2016-03-08 Thread Jacek Laskowski
Hi Praveen, I've spent few hours on the changes related to streaming dataframes (included in the SPARK-8360) and concluded that it's currently only possible to read.stream(), but not write.stream() since there are no streaming Sinks yet. Pozdrawiam, Jacek Laskowski https://medium.com/@jacekl

Spark structured streaming

2016-03-08 Thread Praveen Devarao
Hi, I would like to get my hands on the structured streaming feature coming out in Spark 2.0. I have tried looking around for code samples to get started but am not able to find any. Only few things I could look into is the test cases that have been committed under the JIRA umbrella ht