Re: [DISCUSSION] SPIP: Asynchronous Offset Management in Structured Streaming

Mridul Muralidharan Wed, 30 Nov 2022 06:57:50 -0800

Thanks for all the clarifications and details Jerry, Jungtaek :-)
This looks like an exciting improvement to Structured Streaming - looking
forward to it becoming part of Apache Spark !


Regards,
Mridul


On Mon, Nov 28, 2022 at 8:40 PM Jerry Peng <jerry.boyang.p...@gmail.com>
wrote:

> Hi all,
>
> I will add my two cents.  Improving the Microbatch execution engine does
> not prevent us from working/improving on the continuous execution engine in
> the future.  These are orthogonal issues.  This new mode I am proposing in
> the microbatch execution engine intends to lower latency of this execution
> engine that most people use today.  We can view it as an incremental
> improvement on the existing engine. I see the continuous execution engine
> as a partially completed re-write of spark streaming and may serve as the
> "future" engine powering Spark Streaming.   Improving the "current" engine
> does not mean we cannot work on a "future" engine.  These two are not
> mutually exclusive. I would like to focus the discussion on the merits of
> this feature in regards to the current micro-batch execution engine and not
> a discussion on the future of continuous execution engine.
>
> Best,
>
> Jerry
>
>
> On Wed, Nov 23, 2022 at 3:17 AM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> Hi Mridul,
>>
>> I'd like to make clear to avoid any misunderstanding - the decision was
>> not led by me. (I'm just a one of engineers in the team. Not even TL.) As
>> you see the direction, there was an internal consensus to not revisit the
>> continuous mode. There are various reasons, which I think we know already.
>> You seem to remember I have raised concerns about continuous mode, but have
>> you indicated that it was even over 2 years ago? I still see no traction
>> around the project. The main reason I abandoned the discussion was due to
>> promising effort on integrating push based shuffle into continuous mode to
>> achieve shuffle, but no effort has been made so far.
>>
>> The goal of this SPIP is to have an alternative approach dealing with
>> same workload, given that we no longer have confidence of success of
>> continuous mode. But I also want to make clear that deprecating and
>> eventually retiring continuous mode is not a goal of this project. If that
>> happens eventually, that would be a side-effect. Someone may have concerns
>> that we have two different projects aiming for similar thing, but I'd
>> rather see both projects having competition. If anyone willing to improve
>> continuous mode can start making the effort right now. This SPIP does not
>> block it.
>>
>>
>> On Wed, Nov 23, 2022 at 5:29 PM Mridul Muralidharan <mri...@gmail.com>
>> wrote:
>>
>>>
>>> Hi Jungtaek,
>>>
>>>   Given the goal of the SPIP is reducing latency for stateless apps, and
>>> should reasonably fit continuous mode design goals, it feels odd to not
>>> support it fin the proposal.
>>>
>>> I know you have raised concerns about continuous mode in past as well in
>>> dev@ list, and we are further ignoring it in this proposal (and
>>> possibly other enhancements in past few releases).
>>>
>>> Do you want to revisit the discussion to support it and propose a vote
>>> on that ? And move it to deprecated ?
>>>
>>> I am much more comfortable not supporting this SPIP for CM if it was
>>> deprecated.
>>>
>>> Thoughts ?
>>>
>>> Regards,
>>> Mridul
>>>
>>>
>>>
>>>
>>> On Wed, Nov 23, 2022 at 1:16 AM Jerry Peng <jerry.boyang.p...@gmail.com>
>>> wrote:
>>>
>>>> Jungtaek,
>>>>
>>>> Thanks for taking up the role to shepard this SPIP!  Thank you for also
>>>> chiming in on your thoughts concerning the continuous mode!
>>>>
>>>> Best,
>>>>
>>>> Jerry
>>>>
>>>> On Tue, Nov 22, 2022 at 5:57 PM Jungtaek Lim <
>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>
>>>>> Just FYI, I'm shepherding this SPIP project.
>>>>>
>>>>> I think the major meta question would be, "why don't we spend
>>>>> effort on continuous mode rather than initiating another feature aiming 
>>>>> for
>>>>> the same workload?". Jerry already updated the doc to answer the question,
>>>>> but I can also share my thoughts about it.
>>>>>
>>>>> I feel like the current "continuous mode" is a niche solution. (It's
>>>>> not to blame. If you have to deal with such workload but can't rewrite the
>>>>> underlying engine from scratch, then there are really few options.)
>>>>> Since the implementation went with a workaround to implement which the
>>>>> architecture does not support natively e.g. distributed snapshot, it gets
>>>>> quite tricky on maintaining and expanding the project. It also requires 
>>>>> 3rd
>>>>> parties to implement a separate source and sink implementation, which I'm
>>>>> not sure how many 3rd parties actually followed so far.
>>>>>
>>>>> Eventually, "continuous mode" becomes an area no one in the active
>>>>> community knows the details and has willingness to maintain. I wouldn't 
>>>>> say
>>>>> we are confident to remove the tag on "experimental", although the feature
>>>>> has been shipped for years. It was introduced in Spark 2.3, surprising
>>>>> enough?
>>>>>
>>>>> We went back and thought about the approach from scratch. Jerry came
>>>>> up with the idea which leverages existing microbatch execution, hence
>>>>> relatively stable and no need to require 3rd parties to support another
>>>>> mode. It adds complexity against microbatch execution but it's a lot less
>>>>> complicated compared to the existing continuous mode. Definitely quite 
>>>>> less
>>>>> than creating a new record-to-record engine from scratch.
>>>>>
>>>>> That said, we want to propose and move forward with the new approach.
>>>>>
>>>>> ps. Eventually we could probably discuss retiring continuous mode if
>>>>> the new approach gets accepted and eventually considered as a stable one
>>>>> after several minor releases. That's just me.
>>>>>
>>>>> On Wed, Nov 23, 2022 at 5:16 AM Jerry Peng <
>>>>> jerry.boyang.p...@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I would like to start the discussion for a SPIP, Asynchronous Offset
>>>>>> Management in Structured Streaming.  The high level summary of the SPIP 
>>>>>> is
>>>>>> that currently in Structured Streaming we perform a couple of offset
>>>>>> management operations for progress tracking purposes synchronously on the
>>>>>> critical path which can contribute significantly to processing latency.  
>>>>>> If
>>>>>> we were to make these operations asynchronous and less frequent we can
>>>>>> dramatically improve latency for certain types of workloads.
>>>>>>
>>>>>> I have put together a SPIP to implement such a mechanism.  Please
>>>>>> take a look!
>>>>>>
>>>>>> SPIP Jira: https://issues.apache.org/jira/browse/SPARK-39591
>>>>>>
>>>>>> SPIP doc:
>>>>>> https://docs.google.com/document/d/1iPiI4YoGCM0i61pBjkxcggU57gHKf2jVwD7HWMHgH-Y/edit?usp=sharing
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jerry
>>>>>>
>>>>>

Re: [DISCUSSION] SPIP: Asynchronous Offset Management in Structured Streaming

Reply via email to