Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-09 Thread Jungtaek Lim
Friendly reminder, VOTE thread is now live!
https://lists.apache.org/thread/16ryx828bwoth31hobknxnjfxjxj07mf
The vote made here is not counted toward, so please ensure you vote in the
VOTE thread. Thanks!

On Tue, Jan 9, 2024 at 9:33 AM Jungtaek Lim 
wrote:

> Thanks everyone for the feedback!
>
> Given that we get positive feedback without major concerns, I will
> initiate the vote thread soon. Please make a vote in that thread as well.
>
> Thanks again!
>
> On Tue, Jan 9, 2024 at 7:44 AM Bhuwan Sahni
>  wrote:
>
>> +1 on the newer APIs. I believe these APIs provide a much powerful
>> mechanism for the user to perform arbitrary state management in Structured
>> Streaming queries.
>>
>> Thanks
>> Bhuwan Sahni
>>
>> On Mon, Jan 8, 2024 at 10:07 AM L. C. Hsieh  wrote:
>>
>>> +1
>>>
>>> I left some comments in the SPIP doc and got replies quickly. The new
>>> API looks good and more comprehensive. I think it will help Spark
>>> Structured Streaming to be more useful in more complicated streaming
>>> use cases.
>>>
>>> On Fri, Jan 5, 2024 at 8:15 PM Burak Yavuz  wrote:
>>> >
>>> > I'm also a +1 on the newer APIs. We had a lot of learnings from using
>>> flatMapGroupsWithState and I believe that we can make the APIs a lot easier
>>> to use.
>>> >
>>> > On Wed, Nov 29, 2023 at 6:43 PM Anish Shrigondekar
>>>  wrote:
>>> >>
>>> >> Hi dev,
>>> >>
>>> >> Addressed the comments that Jungtaek had on the doc. Bumping the
>>> thread once again to see if other folks have any feedback on the proposal.
>>> >>
>>> >> Thanks,
>>> >> Anish
>>> >>
>>> >> On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>> >>>
>>> >>> Kindly bump for better reach after the long holiday. Please kindly
>>> review the proposal which opens the chance to address complex use cases of
>>> streaming. Thanks!
>>> >>>
>>> >>> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>> 
>>>  Thanks Anish for proposing SPIP and initiating this thread! I
>>> believe this SPIP will help a bunch of complex use cases on streaming.
>>> 
>>>  dev@: We are coincidentally initiating this discussion in
>>> thanksgiving holidays. We understand people in the US may not have time to
>>> review the SPIP, and we plan to bump this thread in early next week. We are
>>> open for any feedback from non-US during the holiday. We can either address
>>> feedback altogether after the holiday (Anish is in the US) or I can answer
>>> if the feedback is more about the question. Thanks!
>>> 
>>>  On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
>>> anish.shrigonde...@databricks.com> wrote:
>>> >
>>> > Hi dev,
>>> >
>>> > I would like to start a discussion on "Structured Streaming -
>>> Arbitrary State API v2". This proposal aims to address a bunch of
>>> limitations we see today using mapGroupsWithState/flatMapGroupsWithState
>>> operator. The detailed set of limitations is described in the SPIP doc.
>>> >
>>> > We propose to support various features such as multiple state
>>> variables (flexible data modeling), composite types, enhanced timer
>>> functionality, support for chaining operators after new operator, handling
>>> initial state along with state data source, schema evolution etc This will
>>> allow users to write more powerful streaming state management logic
>>> primarily used in operational use-cases. Other built-in stateful operators
>>> could also benefit from such changes in the future.
>>> >
>>> > JIRA: https://issues.apache.org/jira/browse/SPARK-45939
>>> > SPIP:
>>> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
>>> > Design Doc:
>>> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
>>> >
>>> > cc - @Jungtaek Lim  who has graciously agreed to be the shepherd
>>> for this project
>>> >
>>> > Looking forward to your feedback !
>>> >
>>> > Thanks,
>>> > Anish
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>
>> --
>> 
>> *Bhuwan Sahni*
>> Staff Software Engineer
>>
>> bhuwan.sa...@databricks.com
>> 500 108th Ave. NE
>> Bellevue, WA 98004
>> USA
>>
>


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-08 Thread Jungtaek Lim
Thanks everyone for the feedback!

Given that we get positive feedback without major concerns, I will initiate
the vote thread soon. Please make a vote in that thread as well.

Thanks again!

On Tue, Jan 9, 2024 at 7:44 AM Bhuwan Sahni
 wrote:

> +1 on the newer APIs. I believe these APIs provide a much powerful
> mechanism for the user to perform arbitrary state management in Structured
> Streaming queries.
>
> Thanks
> Bhuwan Sahni
>
> On Mon, Jan 8, 2024 at 10:07 AM L. C. Hsieh  wrote:
>
>> +1
>>
>> I left some comments in the SPIP doc and got replies quickly. The new
>> API looks good and more comprehensive. I think it will help Spark
>> Structured Streaming to be more useful in more complicated streaming
>> use cases.
>>
>> On Fri, Jan 5, 2024 at 8:15 PM Burak Yavuz  wrote:
>> >
>> > I'm also a +1 on the newer APIs. We had a lot of learnings from using
>> flatMapGroupsWithState and I believe that we can make the APIs a lot easier
>> to use.
>> >
>> > On Wed, Nov 29, 2023 at 6:43 PM Anish Shrigondekar
>>  wrote:
>> >>
>> >> Hi dev,
>> >>
>> >> Addressed the comments that Jungtaek had on the doc. Bumping the
>> thread once again to see if other folks have any feedback on the proposal.
>> >>
>> >> Thanks,
>> >> Anish
>> >>
>> >> On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>> >>>
>> >>> Kindly bump for better reach after the long holiday. Please kindly
>> review the proposal which opens the chance to address complex use cases of
>> streaming. Thanks!
>> >>>
>> >>> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>> 
>>  Thanks Anish for proposing SPIP and initiating this thread! I
>> believe this SPIP will help a bunch of complex use cases on streaming.
>> 
>>  dev@: We are coincidentally initiating this discussion in
>> thanksgiving holidays. We understand people in the US may not have time to
>> review the SPIP, and we plan to bump this thread in early next week. We are
>> open for any feedback from non-US during the holiday. We can either address
>> feedback altogether after the holiday (Anish is in the US) or I can answer
>> if the feedback is more about the question. Thanks!
>> 
>>  On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
>> anish.shrigonde...@databricks.com> wrote:
>> >
>> > Hi dev,
>> >
>> > I would like to start a discussion on "Structured Streaming -
>> Arbitrary State API v2". This proposal aims to address a bunch of
>> limitations we see today using mapGroupsWithState/flatMapGroupsWithState
>> operator. The detailed set of limitations is described in the SPIP doc.
>> >
>> > We propose to support various features such as multiple state
>> variables (flexible data modeling), composite types, enhanced timer
>> functionality, support for chaining operators after new operator, handling
>> initial state along with state data source, schema evolution etc This will
>> allow users to write more powerful streaming state management logic
>> primarily used in operational use-cases. Other built-in stateful operators
>> could also benefit from such changes in the future.
>> >
>> > JIRA: https://issues.apache.org/jira/browse/SPARK-45939
>> > SPIP:
>> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
>> > Design Doc:
>> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
>> >
>> > cc - @Jungtaek Lim  who has graciously agreed to be the shepherd
>> for this project
>> >
>> > Looking forward to your feedback !
>> >
>> > Thanks,
>> > Anish
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
> --
> 
> *Bhuwan Sahni*
> Staff Software Engineer
>
> bhuwan.sa...@databricks.com
> 500 108th Ave. NE
> Bellevue, WA 98004
> USA
>


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-08 Thread Bhuwan Sahni
+1 on the newer APIs. I believe these APIs provide a much powerful
mechanism for the user to perform arbitrary state management in Structured
Streaming queries.

Thanks
Bhuwan Sahni

On Mon, Jan 8, 2024 at 10:07 AM L. C. Hsieh  wrote:

> +1
>
> I left some comments in the SPIP doc and got replies quickly. The new
> API looks good and more comprehensive. I think it will help Spark
> Structured Streaming to be more useful in more complicated streaming
> use cases.
>
> On Fri, Jan 5, 2024 at 8:15 PM Burak Yavuz  wrote:
> >
> > I'm also a +1 on the newer APIs. We had a lot of learnings from using
> flatMapGroupsWithState and I believe that we can make the APIs a lot easier
> to use.
> >
> > On Wed, Nov 29, 2023 at 6:43 PM Anish Shrigondekar
>  wrote:
> >>
> >> Hi dev,
> >>
> >> Addressed the comments that Jungtaek had on the doc. Bumping the thread
> once again to see if other folks have any feedback on the proposal.
> >>
> >> Thanks,
> >> Anish
> >>
> >> On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
> >>>
> >>> Kindly bump for better reach after the long holiday. Please kindly
> review the proposal which opens the chance to address complex use cases of
> streaming. Thanks!
> >>>
> >>> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
> 
>  Thanks Anish for proposing SPIP and initiating this thread! I believe
> this SPIP will help a bunch of complex use cases on streaming.
> 
>  dev@: We are coincidentally initiating this discussion in
> thanksgiving holidays. We understand people in the US may not have time to
> review the SPIP, and we plan to bump this thread in early next week. We are
> open for any feedback from non-US during the holiday. We can either address
> feedback altogether after the holiday (Anish is in the US) or I can answer
> if the feedback is more about the question. Thanks!
> 
>  On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
> anish.shrigonde...@databricks.com> wrote:
> >
> > Hi dev,
> >
> > I would like to start a discussion on "Structured Streaming -
> Arbitrary State API v2". This proposal aims to address a bunch of
> limitations we see today using mapGroupsWithState/flatMapGroupsWithState
> operator. The detailed set of limitations is described in the SPIP doc.
> >
> > We propose to support various features such as multiple state
> variables (flexible data modeling), composite types, enhanced timer
> functionality, support for chaining operators after new operator, handling
> initial state along with state data source, schema evolution etc This will
> allow users to write more powerful streaming state management logic
> primarily used in operational use-cases. Other built-in stateful operators
> could also benefit from such changes in the future.
> >
> > JIRA: https://issues.apache.org/jira/browse/SPARK-45939
> > SPIP:
> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
> > Design Doc:
> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
> >
> > cc - @Jungtaek Lim  who has graciously agreed to be the shepherd for
> this project
> >
> > Looking forward to your feedback !
> >
> > Thanks,
> > Anish
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 

*Bhuwan Sahni*
Staff Software Engineer

bhuwan.sa...@databricks.com
500 108th Ave. NE
Bellevue, WA 98004
USA


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-08 Thread L. C. Hsieh
+1

I left some comments in the SPIP doc and got replies quickly. The new
API looks good and more comprehensive. I think it will help Spark
Structured Streaming to be more useful in more complicated streaming
use cases.

On Fri, Jan 5, 2024 at 8:15 PM Burak Yavuz  wrote:
>
> I'm also a +1 on the newer APIs. We had a lot of learnings from using 
> flatMapGroupsWithState and I believe that we can make the APIs a lot easier 
> to use.
>
> On Wed, Nov 29, 2023 at 6:43 PM Anish Shrigondekar 
>  wrote:
>>
>> Hi dev,
>>
>> Addressed the comments that Jungtaek had on the doc. Bumping the thread once 
>> again to see if other folks have any feedback on the proposal.
>>
>> Thanks,
>> Anish
>>
>> On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim  
>> wrote:
>>>
>>> Kindly bump for better reach after the long holiday. Please kindly review 
>>> the proposal which opens the chance to address complex use cases of 
>>> streaming. Thanks!
>>>
>>> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim  
>>> wrote:

 Thanks Anish for proposing SPIP and initiating this thread! I believe this 
 SPIP will help a bunch of complex use cases on streaming.

 dev@: We are coincidentally initiating this discussion in thanksgiving 
 holidays. We understand people in the US may not have time to review the 
 SPIP, and we plan to bump this thread in early next week. We are open for 
 any feedback from non-US during the holiday. We can either address 
 feedback altogether after the holiday (Anish is in the US) or I can answer 
 if the feedback is more about the question. Thanks!

 On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar 
  wrote:
>
> Hi dev,
>
> I would like to start a discussion on "Structured Streaming - Arbitrary 
> State API v2". This proposal aims to address a bunch of limitations we 
> see today using mapGroupsWithState/flatMapGroupsWithState operator. The 
> detailed set of limitations is described in the SPIP doc.
>
> We propose to support various features such as multiple state variables 
> (flexible data modeling), composite types, enhanced timer functionality, 
> support for chaining operators after new operator, handling initial state 
> along with state data source, schema evolution etc This will allow users 
> to write more powerful streaming state management logic primarily used in 
> operational use-cases. Other built-in stateful operators could also 
> benefit from such changes in the future.
>
> JIRA: https://issues.apache.org/jira/browse/SPARK-45939
> SPIP: 
> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
> Design Doc: 
> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
>
> cc - @Jungtaek Lim  who has graciously agreed to be the shepherd for this 
> project
>
> Looking forward to your feedback !
>
> Thanks,
> Anish

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-05 Thread Burak Yavuz
I'm also a +1 on the newer APIs. We had a lot of learnings from using
flatMapGroupsWithState and I believe that we can make the APIs a lot easier
to use.

On Wed, Nov 29, 2023 at 6:43 PM Anish Shrigondekar
 wrote:

> Hi dev,
>
> Addressed the comments that Jungtaek had on the doc. Bumping the thread
> once again to see if other folks have any feedback on the proposal.
>
> Thanks,
> Anish
>
> On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim 
> wrote:
>
>> Kindly bump for better reach after the long holiday. Please kindly review
>> the proposal which opens the chance to address complex use cases of
>> streaming. Thanks!
>>
>> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> Thanks Anish for proposing SPIP and initiating this thread! I believe
>>> this SPIP will help a bunch of complex use cases on streaming.
>>>
>>> dev@: We are coincidentally initiating this discussion in thanksgiving
>>> holidays. We understand people in the US may not have time to review the
>>> SPIP, and we plan to bump this thread in early next week. We are open for
>>> any feedback from non-US during the holiday. We can either address feedback
>>> altogether after the holiday (Anish is in the US) or I can answer if the
>>> feedback is more about the question. Thanks!
>>>
>>> On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
>>> anish.shrigonde...@databricks.com> wrote:
>>>
 Hi dev,

 I would like to start a discussion on "Structured Streaming - Arbitrary
 State API v2". This proposal aims to address a bunch of limitations we see
 today using mapGroupsWithState/flatMapGroupsWithState operator. The
 detailed set of limitations is described in the SPIP doc.

 We propose to support various features such as multiple state variables
 (flexible data modeling), composite types, enhanced timer functionality,
 support for chaining operators after new operator, handling initial state
 along with state data source, schema evolution etc This will allow users to
 write more powerful streaming state management logic primarily used in
 operational use-cases. Other built-in stateful operators could also benefit
 from such changes in the future.

 JIRA: https://issues.apache.org/jira/browse/SPARK-45939
 SPIP:
 https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
 Design Doc:
 https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing

 cc - @Jungtaek Lim   who has graciously
 agreed to be the shepherd for this project

 Looking forward to your feedback !

 Thanks,
 Anish

>>>


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2024-01-05 Thread Shixiong Zhu
+1. Looking forward to seeing how the new API brings in new streaming use
cases!

Best Regards,
Shixiong Zhu


On Wed, Nov 29, 2023 at 6:42 PM Anish Shrigondekar
 wrote:

> Hi dev,
>
> Addressed the comments that Jungtaek had on the doc. Bumping the thread
> once again to see if other folks have any feedback on the proposal.
>
> Thanks,
> Anish
>
> On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim 
> wrote:
>
>> Kindly bump for better reach after the long holiday. Please kindly review
>> the proposal which opens the chance to address complex use cases of
>> streaming. Thanks!
>>
>> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>>> Thanks Anish for proposing SPIP and initiating this thread! I believe
>>> this SPIP will help a bunch of complex use cases on streaming.
>>>
>>> dev@: We are coincidentally initiating this discussion in thanksgiving
>>> holidays. We understand people in the US may not have time to review the
>>> SPIP, and we plan to bump this thread in early next week. We are open for
>>> any feedback from non-US during the holiday. We can either address feedback
>>> altogether after the holiday (Anish is in the US) or I can answer if the
>>> feedback is more about the question. Thanks!
>>>
>>> On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
>>> anish.shrigonde...@databricks.com> wrote:
>>>
 Hi dev,

 I would like to start a discussion on "Structured Streaming - Arbitrary
 State API v2". This proposal aims to address a bunch of limitations we see
 today using mapGroupsWithState/flatMapGroupsWithState operator. The
 detailed set of limitations is described in the SPIP doc.

 We propose to support various features such as multiple state variables
 (flexible data modeling), composite types, enhanced timer functionality,
 support for chaining operators after new operator, handling initial state
 along with state data source, schema evolution etc This will allow users to
 write more powerful streaming state management logic primarily used in
 operational use-cases. Other built-in stateful operators could also benefit
 from such changes in the future.

 JIRA: https://issues.apache.org/jira/browse/SPARK-45939
 SPIP:
 https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
 Design Doc:
 https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing

 cc - @Jungtaek Lim   who has graciously
 agreed to be the shepherd for this project

 Looking forward to your feedback !

 Thanks,
 Anish

>>>


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-29 Thread Anish Shrigondekar
Hi dev,

Addressed the comments that Jungtaek had on the doc. Bumping the thread
once again to see if other folks have any feedback on the proposal.

Thanks,
Anish

On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim 
wrote:

> Kindly bump for better reach after the long holiday. Please kindly review
> the proposal which opens the chance to address complex use cases of
> streaming. Thanks!
>
> On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim 
> wrote:
>
>> Thanks Anish for proposing SPIP and initiating this thread! I believe
>> this SPIP will help a bunch of complex use cases on streaming.
>>
>> dev@: We are coincidentally initiating this discussion in thanksgiving
>> holidays. We understand people in the US may not have time to review the
>> SPIP, and we plan to bump this thread in early next week. We are open for
>> any feedback from non-US during the holiday. We can either address feedback
>> altogether after the holiday (Anish is in the US) or I can answer if the
>> feedback is more about the question. Thanks!
>>
>> On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
>> anish.shrigonde...@databricks.com> wrote:
>>
>>> Hi dev,
>>>
>>> I would like to start a discussion on "Structured Streaming - Arbitrary
>>> State API v2". This proposal aims to address a bunch of limitations we see
>>> today using mapGroupsWithState/flatMapGroupsWithState operator. The
>>> detailed set of limitations is described in the SPIP doc.
>>>
>>> We propose to support various features such as multiple state variables
>>> (flexible data modeling), composite types, enhanced timer functionality,
>>> support for chaining operators after new operator, handling initial state
>>> along with state data source, schema evolution etc This will allow users to
>>> write more powerful streaming state management logic primarily used in
>>> operational use-cases. Other built-in stateful operators could also benefit
>>> from such changes in the future.
>>>
>>> JIRA: https://issues.apache.org/jira/browse/SPARK-45939
>>> SPIP:
>>> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
>>> Design Doc:
>>> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
>>>
>>> cc - @Jungtaek Lim   who has graciously
>>> agreed to be the shepherd for this project
>>>
>>> Looking forward to your feedback !
>>>
>>> Thanks,
>>> Anish
>>>
>>


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-27 Thread Jungtaek Lim
Kindly bump for better reach after the long holiday. Please kindly review
the proposal which opens the chance to address complex use cases of
streaming. Thanks!

On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim 
wrote:

> Thanks Anish for proposing SPIP and initiating this thread! I believe this
> SPIP will help a bunch of complex use cases on streaming.
>
> dev@: We are coincidentally initiating this discussion in thanksgiving
> holidays. We understand people in the US may not have time to review the
> SPIP, and we plan to bump this thread in early next week. We are open for
> any feedback from non-US during the holiday. We can either address feedback
> altogether after the holiday (Anish is in the US) or I can answer if the
> feedback is more about the question. Thanks!
>
> On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
> anish.shrigonde...@databricks.com> wrote:
>
>> Hi dev,
>>
>> I would like to start a discussion on "Structured Streaming - Arbitrary
>> State API v2". This proposal aims to address a bunch of limitations we see
>> today using mapGroupsWithState/flatMapGroupsWithState operator. The
>> detailed set of limitations is described in the SPIP doc.
>>
>> We propose to support various features such as multiple state variables
>> (flexible data modeling), composite types, enhanced timer functionality,
>> support for chaining operators after new operator, handling initial state
>> along with state data source, schema evolution etc This will allow users to
>> write more powerful streaming state management logic primarily used in
>> operational use-cases. Other built-in stateful operators could also benefit
>> from such changes in the future.
>>
>> JIRA: https://issues.apache.org/jira/browse/SPARK-45939
>> SPIP:
>> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
>> Design Doc:
>> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
>>
>> cc - @Jungtaek Lim   who has graciously
>> agreed to be the shepherd for this project
>>
>> Looking forward to your feedback !
>>
>> Thanks,
>> Anish
>>
>


Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Jungtaek Lim
Thanks Anish for proposing SPIP and initiating this thread! I believe this
SPIP will help a bunch of complex use cases on streaming.

dev@: We are coincidentally initiating this discussion in thanksgiving
holidays. We understand people in the US may not have time to review the
SPIP, and we plan to bump this thread in early next week. We are open for
any feedback from non-US during the holiday. We can either address feedback
altogether after the holiday (Anish is in the US) or I can answer if the
feedback is more about the question. Thanks!

On Thu, Nov 23, 2023 at 5:27 AM Anish Shrigondekar <
anish.shrigonde...@databricks.com> wrote:

> Hi dev,
>
> I would like to start a discussion on "Structured Streaming - Arbitrary
> State API v2". This proposal aims to address a bunch of limitations we see
> today using mapGroupsWithState/flatMapGroupsWithState operator. The
> detailed set of limitations is described in the SPIP doc.
>
> We propose to support various features such as multiple state variables
> (flexible data modeling), composite types, enhanced timer functionality,
> support for chaining operators after new operator, handling initial state
> along with state data source, schema evolution etc This will allow users to
> write more powerful streaming state management logic primarily used in
> operational use-cases. Other built-in stateful operators could also benefit
> from such changes in the future.
>
> JIRA: https://issues.apache.org/jira/browse/SPARK-45939
> SPIP:
> https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
> Design Doc:
> https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing
>
> cc - @Jungtaek Lim   who has graciously
> agreed to be the shepherd for this project
>
> Looking forward to your feedback !
>
> Thanks,
> Anish
>


[DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Anish Shrigondekar
Hi dev,

I would like to start a discussion on "Structured Streaming - Arbitrary
State API v2". This proposal aims to address a bunch of limitations we see
today using mapGroupsWithState/flatMapGroupsWithState operator. The
detailed set of limitations is described in the SPIP doc.

We propose to support various features such as multiple state variables
(flexible data modeling), composite types, enhanced timer functionality,
support for chaining operators after new operator, handling initial state
along with state data source, schema evolution etc This will allow users to
write more powerful streaming state management logic primarily used in
operational use-cases. Other built-in stateful operators could also benefit
from such changes in the future.

JIRA: https://issues.apache.org/jira/browse/SPARK-45939
SPIP:
https://docs.google.com/document/d/1QtC5qd4WQEia9kl1Qv74WE0TiXYy3x6zeTykygwPWig/edit?usp=sharing
Design Doc:
https://docs.google.com/document/d/1QjZmNZ-fHBeeCYKninySDIoOEWfX6EmqXs2lK097u9o/edit?usp=sharing

cc - @Jungtaek Lim   who has graciously
agreed to be the shepherd for this project

Looking forward to your feedback !

Thanks,
Anish