Thanks Yuanjian for your support!

I've left a comment but to replicate here - I agree with your point. It's
really uneasy for a new feature to be stable from the initial version and
we might want to decide on breaking backward compatibility for
(semantic) bug fixes/improvements. Maybe we could mark the data source as
incubating/experimental and look for a couple of minor releases to see
whether the options/behaviors can be finalized.

On Wed, Oct 18, 2023 at 4:24 PM Yuanjian Li <xyliyuanj...@gmail.com> wrote:

> +1, I have no issues with the practicality and value of this feature
> itself.
> I've left some comments concerning ongoing maintenance and
> compatibility-related matters, which we can continue to discuss.
>
> Jungtaek Lim <kabhwan.opensou...@gmail.com> 于2023年10月17日周二 05:23写道:
>
>> Thanks Bartosz and Anish for your support!
>>
>> I'll wait for a couple more days to see whether we can hear more voices
>> on this. We could probably look for initiating a VOTE thread if there is no
>> objection.
>>
>> On Tue, Oct 17, 2023 at 5:48 AM Anish Shrigondekar <
>> anish.shrigonde...@databricks.com> wrote:
>>
>>> Hi Jungtaek,
>>>
>>> Thanks for putting this together. +1 from me and looks good overall.
>>> Posted some minor comments/questions to the doc.
>>>
>>> Thanks,
>>> Anish
>>>
>>> On Mon, Oct 16, 2023 at 11:25 AM Bartosz Konieczny <
>>> bartkoniec...@gmail.com> wrote:
>>>
>>>> Thank you, Jungtaek, for your answers! It's clear now.
>>>>
>>>> +1 for me. It seems like a prerequisite for further ops-related
>>>> improvements for the state store management. I mean especially here the
>>>> state rebalancing that could rely on this read+write state store API. I
>>>> don't mean here the dynamic state rebalancing that could probably be
>>>> implemented with a lower latency directly in the stateful API. Instead I'm
>>>> thinking more of an offline job to rebalance the state and later restart
>>>> the stateful pipeline with the changed number of shuffle partitions.
>>>>
>>>> Best,
>>>> Bartosz.
>>>>
>>>> On Mon, Oct 16, 2023 at 6:19 PM Jungtaek Lim <
>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>
>>>>> bump for better reach
>>>>>
>>>>> On Thu, Oct 12, 2023 at 4:26 PM Jungtaek Lim <
>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>>
>>>>>> Sorry, please use this link instead for SPIP doc:
>>>>>> https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing
>>>>>>
>>>>>>
>>>>>> On Thu, Oct 12, 2023 at 3:58 PM Jungtaek Lim <
>>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi dev,
>>>>>>>
>>>>>>> I'd like to start a discussion on "State Data Source - Reader".
>>>>>>>
>>>>>>> This proposal aims to introduce a new data source "statestore" which
>>>>>>> enables reading the state rows from existing checkpoint via offline 
>>>>>>> (batch)
>>>>>>> query. This will enable users to 1) create unit tests against stateful
>>>>>>> query verifying the state value (especially flatMapGroupsWithState), 2)
>>>>>>> gather more context on the status when an incident occurs, especially 
>>>>>>> for
>>>>>>> incorrect output.
>>>>>>>
>>>>>>> *SPIP*:
>>>>>>> https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing
>>>>>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45511
>>>>>>>
>>>>>>> Looking forward to your feedback!
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>>
>>>>>>> ps. The scope of the project is narrowed to the reader in this SPIP,
>>>>>>> since the writer requires us to consider more cases. We are planning on 
>>>>>>> it.
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> Bartosz Konieczny
>>>> freelance data engineer
>>>> https://www.waitingforcode.com
>>>> https://github.com/bartosz25/
>>>> https://twitter.com/waitingforcode
>>>>
>>>>

Reply via email to