Re: [DISCUSS] SPIP: State Data Source - Reader

Jungtaek Lim Wed, 18 Oct 2023 19:40:27 -0700

Also, I want to replicate the comment Liang-Chi put into SPIP doc, as it is
a rather general and usual question for every new addition of data source.
Hence I want to sort it out for everyone.

As I know, the author implemented a third-party tool for query state store
> as a data source long time ago. I've suggested some users to use the tool
> before. It is a useful tool for special cases because there is no other
> tool/feature for the purpose.
> I think for such effort to add new data source, one usual question is why
> it has to be in Spark repo instead of as a third-party tool. Especially
> this is not a frequent used one. Even for structured stream users, only
> rare cases it is necessary to look into state store content.

I think we do not expect the data source to be used rarely. We see two
different major use cases; 1) unit tests against stateful query 2) look
into the state during the incident to get full context. 2) is probably not
something users may encounter this frequently, hence it is valid to say the
new feature may not be used frequently. But 1) is definitely something we
can say it's tied to daily work.

Also, even 2), it looks to be an essential feature and has to be provided
as out-of-the-box. Let's say, this feature does not exist and an user
encounters an incident in production with a stateful query. During RCA,
they realize that state is a black-box and their only option is deducing
the value of the state indirectly, mostly likely requiring them to modify
the query heavily and put artificial inputs. If I were such a user, I would
consider this lack as a fundamental issue of SS. It has been out-of-the-box
in Flink for years (State Processor), so it also makes sense for
competitive points.

We are seeing this effort as a stepping stone. As we see comments in SPIP
doc and also previous replies, people also see the proposal as a prior work
for writer part, which we would have a chance to break the strong
preconception for fixed number of shuffle partitions. I'd argue that this
is a rather fundamental limitation of SS and I have seen so many complaints
with this. I don't feel like it is right to delegate to a 3rd party to
solve the fundamental issue. This is probably stronger evidence than the
reader part.

Here's another aspect, during the work, we observed the lacking parts on
checkpointing e.g. the information of prefix scan does not exist in the
checkpoint, which makes a big difference on restoring the state from the
state file. When we come to the state repartitioning, the repartition is
based on the grouping keys in the operator (not the state key), hence we
will also need additional information for that. If this feature goes into
the 3rd party, it will be very painful to make both sides of the changes
altogether. It brings up another headache, versioning and compatibility
matrix.

I hope this would help persuade people to add this to the Spark repo rather
than its own life.

On Thu, Oct 19, 2023 at 11:08 AM Jungtaek Lim <[email protected]>
wrote:

> Thanks Raghu for your support!
>
> Btw, I'd like to replicate the support from JIRA ticket itself, I see
> support from Chaoqin and Praveen. Thanks both!
>
>
>
> On Thu, Oct 19, 2023 at 5:56 AM Raghu Angadi <[email protected]>
> wrote:
>
>> +1 overall and a big +1 to keeping offline state-rebalancing as a primary
>> use case.
>>
>> Raghu.
>>
>> On Mon, Oct 16, 2023 at 11:25 AM Bartosz Konieczny <
>> [email protected]> wrote:
>>
>>> Thank you, Jungtaek, for your answers! It's clear now.
>>>
>>> +1 for me. It seems like a prerequisite for further ops-related
>>> improvements for the state store management. I mean especially here the
>>> state rebalancing that could rely on this read+write state store API. I
>>> don't mean here the dynamic state rebalancing that could probably be
>>> implemented with a lower latency directly in the stateful API. Instead I'm
>>> thinking more of an offline job to rebalance the state and later restart
>>> the stateful pipeline with the changed number of shuffle partitions.
>>>
>>> Best,
>>> Bartosz.
>>>
>>> On Mon, Oct 16, 2023 at 6:19 PM Jungtaek Lim <
>>> [email protected]> wrote:
>>>
>>>> bump for better reach
>>>>
>>>> On Thu, Oct 12, 2023 at 4:26 PM Jungtaek Lim <
>>>> [email protected]> wrote:
>>>>
>>>>> Sorry, please use this link instead for SPIP doc:
>>>>> https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing
>>>>>
>>>>>
>>>>> On Thu, Oct 12, 2023 at 3:58 PM Jungtaek Lim <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi dev,
>>>>>>
>>>>>> I'd like to start a discussion on "State Data Source - Reader".
>>>>>>
>>>>>> This proposal aims to introduce a new data source "statestore" which
>>>>>> enables reading the state rows from existing checkpoint via offline 
>>>>>> (batch)
>>>>>> query. This will enable users to 1) create unit tests against stateful
>>>>>> query verifying the state value (especially flatMapGroupsWithState), 2)
>>>>>> gather more context on the status when an incident occurs, especially for
>>>>>> incorrect output.
>>>>>>
>>>>>> *SPIP*:
>>>>>> https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing
>>>>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45511
>>>>>>
>>>>>> Looking forward to your feedback!
>>>>>>
>>>>>> Thanks,
>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>
>>>>>> ps. The scope of the project is narrowed to the reader in this SPIP,
>>>>>> since the writer requires us to consider more cases. We are planning on 
>>>>>> it.
>>>>>>
>>>>>
>>>
>>> --
>>> Bartosz Konieczny
>>> freelance data engineer
>>> https://www.waitingforcode.com
>>> https://github.com/bartosz25/
>>> https://twitter.com/waitingforcode
>>>
>>>

Re: [DISCUSS] SPIP: State Data Source - Reader

Reply via email to