Hello,
Our objective is to maintain a persistent, time-relevant state per key. What we 
do now is that we use non-overlapping windows and apply GroupByKey.create() to 
gather an array of windowed data for each key. We then sort the data by 
timestamp and iterate through the array to update the associated state.

At the moment, we rely on external storage (Bigtable) to persist state across 
windows. However, the I/O overhead from reading and writing to Bigtable has 
become significant. As a result, we're exploring the possibility of using 
stateful processing to manage this state more efficiently.

Best,

Shaochen


> On 5 May 2025, at 17:46, Kenneth Knowles <k...@apache.org> wrote:
> 
> Hello!
> 
> This is not possible in a simple way, because of the main fact: windows are 
> processed simultaneously.
> 
> Many windows may have some state and incoming data at the same time, even if 
> the time ranges of your windows do not overlap. So, sharing state across 
> windows would need concurrency control (potentially distributed concurrency 
> control) or it would need to wait for all data to arrive and then sort it. 
> Beam state is designed to avoid this, so that it can scale up efficiently.
> 
> If you want to have some running accumulator that processes some data in 
> order, there may be a way to express it, for example there is a 
> @RequiresTimeSortedInput annotation that sometimes can handle it.
> 
> Can you share more about your use case?
> 
> Kenn
> 
> On Fri, May 2, 2025 at 9:03 AM Shaochen Bai <shaoc...@kisi.io 
> <mailto:shaoc...@kisi.io>> wrote:
>> Hello,
>> 
>> I read that in stateful processing 
>> <https://www.google.com/url?q=https://beam.apache.org/blog/stateful-processing/&source=gmail-imap&ust=1747064833000000&usg=AOvVaw1-iRmfnjXeUF2s_zgI0TRD>
>>  with Apache Beam, “a state cell is scoped to a key+window pair.” What if I 
>> want to maintain a persistent state across windows? Is there a workaround 
>> for this, and what are the common practices in such cases?
>> 
>> Thanks!
>> 
>> Best,
>> Shaochen
>>  
>> 
>> 
>> ---
>> This email is confidential/privileged. If you're not the intended recipient, 
>> please delete it and notify us immediately; please do not copy/use/disclose 
>> it for any purpose, to anyone. Thank you!


-- 
---
This email is confidential/privileged. If you're not the intended 
recipient, please delete it and notify us immediately; please do not 
copy/use/disclose it for any purpose, to anyone. Thank you!

Reply via email to