[ 
https://issues.apache.org/jira/browse/BEAM-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-10305:
--------------------------------------
    Description: 
When the state cache is enabled in the Python SDK, the batch mode of the Flink 
Runner currently only allows a single user state cell because a new cache token 
is generated for each state cell; the caching code in the Python SDK Harness 
only supports one cache token per user state handler. 

Theoretically multiple cache tokens would work but would just be adding to the 
payload. We should make sure to just send a single cache token in batch mode 
(which is already the case in streaming)

  was:When the state cache is enabled in the Python SDK, the batch mode of the 
Flink Runner only allows a single state cell because a new cache token is 
generated for each state cell. The caching code in the Python SDK Harness only 
supports one global cache token per user state handler. Theoretically multiple 
cache tokens would work but would just be adding to the payload. We should make 
sure to just send a single cache token in batch mode (which is already the case 
in streaming).


> InMemoryBagUserStateFactory creates a cache token per state cell
> ----------------------------------------------------------------
>
>                 Key: BEAM-10305
>                 URL: https://issues.apache.org/jira/browse/BEAM-10305
>             Project: Beam
>          Issue Type: Bug
>          Components: java-fn-execution, runner-flink, sdk-py-harness
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: P3
>
> When the state cache is enabled in the Python SDK, the batch mode of the 
> Flink Runner currently only allows a single user state cell because a new 
> cache token is generated for each state cell; the caching code in the Python 
> SDK Harness only supports one cache token per user state handler. 
> Theoretically multiple cache tokens would work but would just be adding to 
> the payload. We should make sure to just send a single cache token in batch 
> mode (which is already the case in streaming)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to