[ 
https://issues.apache.org/jira/browse/BEAM-10305?focusedWorklogId=450557&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-450557
 ]

ASF GitHub Bot logged work on BEAM-10305:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Jun/20 17:58
            Start Date: 24/Jun/20 17:58
    Worklog Time Spent: 10m 
      Work Description: lukecwik commented on pull request #12062:
URL: https://github.com/apache/beam/pull/12062#issuecomment-648975358


   > > IIUC, this changes all state handlers to use same cache token value, but 
doesn't reduce the number of user state cache token in process bundle request 
to 1?
   > 
   > Yes, it reduces the number of user state cache tokens for process bundle 
requests per process bundle descriptor to 1.
   
   I think the logic in 
`ByteStringStateRequestHandlerToBagUserStateHandlerFactoryAdapter#getCacheTokens`
 
https://github.com/apache/beam/blob/f98104a22b69972744a13378e17af5f2361fbb3e/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L590
 is going to add N copies of the same cache token to the `ProcessBundleRequest`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 450557)
    Time Spent: 1h 10m  (was: 1h)

> InMemoryBagUserStateFactory creates a cache token per state cell
> ----------------------------------------------------------------
>
>                 Key: BEAM-10305
>                 URL: https://issues.apache.org/jira/browse/BEAM-10305
>             Project: Beam
>          Issue Type: Bug
>          Components: java-fn-execution, runner-flink, sdk-py-harness
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: P3
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When the state cache is enabled in the Python SDK, the batch mode of the 
> Flink Runner currently only allows a single user state cell because a new 
> cache token is generated for each state cell; the caching code in the Python 
> SDK Harness only supports one cache token per user state handler. 
> Theoretically multiple cache tokens would work but would just be adding to 
> the payload. We should make sure to just send a single cache token in batch 
> mode (which is already the case in streaming)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to