[ https://issues.apache.org/jira/browse/BEAM-13930?focusedWorklogId=727376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727376 ]
ASF GitHub Bot logged work on BEAM-13930: ----------------------------------------- Author: ASF GitHub Bot Created on: 15/Feb/22 18:45 Start Date: 15/Feb/22 18:45 Worklog Time Spent: 10m Work Description: lukecwik merged pull request #16836: URL: https://github.com/apache/beam/pull/16836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 727376) Time Spent: 2.5h (was: 2h 20m) > Address StateSpec inconsistency between Runner and Fn API > --------------------------------------------------------- > > Key: BEAM-13930 > URL: https://issues.apache.org/jira/browse/BEAM-13930 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-java-core, sdk-py-core > Reporter: Luke Cwik > Assignee: Luke Cwik > Priority: P2 > Fix For: 2.37.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > The ability to mix and match runners and SDKs is accomplished through two > portability layers: > 1. The Runner API provides an SDK-and-runner-independent definition of a Beam > pipeline > 2. The Fn API allows a runner to invoke SDK-specific user-defined functions > Apache Beam pipelines support executing stateful DoFns[1]. To support this > execution the Runner API defines multiple user state specifications: > * ReadModifyWriteStateSpec > * BagStateSpec > * OrderedListStateSpec > * CombiningStateSpec > * MapStateSpec > * SetStateSpec > The Fn API[2] defines APIs[3] to get, append and clear user state currently > supporting a BagUserState and MultimapUserState protocol. > Since there is no clear mapping between the Runner API and Fn API state > specifications, there is no way for a runner to know that it supports a given > API necessary to support the execution of the pipeline. The Runner will also > have to manage additional runtime metadata associated with which protocol was > used for a type of state so that it can successfully manage the state’s > lifetime once it can be garbage collected. > Please see the doc[4] for further details and a proposal on how to address > this shortcoming. > 1: https://beam.apache.org/blog/stateful-processing/ > 2: > https://github.com/apache/beam/blob/3ad05523f4cdf5122fc319276fcb461f768af39d/model/fn-execution/src/main/proto/beam_fn_api.proto#L742 > 3: https://s.apache.org/beam-fn-state-api-and-bundle-processing > 4: http://doc/1ELKTuRTV3C5jt_YoBBwPdsPa5eoXCCOSKQ3GPzZrK7Q -- This message was sent by Atlassian Jira (v8.20.1#820001)