Chad Dombrova created BEAM-8271: ----------------------------------- Summary: StateGetRequest/Response continuation_token should be string Key: BEAM-8271 URL: https://issues.apache.org/jira/browse/BEAM-8271 Project: Beam Issue Type: Improvement Components: beam-model Reporter: Chad Dombrova Assignee: Chad Dombrova
I've been working on adding typing to the python code and I came across a discrepancy between regarding the type of the continuation token. The .proto defines it as bytes, but the code treats it as a string (i.e. unicode): {code:java} // A request to get state. message StateGetRequest { // (Optional) If specified, signals to the runner that the response // should resume from the following continuation token. // // If unspecified, signals to the runner that the response should start // from the beginning of the logical continuable stream. bytes continuation_token = 1; } // A response to get state representing a logical byte stream which can be // continued using the state API. message StateGetResponse { // (Optional) If specified, represents a token which can be used with the // state API to get the next chunk of this logical byte stream. The end of // the logical byte stream is signalled by this field being unset. bytes continuation_token = 1; // Represents a part of a logical byte stream. Elements within // the logical byte stream are encoded in the nested context and // concatenated together. bytes data = 2; } {code} >From FnApiRunner.StateServicer: {code:python} def blocking_get(self, state_key, continuation_token=None): with self._lock: full_state = self._state[self._to_key(state_key)] if self._use_continuation_tokens: # The token is "nonce:index". if not continuation_token: token_base = 'token_%x' % len(self._continuations) self._continuations[token_base] = tuple(full_state) return b'', '%s:0' % token_base else: token_base, index = continuation_token.split(':') ix = int(index) full_state = self._continuations[token_base] if ix == len(full_state): return b'', None else: return full_state[ix], '%s:%d' % (token_base, ix + 1) else: assert not continuation_token return b''.join(full_state), None {code} This could be a problem in python3. All other id values are string, whereas bytes is reserved for data, so I think that the proto should be changed to string. -- This message was sent by Atlassian Jira (v8.3.4#803005)