[ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=319982&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319982
 ]

ASF GitHub Bot logged work on BEAM-5428:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 28/Sep/19 09:48
            Start Date: 28/Sep/19 09:48
    Worklog Time Spent: 10m 
      Work Description: mxm commented on pull request #9418: [BEAM-5428] 
Implement cross-bundle user state caching in the Python SDK
URL: https://github.com/apache/beam/pull/9418#discussion_r329304677
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##########
 @@ -199,26 +199,19 @@ def finish(self):
 
 
 class _StateBackedIterable(object):
-  def __init__(self, state_handler, state_key, coder_or_impl):
+  def __init__(self, state_handler, state_key, coder_or_impl,
+               is_cached=False):
     self._state_handler = state_handler
     self._state_key = state_key
     if isinstance(coder_or_impl, coders.Coder):
       self._coder_impl = coder_or_impl.get_impl()
     else:
       self._coder_impl = coder_or_impl
+    self._is_cached = is_cached
 
   def __iter__(self):
-    # This is the continuation token this might be useful
-    data, continuation_token = 
self._state_handler.blocking_get(self._state_key)
-    while True:
-      input_stream = coder_impl.create_InputStream(data)
-      while input_stream.size() > 0:
-        yield self._coder_impl.decode_from_stream(input_stream, True)
-      if not continuation_token:
-        break
-      else:
-        data, continuation_token = self._state_handler.blocking_get(
-            self._state_key, continuation_token)
+    return self._state_handler.blocking_get(
 
 Review comment:
   There was no continuation token logic removed. Robert raised the point that 
the state is materialized no matter how many continuation tokens are available 
(only if Caching is enabled). This can result in too large state being loaded. 
However, neither is the continuation token used by the Runner currently, nor is 
this likely to happen for user state, rather for side inputs which are 
currently not cached. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 319982)
    Time Spent: 25h 50m  (was: 25h 40m)

> Implement cross-bundle state caching.
> -------------------------------------
>
>                 Key: BEAM-5428
>                 URL: https://issues.apache.org/jira/browse/BEAM-5428
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-harness
>            Reporter: Robert Bradshaw
>            Assignee: Maximilian Michels
>            Priority: Major
>          Time Spent: 25h 50m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to