[
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=379917&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-379917
]
ASF GitHub Bot logged work on BEAM-8618:
----------------------------------------
Author: ASF GitHub Bot
Created on: 31/Jan/20 11:34
Start Date: 31/Jan/20 11:34
Worklog Time Spent: 10m
Work Description: sunjincheng121 commented on pull request #10655:
[BEAM-8618] Tear down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r373437414
##########
File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
##########
@@ -280,6 +283,7 @@ def get(self, instruction_id, bundle_descriptor_id):
try:
# pop() is threadsafe
processor = self.cached_bundle_processors[bundle_descriptor_id].pop()
+ self.last_access_time[bundle_descriptor_id] = time.time()
except IndexError:
Review comment:
Regarding the single bundle processor case, it doesn't harm to update the
time as the cached bundle processors is empty. However, in cases where there
are multiple bundle processors, it will update the time for the remaining
cached bundle processors and so improve the cache hit rate. I think this is
main difference between solution (1) and (2). However, I'm fine with both
solutions as I think both of them work. Will update the PR according to
solution (2) if you are favor of it according to your experience.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 379917)
Time Spent: 3h 10m (was: 3h)
> Tear down unused DoFns periodically in Python SDK harness
> ---------------------------------------------------------
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-harness
> Reporter: sunjincheng
> Assignee: sunjincheng
> Priority: Major
> Fix For: 2.20.0
>
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1], the teardown of DoFns
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns
> periodically in Python SDK harness.
> [1]
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E
--
This message was sent by Atlassian Jira
(v8.3.4#803005)