[
https://issues.apache.org/jira/browse/BEAM-12118?focusedWorklogId=579528&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-579528
]
ASF GitHub Bot logged work on BEAM-12118:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Apr/21 20:21
Start Date: 08/Apr/21 20:21
Worklog Time Spent: 10m
Work Description: amaliujia commented on a change in pull request #14480:
URL: https://github.com/apache/beam/pull/14480#discussion_r610070432
##########
File path:
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/QueueingBeamFnDataClient.java
##########
@@ -98,35 +117,53 @@ private boolean allDone() {
*
* <p>All {@link InboundDataClient}s will be failed if processing throws an
exception.
*
- * <p>This method is NOT thread safe. This should only be invoked by a
single thread, and is
- * intended for use with a newly constructed QueueingBeamFnDataClient in
{@link
- * ProcessBundleHandler#processBundle}.
+ * <p>This method is NOT thread safe. This should only be invoked once by a
single thread.
*/
public void drainAndBlock() throws Exception {
+ // If all of the inbound clients have already added to the queue, we don't
+ // use the poison to stop and instead poll. This avoids a possible deadlock
+ // if the queue is full and we are unable to queue the poison.
+ boolean requirePoison;
+ synchronized (inboundDataClients) {
+ Preconditions.checkState(!isDraining);
+ isDraining = true;
+ requirePoison = !inboundDataClients.isEmpty();
Review comment:
question:
addPosion above is set based on `inboundDataClients.isEmpty() &&
isDraining`, and why requirePosion does not require the same check?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 579528)
Time Spent: 50m (was: 40m)
> QueuingBeamFnDataClient adds polling latency to completing bundle processing
> ----------------------------------------------------------------------------
>
> Key: BEAM-12118
> URL: https://issues.apache.org/jira/browse/BEAM-12118
> Project: Beam
> Issue Type: Bug
> Components: java-fn-execution
> Reporter: Sam Whittle
> Assignee: Sam Whittle
> Priority: P2
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Currently the inboundDataClients are registered with recieve, and they add
> data to a queue. There is no explicit indication from the clients that they
> are no longer going to add values to the queue.
> Within QueueingBeamFnDataClient.drainAndBlock the queue is therefore polled
> and if nothing is present all clients are polled to see if they are complete.
> This design makes for unfortunate tradeoffs on poll timeout:
> - cpu wasted with small timeout
> - additional latency in noticing we have completed with larger timeout
> With the existing InboundDataClient interface, we could have a separate
> thread call awaitCompletion on all of the clients and then shutdown the queue
> (adding a poison pill perhaps)
> Or we could modify InboundDataClient interface to allow registering iterest
> in when the client is done producing elements. The existing clients all seem
> based upon futures which allow that.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)