[
https://issues.apache.org/jira/browse/HIVE-23746?focusedWorklogId=463697&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-463697
]
ASF GitHub Bot logged work on HIVE-23746:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Jul/20 07:23
Start Date: 29/Jul/20 07:23
Worklog Time Spent: 10m
Work Description: mustafaiman commented on a change in pull request #1291:
URL: https://github.com/apache/hive/pull/1291#discussion_r461849212
##########
File path: llap-common/src/java/org/apache/hadoop/hive/llap/AsyncPbRpcProxy.java
##########
@@ -351,9 +403,81 @@ protected CallableRequest(REQUEST request,
ExecuteRequestCallback<RESPONSE> call
return callback;
}
+ /**
+ * Override this method to make a synchronous request and wait for
response.
+ * @return
+ * @throws Exception
+ */
public abstract RESPONSE call() throws Exception;
}
+ /**
+ * Asynchronous request to a node. The request must override {@link
#callInternal()}
+ * @param <REQUEST>
+ * @param <RESPONSE>
+ */
+ protected static abstract class AsyncCallableRequest<REQUEST extends
Message, RESPONSE extends Message>
+ extends NodeCallableRequest<REQUEST, RESPONSE> {
+
+ private final long TIMEOUT = 60000;
+ private final long EXPONENTIAL_BACKOFF_START = 10;
+ private final int FAST_RETRIES = 5;
+ private AsyncGet<Message, Exception> getFuture;
+
+ protected AsyncCallableRequest(LlapNodeId nodeId, REQUEST request,
+ ExecuteRequestCallback<RESPONSE> callback) {
+ super(nodeId, request, callback);
+ }
+
+ @Override
+ public RESPONSE call() throws Exception {
+ boolean asyncMode = Client.isAsynchronousMode();
+ long deadline = System.currentTimeMillis() + TIMEOUT;
+ int numRetries = 0;
+ long nextBackoffMs = EXPONENTIAL_BACKOFF_START;
Review comment:
Backoff timeout is in the order of milliseconds, not seconds. So it is
going to backoff 10, 20, 40, 80, 160 ms. I think these are reasonable as we
expect these calls to complete very fast (they do not wait for response from
executor unlike CallableRequest). Given it is in milliseconds, I think this is
a reasonable strategy, do you think so?
I'll rename EXPONENTIAL_BACKOFF_START to BACKOFF_START.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 463697)
Time Spent: 20m (was: 10m)
> Send task attempts async from AM to daemons
> -------------------------------------------
>
> Key: HIVE-23746
> URL: https://issues.apache.org/jira/browse/HIVE-23746
> Project: Hive
> Issue Type: Sub-task
> Components: llap
> Reporter: Mustafa Iman
> Assignee: Mustafa Iman
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> LlapTaskCommunicator uses sync client to send task attempts. There are fixed
> number of communication threads (10 by default). This causes unneccessary
> delays when there are enough free execution slots in daemons but they do not
> receive all the tasks because of this bottleneck. LlapTaskCommunicator can
> use an async client to pass these tasks to daemons.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)