[
https://issues.apache.org/jira/browse/LENS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430705#comment-15430705
]
Rajat Khandelwal commented on LENS-899:
---------------------------------------
Planning to move ahead with the *Failed Attempts* approach, but since the code
has moved a lot since the last discussion, I'll describe the entire approach
once again. Hoping to get thoughts on the following approach:
I'll be using the *BackOffRetryHandler* construct available in lens code to
take care of retries. There will be one instance of retry handler on the server
level. This will act as the backoff policy for retries in the lens server. This
can be exponential back off or periodic back off or any other policy.
Exponential backoff is already coded, I'll add code for the periodic backoff.
The policy will be configurable on the server level. All queries are retried
with the same policy. With minor changes(optionally), the policy can be
query-level or driver level. Query level policy is preferred over driver level
policy which is preferred over server level policy. With this, we have a policy
for each query.
Once a query fails, the respective backoff policy is consulted whether it has
exhausted its retries. So there can be a policy which disallows retries and
always says yes to that question. If it hasn't exhausted its retries and
retries need to be done (who decides whether retries are done is described in
the next paragraph) a *QueryRelauncher* (tentative name) thread is spawned.
This thread is similar to *QueryLauncher* thread, with minor modifications. If
possible, I'll even see if they are mergeable. QueryRelauncher thread will
extract the FailedAttempt instance, persist it, change the status to
*LAUNCHING* in the constructor, and in the run method, it'll consult the
backoff policy. The backoff policy will either tell if the attempt can be
launched immediately or, it'll provide the next time when it can be
re-launched. In the latter case, it'll do *Thread.sleep* for a duration. When
this thread wakes, hopefully, the policy will say that retry is possible now
(if not, loop). So then the query will be launched on the driver and normal
status update flow will follow, looping back to the flow described in this
paragraph on further failures.
Now, the decision of whether or not to retry rests upon a bunch of retry
policies. Note that retry policy is different from backoff policy. A retry
policy takes QueryContext and decides whether to retry or not. The most
commonly used policy will be error based policy, which looks at the error and
gives the decision. The policies will be configurable on driver, server and
query level. The result of decisions of all policies will be ANDed to give the
final verdict.
> Create Attempt framework
> ------------------------
>
> Key: LENS-899
> URL: https://issues.apache.org/jira/browse/LENS-899
> Project: Apache Lens
> Issue Type: Sub-task
> Components: server
> Reporter: Rajat Khandelwal
> Assignee: Rajat Khandelwal
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)