[
https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050634#comment-15050634
]
Rajat Khandelwal commented on LENS-743:
---------------------------------------
After some offline discussions, I'm inclined towards this approach:
Define a concept of Attempt for queries. LensQuery is the user-facing class for
a query. Right now it contains fields for one attempt:
{noformat}
private QueryStatus status;
private String driverOpHandle;
private long driverStartTime;
private long driverFinishTime;
{noformat}
These will be extracted out in a class `Attempt` and LensQuery will not contain
a list of attempts.
>From lens database side:
There will be a separate table for attempts, where all attempts of queries will
be stored. The query ultimately finishes when the last attempt finishes, and
finished_query table already have fields belonging to attempt, so I'm thinking
last attempt details can go to that table. Though we'll need to add a column
"number of attempts" in finished_queries.
Similar changes will be done in QueryContext.
On query execution side, updateFinishedQuery will check FAILURE and make a
decision on whether to retry or not. A retry will consist of launching the
query again on the driver and creating another attempt. In this process, the
previous retry will be saved to db.
If a decision is made to not-retry the query, it'll follow the normal code
path.
[~amareshwari] [~Puneetkgupta] Please add if I missed anything. I'll do the
same.
> Query failure retries for transient errors
> ------------------------------------------
>
> Key: LENS-743
> URL: https://issues.apache.org/jira/browse/LENS-743
> Project: Apache Lens
> Issue Type: Improvement
> Components: server
> Reporter: Amareshwari Sriramadasu
> Assignee: Rajat Khandelwal
>
> There have to be retries for query failures for transient errors like network
> errors (Hive server not reachable/ Metastore not reachable/ DB not
> reachable). Retries should be available for each phase - submission,
> execution, updating status, fetching results and formatting.
> Right now, any such failure results in marking query as failed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)