[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581425#comment-15581425 ] Hudson commented on LENS-743: - FAILURE: Integrated in Jenkins build Lens-Commit #1360 (See [https://builds.apache.org/job/Lens-Commit/1360/]) LENS-743: Query retry framework for retrying upon transient failures (rajatgupta59: rev 38ab6c6082b6221502daac979551e8c5fca72241) * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryFailed.java * (add) lens-api/src/main/java/org/apache/lens/api/query/FailedAttempt.java * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/driver/DriverQueryStatus.java * (edit) lens-server/src/test/java/org/apache/lens/server/scheduler/util/SchedulerTestUtils.java * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/query/QueryFailed.java * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/query/FinishedLensQuery.java * (add) lens-server/src/test/resources/drivers/retry/double_failure/driver-site.xml * (delete) lens-server/src/main/java/org/apache/lens/server/query/FIFOQueryComparator.java * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/query/QueryRunning.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/constraint/TotalQueryCostCeilingConstraintTest.java * (edit) lens-driver-es/src/test/resources/hive-site.xml * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/StatusUpdateFailureContext.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/BackOffRetryHandler.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/comparators/QueryPriorityComparator.java * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/common/BackOffRetryHandler.java * (add) lens-server/src/main/java/org/apache/lens/server/query/constraint/RetryPolicyToConstraingAdapter.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/ImmediateRetryHandler.java * (edit) lens-driver-hive/src/test/java/org/apache/lens/driver/hive/TestRemoteHiveDriver.java * (edit) lens-server/src/main/java/org/apache/lens/server/query/constraint/DefaultQueryLaunchingConstraintsChecker.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/TestQueryIndependenceFromSessionClose.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/ChainedRetryPolicyDecider.java * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/query/AbstractQueryContext.java * (edit) lens-server/pom.xml * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/query/constraint/QueryLaunchingConstraint.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/PriorityChange.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryAccepted.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/StatusChange.java * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/query/constraint/MaxConcurrentDriverQueriesConstraint.java * (edit) checkstyle/src/main/resources/checkstyle.xml * (edit) lens-regression/src/main/java/org/apache/lens/regression/core/constants/DriverConfig.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryRejected.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/DefaultRetryPolicyDecider.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryQueuedForRetry.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueuePositionChange.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryEvent.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/RetryPolicyDecider.java * (edit) lens-driver-jdbc/src/main/java/org/apache/lens/driver/jdbc/JDBCDriver.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/FailureContext.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryQueued.java * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/query/QueryRejected.java * (delete) lens-server/src/main/java/org/apache/lens/server/query/QueryComparator.java * (edit) lens-driver-jdbc/src/main/java/org/apache/lens/driver/jdbc/JDBCDriverConfConstants.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/OperationRetryHandlerFactory.java * (delete) lens-server-api/src/test/java/org/apache/lens/server/api/common/TestExponentialBackOffRetryHandler.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/constraint/DefaultQueryLaunchingConstraintsCheckerTest.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryEnded.java * (edit)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581383#comment-15581383 ] Hudson commented on LENS-743: - FAILURE: Integrated in Jenkins build Lens-Commit-Java8 #277 (See [https://builds.apache.org/job/Lens-Commit-Java8/277/]) LENS-743: Query retry framework for retrying upon transient failures (rajatgupta59: rev 38ab6c6082b6221502daac979551e8c5fca72241) * (edit) lens-server/src/test/java/org/apache/lens/server/scheduler/util/SchedulerTestUtils.java * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/query/QueryQueued.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/StatusUpdateFailureContext.java * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/driver/LensDriver.java * (delete) lens-server/src/main/java/org/apache/lens/server/query/QueryCostComparator.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/FibonacciExponentialBackOffRetryHandler.java * (edit) lens-driver-es/src/main/java/org/apache/lens/driver/es/ESDriver.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/constraint/ThreadSafeEstimatedQueryCollectionTest.java * (edit) lens-server/src/main/java/org/apache/lens/server/query/ResultFormatter.java * (edit) lens-api/src/main/java/org/apache/lens/api/query/QueryStatus.java * (edit) lens-driver-jdbc/src/main/java/org/apache/lens/driver/jdbc/MaxJDBCConnectionCheckConstraint.java * (edit) lens-server/src/main/java/org/apache/lens/server/metrics/MetricsServiceImpl.java * (edit) lens-server-api/src/test/java/org/apache/lens/server/api/query/TestQueryContext.java * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/query/QueryEnded.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryAccepted.java * (edit) checkstyle/src/main/resources/checkstyle.xml * (edit) lens-regression/src/main/java/org/apache/lens/regression/core/constants/DriverConfig.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/StatusChange.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/collect/QueryCollectUtil.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/TestEventService.java * (edit) lens-server-api/src/main/java/org/apache/lens/server/api/query/constraint/MaxConcurrentDriverQueriesConstraint.java * (delete) lens-server/src/main/java/org/apache/lens/server/query/QueryPriorityComparator.java * (add) lens-server-api/src/test/java/org/apache/lens/server/api/query/comparators/ChainedComparatorTest.java * (add) lens-server/src/test/java/org/apache/lens/server/query/retry/MockDriverForRetries.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/DefaultRetryPolicyDecider.java * (edit) lens-server/src/main/java/org/apache/lens/server/query/QueryEventHttpNotifier.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QuerySuccess.java * (delete) lens-server/src/main/java/org/apache/lens/server/query/QueryComparator.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/BackOffRetryHandler.java * (edit) lens-server/src/main/java/org/apache/lens/server/query/LensServerDAO.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/ChainedRetryPolicyDecider.java * (edit) lens-server/src/main/java/org/apache/lens/server/scheduler/SchedulerQueryEventListener.java * (add) lens-server/src/test/resources/drivers/mock/single_failure/failing-query-driver-site.xml * (add) lens-server-api/src/main/java/org/apache/lens/server/api/driver/DriverConfiguration.java * (add) lens-server/src/test/resources/drivers/retry/single_failure/driver-site.xml * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryRunning.java * (edit) lens-driver-hive/src/test/java/org/apache/lens/driver/hive/TestHiveDriver.java * (edit) lens-api/src/test/java/org/apache/lens/api/jaxb/YAMLToStringStrategyTest.java * (add) lens-server/src/test/java/org/apache/lens/server/query/retry/TestServerRetryPolicyDecider.java * (edit) lens-server/src/test/java/org/apache/lens/server/query/constraint/DefaultQueryLaunchingConstraintsCheckerTest.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/FailureContext.java * (edit) lens-driver-es/src/test/resources/hive-site.xml * (delete) lens-server-api/src/main/java/org/apache/lens/server/api/common/OperationRetryHandlerFactory.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/retry/NoRetryHandler.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/events/QueryClosed.java * (edit) lens-driver-jdbc/src/main/java/org/apache/lens/driver/jdbc/JDBCDriverConfConstants.java * (add) lens-server-api/src/main/java/org/apache/lens/server/api/query/comparators/QueryPriorityComparator.java * (edit)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581294#comment-15581294 ] Rajat Khandelwal commented on LENS-743: --- Committed myself. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Fix For: 2.7 > > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch, > LENS-743.18.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572113#comment-15572113 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.18.patch|https://issues.apache.org/jira/secure/attachment/12833052/LENS-743.18.patch] and ran command: mvn clean install -fae. Result: Success. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1059/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch, > LENS-743.18.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571597#comment-15571597 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.18.patch|https://issues.apache.org/jira/secure/attachment/12833052/LENS-743.18.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1055/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch, > LENS-743.18.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571412#comment-15571412 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.18.patch|https://issues.apache.org/jira/secure/attachment/12833052/LENS-743.18.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1051/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch, > LENS-743.18.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571089#comment-15571089 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.18.patch|https://issues.apache.org/jira/secure/attachment/12833052/LENS-743.18.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1048/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch, > LENS-743.18.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571014#comment-15571014 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch, > LENS-743.18.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568761#comment-15568761 ] Hadoop QA commented on LENS-743: Patch does not apply. Build job: https://builds.apache.org/job/PreCommit-Lens-Build/1047/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch, LENS-743.16.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561557#comment-15561557 ] Rajat Khandelwal commented on LENS-743: --- About time! :) > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561547#comment-15561547 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.15.patch|https://issues.apache.org/jira/secure/attachment/12832415/LENS-743.15.patch] and ran command: mvn clean install -fae. Result: Success. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1035/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561515#comment-15561515 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.15.patch|https://issues.apache.org/jira/secure/attachment/12832415/LENS-743.15.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1036/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561471#comment-15561471 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch, LENS-743.15.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561455#comment-15561455 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.14.patch|https://issues.apache.org/jira/secure/attachment/12832411/LENS-743.14.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1029/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561358#comment-15561358 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch, LENS-743.14.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559443#comment-15559443 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.13.patch|https://issues.apache.org/jira/secure/attachment/12832318/LENS-743.13.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1028/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559416#comment-15559416 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch, > LENS-743.13.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559382#comment-15559382 ] Hadoop QA commented on LENS-743: Applied patch: [LENS-743.12.patch|https://issues.apache.org/jira/secure/attachment/12832315/LENS-743.12.patch] and ran command: mvn clean install -fae. Result: Failure. Build Job: https://builds.apache.org/job/PreCommit-Lens-Build/1020/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559371#comment-15559371 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch, LENS-743.12.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556314#comment-15556314 ] Hadoop QA commented on LENS-743: Patch does not apply. Build job: https://builds.apache.org/job/PreCommit-Lens-Build/1018/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1817#comment-1817 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch, LENS-743.11.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15551859#comment-15551859 ] Hadoop QA commented on LENS-743: Patch does not apply. Build job: https://builds.apache.org/job/PreCommit-Lens-Build/1001/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15551818#comment-15551818 ] Rajat Khandelwal commented on LENS-743: --- Taking patch from reviewboard and attaching > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > Attachments: LENS-743.09.patch > > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15535812#comment-15535812 ] Rajat Khandelwal commented on LENS-743: --- The policies and retries are persisted and recovered upon restart. So it resumes from where it left off. So if retries are not exhausted and query is in queue, it'll be restored in the queue, from where QuerySubmitter takes care of taking retry policy into consideration. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532095#comment-15532095 ] Rajat Khandelwal commented on LENS-743: --- More changes: h4. Query API changes Details api will return all attempts. Status api will return current attempt's status. In case of retries and status being zero, Progress message should help users understand that their query is being retried. h4. Unification of Drivers Common functionality of Drivers has been pulled out from individual implementations to AbstractLensDriver. * Each driver implemented configure, in which: ** Added its own resource file ** Set configuration based on last step, which can be returned in getConf Now, both these steps are moved to AbstractLensDriver * Each Driver initialized its own query launching constraints and waiting query selection policies. This is also moved to Abstract LensDriver * In line with these, initialization of retry policy decider is also done in AbstractLensDriver * A subclass of Configuration has been added for driver configuration. This allows for flexible and unifiable properties for the above things. I'll explain with an example. For reading retry policy, the driver will first check "lens.driver.driver_type.retry.policy" key. If that's not present, it'll look for "lens.driver.retry.policy". If that's also not there, it'll check for "retry.policy". This change allows us to get rid of duplicate properties like {"lens.driver.hive.query.launching.constraint.factories" and "lens.driver.jdbc.query.launching.constraint.factories"} and allow us to use just "query.launching.constraint.factories". This change doesn't force the move, it just facilitates it. So backward compatibility is maintained. Another upside is, that now any new driver has these things readily available with a configuration and no code. h4. Query Events Query Events have been logically grouped into a module. h4. Transaction semantics in finished queries Finished queries are inserted by * turn autocommit off * insert query * insert attempts * commit. This change renders LENS-1308 moot. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15529044#comment-15529044 ] Rajat Khandelwal commented on LENS-743: --- It will retry only on d3. That is also dependent on whether server retry policy allows more than one retry (since according to server retry policy, one retry has been done while trying on d2). > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528395#comment-15528395 ] Puneet Gupta commented on LENS-743: --- [~prongs] Thanks for documenting this one in such detail. +1 for dual re-policy (driver+server) I had one doubt. Suppose we have 3 drivers (d1,d,2,d3). Query was run on d1 and failed twice and d1's policy confirmed a no-retry. The server policy said try on d2,d3 and d2 was chosen based on cost. Then 2 tries were done on d2 and they failed too and d2's policy confirmed a no-retry. Will the server policy now try on d3 alone now or d1 and d3? > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15525996#comment-15525996 ] Rajat Khandelwal commented on LENS-743: --- Have posted revision 3 of the patch. The patch has become quite big. Will summarize changes done till now in another comment. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507348#comment-15507348 ] Rajat Khandelwal commented on LENS-743: --- Created https://reviews.apache.org/r/52088/ > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016, java > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177727#comment-15177727 ] Rajat Khandelwal commented on LENS-743: --- Hi [~chidelmun] Wonderful to see that you're interested. For starting up on Apache Lens, you should consult http://lens.apache.org/. The website contains a high-level description of the project, along with resources describing how to contribute. Try to set-up the environment. Feel free to mail the dev mailing list (dev@lens.apache.org) if you get stuck anywhere. This issue is for adding a new feature in lens. So after setting up, the next step would be to get to know all the moving parts surrounding that feature. We'll help you understand once you get to that stage. Then when you're actually coding for the feature, we'll provide constructive feedback through github/jira/reviewboard ensuring a good quality feature addition in lens along with a huge learning experience for you :) Hope to see you on the mailing lists :) > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016 > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177690#comment-15177690 ] Delveri Munang commented on LENS-743: - Hi, I am Delveri Munang a Software engineering student at University of Buea, Cameroon and I wish to work on this project during GSoC 2016. I have good Knowledge of Java and network programming. I look forward to hearing from anyone soon on pointers as to where to get started. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: gsoc2016 > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052402#comment-15052402 ] Rajat Khandelwal commented on LENS-743: --- I'm thinking Lens server will consult the driver regarding the retry mechanism. Driver will decide based on failure type. Policy can be exponential backoff or immediate retry. Only High level constructs but I feel that'll get clearer as I dive into coding. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052190#comment-15052190 ] Puneet Gupta commented on LENS-743: --- Configurations *Retry attempts (driver specific/ Error specific or common?) *Wait duration between retries (can have a default value and also based on error type) How will the APIs respond ? * query details : show details for all attempts * query status : show status of latest attempt * show logs : shows logs for all attempts > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048410#comment-15048410 ] Rajat Khandelwal commented on LENS-743: --- I'm thinking creating a new query handle will result in addition of a lot of book-keeping code. Instead of that, I'm thinking we should add a retry-count field in query context. The email failure message will be sent to the user as usual. User will be notified that we're re-trying the query. Query handle will be same, all the state transitions will be same, just that after FAILED, it'll move to QUEUED/RUNNING state. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: newbie > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048543#comment-15048543 ] Rajat Khandelwal commented on LENS-743: --- One major thing to consider is whether in an event of re-try, the query go to the queue Or it should be directly launched on driver. I believe that the retry should not start from the beginning, it should just re-launch on driver. But that has the hazard of one query taking up major chunk resources in retries hindering the execution of other queries. Would be helpful if other can share their thoughts on this. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: newbie > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048541#comment-15048541 ] Puneet Gupta commented on LENS-743: --- *Different IDs Approach* *Pros* 1. Clean separation between runs with separate log files/DB entry for each run *Cons* We have to add extra functionality that user has to be aware of. May be retry option should be switched of by default so that existing users are not affected. 1. Api to access all linked ids. 2. Existing apis may need option to show the status/details/result/etc for the latest retry by passing the original handle ( for convenience sake ) ? Please add to the list .. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > Labels: newbie > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048560#comment-15048560 ] Amareshwari Sriramadasu commented on LENS-743: -- Note on this : Retry can actually happen for each phase. For example, if result formatting failed. Only formatting can be retried and whole query re-run is not required. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048582#comment-15048582 ] Rajat Khandelwal commented on LENS-743: --- Agree with the attempts approach. We can add the attempt number as a column in db. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048596#comment-15048596 ] Puneet Gupta commented on LENS-743: --- Good Question. Adding to that ... I would want a quick retry incase the failure happens early on (what is early on is questionable) .. say a person submitted a query and it was promoted to run only after 10 hours and then it fails within few seconds/mins due to transient failure But then if say a person submitted a query and it was promoted to run only after 10 hours and then it fails after 10 more hours, should we re run immediately.. from user perspective yes ... form lens perspective .. not sure Another thing we need to consider is if we do a quick retry, the cause of transient failure may still be persisting. Should we wait and try ? How long should we wait ? Should all new queries also wait (because they ll fail anyway).. say in case the failures are coz Hive Server is clogged (excessive GC/etc)? Another thought , may be we should have the first re run immediately (after some set wait time based on type of error) and the subsequent runs can have exponential wait times. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu >Assignee: Rajat Khandelwal > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (LENS-743) Query failure retries for transient errors
[ https://issues.apache.org/jira/browse/LENS-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021880#comment-15021880 ] Rajat Khandelwal commented on LENS-743: --- [~yash...@gmail.com] Are you working on this? We at inmobi were thinking of picking this up if noone is working on this. > Query failure retries for transient errors > -- > > Key: LENS-743 > URL: https://issues.apache.org/jira/browse/LENS-743 > Project: Apache Lens > Issue Type: Improvement > Components: server >Reporter: Amareshwari Sriramadasu > Labels: newbie > > There have to be retries for query failures for transient errors like network > errors (Hive server not reachable/ Metastore not reachable/ DB not > reachable). Retries should be available for each phase - submission, > execution, updating status, fetching results and formatting. > Right now, any such failure results in marking query as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)