[GitHub] incubator-griffin pull request #421: GRIFFIN-197 Treat non-existing YARN app...

chemikadze Sun, 23 Sep 2018 20:29:33 -0700

Github user chemikadze commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/421#discussion_r219724832
  
    --- Diff: 
service/src/main/java/org/apache/griffin/core/util/YarnNetUtil.java ---
    @@ -56,6 +62,14 @@ public static boolean update(String url, JobInstanceBean 
instance) {
                     instance.setState(LivySessionStates.toLivyState(state));
                 }
                 return true;
    +        } catch (HttpClientErrorException e) {
    +            LOGGER.warn("client error {} from yarn: {}",
    +                    e.getMessage(), e.getResponseBodyAsString());
    +            if (e.getStatusCode() == HttpStatus.NOT_FOUND) {
    +                // in sync with Livy behavior, see 
com.cloudera.livy.utils.SparkYarnApp
    +                instance.setState(DEAD);
    --- End diff --
    
    Only 404 is handled here, which should not be result of network issue.
    
    It looks like any kind of error reported by Yarn client (after internal 
retries) results in DEADing job on Livy side: 
https://github.com/cloudera/livy/blob/master/server/src/main/scala/com/cloudera/livy/utils/SparkYarnApp.scala#L307
    I'll need to double check whether not found applications are ever getting 
retried, to make sure behavior is same as on Livy side. If not -- then that's 
what Livy would do.

---

[GitHub] incubator-griffin pull request #421: GRIFFIN-197 Treat non-existing YARN app...

Reply via email to