[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

Peter Vary (JIRA) Tue, 08 Aug 2017 11:50:36 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118850#comment-16118850
 ]


Peter Vary commented on HIVE-17270:
-----------------------------------

{code:title=SparkSessionImpl}
  @Override
  public ObjectPair<Long, Integer> getMemoryAndCores() throws Exception {
    SparkConf sparkConf = hiveSparkClient.getSparkConf();
    int numExecutors = hiveSparkClient.getExecutorCount();
[..]
    int totalCores;
    String masterURL = sparkConf.get("spark.master");
    if (masterURL.startsWith("spark")) {
[..]
    } else {
      int coresPerExecutor = sparkConf.getInt("spark.executor.cores", 1);
      totalCores = numExecutors * coresPerExecutor;
    }
    totalCores = totalCores / sparkConf.getInt("spark.task.cpus", 1);

    long memoryPerTaskInBytes = totalMemory / totalCores;
    LOG.info("Spark cluster current has executors: " + numExecutors
        + ", total cores: " + totalCores + ", memory per executor: "
        + executorMemoryInMB + "M, memoryFraction: " + memoryFraction);
    return new ObjectPair<Long, Integer>(Long.valueOf(memoryPerTaskInBytes),
        Integer.valueOf(totalCores));
  }
{code}

So my guess is the problem with {{hiveSparkClient.getExecutorCount()}}

This seems right, but... Who knows :)
{code:title=SparkClientImpl.GetExecutorCountJob}
  private static class GetExecutorCountJob implements Job<Integer> {
      private static final long serialVersionUID = 1L;

      @Override
      public Integer call(JobContext jc) throws Exception {
        // minus 1 here otherwise driver is also counted as an executor
        int count = jc.sc().sc().getExecutorMemoryStatus().size() - 1;
        return Integer.valueOf(count);
      }

  }
{code}

> Qtest results show wrong number of executors
> --------------------------------------------
>
>                 Key: HIVE-17270
>                 URL: https://issues.apache.org/jira/browse/HIVE-17270
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 3.0.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>
> The hive-site.xml shows, that the TestMiniSparkOnYarnCliDriver uses 2 cores, 
> and 2 executor instances to run the queries. See: 
> https://github.com/apache/hive/blob/master/data/conf/spark/yarn-client/hive-site.xml#L233
> When reading the log files for the query tests, I see the following:
> {code}
> 2017-08-08T07:41:03,315  INFO [0381325d-2c8c-46fb-ab51-423defaddd84 main] 
> session.SparkSession: Spark cluster current has executors: 1, total cores: 2, 
> memory per executor: 512M, memoryFraction: 0.4
> {code}
> See: 
> http://104.198.109.242/logs/PreCommit-HIVE-Build-6299/succeeded/171-TestMiniSparkOnYarnCliDriver-insert_overwrite_directory2.q-scriptfile1.q-vector_outer_join0.q-and-17-more/logs/hive.log
> When running the tests against a real cluster, I found that running an 
> explain query for the first time I see 1 executor, but running it for the 
> second time I see 2 executors.
> Also setting some spark configuration on the cluster resets this behavior. 
> For the first time I will see 1 executor, and for the second time I will see 
> 2 executors again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

Reply via email to