[
https://issues.apache.org/jira/browse/HIVE-10671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539287#comment-14539287
]
Rui Li commented on HIVE-10671:
-------------------------------
Why does each table have 2 sizes? The following is the output of the same
command on my cluster:
{code}
[root@node13-1 ~]# hadoop fs -du -h /user/hive/warehouse/tpch_flat_orc_320.db
2.4 G /user/hive/warehouse/tpch_flat_orc_320.db/customer
53.8 G /user/hive/warehouse/tpch_flat_orc_320.db/lineitem
1.7 K /user/hive/warehouse/tpch_flat_orc_320.db/nation
12.6 G /user/hive/warehouse/tpch_flat_orc_320.db/orders
1.2 G /user/hive/warehouse/tpch_flat_orc_320.db/part
9.2 G /user/hive/warehouse/tpch_flat_orc_320.db/partsupp
980 /user/hive/warehouse/tpch_flat_orc_320.db/region
156.8 M /user/hive/warehouse/tpch_flat_orc_320.db/supplier
{code}
Q22 runs for about 57s in both yarn-client and yarn-cluster mode on my side.
I'll try other cases.
> yarn-cluster mode offers a degraded performance from yarn-client [Spark
> Branch]
> -------------------------------------------------------------------------------
>
> Key: HIVE-10671
> URL: https://issues.apache.org/jira/browse/HIVE-10671
> Project: Hive
> Issue Type: Bug
> Components: Spark
> Reporter: Xuefu Zhang
> Assignee: Rui Li
>
> With Hive on Spark, users noticed that in certain cases
> spark.master=yarn-client offers 2x or 3x better performance than if
> spark.master=yarn-cluster. However, yarn-cluster is what we recommend and
> support. Thus, we should investigate and fix the problem. One of the such
> queries is TPC-H 22.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)