28490 and the performance of Hive 4.0.1 on MR3 1.12 (vs Trino 453)

Sungwoo Park Thu, 28 Nov 2024 18:57:12 -0800

Hello,

We've merged all three pull requests. Thanks for your contributions.
>


The updated version of HIVE-28489 additionally reduces the total running
time of 10TB TPC-DS by about 100 seconds. So, the total running time now
decreases from around 5700s to 5200s. Considering the maturity of the Hive
compiler today, I would say this is a significant improvement in
performance. Thank you for merging the patches.


> > 1. The query plan is identical, but Trino is much faster. This is due to
> the architectural difference between Trino and Hive (on shuffle-intensive
> queries): Trino is based on MPP and thus uses the push model, while Hive
> uses the pull model. There is not much we can do about this type of
> queries. (Note that the push model has its own drawbacks and thus does not
> always win over the pull model. That's why Trino is much slower than Hive
> on many queries.)
>
> If we'd like to accelerate those queries, we may be able to enhance
> `tez.runtime.pipelined-shuffle.enabled`. I've not used this feature,
> and IMO the priority is lower considering the use case of Apache Hive.
>

Setting tez.runtime.pipelined-shuffle.enabled to true can be useful for a
particular type of queries, but it produces only a small speedup on average
(no more than 4 percent) when tested with 10TB TPC-DS. Besides, it has its
own drawback that it cannot be used with speculative execution (which is
important for dealing with occasional fetch delays).

Pipelined shuffling in Tez is different from pipelined shuffling in Trino
which never writes to local disks. In Tez, the output of mappers are always
written to local disks regardless of pipelined shuffling. It's just that
the output can be written incrementally in separate chunks, while each
chunk can be shuffled to downstream tasks as soon as it is created.

So, this problem can be solved only at the level of the execution engine.
For me, it's a high-priority problem because Hive in LLAP mode comes quite
close to Trino in performance and is already a strong contender as an
interactive query engine.


> > 2. Trino generates a query plan that is clearly more efficient than
> Hive. We made some attempt to find a solution in Hive, but came to a
> preliminary conclusion that this would require a significant change in the
> query compiler (e.g., if the decision made later during query compilation
> is inconsistent with an earlier assumption, retry with a different
> assumption until consistency is reached).
>
> Interesting. I would like to know more details about those points. We
> can help with that part, as I am also involved with Trino.
>

An example is query 29 of TPC-DS. Trino chooses MapJoin while Hive chooses
DynamicPartionedHashJoin, and join ordering is very different:

Trino: ((ss + sr) + cs) + i + s
Hive: (cs+ sr) + (ss + i) + s

 As a result, Hive shuffles a lot of intermediate data and runs much slower
than Trino (e.g., 104 seconds vs 15 seconds on 10TB TPC-DS). Before trying
to find a fix in Hive, I would like to understand why Trino chooses a
simple yet efficient query plan for query 29. I am not actively working on
this problem at the moment, and will create a JIRA issue when I make some
progress. Thanks.

Regards,

--- Sungwoo Park

Re: HIVE-28488/28489/28490 and the performance of Hive 4.0.1 on MR3 1.12 (vs Trino 453)

Reply via email to