Hello Hive users, We have released Hive on MR3 1.10. MR3 is an execution engine similar to MapReduce and Tez, and it supports Hadoop, Kubernetes, and standalone mode. Hive-MR3 uses MR3 for its execution backend in Hive 3.1.3. If you are interested, please give it a try.
In MR3 1.10, we have re-written the shuffle library in Tez. In the previous version, all tasks manage fetchers independently of each other. Now all fetchers inside a container are managed by a common shuffle server. For those interested in performance comparison, here are the latest results of testing Hive-MR3 1.9/1.10, Trino 435, and Spark 3.4.1 using the (original) TPC-DS benchmark with 10TB scale. All the systems were tested with Java 17. Hive-MR3 1.9: total 6473 seconds, geo-mean 25.0 seconds. Hive-MR3 1.10: total 6138 seconds, geo-mean 24.4 seconds. Trino 435: total 6950 seconds, geo-mean 19.2 seconds. Query 23 returns wrong results. Query 72 fails. Spark 3.4.1 (using Parquet instead of ORC): total 19044 seconds, geo-mean 35.9 seconds. Thank you, --- Sungwoo