Hi Hive users, I am happy to announce the release of MR3 1.7. MR3 is an execution engine for big data processing, and its main application Hive on MR3 is an alternative to Hive-Tez and Hive-LLAP. I would like to summarize its main features.
1. Hive on MR3 on Hadoop Hive on MR3 is easy to install on Hadoop. In particular, you don't have to upgrade Tez and Hadoop to their matching version. 2. Hive on MR3 on Kubernetes MR3 provides native support for Kubernetes, and Hive on MR3 can run directly on Kubernetes (without having to operate Hadoop on Kubernetes). On public clouds like Amazon EKS, one can take advantage of autoscaling and spot instances. 3. Hive on MR3 in standalone mode >From version 1.7, MR3 supports standalone mode which does not require a resource manager like Hadoop and Kubernetes. By exploiting standalone mode, one can run Hive on MR3 virtually in any type of cluster. Now installing Hive on MR3 is as simple as installing Trino/Presto. 4. Performance Based on 10TB TPC-DS benchmark, Hive on MR3 runs faster than Hive-LLAP (8074s vs 8680s). It is slightly slower than Trino 418 (7424s vs 8074s), but returns correct results on all 99 queries, while Trino fails or returns wrong results on some queries. 5. Java 17 As an experimental feature, MR3 supports Java 17, and Hive on MR3 can run with Java 17. From the same TPC-DS benchmark, upgrading Java from 8 to 17 yields about 8% speedup (from 8074s to 7415s). 6. Correctness Hive on MR3 is based on Hive branch-3.1 and has backported over 700 patches. In addition to q-tests included in the source code of Hive, we use TPC-DS benchmark to check the correctness of query compilation. Hive on MR3 returns correct results on all 99 queries. (Note: The current master branch of Hive returns wrong results on some queries in TPC-DS.) 7. Misc Other applications of MR3 include Spark on MR3 and MapReduce on MR3. For example, you can run MapReduce jobs directly on Kubernetes! For the full documentation (including quick start guide and release notes), please see: https://mr3docs.datamonad.com/ The git repository for Hive on MR3 can be used to build Hive on Tez as well (by ignoring the last few commits): https://github.com/mr3project/hive-mr3 Thanks, --- Sungwoo