Hi, We've been running Hive 2.0.1 on Tez 0.8.4 for a few weeks now. Most queries that we run work. However some queries that go over millions to billions of rows don't finish using Tez as the execution engine.
Here's an example of a simple query that does not finish select count(distinct external_id) from t1; Table t1 has 300 million + rows. The mappers finish pretty quickly and it's supposed to run only 1 reducer. The reducer does not finish. Here's a screenshot of another query that ran for over 8 hours, where the map output records is about a billion rows [image: Inline image 1] When I switched the execution engine to mr, the query finished in 30 mins. Are there any knobs we have to tweak? -- Regards, Premal Shah.