I found that
hive.optimize.countdistinct=true;
is the problem, It looks like
https://issues.apache.org/jira/browse/HIVE-16654 made the side effect.
Best regards,
Eugene Chung (Korean : 정의근)
Hi everyone,
We created a video demo of fault tolerance in Hive on MR3 on Kubernetes,
using Hive 3.1.2 and MR3 1.1. Hope you enjoy it!
https://youtu.be/uoZGsMUlhew
Cheers,
--- Sungwoo
Hi,
For the same query, for example,
select count(*), count(distinct mid)
from db1.table1
where log_date between '2020-07-20' and '2020-07-26';
both Hive 2.3.2 and Hive 3.1.2 give different results for the same input.
Note that db1.table1 is an ORC table and partitioned with the log_date
Hi,
we are currently trying to use the Hive Warehouse Connector to read
transactional tables in Hive (3.0.0.3.1) from Spark (2.3.0). It seems that
there is no other option to do so, when the hive tables are transactional.
Our application (spring-boot and spark) is runnig fine without the HWC