Hi all, I read the tutorial of Hive, and it says that "no two aggregations can have different DISTINCT columns". Could anyone tell what is the reason ? Does the following Distinct will been translate to map-reduce job or just do it locally ?
INSERT OVERWRITE TABLE pv_gender_agg
SELECT pv_users.gender, count(DISTINCT pv_users.userid),
count(DISTINCT pv_users.ip)
FROM pv_users
GROUP BY pv_users.gender;
--
Best Regards
Jeff Zhang
