Hi, we have some user data with columns(userId,company,client,country,region,city), now we want to count userId by multiple column,such as : select count(distinct userId) group by company select count(distinct userId) group by company,client select count(distinct userId) group by company,client,country select count(distinct userId) group by company,client,country,region etc so each action will bring a shuffle stage, as for columns( company,client) contain column company, Is there a way to reduce shuffle stage?
Thanks for any replys