Hi,
I see the skew handling strategy as mentioned in hive-964. Here are some 
questions.
1. how to get the big keys for a table? Launch a mr job to build histogram on 
each table?
2. now that we get big/skewed keys, do we also have small/non-skewed keys? Do 
we process these non-skewed keys in the same way (replicate join), or in the 
traditional way (redistribution join)?

Thanks,
-Gang




Reply via email to