[ https://issues.apache.org/jira/browse/HIVE-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V resolved HIVE-7074. --------------------------- Resolution: Not a Problem > The reducer parallelism should be a prime number for better stride protection > ----------------------------------------------------------------------------- > > Key: HIVE-7074 > URL: https://issues.apache.org/jira/browse/HIVE-7074 > Project: Hive > Issue Type: Improvement > Components: Statistics > Reporter: Gopal V > Assignee: Gopal V > Attachments: HIVE-7074.1.patch > > > The current hive reducer parallelism results in stride issues with key > distribution. > a JOIN generating even numbers will get strided onto only some of the > reducers. > The probability of distribution skew is controlled by the number of common > factors shared by the hashcode of the key and the number of buckets. > Using a prime number within the reducer estimation will cut that probability > down by a significant amount. -- This message was sent by Atlassian JIRA (v6.2#6252)