[ https://issues.apache.org/jira/browse/HIVE-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028563#comment-13028563 ]
Siying Dong commented on HIVE-2146: ----------------------------------- review board: https://reviews.apache.org/r/685/ > Block Sampling should adjust number of reducers accordingly to make it useful > ----------------------------------------------------------------------------- > > Key: HIVE-2146 > URL: https://issues.apache.org/jira/browse/HIVE-2146 > Project: Hive > Issue Type: Bug > Reporter: Siying Dong > Assignee: Siying Dong > Attachments: HIVE-2146.1.patch, HIVE-2146.2.patch > > > Now number of reducers of block sampling is not modified, so that queries > like: > select c from tab tablesample(1 percent) group by c; > can generate huge number of reducers although the input is sampled to be > small. > We need to shrink number of reducers to make block sampling more useful. > Since now number of reducers are determined before get splits, the way to do > it probably is not clean enough, but we can do a good guess. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira