[ https://issues.apache.org/jira/browse/HIVE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817450#comment-13817450 ]
Ashutosh Chauhan commented on HIVE-5657: ---------------------------------------- +1 Can you create a follow-up jira for removing unnecessary if(firstRow) from processOp(), seems like work in that if block can be done in initializeOp() ? Also, you need to reupload your patch since seems like Hive QA hasn't picked it up yet. > TopN produces incorrect results with count(distinct) > ---------------------------------------------------- > > Key: HIVE-5657 > URL: https://issues.apache.org/jira/browse/HIVE-5657 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Priority: Critical > Attachments: D13797.1.patch, D13797.2.patch, HIVE-5657.02.patch, > HIVE-5657.1.patch.txt, example.patch > > > Attached patch illustrates the problem. > limit_pushdown test has various other cases of aggregations and distincts, > incl. count-distinct, that work correctly (that said, src dataset is bad for > testing these things because every count, for example, produces one record > only), so something must be special about this. > I am not very familiar with distinct- code and these nuances; if someone > knows a quick fix feel free to take this, otherwise I will probably start > looking next week. -- This message was sent by Atlassian JIRA (v6.1#6144)