[ https://issues.apache.org/jira/browse/DRILL-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steven Phillips resolved DRILL-2801. ------------------------------------ Resolution: Duplicate > ORDER BY produces extra records > ------------------------------- > > Key: DRILL-2801 > URL: https://issues.apache.org/jira/browse/DRILL-2801 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 0.8.0 > Reporter: Sudheesh Katkam > Assignee: Steven Phillips > Priority: Critical > Fix For: 1.0.0 > > Attachments: data.csv > > > Running in embedded mode on my mac. > {code} > $ wc -w data.csv > 50000 data.csv > {code} > Here's the query: > {code} > 0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`; > +------------+ > | EXPR$0 | > +------------+ > | 50000 | > +------------+ > 1 row selected (0.223 seconds) > 0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY > columns[0]; > +------------+ > | EXPR$0 | > +------------+ > ... > | 6 | > +------------+ > 50,001 rows selected (0.928 seconds) > 0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT > columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col; > +------------+------------+ > | col | EXPR$1 | > +------------+------------+ > | 2 | 10000 | > | 3 | 10000 | > | 4 | 10000 | > | 5 | 10001 | > | 6 | 10000 | > +------------+------------+ > 5 rows selected (0.704 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)