KomachiSion opened a new issue #2094: Lose result set when use DISTINCT syntax. URL: https://github.com/apache/incubator-shardingsphere/issues/2094 ## Bug Report SQL `SELECT DISTINCT col as col, count(*) from table_name group by col` should has the same result with SQL `SELECT col, count(*) from table_name group by col`. But the former loses data. ### Which version of ShardingSphere did you use? 4.0.0-RC1 ### Which project did you use? Sharding-JDBC or Sharding-Proxy? Sharding-Proxy ### Expected behavior Same result for two SQL. ### Actual behavior The result with DISTINCT syntax lose data. ### Reason analyze (If you can) The final result only contains the result of the first sharding-table of each sharding-database. And I checked the proxy log, SQL has been routed to all sharding nodes. ``` [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Rule Type: sharding [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Logic SQL: select distinct user_id as user_id, count(*) from t_order group by user_id [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - SQLStatement: SelectStatement(super=DQLStatement(super=AbstractSQLStatement(type=DQL, tables=Tables(tables=[Table(name=t_order, alias=Optional.absent())]), routeConditions=Conditions(orCondition=OrCondition(andConditions=[])), encryptConditions=Conditions(orCondition=OrCondition(andConditions=[])), sqlTokens=[TableToken(skippedSchemaNameLength=0, tableName=t_order, leftDelimiter=, rightDelimiter=), SQLToken(startIndex=74)], parametersIndex=0, logicSQL=select distinct user_id as user_id, count(*) from t_order group by user_id)), containStar=false, firstSelectItemStartIndex=16, selectListStopIndex=43, groupByLastIndex=73, items=[org.apache.shardingsphere.core.parse.parser.context.selectitem.DistinctSelectItem@2f5d24d9, AggregationSelectItem(type=COUNT, innerExpression=(*), alias=Optional.absent(), derivedAggregationSelectItems=[], index=-1)], groupByItems=[OrderItem(owner=Optional.absent(), name=Optional.of(user_id), orderDirection=ASC, nullOrderDirection=ASC, index=-1, expression=null, alias=Optional.of(user_id))], orderByItems=[OrderItem(owner=Optional.absent(), name=Optional.of(user_id), orderDirection=ASC, nullOrderDirection=ASC, index=-1, expression=null, alias=Optional.of(user_id))], limit=null, subqueryStatement=null, subqueryStatements=[], subqueryConditions=[]) [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual SQL: ds_0 ::: select distinct user_id as user_id, count(*) from t_order_0 group by user_id ORDER BY user_id ASC [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual SQL: ds_0 ::: select distinct user_id as user_id, count(*) from t_order_1 group by user_id ORDER BY user_id ASC [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual SQL: ds_1 ::: select distinct user_id as user_id, count(*) from t_order_0 group by user_id ORDER BY user_id ASC [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual SQL: ds_1 ::: select distinct user_id as user_id, count(*) from t_order_1 group by user_id ORDER BY user_id ASC ``` So, I guess there are some mistake for merger. ### Steps to reproduce the behavior, such as: SQL to execute, sharding rule configuration, when exception occur etc. sharding-configuration ``` schemaName: sharding_db dataSources: ds_0: url: jdbc:mysql://127.0.0.1:3306/demo_ds_0?serverTimezone=UTC&useSSL=false username: root password: connectionTimeoutMilliseconds: 30000 idleTimeoutMilliseconds: 60000 maxLifetimeMilliseconds: 1800000 maxPoolSize: 50 ds_1: url: jdbc:mysql://127.0.0.1:3306/demo_ds_1?serverTimezone=UTC&useSSL=false username: root password: connectionTimeoutMilliseconds: 30000 idleTimeoutMilliseconds: 60000 maxLifetimeMilliseconds: 1800000 maxPoolSize: 50 shardingRule: tables: t_order: actualDataNodes: ds_${0..1}.t_order_${0..1} tableStrategy: inline: shardingColumn: order_id algorithmExpression: t_order_${order_id % 2} keyGenerator: type: SNOWFLAKE column: order_id t_order_item: actualDataNodes: ds_${0..1}.t_order_item_${0..1} tableStrategy: inline: shardingColumn: order_id algorithmExpression: t_order_item_${order_id % 2} keyGenerator: type: SNOWFLAKE column: order_item_id bindingTables: - t_order,t_order_item broadcastTables: - t_config defaultDatabaseStrategy: inline: shardingColumn: user_id algorithmExpression: ds_${user_id % 2} defaultTableStrategy: none: ``` execute SQL ``` select distinct user_id as user_id, count(*) from t_order group by user_id ``` ### Example codes for reproduce this issue (such as a github link).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
