KomachiSion opened a new issue #2094: Lose result set when use DISTINCT syntax.
URL: https://github.com/apache/incubator-shardingsphere/issues/2094
 
 
   ## Bug Report
   
   SQL `SELECT DISTINCT col as col, count(*) from table_name group by col` 
should has the same result with SQL `SELECT col, count(*) from table_name group 
by col`. 
   But the former loses data.
   
   ### Which version of ShardingSphere did you use?
   
   4.0.0-RC1
   
   ### Which project did you use? Sharding-JDBC or Sharding-Proxy?
   
   Sharding-Proxy
   
   ### Expected behavior
   
   Same result for two SQL.
   
   ### Actual behavior
   
   The result with DISTINCT syntax lose data.
   
   ### Reason analyze (If you can)
   
   The final result only contains the result of the first sharding-table of 
each sharding-database.
   And I checked the proxy log, SQL has been routed to all sharding nodes.
   ```
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Rule 
Type: sharding
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Logic 
SQL: select distinct user_id as user_id, count(*) from t_order group by user_id
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - 
SQLStatement: 
SelectStatement(super=DQLStatement(super=AbstractSQLStatement(type=DQL, 
tables=Tables(tables=[Table(name=t_order, alias=Optional.absent())]), 
routeConditions=Conditions(orCondition=OrCondition(andConditions=[])), 
encryptConditions=Conditions(orCondition=OrCondition(andConditions=[])), 
sqlTokens=[TableToken(skippedSchemaNameLength=0, tableName=t_order, 
leftDelimiter=, rightDelimiter=), SQLToken(startIndex=74)], parametersIndex=0, 
logicSQL=select distinct user_id as user_id, count(*) from t_order group by 
user_id)), containStar=false, firstSelectItemStartIndex=16, 
selectListStopIndex=43, groupByLastIndex=73, 
items=[org.apache.shardingsphere.core.parse.parser.context.selectitem.DistinctSelectItem@2f5d24d9,
 AggregationSelectItem(type=COUNT, innerExpression=(*), 
alias=Optional.absent(), derivedAggregationSelectItems=[], index=-1)], 
groupByItems=[OrderItem(owner=Optional.absent(), name=Optional.of(user_id), 
orderDirection=ASC, nullOrderDirection=ASC, index=-1, expression=null, 
alias=Optional.of(user_id))], orderByItems=[OrderItem(owner=Optional.absent(), 
name=Optional.of(user_id), orderDirection=ASC, nullOrderDirection=ASC, 
index=-1, expression=null, alias=Optional.of(user_id))], limit=null, 
subqueryStatement=null, subqueryStatements=[], subqueryConditions=[])
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual 
SQL: ds_0 ::: select distinct user_id as user_id, count(*) from t_order_0 group 
by user_id ORDER BY user_id ASC 
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual 
SQL: ds_0 ::: select distinct user_id as user_id, count(*) from t_order_1 group 
by user_id ORDER BY user_id ASC 
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual 
SQL: ds_1 ::: select distinct user_id as user_id, count(*) from t_order_0 group 
by user_id ORDER BY user_id ASC 
   [INFO ] 14:06:15.839 [ShardingSphere-Command-5] ShardingSphere-SQL - Actual 
SQL: ds_1 ::: select distinct user_id as user_id, count(*) from t_order_1 group 
by user_id ORDER BY user_id ASC 
   ```
   So, I guess there are some mistake for merger.
   
   ### Steps to reproduce the behavior, such as: SQL to execute, sharding rule 
configuration, when exception occur etc.
   
   sharding-configuration
   ```
   schemaName: sharding_db
   
   dataSources:
    ds_0:
      url: jdbc:mysql://127.0.0.1:3306/demo_ds_0?serverTimezone=UTC&useSSL=false
      username: root
      password:
      connectionTimeoutMilliseconds: 30000
      idleTimeoutMilliseconds: 60000
      maxLifetimeMilliseconds: 1800000
      maxPoolSize: 50
    ds_1:
      url: jdbc:mysql://127.0.0.1:3306/demo_ds_1?serverTimezone=UTC&useSSL=false
      username: root
      password:
      connectionTimeoutMilliseconds: 30000
      idleTimeoutMilliseconds: 60000
      maxLifetimeMilliseconds: 1800000
      maxPoolSize: 50
   
   shardingRule:
    tables:
      t_order:
        actualDataNodes: ds_${0..1}.t_order_${0..1}
        tableStrategy:
          inline:
            shardingColumn: order_id
            algorithmExpression: t_order_${order_id % 2}
        keyGenerator:
          type: SNOWFLAKE
          column: order_id
      t_order_item:
        actualDataNodes: ds_${0..1}.t_order_item_${0..1}
        tableStrategy:
          inline:
            shardingColumn: order_id
            algorithmExpression: t_order_item_${order_id % 2}
        keyGenerator:
          type: SNOWFLAKE
          column: order_item_id
    bindingTables:
      - t_order,t_order_item
    broadcastTables:
      - t_config
    defaultDatabaseStrategy:
      inline:
        shardingColumn: user_id
        algorithmExpression: ds_${user_id % 2}
    defaultTableStrategy:
      none:
   ```
   
   execute SQL
   ```
   select distinct user_id as user_id, count(*) from t_order group by user_id
   ```
   
   ### Example codes for reproduce this issue (such as a github link).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to