xielongfei opened a new issue, #29215:
URL: https://github.com/apache/shardingsphere/issues/29215
1. **Brief Description**: I have a user table split into 32 subtables. When
querying the user list, the returned total number of records does not match the
actual data (related to my business, please refer to the detailed description
below).
2. **Versions**:
- Shardingsphere-jdbc 5.2.0
- MyBatis Plus 3.3.1
3. **Table Structure**:
```sql
CREATE TABLE user (
id INT,
name VARCHAR, -- Name
idCard VARCHAR, -- ID Card
areaId INT, -- Area ID (Sharding Key)
...
)
```
There is one user table, divided into 32 tables based on `areaId`,
corresponding to user_0, user_1, ..., user_31. 99% of queries can be executed
on a single table, based on the sharding key.
However, there is a business scenario where I need to perform a global
deduplication based on names and ID cards. This means the same name and ID card
can exist in both user_0 and user_1.
4. **Query Execution**: When I query the list, MyBatis Plus generates two
logical SQL queries:
```sql
SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user r WHERE
r.areaId IN (1, 2, 3, 4) ) TOTAL;
SELECT DISTINCT r.NAME, r.idCard FROM user r WHERE r.areaId IN (1, 2, 3,
4);
```
Actual executed SQL (I modified the sharding condition values):
```sql
-- Count Queries
SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_1 r
WHERE r.areaId IN (1) ) TOTAL;
SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_2 r
WHERE r.areaId IN (2) ) TOTAL;
SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_3 r
WHERE r.areaId IN (3) ) TOTAL;
SELECT COUNT(1) FROM ( SELECT DISTINCT r.NAME, r.idCard FROM user_4 r
WHERE r.areaId IN (4) ) TOTAL;
-- Select Queries
SELECT DISTINCT r.NAME, r.idCard FROM user_1 r WHERE r.areaId IN (1);
SELECT DISTINCT r.NAME, r.idCard FROM user_2 r WHERE r.areaId IN (2);
SELECT DISTINCT r.NAME, r.idCard FROM user_3 r WHERE r.areaId IN (3);
SELECT DISTINCT r.NAME, r.idCard FROM user_4 r WHERE r.areaId IN (4);
```

The count of 599508 is incorrect. When clicking to navigate to page 29975,
no data is returned.
5. **Problem Caused**:
When counting the total, Shardingsphere simply adds them up,
but when querying data, Shardingsphere deduplicates them through a streaming
merge concept, leading to data inconsistency.
If my description is unclear, please point it out so I can correct it. Thank
you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
[email protected]
For queries about this service, please contact Infrastructure at:
[email protected]