nextdreamblue opened a new issue, #15129:
URL: https://github.com/apache/doris/issues/15129

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   master
   
   ### What's Wrong?
   
   当是普通的unique 表(非merge-on-write),以及agg表的时候,一个大的数据导入,会产生若干个segment,会触发segment 
compaction,而segment 
compaction并没有对普通的unique表和agg表中,想同rowset下不同segment下有相同key(主键)的值进行合理的去重或者聚合,导致合并后的segment有相同主键,然后执行查询,就会查询到相同的主键行
   
   enable_segcompaction = false时,则没复现问题
   
   ### What You Expected?
   
   查询返回正确的结果
   
   ### How to Reproduce?
   
   使用这个测试数据
   
https://doris-agg-table-bug-1301087413.cos.ap-beijing.myqcloud.com/agg_table_bug/segcompaction_lineitem
   数据中有若干行相同主键的数据
   
   创建表
   ```
   CREATE TABLE `lineitem_unique` (
     `l_orderkey` bigint(20) NULL,
     `l_partkey` int(11) NULL,
     `l_suppkey` int(11) NULL,
     `l_linenumber` int(11) NULL,
     `l_quantity` decimal(15, 2) NULL,
     `l_extendedprice` decimal(15, 2) NULL,
     `l_discount` decimal(15, 2) NULL,
     `l_tax` decimal(15, 2) NULL,
     `l_returnflag` char(1) NULL,
     `l_linestatus` char(1) NULL,
     `l_shipdate` date NULL,
     `l_commitdate` date NULL,
     `l_receiptdate` date NULL,
     `l_shipinstruct` char(25) NULL,
     `l_shipmode` char(10) NULL,
     `l_comment` varchar(44) NULL
   ) ENGINE=OLAP
   UNIQUE KEY(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`) 
BUCKETS 1
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false"
   );
   
   
   CREATE TABLE `lineitem_agg` (
     `l_orderkey` bigint(20) NULL,
     `l_partkey` int(11) NULL,
     `l_suppkey` int(11) NULL,
     `l_linenumber` int(11) NULL,
     `l_quantity` decimal(15, 2) REPLACE NULL,
     `l_extendedprice` decimal(15, 2) REPLACE NULL,
     `l_discount` decimal(15, 2) REPLACE NULL,
     `l_tax` decimal(15, 2) REPLACE NULL,
     `l_returnflag` char(1) REPLACE NULL,
     `l_linestatus` char(1) REPLACE NULL,
     `l_shipdate` date REPLACE NULL,
     `l_commitdate` date REPLACE NULL,
     `l_receiptdate` date REPLACE NULL,
     `l_shipinstruct` char(25) REPLACE NULL,
     `l_shipmode` char(10) REPLACE NULL,
     `l_comment` varchar(44) REPLACE NULL
   ) ENGINE=OLAP
   AGGREGATE KEY(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`) 
BUCKETS 1
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false"
   );
   ```
   
   使用stream_load导入上边的数据文件
   
   查询
   ```
   MySQL [test]> select l_orderkey,l_partkey,l_suppkey,l_linenumber,l_comment 
from lineitem_unique where l_orderkey=5970000001 and l_partkey=75814607 and 
l_suppkey=5814608 and l_linenumber=1;
   +------------+-----------+-----------+--------------+-----------+
   | l_orderkey | l_partkey | l_suppkey | l_linenumber | l_comment |
   +------------+-----------+-----------+--------------+-----------+
   | 5970000001 |  75814607 |   5814608 |            1 |  00_1124  |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_184   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_245   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_307   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_368   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_430   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_491   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_552   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_61    |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_614   |
   +------------+-----------+-----------+--------------+-----------+
   10 rows in set (0.01 sec)
   ```
   
   获得了10行主键完全相同的key
   
   查询另外一个表
   ```
   MySQL [test]> select l_orderkey,l_partkey,l_suppkey,l_linenumber,l_comment 
from lineitem_agg where l_orderkey=5970000001 and l_partkey=75814607 and 
l_suppkey=5814608 and l_linenumber=1;
   +------------+-----------+-----------+--------------+-----------+
   | l_orderkey | l_partkey | l_suppkey | l_linenumber | l_comment |
   +------------+-----------+-----------+--------------+-----------+
   | 5970000001 |  75814607 |   5814608 |            1 |  00_1124  |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_191   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_255   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_319   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_383   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_447   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_511   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_575   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_63    |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_638   |
   +------------+-----------+-----------+--------------+-----------+
   10 rows in set (0.05 sec)
   ```
   
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to