guanghuan96 opened a new issue, #29443:
URL: https://github.com/apache/doris/issues/29443

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   2.0.3
   
   ### What's Wrong?
   
   完整sql语句如下:
   ```
   select 
         count(distinct employee_no) as employee_no   
   from retail_dim.dim_ehr_org_staff_position_all_1229 
   where employee_no in (
       select distinct employee_no  
       from retail_dim.dim_ehr_org_staff_position_all_1229
       where office_start_date = '2023-12-01'
   ) 
   ```
   -- 问题描述:
   where条件里的子查询查出的结果集条数为:2012条
   select count(distinct employee_no)   
   from retail_dim.dim_ehr_org_staff_position_all_1229
   where office_start_date = '2023-12-01'
   
   但是外面整个语句查询数量却少了一半,只有918条
   select 
      count(distinct employee_no)    
   from retail_dim.dim_ehr_org_staff_position_all_1229 
   where employee_no in (
       select distinct employee_no  
       from retail_dim.dim_ehr_org_staff_position_all_1229
       where office_start_date = '2023-12-01'
     ) 
   
   ```
   CREATE TABLE `dim_ehr_org_staff_position_all_1229` (
     `employee_no` char(20) NULL COMMENT '员工工号',
     `organ_new_no` char(20) NULL COMMENT '组织编码',
     `position_no` char(10) NULL COMMENT '岗位编号',
     `office_start_date` date NULL COMMENT '任职开始日期',
     `office_status` tinyint(4) NULL COMMENT '在本组织任职状态(1:在职,0:失效)',
     `employee_name` char(50) NULL COMMENT '员工姓名',
     `parent_employee_no` char(10) NULL COMMENT '上级领导编码',
     `parent_employee_name` varchar(50) NULL COMMENT '上级领导姓名',
     `unit_type` char(20) NULL COMMENT '组织类型',
     `organ_new_name` varchar(200) NULL COMMENT '组织名称',
     `position_name` char(50) NULL COMMENT '岗位名称',
     `position_type` tinyint(4) NULL COMMENT '岗位类型(01:管理线,02:陈列线)',
     `brand_unit_no` char(20) NULL COMMENT '品牌部编码',
     `region_no` char(10) NULL COMMENT '小区编码',
     `region_name` char(30) NULL COMMENT '小区名称',
     `office_end_date` date NULL COMMENT '任职结束日期',
     `update_time` datetime NULL COMMENT '更新时间'
   ) ENGINE=OLAP
   UNIQUE KEY(`employee_no`, `organ_new_no`, `position_no`, 
`office_start_date`, `office_status`)
   COMMENT '员工任职拉链表'
   DISTRIBUTED BY HASH(`employee_no`) BUCKETS 1
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 3",
   "is_being_synced" = "false",
   "storage_format" = "V2",
   "light_schema_change" = "true",
   "disable_auto_compaction" = "false",
   "enable_single_replica_compaction" = "false"
   );
   ```
   
   
   -- 执行计划如下:
   PLAN FRAGMENT 0
     OUTPUT EXPRS:
       employee_no[#46]
     PARTITION: UNPARTITIONED
   
     HAS_COLO_PLAN_NODE: false
   
     VRESULT SINK
   
     809:VAGGREGATE (merge finalize)
     |  output: count(employee_no[#45])[#46]
     |  group by: 
     |  cardinality=1
     |  
     806:VEXCHANGE
        offset: 0
   
   PLAN FRAGMENT 1
   
     PARTITION: HASH_PARTITIONED: employee_no[#21]
   
     HAS_COLO_PLAN_NODE: true
   
     STREAM DATA SINK
       EXCHANGE ID: 806
       UNPARTITIONED
   
     803:VAGGREGATE (update serialize)
     |  output: partial_count(employee_no[#44])[#45]
     |  group by: 
     |  cardinality=1
     |  
     800:VAGGREGATE (merge serialize)
     |  group by: employee_no[#43]
     |  cardinality=42,546
     |  
     797:VAGGREGATE (update serialize)
     |  group by: employee_no[#42]
     |  cardinality=85,092
     |  
     794:VHASH JOIN
     |  join op: LEFT SEMI JOIN(COLOCATE[])[]
     |  equal join conjunct: employee_no[#40] = employee_no[#20]
     |  runtime filters: RF000[in_or_bloom] <- 
employee_no[#20](21273/32768/1048576)
     |  cardinality=170,185
     |  vec output tuple id: 6
     |  vIntermediate tuple ids: 5 
     |  hash output slot ids: 40 
     |  
     |----791:VAGGREGATE (update finalize)
     |    |  group by: employee_no[#19]
     |    |  cardinality=21,273
     |    |  
     |    781:VOlapScanNode
     |       TABLE: 
default_cluster:retail_dim.dim_ehr_org_staff_position_all_1229(dim_ehr_org_staff_position_all_1229),
 PREAGGREGATION: OFF. Reason: __DORIS_DELETE_SIGN__ is used as conjuncts.
     |       PREDICATES: office_start_date[#3] = '2023-12-01' AND 
__DORIS_DELETE_SIGN__[#17] = 0
     |       partitions=1/1 (dim_ehr_org_staff_position_all_1229), tablets=1/1, 
tabletList=4534320
     |       cardinality=170185, avgRowSize=121.58043, numNodes=3
     |       pushAggOp=NONE
     |       projections: employee_no[#0]
     |       project output tuple id: 1
     |    
     774:VOlapScanNode
        TABLE: 
default_cluster:retail_dim.dim_ehr_org_staff_position_all_1229(dim_ehr_org_staff_position_all_1229),
 PREAGGREGATION: OFF. Reason: __DORIS_DELETE_SIGN__ is used as conjuncts.
        PREDICATES: __DORIS_DELETE_SIGN__[#38] = 0
        runtime filters: RF000[in_or_bloom] -> employee_no[#21]
        partitions=1/1 (dim_ehr_org_staff_position_all_1229), tablets=1/1, 
tabletList=4534320
        cardinality=170185, avgRowSize=121.58043, numNodes=3
        pushAggOp=NONE
        projections: employee_no[#21]
        project output tuple id: 4
   
   
   
   
   
   
   ### What You Expected?
   
   期望得到正确的结果:
   整个sql查询出来的结果应是2012条
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to