thinh2 commented on issue #12057:
URL: https://github.com/apache/datafusion/issues/12057#issuecomment-2323096635

   Hi @2010YOUY01 ,
   
   I am stucking with this bug several days without any progression, do you 
have any recommendation to debug the query execution issue? Now, I am able to 
reproduce the issue and after turn on `RUST_LOG=trace`, here is the information 
related to the issue I got and some of my guess and questions:
   
   - Query's physical plan:
   
   ` WindowAggExec: wdw=[sum(Int64(1)) PARTITION BY [Boolean(false) = 
Boolean(false)] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: 
Ok(Field { name: "sum(Int64(1)) PARTITION BY [Boolean(false) = Boolean(false)] 
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING", data_type: Int64, 
nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: 
WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound: 
Following(UInt64(NULL)), is_causal: false }]
         CoalesceBatchesExec: target_batch_size=8192
           RepartitionExec: partitioning=Hash([true], 4), input_partitions=4
             RepartitionExec: partitioning=RoundRobinBatch(4), 
input_partitions=1
               ProjectionExec: expr=[]
                 CoalesceBatchesExec: target_batch_size=8192
                   FilterExec: (false > (v1@0 = v1@0)) IS DISTINCT FROM true
                     MemoryExec: partitions=1, partition_sizes=[1]
   `
   
   - Debug log with error:
   `
   [2024-08-31T01:31:23Z DEBUG datafusion_physical_plan::stream] Stopping 
execution: plan returned error: WindowAggExec: wdw=[sum(Int64(1)) PARTITION BY 
[Boolean(false) = Boolean(false)] ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING: Ok(Field { name: "sum(Int64(1)) PARTITION BY 
[Boolean(false) = Boolean(false)] ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING", data_type: Int64, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows, 
start_bound: Preceding(UInt64(NULL)), end_bound: Following(UInt64(NULL)), 
is_causal: false }]
   ` 
   - From my understanding, I suspect that the issue is because of the 
repartition_exec `RepartitionExec: partitioning=Hash([true], 4), 
input_partitions=4`. This `RepartitionExec: partitioning=Hash` receive an empty 
`RecordBatch` because of the empty `ProjectionExec: expr=[]` I guess. And 
processing this empty `RecordBatch` lead to panic. Is my guess correct and how 
can I verify that it is correct? I think it is contradict with your assumption 
that it `related to repartition execution in window functions` . May I know 
what is the physical query plan related to the `repartition execution in window 
functions` ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to