thinh2 commented on issue #12057:
URL: https://github.com/apache/datafusion/issues/12057#issuecomment-2323096635
Hi @2010YOUY01 ,
I am stucking with this bug several days without any progression, do you
have any recommendation to debug the query execution issue? Now, I am able to
reproduce the issue and after turn on `RUST_LOG=trace`, here is the information
related to the issue I got and some of my guess and questions:
- Query's physical plan:
` WindowAggExec: wdw=[sum(Int64(1)) PARTITION BY [Boolean(false) =
Boolean(false)] ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING:
Ok(Field { name: "sum(Int64(1)) PARTITION BY [Boolean(false) = Boolean(false)]
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING", data_type: Int64,
nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame:
WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound:
Following(UInt64(NULL)), is_causal: false }]
CoalesceBatchesExec: target_batch_size=8192
RepartitionExec: partitioning=Hash([true], 4), input_partitions=4
RepartitionExec: partitioning=RoundRobinBatch(4),
input_partitions=1
ProjectionExec: expr=[]
CoalesceBatchesExec: target_batch_size=8192
FilterExec: (false > (v1@0 = v1@0)) IS DISTINCT FROM true
MemoryExec: partitions=1, partition_sizes=[1]
`
- Debug log with error:
`
[2024-08-31T01:31:23Z DEBUG datafusion_physical_plan::stream] Stopping
execution: plan returned error: WindowAggExec: wdw=[sum(Int64(1)) PARTITION BY
[Boolean(false) = Boolean(false)] ROWS BETWEEN UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING: Ok(Field { name: "sum(Int64(1)) PARTITION BY
[Boolean(false) = Boolean(false)] ROWS BETWEEN UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING", data_type: Int64, nullable: true, dict_id: 0,
dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows,
start_bound: Preceding(UInt64(NULL)), end_bound: Following(UInt64(NULL)),
is_causal: false }]
`
- From my understanding, I suspect that the issue is because of the
repartition_exec `RepartitionExec: partitioning=Hash([true], 4),
input_partitions=4`. This `RepartitionExec: partitioning=Hash` receive an empty
`RecordBatch` because of the empty `ProjectionExec: expr=[]` I guess. And
processing this empty `RecordBatch` lead to panic. Is my guess correct and how
can I verify that it is correct? I think it is contradict with your assumption
that it `related to repartition execution in window functions` . May I know
what is the physical query plan related to the `repartition execution in window
functions` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]