wangyum opened a new issue, #11752:
URL: https://github.com/apache/gluten/issues/11752
### Backend
VL (Velox)
### Bug description
## **Description**
Gluten's columnar writer optimization wraps `AdaptiveSparkPlanExec` with
`ColumnarToCarrierRow` to avoid unnecessary columnar-to-row conversions.
However, this breaks the pattern matching used in Apache Spark PR #51432, which
relies on:
```scala
queryExecution.executedPlan match {
case ae: AdaptiveSparkPlanExec =>
ae.context.shuffleIds.asScala.keys
}
```
When `AdaptiveSparkPlanExec` is wrapped by `ColumnarToCarrierRow`, the
pattern matching fails, making shuffle IDs inaccessible.
### **Root Cause**
In `GlutenWriterColumnarRules.injectFakeRowAdaptor()`, when the child is an
`AdaptiveSparkPlanExec`, the original implementation:
1. Created a new `AdaptiveSparkPlanExec` with `supportsColumnar=true`
2. Wrapped this with `genColumnarToCarrierRow()` →
`ColumnarToCarrierRow(AdaptiveSparkPlanExec(...))`
This structure hides `AdaptiveSparkPlanExec` inside `ColumnarToCarrierRow`,
breaking any external pattern matching.
### **Solution**
Refactored the wrapping logic to:
1. Wrap `aqe.inputPlan` with `genColumnarToCarrierRow()` first →
`ColumnarToCarrierRow(inputPlan)`
2. Create a new `AdaptiveSparkPlanExec` with the wrapped child →
`AdaptiveSparkPlanExec(ColumnarToCarrierRow(...))`
3. Set `supportsColumnar=false` since the child is already wrapped
### Gluten version
main branch
### Spark version
spark-4.0.x
### Spark configurations
_No response_
### System information
_No response_
### Relevant logs
```bash
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]