gaoyangxiaozhu commented on code in PR #6390:
URL: https://github.com/apache/incubator-gluten/pull/6390#discussion_r1675025735
##########
gluten-core/src/main/scala/org/apache/gluten/extension/columnar/OffloadSingleNode.scala:
##########
@@ -299,16 +299,15 @@ case class OffloadProject() extends OffloadSingleNode
with LogLevelUtil {
// Project is still not transformable after remove `input_file_name`
expressions.
projectExec
} else {
- // the project with `input_file_name` expression should have at most
- // one data source, reference:
+ // the project with `input_file_name` expression may have multiple data
source
+ // by union all, reference:
//
https://github.com/apache/spark/blob/e459674127e7b21e2767cc62d10ea6f1f941936c
- //
/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala#L506
+ //
/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala#L519
val leafScans = findScanNodes(projectExec)
- assert(leafScans.size <= 1)
- if (leafScans.isEmpty || FallbackTags.nonEmpty(leafScans(0))) {
+ if (leafScans.isEmpty || leafScans.forall(FallbackTags.nonEmpty)) {
Review Comment:
looks for `union` the datatype and schema of al scans should be same. so it
is ok to use `forall` here. thanks !
##########
gluten-core/src/main/scala/org/apache/gluten/extension/columnar/OffloadSingleNode.scala:
##########
@@ -299,16 +299,15 @@ case class OffloadProject() extends OffloadSingleNode
with LogLevelUtil {
// Project is still not transformable after remove `input_file_name`
expressions.
projectExec
} else {
- // the project with `input_file_name` expression should have at most
- // one data source, reference:
+ // the project with `input_file_name` expression may have multiple data
source
+ // by union all, reference:
//
https://github.com/apache/spark/blob/e459674127e7b21e2767cc62d10ea6f1f941936c
- //
/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala#L506
+ //
/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala#L519
val leafScans = findScanNodes(projectExec)
- assert(leafScans.size <= 1)
- if (leafScans.isEmpty || FallbackTags.nonEmpty(leafScans(0))) {
+ if (leafScans.isEmpty || leafScans.forall(FallbackTags.nonEmpty)) {
Review Comment:
Thanks for the fix to address the `union` case. I think we need to change
here `forall` to `exists` to always keep the fallback path when at least one
scan node falls back. Otherwise, the fallback scan node may encounter property
not found issues due to property replication happen
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]