kevinwilfong commented on code in PR #10569:
URL:
https://github.com/apache/incubator-gluten/pull/10569#discussion_r2317137834
##########
backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxIteratorApi.scala:
##########
@@ -141,12 +141,24 @@ class VeloxIteratorApi extends IteratorApi with Logging {
val planByteArray = wsCtx.root.toProtobuf.toByteArray
splitInfos.zipWithIndex.map {
- case (splitInfos, index) =>
+ case (splits, index) =>
+ val splitsByteArray = splits.zipWithIndex.map {
Review Comment:
@Yohahaha I see now why they originally added it at the split level for
ClickHouse
They only wanted to add the table schema if it was necessary, which today is
only if a file is in the TextFileFormat. Given this is a property of the
partition/split we don't know this at the time we're constructing the plan.
If we always set the table schema then it's possible the table schema has
some column type that Gluten doesn't support (e.g. TimestampNTZ) which causes
an exception serializing the ReadRelNode.
We can optionally add it to the substrait plan for the case where we want to
support index based column resolution, which can be determined at plan
generation time, however, to support file formats that always depend on knowing
the schema like Text files, I think we'll need to keep it at the split level.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]