viirya commented on a change in pull request #35850:
URL: https://github.com/apache/spark/pull/35850#discussion_r827565712
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NestedColumnAliasing.scala
##########
@@ -321,6 +321,38 @@ object GeneratorNestedColumnAliasing {
// need to prune nested columns through Project and under Generate. The
difference is
// when `nestedSchemaPruningEnabled` is on, nested columns will be pruned
further at
// file format readers if it is supported.
+
+ // There are [[ExtractValue]] expressions on or not on the output of the
generator. Generator
+ // can also have different types:
+ // 1. For [[ExtractValue]]s not on the output of the generator,
theoretically speaking, there
+ // lots of expressions that we can push down, including non
ExtractValues and GetArrayItem
+ // and GetMapValue. But to be safe, we only handle GetStructField and
GetArrayStructFields.
Review comment:
The first item looks a bit weird. For what `ExtractValue` can be pushed
down, you can simply list them. I'm not sure "For [[ExtractValue]]s not on the
output of the generator..." means. Do you mean on the output of generator?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]