viirya commented on a change in pull request #28761:
URL: https://github.com/apache/spark/pull/28761#discussion_r466542348
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/orc/OrcScanBuilder.scala
##########
@@ -60,10 +61,8 @@ case class OrcScanBuilder(
// changed `hadoopConf` in executors.
OrcInputFormat.setSearchArgument(hadoopConf, f, schema.fieldNames)
}
- val dataTypeMap = schema.map(f => quoteIfNeeded(f.name) ->
f.dataType).toMap
- // TODO (SPARK-25557): ORC doesn't support nested predicate pushdown, so
they are removed.
- val newFilters = filters.filter(!_.containsNestedColumn)
- _pushedFilters = OrcFilters.convertibleFilters(schema, dataTypeMap,
newFilters).toArray
Review comment:
The config is for DSv1 compatibility issues, and so only controls DSv1
file-based data sources. For DSv2, it is up to the DS implementation. Once the
interface is implemented, we think it supports nested filter pushdown. Our orc
filter tests cover both v1 and v2.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]