Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22597#discussion_r224947556
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
---
@@ -67,6 +67,16 @@ private[sql] object OrcFilters {
}
}
+ // Since ORC 1.5.0 (ORC-323), we need to quote for column names with `.`
characters
+ // in order to distinguish predicate pushdown for nested columns.
+ private def quoteAttributeNameIfNeeded(name: String) : String = {
+ if (!name.contains("`") && name.contains(".")) {
--- End diff --
@HyukjinKwon . Actually, Spark 2.3.2 ORC (native/hive) doesn't support a
backtick character in column names. It fails on **writing** operation. And,
although Spark 2.4.0 broadens the supported special characters like `.` and `"`
in column names, the backtick character is not handled yet.
So, for that one, I'll proceed in another PR since it's an improvement
instead of a regression.
Also, cc @gatorsmile and @dbtsai .
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]