Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19871#discussion_r154847064
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
---
@@ -568,8 +574,13 @@ object DataSource extends Logging {
"org.apache.spark.Logging")
/** Given a provider name, look up the data source class definition. */
- def lookupDataSource(provider: String): Class[_] = {
- val provider1 = backwardCompatibilityMap.getOrElse(provider, provider)
+ def lookupDataSource(provider: String, conf: SQLConf): Class[_] = {
+ val provider1 = backwardCompatibilityMap.getOrElse(provider, provider)
match {
+ case name if name.equalsIgnoreCase("orc") &&
+ conf.getConf(SQLConf.ORC_IMPLEMENTATION) == "native" =>
+ classOf[OrcFileFormat].getCanonicalName
+ case name => name
--- End diff --
if `ORC_IMPLEMENTATION` is `hive`, we leave the provider as it was, which
may be `orc`. Then we will hit `Multiple sources found` issue, aren't we? Both
the old and new orc has the same short name `orc`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]