Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/19571#discussion_r146784952
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
@@ -58,10 +58,7 @@ class OrcFileFormat extends FileFormat with
DataSourceRegister with Serializable
sparkSession: SparkSession,
options: Map[String, String],
files: Seq[FileStatus]): Option[StructType] = {
- OrcFileOperator.readSchema(
- files.map(_.getPath.toString),
- Some(sparkSession.sessionState.newHadoopConf())
- )
+
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.readSchema(sparkSession,
files)
--- End diff --
I am not sure of this one too. This looks a complete rewrite of
`org.apache.spark.sql.hive.orc.OrcFileOperator.readSchema`.. Is this change
required to fix this issue?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]