Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/19124#discussion_r136916230
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
@@ -169,6 +172,18 @@ class OrcFileFormat extends FileFormat with
DataSourceRegister with Serializable
}
}
}
+
+ private def checkFieldName(name: String): Unit = {
+ try {
+ TypeDescription.fromString(s"struct<$name:int>")
+ } catch {
+ case _: IllegalArgumentException =>
+ throw new AnalysisException(
+ s"""Attribute name "$name" contains invalid character(s).
--- End diff --
I agree with you that `column` is more accurate here. Previously, I
borrowed this from `ParquetSchemaConverter`
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L565-L572
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]