Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/22932#discussion_r232430599
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
@@ -274,6 +278,15 @@ private[orc] class OrcOutputWriter(
override def close(): Unit = {
if (recordWriterInstantiated) {
+ // Hive 1.2.1 ORC initializes its private `writer` field at the
first write.
+ try {
+ val writerField = recordWriter.getClass.getDeclaredField("writer")
+ writerField.setAccessible(true)
+ val writer = writerField.get(recordWriter).asInstanceOf[Writer]
+ writer.addUserMetadata(SPARK_VERSION_METADATA_KEY,
UTF_8.encode(SPARK_VERSION_SHORT))
+ } catch {
+ case NonFatal(e) => log.warn(e.toString, e)
+ }
--- End diff --
For this case, I'll refactor out all the new code (line 281 ~ 289).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]