sandip-db commented on code in PR #56600:
URL: https://github.com/apache/spark/pull/56600#discussion_r3449146014


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlGenerator.scala:
##########
@@ -72,7 +72,13 @@ class StaxXmlGenerator(
   private val binaryFormatter = ToStringBase.getBinaryFormatter
 
   private val gen = {
-    val factory = XMLOutputFactory.newInstance()
+    // Instantiate the Woodstox factory directly from the shaded Hadoop 
classes instead of
+    // using XMLOutputFactory.newInstance(). The latter resolves an 
implementation via the
+    // service-loader mechanism, which could pick up a different (unshaded) 
StAX provider on the
+    // classpath. Such a provider would not understand the shaded 
WstxOutputProperties keys set
+    // below and would throw IllegalArgumentException. Constructing the shaded 
factory directly
+    // guarantees the properties and the implementation always match.
+    val factory = new WstxOutputFactory()
     // to_xml disables structure validation to allow multiple root tags
     factory.setProperty(WstxOutputProperties.P_OUTPUT_VALIDATE_STRUCTURE, 
validateStructure)
     factory.setProperty(WstxOutputProperties.P_OUTPUT_VALIDATE_NAMES, 
options.validateName)

Review Comment:
   Good point. I am approving the current PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to