sandip-db opened a new pull request, #41832:
URL: https://github.com/apache/spark/pull/41832

   ### What changes were proposed in this pull request?
   XML is a widely used data format. An external spark-xml package 
(https://github.com/databricks/spark-xml) is available to read and write XML 
data in spark. Making spark-xml built-in will provide a better user experience 
for Spark SQL and structured streaming. The proposal is to inline code from 
spark-xml package.
   
   The PR has 4 main commits:
   i) The first commit is just a vanilla copy of spark-xml src files to 
spark/connector
   ii) The second commit fixes scala format issues and update/add ASF license 
to relevant files.
   iii) Add mvn dependencies, etc.
   iv) Use SharedSparkSession and testFiles to access resource files in the XML 
unit tests under all environment (sbt, mvn, IntelliJ).
   
   
   ### Why are the changes needed?
   Built-in support for XML data source would provide better user experience 
than having to import an external package.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, Add built-in support for XML data source.
   
   
   ### How was this patch tested?
   Tested the new unit-tests that came with the imported spark-xml package.
   Also ran ./dev/run-test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to