ufuksungu commented on PR #44022: URL: https://github.com/apache/spark/pull/44022#issuecomment-1917525146
@srowen, Before this PR, I came across a scenario and that time i was using original repository of XML (also i know you are one of the contributor of that repository). Scenario was, assume that I have some complex xml files with lots of layers in it and each of them presents different tags. So at each tag, I need to apply specific operations or transformations on the related data and save it to somewhere else. Basically, apply operation to tag then save it and proceed to next or inner tag and some tags can be irrelevant as well (lets say after 8th level). Basically I want to apply my operations to xml level by level (and lazily). And if i want to proceed nested tag, to get structured format, I need to apply some transformations manually. But repository already have some functions (InferSchema.infer and StaxXmlParser.parseColumn) to convert xml to structured data. But if you want to use them i think i need to keep inner xmls in original format. At that time, I tought, it made more sense to use existing functions. And because of that I've opened this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
