Re: [VOTE] SPIP: XML data source support

Mich Talebzadeh Sat, 29 Jul 2023 02:29:10 -0700

+1 for me.

Though Databriks did a good job releasing the code.


GitHub - databricks/spark-xml: XML data source for Spark SQL and DataFrames
<https://github.com/databricks/spark-xml>

<https://github.com/databricks/spark-xml>
Mich Talebzadeh,
Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sat, 29 Jul 2023 at 06:34, Jia Fan <[email protected]> wrote:

>
> + 1
>
>
> 2023年7月29日 13:06，Adrian Pop-Tifrea <[email protected]> 写道：
>
> +1, the more data source formats, the better, and if the solution is
> already thoroughly tested, I say we should go for it.
>
> On Sat, Jul 29, 2023, 06:35 Xiao Li <[email protected]> wrote:
>
>> +1
>>
>> On Fri, Jul 28, 2023 at 15:54 Sean Owen <[email protected]> wrote:
>>
>>> +1 I think that porting the package 'as is' into Spark is probably
>>> worthwhile.
>>> That's relatively easy; the code is already pretty battle-tested and not
>>> that big and even originally came from Spark code, so is more or less
>>> similar already.
>>>
>>> One thing it never got was DSv2 support, which means XML reading would
>>> still be somewhat behind other formats. (I was not able to implement it.)
>>> This isn't a necessary goal right now, but would be possibly part of the
>>> logic of moving it into the Spark code base.
>>>
>>> On Fri, Jul 28, 2023 at 5:38 PM Sandip Agarwala
>>> <[email protected]> wrote:
>>>
>>>> Dear Spark community,
>>>>
>>>> I would like to start the vote for "SPIP: XML data source support".
>>>>
>>>> XML is a widely used data format. An external spark-xml package (
>>>> https://github.com/databricks/spark-xml) is available to read and
>>>> write XML data in spark. Making spark-xml built-in will provide a better
>>>> user experience for Spark SQL and structured streaming. The proposal is to
>>>> inline code from the spark-xml package.
>>>>
>>>> SPIP link:
>>>>
>>>> https://docs.google.com/document/d/1ZaOBT4-YFtN58UCx2cdFhlsKbie1ugAn-Fgz_Dddz-Q/edit?usp=sharing
>>>>
>>>> JIRA:
>>>> https://issues.apache.org/jira/browse/SPARK-44265
>>>>
>>>> Discussion Thread:
>>>> https://lists.apache.org/thread/q32hxgsp738wom03mgpg9ykj9nr2n1fh
>>>>
>>>> Please vote on the SPIP for the next 72 hours:
>>>> [ ] +1: Accept the proposal as an official SPIP
>>>> [ ] +0
>>>> [ ] -1: I don’t think this is a good idea because __.
>>>>
>>>> Thanks, Sandip
>>>>
>>>
>

Re: [VOTE] SPIP: XML data source support

Reply via email to