Re: [VOTE][Format] JSON canonical extension type
> > I spoke to the DuckDB maintainers about this. DuckDB has a JSON extension > which defines a JSON column type. They intend to have DuckDB's Arrow > integrations recognize this arrow.json extension name on input and set it > on output. > That's great to hear! Thanks for checking with DuckDB Ian. Rok
Re: [VOTE][Format] JSON canonical extension type
Thanks Rok and Pradeep for your work to advance this proposal. I spoke to the DuckDB maintainers about this. DuckDB has a JSON extension which defines a JSON column type. They intend to have DuckDB's Arrow integrations recognize this arrow.json extension name on input and set it on output. Ian On Tue, May 7, 2024 at 8:21 AM Rok Mihevc wrote: > Hi all, > > With 9 +1 votes (4 binding, 5 non-binding) and 0 -1 votes the proposal is > approved as shown below and in the PR [1]. > Thank you everyone who voted and helped shape this proposal. Once the > language is merged we'll proceed with work on the C++ implementation PR > [2]. > > [1] https://github.com/apache/arrow/pull/41257 > [2] https://github.com/apache/arrow/pull/13901 > > Rok > --- > > JSON > > > * Extension name: `arrow.json`. > > * The storage type of this extension is ``StringArray`` or > or ``LargeStringArray`` or ``StringViewArray``. > Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported. > > * Extension type parameters: > > This type does not have any parameters. > > * Description of the serialization: > > Metadata is either an empty string or a JSON string with an empty object. > In the future, additional fields may be added, but they are not required > to interpret the array. >
Re: [VOTE][Format] JSON canonical extension type
Hi all, With 9 +1 votes (4 binding, 5 non-binding) and 0 -1 votes the proposal is approved as shown below and in the PR [1]. Thank you everyone who voted and helped shape this proposal. Once the language is merged we'll proceed with work on the C++ implementation PR [2]. [1] https://github.com/apache/arrow/pull/41257 [2] https://github.com/apache/arrow/pull/13901 Rok --- JSON * Extension name: `arrow.json`. * The storage type of this extension is ``StringArray`` or or ``LargeStringArray`` or ``StringViewArray``. Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported. * Extension type parameters: This type does not have any parameters. * Description of the serialization: Metadata is either an empty string or a JSON string with an empty object. In the future, additional fields may be added, but they are not required to interpret the array.
Re: [VOTE][Format] JSON canonical extension type
+1 (non-binding) On Mon, May 6, 2024 at 12:14 PM Wes McKinney wrote: > +1 > > On Tue, Apr 30, 2024 at 4:03 PM Antoine Pitrou wrote: > > > +1 (binding) for the current proposal, i.e. with the RFC 8289 > > requirement and the 3 current String types allowed. > > > > Regards > > > > Antoine. > > > > > > Le 30/04/2024 à 19:26, Rok Mihevc a écrit : > > > Hi all, thanks for the votes and comments so far. > > > I've amended [1] the proposed language with the RFC-8259 requirement as > > it > > > seems to be almost unanimously requested. New language is below. > > > To Micah's comment regarding rejecting Binary arrays [2] - please > discuss > > > in the PR. > > > > > > Let's leave the vote open until after the May holiday. > > > > > > Rok > > > > > > [1] > > > > > > https://github.com/apache/arrow/pull/41257/commits/594945010e3b7d393b411aad971743ffcdbdbc8e > > > [2] https://github.com/apache/arrow/pull/41257#discussion_r1583441040 > > > > > > > > > JSON > > > > > > > > > * Extension name: `arrow.json`. > > > > > > * The storage type of this extension is ``StringArray`` or > > >or ``LargeStringArray`` or ``StringViewArray``. > > >*Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported.* > > > > > > * Extension type parameters: > > > > > >This type does not have any parameters. > > > > > > * Description of the serialization: > > > > > >Metadata is either an empty string or a JSON string with an empty > > object. > > >In the future, additional fields may be added, but they are not > > required > > >to interpret the array. > > > > > >
Re: [VOTE][Format] JSON canonical extension type
+1 On Tue, Apr 30, 2024 at 4:03 PM Antoine Pitrou wrote: > +1 (binding) for the current proposal, i.e. with the RFC 8289 > requirement and the 3 current String types allowed. > > Regards > > Antoine. > > > Le 30/04/2024 à 19:26, Rok Mihevc a écrit : > > Hi all, thanks for the votes and comments so far. > > I've amended [1] the proposed language with the RFC-8259 requirement as > it > > seems to be almost unanimously requested. New language is below. > > To Micah's comment regarding rejecting Binary arrays [2] - please discuss > > in the PR. > > > > Let's leave the vote open until after the May holiday. > > > > Rok > > > > [1] > > > https://github.com/apache/arrow/pull/41257/commits/594945010e3b7d393b411aad971743ffcdbdbc8e > > [2] https://github.com/apache/arrow/pull/41257#discussion_r1583441040 > > > > > > JSON > > > > > > * Extension name: `arrow.json`. > > > > * The storage type of this extension is ``StringArray`` or > >or ``LargeStringArray`` or ``StringViewArray``. > >*Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported.* > > > > * Extension type parameters: > > > >This type does not have any parameters. > > > > * Description of the serialization: > > > >Metadata is either an empty string or a JSON string with an empty > object. > >In the future, additional fields may be added, but they are not > required > >to interpret the array. > > >
Re: [VOTE][Format] JSON canonical extension type
+1 (binding) for the current proposal, i.e. with the RFC 8289 requirement and the 3 current String types allowed. Regards Antoine. Le 30/04/2024 à 19:26, Rok Mihevc a écrit : Hi all, thanks for the votes and comments so far. I've amended [1] the proposed language with the RFC-8259 requirement as it seems to be almost unanimously requested. New language is below. To Micah's comment regarding rejecting Binary arrays [2] - please discuss in the PR. Let's leave the vote open until after the May holiday. Rok [1] https://github.com/apache/arrow/pull/41257/commits/594945010e3b7d393b411aad971743ffcdbdbc8e [2] https://github.com/apache/arrow/pull/41257#discussion_r1583441040 JSON * Extension name: `arrow.json`. * The storage type of this extension is ``StringArray`` or or ``LargeStringArray`` or ``StringViewArray``. *Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported.* * Extension type parameters: This type does not have any parameters. * Description of the serialization: Metadata is either an empty string or a JSON string with an empty object. In the future, additional fields may be added, but they are not required to interpret the array.
Re: [VOTE][Format] JSON canonical extension type
+1 (non-binding) Thanks for moving these two forward Rok! Am Di., 30. Apr. 2024 um 19:26 Uhr schrieb Rok Mihevc : > Hi all, thanks for the votes and comments so far. > I've amended [1] the proposed language with the RFC-8259 requirement as it > seems to be almost unanimously requested. New language is below. > To Micah's comment regarding rejecting Binary arrays [2] - please discuss > in the PR. > > Let's leave the vote open until after the May holiday. > > Rok > > [1] > > https://github.com/apache/arrow/pull/41257/commits/594945010e3b7d393b411aad971743ffcdbdbc8e > [2] https://github.com/apache/arrow/pull/41257#discussion_r1583441040 > > > JSON > > > * Extension name: `arrow.json`. > > * The storage type of this extension is ``StringArray`` or > or ``LargeStringArray`` or ``StringViewArray``. > *Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported.* > > * Extension type parameters: > > This type does not have any parameters. > > * Description of the serialization: > > Metadata is either an empty string or a JSON string with an empty object. > In the future, additional fields may be added, but they are not required > to interpret the array. >
Re: [VOTE][Format] JSON canonical extension type
Hi all, thanks for the votes and comments so far. I've amended [1] the proposed language with the RFC-8259 requirement as it seems to be almost unanimously requested. New language is below. To Micah's comment regarding rejecting Binary arrays [2] - please discuss in the PR. Let's leave the vote open until after the May holiday. Rok [1] https://github.com/apache/arrow/pull/41257/commits/594945010e3b7d393b411aad971743ffcdbdbc8e [2] https://github.com/apache/arrow/pull/41257#discussion_r1583441040 JSON * Extension name: `arrow.json`. * The storage type of this extension is ``StringArray`` or or ``LargeStringArray`` or ``StringViewArray``. *Only UTF-8 encoded JSON as specified in `rfc8259`_ is supported.* * Extension type parameters: This type does not have any parameters. * Description of the serialization: Metadata is either an empty string or a JSON string with an empty object. In the future, additional fields may be added, but they are not required to interpret the array.
Re: [VOTE][Format] JSON canonical extension type
+1 (binding) I agree we should be explicit about RFC-8259 On Mon, Apr 29, 2024 at 4:46 PM David Li wrote: > +1 (binding) > > assuming we explicitly state RFC-8259 > > On Tue, Apr 30, 2024, at 08:02, Matt Topol wrote: > > +1 (binding) > > > > On Mon, Apr 29, 2024 at 5:36 PM Ian Cook wrote: > > > >> +1 (non-binding) > >> > >> I added a comment in the PR suggesting that we explicitly refer to > RFC-8259 > >> in CanonicalExtensions.rst. > >> > >> On Mon, Apr 29, 2024 at 1:21 PM Micah Kornfield > >> wrote: > >> > >> > +1, I added a comment to the PR because I think we should recommend > >> > implementations specifically reject parsing Binary arrays with the > >> > annotation in-case we want to support non-UTF8 encodings in the future > >> > (even thought IIRC these aren't really JSON spec compliant). > >> > > >> > On Fri, Apr 19, 2024 at 1:24 PM Rok Mihevc > wrote: > >> > > >> > > Hi all, > >> > > > >> > > Following discussions [1][2] and preliminary implementation work (by > >> > > Pradeep Gollakota) [3] I would like to propose a vote to add > language > >> for > >> > > JSON canonical extension type to CanonicalExtensions.rst as in PR > [4] > >> and > >> > > written below. > >> > > A draft C++ implementation PR can be seen here [3]. > >> > > > >> > > [1] > https://lists.apache.org/thread/p3353oz6lk846pnoq6vk638tjqz2hm1j > >> > > [2] > https://lists.apache.org/thread/7xph3476g9rhl9mtqvn804fqf5z8yoo1 > >> > > [3] https://github.com/apache/arrow/pull/13901 > >> > > [4] https://github.com/apache/arrow/pull/41257 <- proposed change > >> > > > >> > > > >> > > The vote will be open for at least 72 hours. > >> > > > >> > > [ ] +1 Accept this proposal > >> > > [ ] +0 > >> > > [ ] -1 Do not accept this proposal because... > >> > > > >> > > > >> > > JSON > >> > > > >> > > > >> > > * Extension name: `arrow.json`. > >> > > > >> > > * The storage type of this extension is ``StringArray`` or > >> > > or ``LargeStringArray`` or ``StringViewArray``. > >> > > Only UTF-8 encoded JSON is supported. > >> > > > >> > > * Extension type parameters: > >> > > > >> > > This type does not have any parameters. > >> > > > >> > > * Description of the serialization: > >> > > > >> > > Metadata is either an empty string or a JSON string with an empty > >> > object. > >> > > In the future, additional fields may be added, but they are not > >> > required > >> > > to interpret the array. > >> > > > >> > > > >> > > > >> > > Rok > >> > > > >> > > >> >
Re: [VOTE][Format] JSON canonical extension type
+1 (binding) assuming we explicitly state RFC-8259 On Tue, Apr 30, 2024, at 08:02, Matt Topol wrote: > +1 (binding) > > On Mon, Apr 29, 2024 at 5:36 PM Ian Cook wrote: > >> +1 (non-binding) >> >> I added a comment in the PR suggesting that we explicitly refer to RFC-8259 >> in CanonicalExtensions.rst. >> >> On Mon, Apr 29, 2024 at 1:21 PM Micah Kornfield >> wrote: >> >> > +1, I added a comment to the PR because I think we should recommend >> > implementations specifically reject parsing Binary arrays with the >> > annotation in-case we want to support non-UTF8 encodings in the future >> > (even thought IIRC these aren't really JSON spec compliant). >> > >> > On Fri, Apr 19, 2024 at 1:24 PM Rok Mihevc wrote: >> > >> > > Hi all, >> > > >> > > Following discussions [1][2] and preliminary implementation work (by >> > > Pradeep Gollakota) [3] I would like to propose a vote to add language >> for >> > > JSON canonical extension type to CanonicalExtensions.rst as in PR [4] >> and >> > > written below. >> > > A draft C++ implementation PR can be seen here [3]. >> > > >> > > [1] https://lists.apache.org/thread/p3353oz6lk846pnoq6vk638tjqz2hm1j >> > > [2] https://lists.apache.org/thread/7xph3476g9rhl9mtqvn804fqf5z8yoo1 >> > > [3] https://github.com/apache/arrow/pull/13901 >> > > [4] https://github.com/apache/arrow/pull/41257 <- proposed change >> > > >> > > >> > > The vote will be open for at least 72 hours. >> > > >> > > [ ] +1 Accept this proposal >> > > [ ] +0 >> > > [ ] -1 Do not accept this proposal because... >> > > >> > > >> > > JSON >> > > >> > > >> > > * Extension name: `arrow.json`. >> > > >> > > * The storage type of this extension is ``StringArray`` or >> > > or ``LargeStringArray`` or ``StringViewArray``. >> > > Only UTF-8 encoded JSON is supported. >> > > >> > > * Extension type parameters: >> > > >> > > This type does not have any parameters. >> > > >> > > * Description of the serialization: >> > > >> > > Metadata is either an empty string or a JSON string with an empty >> > object. >> > > In the future, additional fields may be added, but they are not >> > required >> > > to interpret the array. >> > > >> > > >> > > >> > > Rok >> > > >> > >>
Re: [VOTE][Format] JSON canonical extension type
+1 (binding) On Mon, Apr 29, 2024 at 5:36 PM Ian Cook wrote: > +1 (non-binding) > > I added a comment in the PR suggesting that we explicitly refer to RFC-8259 > in CanonicalExtensions.rst. > > On Mon, Apr 29, 2024 at 1:21 PM Micah Kornfield > wrote: > > > +1, I added a comment to the PR because I think we should recommend > > implementations specifically reject parsing Binary arrays with the > > annotation in-case we want to support non-UTF8 encodings in the future > > (even thought IIRC these aren't really JSON spec compliant). > > > > On Fri, Apr 19, 2024 at 1:24 PM Rok Mihevc wrote: > > > > > Hi all, > > > > > > Following discussions [1][2] and preliminary implementation work (by > > > Pradeep Gollakota) [3] I would like to propose a vote to add language > for > > > JSON canonical extension type to CanonicalExtensions.rst as in PR [4] > and > > > written below. > > > A draft C++ implementation PR can be seen here [3]. > > > > > > [1] https://lists.apache.org/thread/p3353oz6lk846pnoq6vk638tjqz2hm1j > > > [2] https://lists.apache.org/thread/7xph3476g9rhl9mtqvn804fqf5z8yoo1 > > > [3] https://github.com/apache/arrow/pull/13901 > > > [4] https://github.com/apache/arrow/pull/41257 <- proposed change > > > > > > > > > The vote will be open for at least 72 hours. > > > > > > [ ] +1 Accept this proposal > > > [ ] +0 > > > [ ] -1 Do not accept this proposal because... > > > > > > > > > JSON > > > > > > > > > * Extension name: `arrow.json`. > > > > > > * The storage type of this extension is ``StringArray`` or > > > or ``LargeStringArray`` or ``StringViewArray``. > > > Only UTF-8 encoded JSON is supported. > > > > > > * Extension type parameters: > > > > > > This type does not have any parameters. > > > > > > * Description of the serialization: > > > > > > Metadata is either an empty string or a JSON string with an empty > > object. > > > In the future, additional fields may be added, but they are not > > required > > > to interpret the array. > > > > > > > > > > > > Rok > > > > > >
Re: [VOTE][Format] JSON canonical extension type
+1 (non-binding) I added a comment in the PR suggesting that we explicitly refer to RFC-8259 in CanonicalExtensions.rst. On Mon, Apr 29, 2024 at 1:21 PM Micah Kornfield wrote: > +1, I added a comment to the PR because I think we should recommend > implementations specifically reject parsing Binary arrays with the > annotation in-case we want to support non-UTF8 encodings in the future > (even thought IIRC these aren't really JSON spec compliant). > > On Fri, Apr 19, 2024 at 1:24 PM Rok Mihevc wrote: > > > Hi all, > > > > Following discussions [1][2] and preliminary implementation work (by > > Pradeep Gollakota) [3] I would like to propose a vote to add language for > > JSON canonical extension type to CanonicalExtensions.rst as in PR [4] and > > written below. > > A draft C++ implementation PR can be seen here [3]. > > > > [1] https://lists.apache.org/thread/p3353oz6lk846pnoq6vk638tjqz2hm1j > > [2] https://lists.apache.org/thread/7xph3476g9rhl9mtqvn804fqf5z8yoo1 > > [3] https://github.com/apache/arrow/pull/13901 > > [4] https://github.com/apache/arrow/pull/41257 <- proposed change > > > > > > The vote will be open for at least 72 hours. > > > > [ ] +1 Accept this proposal > > [ ] +0 > > [ ] -1 Do not accept this proposal because... > > > > > > JSON > > > > > > * Extension name: `arrow.json`. > > > > * The storage type of this extension is ``StringArray`` or > > or ``LargeStringArray`` or ``StringViewArray``. > > Only UTF-8 encoded JSON is supported. > > > > * Extension type parameters: > > > > This type does not have any parameters. > > > > * Description of the serialization: > > > > Metadata is either an empty string or a JSON string with an empty > object. > > In the future, additional fields may be added, but they are not > required > > to interpret the array. > > > > > > > > Rok > > >
Re: [VOTE][Format] JSON canonical extension type
+1, I added a comment to the PR because I think we should recommend implementations specifically reject parsing Binary arrays with the annotation in-case we want to support non-UTF8 encodings in the future (even thought IIRC these aren't really JSON spec compliant). On Fri, Apr 19, 2024 at 1:24 PM Rok Mihevc wrote: > Hi all, > > Following discussions [1][2] and preliminary implementation work (by > Pradeep Gollakota) [3] I would like to propose a vote to add language for > JSON canonical extension type to CanonicalExtensions.rst as in PR [4] and > written below. > A draft C++ implementation PR can be seen here [3]. > > [1] https://lists.apache.org/thread/p3353oz6lk846pnoq6vk638tjqz2hm1j > [2] https://lists.apache.org/thread/7xph3476g9rhl9mtqvn804fqf5z8yoo1 > [3] https://github.com/apache/arrow/pull/13901 > [4] https://github.com/apache/arrow/pull/41257 <- proposed change > > > The vote will be open for at least 72 hours. > > [ ] +1 Accept this proposal > [ ] +0 > [ ] -1 Do not accept this proposal because... > > > JSON > > > * Extension name: `arrow.json`. > > * The storage type of this extension is ``StringArray`` or > or ``LargeStringArray`` or ``StringViewArray``. > Only UTF-8 encoded JSON is supported. > > * Extension type parameters: > > This type does not have any parameters. > > * Description of the serialization: > > Metadata is either an empty string or a JSON string with an empty object. > In the future, additional fields may be added, but they are not required > to interpret the array. > > > > Rok >