Re: AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Christofer Dutz
Hi all,

No need to apologize. It was me who didn't find you. Guess when you go through 
all vote threads of one third of all apache projects in a couple of days, that 
can happen.

But yes: adding that "binding" for pmc members is always a good idea.

Chris

Gesendet von Outlook für Android

From: Jean-Baptiste Onofré 
Sent: Thursday, April 18, 2024 7:18:00 AM
To: dev@arrow.apache.org 
Cc: lidav...@apache.org ; cd...@apache.org 

Subject: Re: AW: Personal feedback on your last release on Apache Arrow ADBC 
0.11.0

Something that I do on releases is to count the binding/non binding
vote in the result email (to have a clear result).

Something like:

"The vote passed with the following result:
+1 (binding): foo, bar
+1 (non binding): other, other
No other vote.
"

Regards
JB

On Thu, Apr 18, 2024 at 1:54 AM Sutou Kouhei  wrote:
>
> Hi,
>
> Sorry for confusing by me... My country (Japan) uses
> "${FAMILY_NAME} ${FIRST_NAME}" order for name. I found a
> recommendation[1] from my country that "${FAMILY_NAME}
> ${FIRST_NAME}" is preferred to "${FIRST_NAME}
> ${FAMILY_NAME}" in English context too. So I changed to use
> the "${FAMILY_NAME} ${FIRST_NAME}" style a few years ago.
>
> [1] 
> https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kakuki/22/tosin04/17.html
>
> But I couldn't change the "Public name" field by
> https://id.apache.org/ . So
> https://people.apache.org/phonebook.html?uid=kou still uses
> the "${FIRST_NAME} ${FAMILY_NAME}" style.
>
>
> Should I use "+1 (binding)" instead of just "+1" to avoid
> this confusion?
>
>
> Thanks,
> --
> kou
>
> In
>  
> 
>   "AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0" on 
> Wed, 17 Apr 2024 08:44:10 +,
>   Christofer Dutz  wrote:
>
> > When looking at whimsy, I can’t see any person named Sutou Kouhei listed as 
> > member of the Arrow PMC.
> >
> > Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah … 
> > ok … then please ignore my mumbling ;-)
> >
> > And yeah … the result now also moved to the same page … guess it was sent 
> > out a while after the Announce … guess that’s why I missed it.
> >
> > Thanks for following up …
> >
> > Chris
> >
> > Von: David Li 
> > Datum: Mittwoch, 17. April 2024 um 10:36
> > An: Christofer Dutz , dev@arrow.apache.org 
> > 
> > Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC 
> > 0.11.0
> > Hi Christofer,
> >
> > Sutou Kouhei is part of the PMC.
> >
> > Additionally, there is a result email: 
> > https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15
> >
> > On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
> >> Hi all,
> >>
> >> while reviewing your projects activity in the last quarter as part of
> >> my preparation for today's borads meeting I came across your last vote
> >> on Apache Arrow ADBC 0.11.0 RC0
> >>
> >> Technically I count only 2 binding +1 votes:
> >> - Matthew Topol
> >> - Dewey Dunnington
> >>
> >> All others are not part of the PMC.
> >>
> >> I assume the Release Manager David implicitly counted himself as +1,
> >> however does a concept of an implicit vote not exist at Apache. If you
> >> want to save sending an additional email, adding something like "this
> >> also counts as my +1 vote" to your email, or - even better - send an
> >> explicit vote email.
> >>
> >> Also would it be good to have a RESULT email containing the result of a 
> >> vote.
> >>
> >> So right now we would need a third binding vote as soon as possible
> >> (Possibly also for other votes, where we had the release manager
> >> provide the missing third vote).
> >>
> >> Chris
> >>
> >> PS: Please keep me in CC as I'm not subscribed here.


Re: AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Jean-Baptiste Onofré
Something that I do on releases is to count the binding/non binding
vote in the result email (to have a clear result).

Something like:

"The vote passed with the following result:
+1 (binding): foo, bar
+1 (non binding): other, other
No other vote.
"

Regards
JB

On Thu, Apr 18, 2024 at 1:54 AM Sutou Kouhei  wrote:
>
> Hi,
>
> Sorry for confusing by me... My country (Japan) uses
> "${FAMILY_NAME} ${FIRST_NAME}" order for name. I found a
> recommendation[1] from my country that "${FAMILY_NAME}
> ${FIRST_NAME}" is preferred to "${FIRST_NAME}
> ${FAMILY_NAME}" in English context too. So I changed to use
> the "${FAMILY_NAME} ${FIRST_NAME}" style a few years ago.
>
> [1] 
> https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kakuki/22/tosin04/17.html
>
> But I couldn't change the "Public name" field by
> https://id.apache.org/ . So
> https://people.apache.org/phonebook.html?uid=kou still uses
> the "${FIRST_NAME} ${FAMILY_NAME}" style.
>
>
> Should I use "+1 (binding)" instead of just "+1" to avoid
> this confusion?
>
>
> Thanks,
> --
> kou
>
> In
>  
> 
>   "AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0" on 
> Wed, 17 Apr 2024 08:44:10 +,
>   Christofer Dutz  wrote:
>
> > When looking at whimsy, I can’t see any person named Sutou Kouhei listed as 
> > member of the Arrow PMC.
> >
> > Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah … 
> > ok … then please ignore my mumbling ;-)
> >
> > And yeah … the result now also moved to the same page … guess it was sent 
> > out a while after the Announce … guess that’s why I missed it.
> >
> > Thanks for following up …
> >
> > Chris
> >
> > Von: David Li 
> > Datum: Mittwoch, 17. April 2024 um 10:36
> > An: Christofer Dutz , dev@arrow.apache.org 
> > 
> > Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC 
> > 0.11.0
> > Hi Christofer,
> >
> > Sutou Kouhei is part of the PMC.
> >
> > Additionally, there is a result email: 
> > https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15
> >
> > On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
> >> Hi all,
> >>
> >> while reviewing your projects activity in the last quarter as part of
> >> my preparation for today's borads meeting I came across your last vote
> >> on Apache Arrow ADBC 0.11.0 RC0
> >>
> >> Technically I count only 2 binding +1 votes:
> >> - Matthew Topol
> >> - Dewey Dunnington
> >>
> >> All others are not part of the PMC.
> >>
> >> I assume the Release Manager David implicitly counted himself as +1,
> >> however does a concept of an implicit vote not exist at Apache. If you
> >> want to save sending an additional email, adding something like "this
> >> also counts as my +1 vote" to your email, or - even better - send an
> >> explicit vote email.
> >>
> >> Also would it be good to have a RESULT email containing the result of a 
> >> vote.
> >>
> >> So right now we would need a third binding vote as soon as possible
> >> (Possibly also for other votes, where we had the release manager
> >> provide the missing third vote).
> >>
> >> Chris
> >>
> >> PS: Please keep me in CC as I'm not subscribed here.


Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Jean-Baptiste Onofré
+1 (non binding)

I checked:
- hash and signature are OK
- ASF header are there
- no binary found in source distribution
- build ok

I run Java and C++ tests on MacOS, all good.

Thanks !
Regards
JB

On Wed, Apr 17, 2024 at 10:01 AM Raúl Cumplido  wrote:
>
> Hi,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow version 16.0.0. This is a release consisting of 378
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 16.0.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
> [13]: https://arrow.apache.org/docs/developers/release_verification.html
> [14]: https://github.com/apache/arrow/pull/41235


Re: AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Jean-Baptiste Onofré
Hi

Yeah it’s better to state the vote as binding or non binding.
It simplifies the count ;)

Thanks !
Regards
JB

Le jeu. 18 avr. 2024 à 01:54, Sutou Kouhei  a écrit :

> Hi,
>
> Sorry for confusing by me... My country (Japan) uses
> "${FAMILY_NAME} ${FIRST_NAME}" order for name. I found a
> recommendation[1] from my country that "${FAMILY_NAME}
> ${FIRST_NAME}" is preferred to "${FIRST_NAME}
> ${FAMILY_NAME}" in English context too. So I changed to use
> the "${FAMILY_NAME} ${FIRST_NAME}" style a few years ago.
>
> [1]
> https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kakuki/22/tosin04/17.html
>
> But I couldn't change the "Public name" field by
> https://id.apache.org/ . So
> https://people.apache.org/phonebook.html?uid=kou still uses
> the "${FIRST_NAME} ${FAMILY_NAME}" style.
>
>
> Should I use "+1 (binding)" instead of just "+1" to avoid
> this confusion?
>
>
> Thanks,
> --
> kou
>
> In
>  <
> as2pr05mb102478736f4542dd0e7120ee8a2...@as2pr05mb10247.eurprd05.prod.outlook.com
> >
>   "AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0"
> on Wed, 17 Apr 2024 08:44:10 +,
>   Christofer Dutz  wrote:
>
> > When looking at whimsy, I can’t see any person named Sutou Kouhei listed
> as member of the Arrow PMC.
> >
> > Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah
> … ok … then please ignore my mumbling ;-)
> >
> > And yeah … the result now also moved to the same page … guess it was
> sent out a while after the Announce … guess that’s why I missed it.
> >
> > Thanks for following up …
> >
> > Chris
> >
> > Von: David Li 
> > Datum: Mittwoch, 17. April 2024 um 10:36
> > An: Christofer Dutz , dev@arrow.apache.org <
> dev@arrow.apache.org>
> > Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC
> 0.11.0
> > Hi Christofer,
> >
> > Sutou Kouhei is part of the PMC.
> >
> > Additionally, there is a result email:
> https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15
> >
> > On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
> >> Hi all,
> >>
> >> while reviewing your projects activity in the last quarter as part of
> >> my preparation for today's borads meeting I came across your last vote
> >> on Apache Arrow ADBC 0.11.0 RC0
> >>
> >> Technically I count only 2 binding +1 votes:
> >> - Matthew Topol
> >> - Dewey Dunnington
> >>
> >> All others are not part of the PMC.
> >>
> >> I assume the Release Manager David implicitly counted himself as +1,
> >> however does a concept of an implicit vote not exist at Apache. If you
> >> want to save sending an additional email, adding something like "this
> >> also counts as my +1 vote" to your email, or - even better - send an
> >> explicit vote email.
> >>
> >> Also would it be good to have a RESULT email containing the result of a
> vote.
> >>
> >> So right now we would need a third binding vote as soon as possible
> >> (Possibly also for other votes, where we had the release manager
> >> provide the missing third vote).
> >>
> >> Chris
> >>
> >> PS: Please keep me in CC as I'm not subscribed here.
>


Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Dominik Moritz
Totally understand. I didn’t know how involved this process is. 16.1.0
makes sense.

I made myself a mail filter so I don’t miss future “feature freeze” emails.
As an idea, can we add the target dates to milestones on GitHub as well?
That would certainly help me as I don’t have the cycles to actively monitor
the mailing list.

Speaking of which, can I make a 16.1.0 milestone so I can correct issues
and make correct release notes?

On Apr 17, 2024 at 14:34:12, Raúl Cumplido  wrote:

> Hi Dominik,
>
> I am sorry the announcement was missed. I did send an email one month
> ago [1] and shared the dates on Zulip and the Arrow community call. I
> probably should send an email once the feature freeze is about to be
> performed as a reminder and to give more visibility.
>
> At this point I would prefer to create a 16.0.1 or 16.1.0 release as
> the 16.0.0 RC0 is pretty stable (this took a lot of time to achieve :)
> ) and has already enough votes for it to be released. Would that be
> ok?
>
> Thanks,
> Raúl
>
> [1] https://lists.apache.org/thread/lhdxnk4j0rl3sbtswlyvkp2rq13539fg
>
>
> El mié, 17 abr 2024 a las 19:20, Dominik Moritz
> () escribió:
>
>
> I’m sorry that we missed the announcement for the release but there are a
>
> few ArrowJS changes that we had marked for arrow 16 that are now in main. I
>
> created a PR with those changes to make it easier to see:
>
> https://github.com/apache/arrow/pull/41261. Can you add those to the RC1
> if
>
> not into RC0?
>
>
> On Apr 17, 2024 at 12:43:25, Ruoxi Sun  wrote:
>
>
> > +1 (non-binding)
>
> >
>
> > On my Intel Mac, OS version Sonoma 14.2.1 (23C71), verified cpp and go:
>
> >
>
> > TEST_DEFAULT=0 TEST_GO=1 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0
> 0
>
> >
>
> > I also tried to verify python:
>
> >
>
> > TEST_DEFAULT=0 TEST_PYTHON=1 ./verify-release-candidate.sh 16.0.0 0
>
> >
>
> > It succeeded except for [1] (as for several previous versions), which
>
> > should be trivial.
>
> >
>
> > [1] https://github.com/apache/arrow/issues/39679
>
> >
>
> > *Regards,*
>
> > *Rossi SUN*
>
> >
>
> >
>
> > Raúl Cumplido  于2024年4月18日周四 00:33写道:
>
> >
>
> > +1 (binding)
>
> >
>
> >
>
> > I've successfully verified sources and binaries:
>
> >
>
> >
>
> > TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
>
> >
>
> > 16.0.0 0
>
> >
>
> > TEST_DEFAULT=0 TEST_BINARIES=1 dev/release/verify-release-candidate.sh
>
> >
>
> > 16.0.0 0
>
> >
>
> >
>
> > with:
>
> >
>
> >   * Python 3.10.12
>
> >
>
> >   * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
>
> >
>
> >   * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
>
> >
>
> >   * openjdk 17.0.10 2024-01-16
>
> >
>
> >   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
>
> >
>
> >   * 7.0.117
>
> >
>
> >   * Ubuntu 22.04 LTS
>
> >
>
> >
>
> > El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
>
> >
>
> > () escribió:
>
> >
>
> > >
>
> >
>
> > > +1 (binding)
>
> >
>
> > >
>
> >
>
> > > Tested source with conda on Ubuntu
>
> >
>
> > >
>
> >
>
> > > On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon 
>
> >
>
> > wrote:
>
> >
>
> > > >
>
> >
>
> > > > I executed the following
>
> >
>
> > > >
>
> >
>
> > > > # Verifying C++
>
> >
>
> > > >
>
> >
>
> > > > ```bash
>
> >
>
> > > > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
>
> >
>
> > > > ```
>
> >
>
> > > >
>
> >
>
> > > > # Verifying C++ and Python
>
> >
>
> > > >
>
> >
>
> > > > ```bash
>
> >
>
> > > > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh
>
> >
>
> > 16.0.0 0
>
> >
>
> > > > ```
>
> >
>
> > > >
>
> >
>
> > > > # Verifying C++ and Java
>
> >
>
> > > >
>
> >
>
> > > > ```bash
>
> >
>
> > > > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
>
> >
>
> > > > ./verify-release-candidate.sh 16.0.0 0
>
> >
>
> > > > ```
>
> >
>
> > > >
>
> >
>
> > > > with:
>
> >
>
> > > > * Python 3.10.12
>
> >
>
> > > > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
>
> >
>
> > > > * openjdk version "21.0.2" 2024-01-16
>
> >
>
> > > > * Ubuntu 22.04.4 LTS
>
> >
>
> > > >
>
> >
>
> > > > Verifying C++, Python and Java
>
> >
>
> > > >
>
> >
>
> > > > +1 (non-binding)
>
> >
>
> > > >
>
> >
>
> > > >
>
> >
>
> > > > On Wed, Apr 17, 2024 at 6:16 PM David Li 
> wrote:
>
> >
>
> > > >
>
> >
>
> > > > > +1
>
> >
>
> > > > >
>
> >
>
> > > > > tested sources on Debian 12, x86-64
>
> >
>
> > > > >
>
> >
>
> > > > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
>
> >
>
> > > > > > Hi,
>
> >
>
> > > > > >
>
> >
>
> > > > > > Just a minor note, the binary verification for
>
> >
>
> > > > > > verify-rc-binaries-wheels-windows failed with [1].
>
> >
>
> > > > > > This can be avoided by implementing the solution proposed in this
>
> >
>
> > > > > > comment by Kou [2]. See more details there.
>
> >
>
> > > > > >
>
> >
>
> > > > > > As shared in the comment we don't think this is a blocker as it
>
> >
>
> > just
>
> >
>
> > > > > > requires to set TZDIR and download the IANA database for the ORC
>
> >
>
> > test
>
> >

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Gang Wu
+1 (non-binding)

Successfully verified C++ on macOS 12.5.1 with AppleClang 13.1.6.13160021
by running
`TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0`

Best,
Gang


On Thu, Apr 18, 2024 at 3:20 AM Rok Mihevc  wrote:

> +1
>
> I've successfully verified sources on Ubuntu 22.04:
>
> TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh 16.0.0
> 0
>
> Rok
>
> On Wed, Apr 17, 2024 at 8:36 PM Raúl Cumplido 
> wrote:
>
> > Hi Dominik,
> >
> > I am sorry the announcement was missed. I did send an email one month
> > ago [1] and shared the dates on Zulip and the Arrow community call. I
> > probably should send an email once the feature freeze is about to be
> > performed as a reminder and to give more visibility.
> >
> > At this point I would prefer to create a 16.0.1 or 16.1.0 release as
> > the 16.0.0 RC0 is pretty stable (this took a lot of time to achieve :)
> > ) and has already enough votes for it to be released. Would that be
> > ok?
> >
> > Thanks,
> > Raúl
> >
> > [1] https://lists.apache.org/thread/lhdxnk4j0rl3sbtswlyvkp2rq13539fg
> >
> >
> > El mié, 17 abr 2024 a las 19:20, Dominik Moritz
> > () escribió:
> > >
> > > I’m sorry that we missed the announcement for the release but there
> are a
> > > few ArrowJS changes that we had marked for arrow 16 that are now in
> > main. I
> > > created a PR with those changes to make it easier to see:
> > > https://github.com/apache/arrow/pull/41261. Can you add those to the
> > RC1 if
> > > not into RC0?
> > >
> > > On Apr 17, 2024 at 12:43:25, Ruoxi Sun  wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On my Intel Mac, OS version Sonoma 14.2.1 (23C71), verified cpp and
> go:
> > > >
> > > > TEST_DEFAULT=0 TEST_GO=1 TEST_CPP=1 ./verify-release-candidate.sh
> > 16.0.0 0
> > > >
> > > > I also tried to verify python:
> > > >
> > > > TEST_DEFAULT=0 TEST_PYTHON=1 ./verify-release-candidate.sh 16.0.0 0
> > > >
> > > > It succeeded except for [1] (as for several previous versions), which
> > > > should be trivial.
> > > >
> > > > [1] https://github.com/apache/arrow/issues/39679
> > > >
> > > > *Regards,*
> > > > *Rossi SUN*
> > > >
> > > >
> > > > Raúl Cumplido  于2024年4月18日周四 00:33写道:
> > > >
> > > > +1 (binding)
> > > >
> > > >
> > > > I've successfully verified sources and binaries:
> > > >
> > > >
> > > > TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
> > > >
> > > > 16.0.0 0
> > > >
> > > > TEST_DEFAULT=0 TEST_BINARIES=1
> dev/release/verify-release-candidate.sh
> > > >
> > > > 16.0.0 0
> > > >
> > > >
> > > > with:
> > > >
> > > >   * Python 3.10.12
> > > >
> > > >   * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
> > > >
> > > >   * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
> > > >
> > > >   * openjdk 17.0.10 2024-01-16
> > > >
> > > >   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233)
> [x86_64-linux-gnu]
> > > >
> > > >   * 7.0.117
> > > >
> > > >   * Ubuntu 22.04 LTS
> > > >
> > > >
> > > > El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
> > > >
> > > > () escribió:
> > > >
> > > > >
> > > >
> > > > > +1 (binding)
> > > >
> > > > >
> > > >
> > > > > Tested source with conda on Ubuntu
> > > >
> > > > >
> > > >
> > > > > On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon <
> vibha...@gmail.com>
> > > >
> > > > wrote:
> > > >
> > > > > >
> > > >
> > > > > > I executed the following
> > > >
> > > > > >
> > > >
> > > > > > # Verifying C++
> > > >
> > > > > >
> > > >
> > > > > > ```bash
> > > >
> > > > > > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> > > >
> > > > > > ```
> > > >
> > > > > >
> > > >
> > > > > > # Verifying C++ and Python
> > > >
> > > > > >
> > > >
> > > > > > ```bash
> > > >
> > > > > > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1
> verify-release-candidate.sh
> > > >
> > > > 16.0.0 0
> > > >
> > > > > > ```
> > > >
> > > > > >
> > > >
> > > > > > # Verifying C++ and Java
> > > >
> > > > > >
> > > >
> > > > > > ```bash
> > > >
> > > > > > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
> > > >
> > > > > > ./verify-release-candidate.sh 16.0.0 0
> > > >
> > > > > > ```
> > > >
> > > > > >
> > > >
> > > > > > with:
> > > >
> > > > > > * Python 3.10.12
> > > >
> > > > > > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> > > >
> > > > > > * openjdk version "21.0.2" 2024-01-16
> > > >
> > > > > > * Ubuntu 22.04.4 LTS
> > > >
> > > > > >
> > > >
> > > > > > Verifying C++, Python and Java
> > > >
> > > > > >
> > > >
> > > > > > +1 (non-binding)
> > > >
> > > > > >
> > > >
> > > > > >
> > > >
> > > > > > On Wed, Apr 17, 2024 at 6:16 PM David Li 
> > wrote:
> > > >
> > > > > >
> > > >
> > > > > > > +1
> > > >
> > > > > > >
> > > >
> > > > > > > tested sources on Debian 12, x86-64
> > > >
> > > > > > >
> > > >
> > > > > > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> > > >
> > > > > > > > Hi,
> > > >
> > > > > > > >
> > > >
> > > > > > > > Just a minor note, the binary verification for
> > > >
> > > > > > > > verify-rc-binaries-wheels-windows faile

Re: AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Sutou Kouhei
Hi,

Sorry for confusing by me... My country (Japan) uses
"${FAMILY_NAME} ${FIRST_NAME}" order for name. I found a
recommendation[1] from my country that "${FAMILY_NAME}
${FIRST_NAME}" is preferred to "${FIRST_NAME}
${FAMILY_NAME}" in English context too. So I changed to use
the "${FAMILY_NAME} ${FIRST_NAME}" style a few years ago.

[1] 
https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kakuki/22/tosin04/17.html

But I couldn't change the "Public name" field by
https://id.apache.org/ . So
https://people.apache.org/phonebook.html?uid=kou still uses
the "${FIRST_NAME} ${FAMILY_NAME}" style.


Should I use "+1 (binding)" instead of just "+1" to avoid
this confusion?


Thanks,
-- 
kou

In 
 

  "AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0" on 
Wed, 17 Apr 2024 08:44:10 +,
  Christofer Dutz  wrote:

> When looking at whimsy, I can’t see any person named Sutou Kouhei listed as 
> member of the Arrow PMC.
> 
> Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah … ok 
> … then please ignore my mumbling ;-)
> 
> And yeah … the result now also moved to the same page … guess it was sent out 
> a while after the Announce … guess that’s why I missed it.
> 
> Thanks for following up …
> 
> Chris
> 
> Von: David Li 
> Datum: Mittwoch, 17. April 2024 um 10:36
> An: Christofer Dutz , dev@arrow.apache.org 
> 
> Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC 
> 0.11.0
> Hi Christofer,
> 
> Sutou Kouhei is part of the PMC.
> 
> Additionally, there is a result email: 
> https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15
> 
> On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
>> Hi all,
>>
>> while reviewing your projects activity in the last quarter as part of
>> my preparation for today's borads meeting I came across your last vote
>> on Apache Arrow ADBC 0.11.0 RC0
>>
>> Technically I count only 2 binding +1 votes:
>> - Matthew Topol
>> - Dewey Dunnington
>>
>> All others are not part of the PMC.
>>
>> I assume the Release Manager David implicitly counted himself as +1,
>> however does a concept of an implicit vote not exist at Apache. If you
>> want to save sending an additional email, adding something like "this
>> also counts as my +1 vote" to your email, or - even better - send an
>> explicit vote email.
>>
>> Also would it be good to have a RESULT email containing the result of a vote.
>>
>> So right now we would need a third binding vote as soon as possible
>> (Possibly also for other votes, where we had the release manager
>> provide the missing third vote).
>>
>> Chris
>>
>> PS: Please keep me in CC as I'm not subscribed here.


Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Rok Mihevc
+1

I've successfully verified sources on Ubuntu 22.04:

TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh 16.0.0
0

Rok

On Wed, Apr 17, 2024 at 8:36 PM Raúl Cumplido 
wrote:

> Hi Dominik,
>
> I am sorry the announcement was missed. I did send an email one month
> ago [1] and shared the dates on Zulip and the Arrow community call. I
> probably should send an email once the feature freeze is about to be
> performed as a reminder and to give more visibility.
>
> At this point I would prefer to create a 16.0.1 or 16.1.0 release as
> the 16.0.0 RC0 is pretty stable (this took a lot of time to achieve :)
> ) and has already enough votes for it to be released. Would that be
> ok?
>
> Thanks,
> Raúl
>
> [1] https://lists.apache.org/thread/lhdxnk4j0rl3sbtswlyvkp2rq13539fg
>
>
> El mié, 17 abr 2024 a las 19:20, Dominik Moritz
> () escribió:
> >
> > I’m sorry that we missed the announcement for the release but there are a
> > few ArrowJS changes that we had marked for arrow 16 that are now in
> main. I
> > created a PR with those changes to make it easier to see:
> > https://github.com/apache/arrow/pull/41261. Can you add those to the
> RC1 if
> > not into RC0?
> >
> > On Apr 17, 2024 at 12:43:25, Ruoxi Sun  wrote:
> >
> > > +1 (non-binding)
> > >
> > > On my Intel Mac, OS version Sonoma 14.2.1 (23C71), verified cpp and go:
> > >
> > > TEST_DEFAULT=0 TEST_GO=1 TEST_CPP=1 ./verify-release-candidate.sh
> 16.0.0 0
> > >
> > > I also tried to verify python:
> > >
> > > TEST_DEFAULT=0 TEST_PYTHON=1 ./verify-release-candidate.sh 16.0.0 0
> > >
> > > It succeeded except for [1] (as for several previous versions), which
> > > should be trivial.
> > >
> > > [1] https://github.com/apache/arrow/issues/39679
> > >
> > > *Regards,*
> > > *Rossi SUN*
> > >
> > >
> > > Raúl Cumplido  于2024年4月18日周四 00:33写道:
> > >
> > > +1 (binding)
> > >
> > >
> > > I've successfully verified sources and binaries:
> > >
> > >
> > > TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
> > >
> > > 16.0.0 0
> > >
> > > TEST_DEFAULT=0 TEST_BINARIES=1 dev/release/verify-release-candidate.sh
> > >
> > > 16.0.0 0
> > >
> > >
> > > with:
> > >
> > >   * Python 3.10.12
> > >
> > >   * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
> > >
> > >   * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
> > >
> > >   * openjdk 17.0.10 2024-01-16
> > >
> > >   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
> > >
> > >   * 7.0.117
> > >
> > >   * Ubuntu 22.04 LTS
> > >
> > >
> > > El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
> > >
> > > () escribió:
> > >
> > > >
> > >
> > > > +1 (binding)
> > >
> > > >
> > >
> > > > Tested source with conda on Ubuntu
> > >
> > > >
> > >
> > > > On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon 
> > >
> > > wrote:
> > >
> > > > >
> > >
> > > > > I executed the following
> > >
> > > > >
> > >
> > > > > # Verifying C++
> > >
> > > > >
> > >
> > > > > ```bash
> > >
> > > > > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> > >
> > > > > ```
> > >
> > > > >
> > >
> > > > > # Verifying C++ and Python
> > >
> > > > >
> > >
> > > > > ```bash
> > >
> > > > > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh
> > >
> > > 16.0.0 0
> > >
> > > > > ```
> > >
> > > > >
> > >
> > > > > # Verifying C++ and Java
> > >
> > > > >
> > >
> > > > > ```bash
> > >
> > > > > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
> > >
> > > > > ./verify-release-candidate.sh 16.0.0 0
> > >
> > > > > ```
> > >
> > > > >
> > >
> > > > > with:
> > >
> > > > > * Python 3.10.12
> > >
> > > > > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> > >
> > > > > * openjdk version "21.0.2" 2024-01-16
> > >
> > > > > * Ubuntu 22.04.4 LTS
> > >
> > > > >
> > >
> > > > > Verifying C++, Python and Java
> > >
> > > > >
> > >
> > > > > +1 (non-binding)
> > >
> > > > >
> > >
> > > > >
> > >
> > > > > On Wed, Apr 17, 2024 at 6:16 PM David Li 
> wrote:
> > >
> > > > >
> > >
> > > > > > +1
> > >
> > > > > >
> > >
> > > > > > tested sources on Debian 12, x86-64
> > >
> > > > > >
> > >
> > > > > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> > >
> > > > > > > Hi,
> > >
> > > > > > >
> > >
> > > > > > > Just a minor note, the binary verification for
> > >
> > > > > > > verify-rc-binaries-wheels-windows failed with [1].
> > >
> > > > > > > This can be avoided by implementing the solution proposed in
> this
> > >
> > > > > > > comment by Kou [2]. See more details there.
> > >
> > > > > > >
> > >
> > > > > > > As shared in the comment we don't think this is a blocker as it
> > >
> > > just
> > >
> > > > > > > requires to set TZDIR and download the IANA database for the
> ORC
> > >
> > > test
> > >
> > > > > > > to pass on Windows.
> > >
> > > > > > >
> > >
> > > > > > > Kind regards,
> > >
> > > > > > > Raúl
> > >
> > > > > > >
> > >
> > > > > > > [1]
> > >
> > > > > > >
> > >
> > > > > >
> > >
> > >
> > >
> https://github.com/ursacomputing/crossbow/actions/runs/871526299

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Raúl Cumplido
Hi Dominik,

I am sorry the announcement was missed. I did send an email one month
ago [1] and shared the dates on Zulip and the Arrow community call. I
probably should send an email once the feature freeze is about to be
performed as a reminder and to give more visibility.

At this point I would prefer to create a 16.0.1 or 16.1.0 release as
the 16.0.0 RC0 is pretty stable (this took a lot of time to achieve :)
) and has already enough votes for it to be released. Would that be
ok?

Thanks,
Raúl

[1] https://lists.apache.org/thread/lhdxnk4j0rl3sbtswlyvkp2rq13539fg


El mié, 17 abr 2024 a las 19:20, Dominik Moritz
() escribió:
>
> I’m sorry that we missed the announcement for the release but there are a
> few ArrowJS changes that we had marked for arrow 16 that are now in main. I
> created a PR with those changes to make it easier to see:
> https://github.com/apache/arrow/pull/41261. Can you add those to the RC1 if
> not into RC0?
>
> On Apr 17, 2024 at 12:43:25, Ruoxi Sun  wrote:
>
> > +1 (non-binding)
> >
> > On my Intel Mac, OS version Sonoma 14.2.1 (23C71), verified cpp and go:
> >
> > TEST_DEFAULT=0 TEST_GO=1 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> >
> > I also tried to verify python:
> >
> > TEST_DEFAULT=0 TEST_PYTHON=1 ./verify-release-candidate.sh 16.0.0 0
> >
> > It succeeded except for [1] (as for several previous versions), which
> > should be trivial.
> >
> > [1] https://github.com/apache/arrow/issues/39679
> >
> > *Regards,*
> > *Rossi SUN*
> >
> >
> > Raúl Cumplido  于2024年4月18日周四 00:33写道:
> >
> > +1 (binding)
> >
> >
> > I've successfully verified sources and binaries:
> >
> >
> > TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
> >
> > 16.0.0 0
> >
> > TEST_DEFAULT=0 TEST_BINARIES=1 dev/release/verify-release-candidate.sh
> >
> > 16.0.0 0
> >
> >
> > with:
> >
> >   * Python 3.10.12
> >
> >   * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
> >
> >   * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
> >
> >   * openjdk 17.0.10 2024-01-16
> >
> >   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
> >
> >   * 7.0.117
> >
> >   * Ubuntu 22.04 LTS
> >
> >
> > El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
> >
> > () escribió:
> >
> > >
> >
> > > +1 (binding)
> >
> > >
> >
> > > Tested source with conda on Ubuntu
> >
> > >
> >
> > > On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon 
> >
> > wrote:
> >
> > > >
> >
> > > > I executed the following
> >
> > > >
> >
> > > > # Verifying C++
> >
> > > >
> >
> > > > ```bash
> >
> > > > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> >
> > > > ```
> >
> > > >
> >
> > > > # Verifying C++ and Python
> >
> > > >
> >
> > > > ```bash
> >
> > > > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh
> >
> > 16.0.0 0
> >
> > > > ```
> >
> > > >
> >
> > > > # Verifying C++ and Java
> >
> > > >
> >
> > > > ```bash
> >
> > > > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
> >
> > > > ./verify-release-candidate.sh 16.0.0 0
> >
> > > > ```
> >
> > > >
> >
> > > > with:
> >
> > > > * Python 3.10.12
> >
> > > > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> >
> > > > * openjdk version "21.0.2" 2024-01-16
> >
> > > > * Ubuntu 22.04.4 LTS
> >
> > > >
> >
> > > > Verifying C++, Python and Java
> >
> > > >
> >
> > > > +1 (non-binding)
> >
> > > >
> >
> > > >
> >
> > > > On Wed, Apr 17, 2024 at 6:16 PM David Li  wrote:
> >
> > > >
> >
> > > > > +1
> >
> > > > >
> >
> > > > > tested sources on Debian 12, x86-64
> >
> > > > >
> >
> > > > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> >
> > > > > > Hi,
> >
> > > > > >
> >
> > > > > > Just a minor note, the binary verification for
> >
> > > > > > verify-rc-binaries-wheels-windows failed with [1].
> >
> > > > > > This can be avoided by implementing the solution proposed in this
> >
> > > > > > comment by Kou [2]. See more details there.
> >
> > > > > >
> >
> > > > > > As shared in the comment we don't think this is a blocker as it
> >
> > just
> >
> > > > > > requires to set TZDIR and download the IANA database for the ORC
> >
> > test
> >
> > > > > > to pass on Windows.
> >
> > > > > >
> >
> > > > > > Kind regards,
> >
> > > > > > Raúl
> >
> > > > > >
> >
> > > > > > [1]
> >
> > > > > >
> >
> > > > >
> >
> >
> > https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
> >
> > > > > > [2]
> >
> > https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
> >
> > > > > >
> >
> > > > > > El mié, 17 abr 2024 a las 11:01, Raúl Cumplido ( >
> > >)
> >
> > > > > escribió:
> >
> > > > > >>
> >
> > > > > >> Hi,
> >
> > > > > >>
> >
> > > > > >> I would like to propose the following release candidate (RC0) of
> >
> > Apache
> >
> > > > > >> Arrow version 16.0.0. This is a release consisting of 378
> >
> > > > > >> resolved GitHub issues[1].
> >
> > > > > >>
> >
> > > > > >> This release candidate is based on commit:
> >
> > > > > >> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Dominik Moritz
I’m sorry that we missed the announcement for the release but there are a
few ArrowJS changes that we had marked for arrow 16 that are now in main. I
created a PR with those changes to make it easier to see:
https://github.com/apache/arrow/pull/41261. Can you add those to the RC1 if
not into RC0?

On Apr 17, 2024 at 12:43:25, Ruoxi Sun  wrote:

> +1 (non-binding)
>
> On my Intel Mac, OS version Sonoma 14.2.1 (23C71), verified cpp and go:
>
> TEST_DEFAULT=0 TEST_GO=1 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
>
> I also tried to verify python:
>
> TEST_DEFAULT=0 TEST_PYTHON=1 ./verify-release-candidate.sh 16.0.0 0
>
> It succeeded except for [1] (as for several previous versions), which
> should be trivial.
>
> [1] https://github.com/apache/arrow/issues/39679
>
> *Regards,*
> *Rossi SUN*
>
>
> Raúl Cumplido  于2024年4月18日周四 00:33写道:
>
> +1 (binding)
>
>
> I've successfully verified sources and binaries:
>
>
> TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
>
> 16.0.0 0
>
> TEST_DEFAULT=0 TEST_BINARIES=1 dev/release/verify-release-candidate.sh
>
> 16.0.0 0
>
>
> with:
>
>   * Python 3.10.12
>
>   * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
>
>   * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
>
>   * openjdk 17.0.10 2024-01-16
>
>   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
>
>   * 7.0.117
>
>   * Ubuntu 22.04 LTS
>
>
> El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
>
> () escribió:
>
> >
>
> > +1 (binding)
>
> >
>
> > Tested source with conda on Ubuntu
>
> >
>
> > On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon 
>
> wrote:
>
> > >
>
> > > I executed the following
>
> > >
>
> > > # Verifying C++
>
> > >
>
> > > ```bash
>
> > > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
>
> > > ```
>
> > >
>
> > > # Verifying C++ and Python
>
> > >
>
> > > ```bash
>
> > > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh
>
> 16.0.0 0
>
> > > ```
>
> > >
>
> > > # Verifying C++ and Java
>
> > >
>
> > > ```bash
>
> > > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
>
> > > ./verify-release-candidate.sh 16.0.0 0
>
> > > ```
>
> > >
>
> > > with:
>
> > > * Python 3.10.12
>
> > > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
>
> > > * openjdk version "21.0.2" 2024-01-16
>
> > > * Ubuntu 22.04.4 LTS
>
> > >
>
> > > Verifying C++, Python and Java
>
> > >
>
> > > +1 (non-binding)
>
> > >
>
> > >
>
> > > On Wed, Apr 17, 2024 at 6:16 PM David Li  wrote:
>
> > >
>
> > > > +1
>
> > > >
>
> > > > tested sources on Debian 12, x86-64
>
> > > >
>
> > > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
>
> > > > > Hi,
>
> > > > >
>
> > > > > Just a minor note, the binary verification for
>
> > > > > verify-rc-binaries-wheels-windows failed with [1].
>
> > > > > This can be avoided by implementing the solution proposed in this
>
> > > > > comment by Kou [2]. See more details there.
>
> > > > >
>
> > > > > As shared in the comment we don't think this is a blocker as it
>
> just
>
> > > > > requires to set TZDIR and download the IANA database for the ORC
>
> test
>
> > > > > to pass on Windows.
>
> > > > >
>
> > > > > Kind regards,
>
> > > > > Raúl
>
> > > > >
>
> > > > > [1]
>
> > > > >
>
> > > >
>
>
> https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
>
> > > > > [2]
>
> https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
>
> > > > >
>
> > > > > El mié, 17 abr 2024 a las 11:01, Raúl Cumplido (
> >)
>
> > > > escribió:
>
> > > > >>
>
> > > > >> Hi,
>
> > > > >>
>
> > > > >> I would like to propose the following release candidate (RC0) of
>
> Apache
>
> > > > >> Arrow version 16.0.0. This is a release consisting of 378
>
> > > > >> resolved GitHub issues[1].
>
> > > > >>
>
> > > > >> This release candidate is based on commit:
>
> > > > >> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
>
> > > > >>
>
> > > > >> The source release rc0 is hosted at [3].
>
> > > > >> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
>
> > > > >> The changelog is located at [12].
>
> > > > >>
>
> > > > >> Please download, verify checksums and signatures, run the unit
>
> tests,
>
> > > > >> and vote on the release. See [13] for how to validate a release
>
> > > > candidate.
>
> > > > >>
>
> > > > >> See also a verification result on GitHub pull request [14].
>
> > > > >>
>
> > > > >> The vote will be open for at least 72 hours.
>
> > > > >>
>
> > > > >> [ ] +1 Release this as Apache Arrow 16.0.0
>
> > > > >> [ ] +0
>
> > > > >> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
>
> > > > >>
>
> > > > >> [1]:
>
> > > >
>
>
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
>
> > > > >> [2]:
>
> > > >
>
>
> https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
>
> > > > >> [3]:
>
> > > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
>
> > > > >> [4]: https://apache.jfrog.io/artifactory

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Ruoxi Sun
+1 (non-binding)

On my Intel Mac, OS version Sonoma 14.2.1 (23C71), verified cpp and go:

TEST_DEFAULT=0 TEST_GO=1 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0

I also tried to verify python:

TEST_DEFAULT=0 TEST_PYTHON=1 ./verify-release-candidate.sh 16.0.0 0

It succeeded except for [1] (as for several previous versions), which
should be trivial.

[1] https://github.com/apache/arrow/issues/39679

*Regards,*
*Rossi SUN*


Raúl Cumplido  于2024年4月18日周四 00:33写道:

> +1 (binding)
>
> I've successfully verified sources and binaries:
>
> TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh
> 16.0.0 0
> TEST_DEFAULT=0 TEST_BINARIES=1 dev/release/verify-release-candidate.sh
> 16.0.0 0
>
> with:
>   * Python 3.10.12
>   * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
>   * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
>   * openjdk 17.0.10 2024-01-16
>   * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
>   * 7.0.117
>   * Ubuntu 22.04 LTS
>
> El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
> () escribió:
> >
> > +1 (binding)
> >
> > Tested source with conda on Ubuntu
> >
> > On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon 
> wrote:
> > >
> > > I executed the following
> > >
> > > # Verifying C++
> > >
> > > ```bash
> > > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> > > ```
> > >
> > > # Verifying C++ and Python
> > >
> > > ```bash
> > > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh
> 16.0.0 0
> > > ```
> > >
> > > # Verifying C++ and Java
> > >
> > > ```bash
> > > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
> > > ./verify-release-candidate.sh 16.0.0 0
> > > ```
> > >
> > > with:
> > > * Python 3.10.12
> > > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> > > * openjdk version "21.0.2" 2024-01-16
> > > * Ubuntu 22.04.4 LTS
> > >
> > > Verifying C++, Python and Java
> > >
> > > +1 (non-binding)
> > >
> > >
> > > On Wed, Apr 17, 2024 at 6:16 PM David Li  wrote:
> > >
> > > > +1
> > > >
> > > > tested sources on Debian 12, x86-64
> > > >
> > > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> > > > > Hi,
> > > > >
> > > > > Just a minor note, the binary verification for
> > > > > verify-rc-binaries-wheels-windows failed with [1].
> > > > > This can be avoided by implementing the solution proposed in this
> > > > > comment by Kou [2]. See more details there.
> > > > >
> > > > > As shared in the comment we don't think this is a blocker as it
> just
> > > > > requires to set TZDIR and download the IANA database for the ORC
> test
> > > > > to pass on Windows.
> > > > >
> > > > > Kind regards,
> > > > > Raúl
> > > > >
> > > > > [1]
> > > > >
> > > >
> https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
> > > > > [2]
> https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
> > > > >
> > > > > El mié, 17 abr 2024 a las 11:01, Raúl Cumplido ( >)
> > > > escribió:
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> I would like to propose the following release candidate (RC0) of
> Apache
> > > > >> Arrow version 16.0.0. This is a release consisting of 378
> > > > >> resolved GitHub issues[1].
> > > > >>
> > > > >> This release candidate is based on commit:
> > > > >> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
> > > > >>
> > > > >> The source release rc0 is hosted at [3].
> > > > >> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> > > > >> The changelog is located at [12].
> > > > >>
> > > > >> Please download, verify checksums and signatures, run the unit
> tests,
> > > > >> and vote on the release. See [13] for how to validate a release
> > > > candidate.
> > > > >>
> > > > >> See also a verification result on GitHub pull request [14].
> > > > >>
> > > > >> The vote will be open for at least 72 hours.
> > > > >>
> > > > >> [ ] +1 Release this as Apache Arrow 16.0.0
> > > > >> [ ] +0
> > > > >> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
> > > > >>
> > > > >> [1]:
> > > >
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
> > > > >> [2]:
> > > >
> https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
> > > > >> [3]:
> > > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
> > > > >> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > > > >> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> > > > >> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> > > > >> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > > > >> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
> > > > >> [9]:
> https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
> > > > >> [10]:
> https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
> > > > >> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > > > >> [12]:
> > > >
> https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d7

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Raúl Cumplido
+1 (binding)

I've successfully verified sources and binaries:

TEST_DEFAULT=0 TEST_SOURCE=1 dev/release/verify-release-candidate.sh 16.0.0 0
TEST_DEFAULT=0 TEST_BINARIES=1 dev/release/verify-release-candidate.sh 16.0.0 0

with:
  * Python 3.10.12
  * gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
  * NVIDIA CUDA cuda_11.5.r11.5/compiler.30672275_0
  * openjdk 17.0.10 2024-01-16
  * ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux-gnu]
  * 7.0.117
  * Ubuntu 22.04 LTS

El mié, 17 abr 2024 a las 16:33, Joris Van den Bossche
() escribió:
>
> +1 (binding)
>
> Tested source with conda on Ubuntu
>
> On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon  wrote:
> >
> > I executed the following
> >
> > # Verifying C++
> >
> > ```bash
> > TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> > ```
> >
> > # Verifying C++ and Python
> >
> > ```bash
> > TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh 16.0.0 0
> > ```
> >
> > # Verifying C++ and Java
> >
> > ```bash
> > TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
> > ./verify-release-candidate.sh 16.0.0 0
> > ```
> >
> > with:
> > * Python 3.10.12
> > * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> > * openjdk version "21.0.2" 2024-01-16
> > * Ubuntu 22.04.4 LTS
> >
> > Verifying C++, Python and Java
> >
> > +1 (non-binding)
> >
> >
> > On Wed, Apr 17, 2024 at 6:16 PM David Li  wrote:
> >
> > > +1
> > >
> > > tested sources on Debian 12, x86-64
> > >
> > > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> > > > Hi,
> > > >
> > > > Just a minor note, the binary verification for
> > > > verify-rc-binaries-wheels-windows failed with [1].
> > > > This can be avoided by implementing the solution proposed in this
> > > > comment by Kou [2]. See more details there.
> > > >
> > > > As shared in the comment we don't think this is a blocker as it just
> > > > requires to set TZDIR and download the IANA database for the ORC test
> > > > to pass on Windows.
> > > >
> > > > Kind regards,
> > > > Raúl
> > > >
> > > > [1]
> > > >
> > > https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
> > > > [2] https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
> > > >
> > > > El mié, 17 abr 2024 a las 11:01, Raúl Cumplido ()
> > > escribió:
> > > >>
> > > >> Hi,
> > > >>
> > > >> I would like to propose the following release candidate (RC0) of Apache
> > > >> Arrow version 16.0.0. This is a release consisting of 378
> > > >> resolved GitHub issues[1].
> > > >>
> > > >> This release candidate is based on commit:
> > > >> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
> > > >>
> > > >> The source release rc0 is hosted at [3].
> > > >> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> > > >> The changelog is located at [12].
> > > >>
> > > >> Please download, verify checksums and signatures, run the unit tests,
> > > >> and vote on the release. See [13] for how to validate a release
> > > candidate.
> > > >>
> > > >> See also a verification result on GitHub pull request [14].
> > > >>
> > > >> The vote will be open for at least 72 hours.
> > > >>
> > > >> [ ] +1 Release this as Apache Arrow 16.0.0
> > > >> [ ] +0
> > > >> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
> > > >>
> > > >> [1]:
> > > https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
> > > >> [2]:
> > > https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
> > > >> [3]:
> > > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
> > > >> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > > >> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> > > >> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> > > >> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > > >> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
> > > >> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
> > > >> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
> > > >> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > > >> [12]:
> > > https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
> > > >> [13]:
> > > https://arrow.apache.org/docs/developers/release_verification.html
> > > >> [14]: https://github.com/apache/arrow/pull/41235
> > >


Re: Unsupported/Other Type

2024-04-17 Thread David Li
Yes, this would be for an extension type. 

On Wed, Apr 17, 2024, at 23:25, Weston Pace wrote:
>> people generally find use in Arrow schemas independently of concrete data.
>
> This makes sense.  I think we do want to encourage use of Arrow as a "type
> system" even if there is no data involved.  And, given that we cannot
> easily change a field's data type property to "optional" it makes sense to
> use a dedicated type and I so I would be in favor of such a proposal (we
> may eventually add an "unknown type" concept in Substrait as well, it's
> come up several times, and so we could use this in that context).
>
> I think that I would still prefer a canonical extension type (with storage
> type null) over a new dedicated type.
>
> On Wed, Apr 17, 2024 at 5:39 AM Antoine Pitrou  wrote:
>
>>
>> Ah! Well, I think this could be an interesting proposal, but someone
>> should put a more formal proposal, perhaps as a draft PR.
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 17/04/2024 à 11:57, David Li a écrit :
>> > For an unsupported/other extension type.
>> >
>> > On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:
>> >> What is "this proposal"?
>> >>
>> >>
>> >> Le 17/04/2024 à 10:38, David Li a écrit :
>> >>> Should I take it that this proposal is dead in the water? While we
>> could define our own Unknown/Other type for say the ADBC PostgreSQL driver
>> it might be useful to have a singular type for consumers to latch on to.
>> >>>
>> >>> On Fri, Apr 12, 2024, at 07:32, David Li wrote:
>>  I think an "Other" extension type is slightly different than an
>>  arbitrary extension type, though: the latter may be understood
>>  downstream but the former represents a point at which a component
>>  explicitly declares it does not know how to handle a field. In this
>>  example, the PostgreSQL ADBC driver might be able to provide a
>>  representation regardless, but a different driver (or say, the JDBC
>>  adapter, which cannot necessarily get a bytestring for an arbitrary
>>  JDBC type) may want an Other type to signal that it would fail if
>> asked
>>  to provide particular columns.
>> 
>>  On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:
>> > Depending where your Arrow-encoded data is used, either extension
>> > types or generic field metadata are options. We have this problem in
>> > the ADBC Postgres driver, where we can convert *most* Postgres types
>> > to an Arrow type but there are some others where we can't or don't
>> > know or don't implement a conversion. Currently for these we return
>> > opaque binary (the Postgres COPY representation of the value) but put
>> > field metadata so that a consumer can implement a workaround for an
>> > unsupported type. It would be arguably better to have implemented
>> this
>> > as an extension type; however, field metadata felt like less of a
>> > commitment when I first worked on this.
>> >
>> > Cheers,
>> >
>> > -dewey
>> >
>> > On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
>> >  wrote:
>> >>
>> >> I was using UUID as an example. It looks like extension types
>> covers my original request.
>> >> 
>> >> From: Felipe Oliveira Carvalho 
>> >> Sent: Thursday, April 11, 2024 7:15 AM
>> >> To: dev@arrow.apache.org 
>> >> Subject: Re: Unsupported/Other Type
>> >>
>> >> The OP used UUID as an example. Would that be enough or the request
>> is for
>> >> a flexible mechanism that allows the creation of one-off nominal
>> types for
>> >> very specific use-cases?
>> >>
>> >> —
>> >> Felipe
>> >>
>> >> On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou 
>> wrote:
>> >>
>> >>>
>> >>> Yes, JSON and UUID are obvious candidates for new canonical
>> extension
>> >>> types. XML also comes to mind, but I'm not sure there's much of a
>> use
>> >>> case for it.
>> >>>
>> >>> Regards
>> >>>
>> >>> Antoine.
>> >>>
>> >>>
>> >>> Le 10/04/2024 à 22:55, Wes McKinney a écrit :
>>  In the past we have discussed adding a canonical type for UUID
>> and JSON.
>> >>> I
>>  still think this is a good idea and could improve ergonomics in
>> >>> downstream
>>  language bindings (e.g. by exposing JSON querying function or
>> >>> automatically
>>  boxing UUIDs in built-in UUID types, like the Python uuid
>> library). Has
>>  anyone done any work on this to anyone's knowledge?
>> 
>>  On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield <
>> emkornfi...@gmail.com>
>>  wrote:
>> 
>> > Hi Norman,
>> > Arrow has a concept of extension types [1] along with the
>> possibility of
>> > proposing new canonical extension types [2].  This seems to
>> cover the
>> > use-cases you mention but I might be misunderstanding?
>> >
>> > Thanks,
>> 

Re: Unsupported/Other Type

2024-04-17 Thread Antoine Pitrou



I think this should be:
- a canonical extension type
- with a parameter unambiguously identifying the type for applications 
supporting it (such as "org.postgres.pg_lsn")
- with storage type left for each implementation to decide, but with a 
recommendation to use either 1) binary, 2) fixed-size-binary or 3) null.


Regards

Antoine.


Le 17/04/2024 à 16:25, Weston Pace a écrit :

people generally find use in Arrow schemas independently of concrete data.


This makes sense.  I think we do want to encourage use of Arrow as a "type
system" even if there is no data involved.  And, given that we cannot
easily change a field's data type property to "optional" it makes sense to
use a dedicated type and I so I would be in favor of such a proposal (we
may eventually add an "unknown type" concept in Substrait as well, it's
come up several times, and so we could use this in that context).

I think that I would still prefer a canonical extension type (with storage
type null) over a new dedicated type.

On Wed, Apr 17, 2024 at 5:39 AM Antoine Pitrou  wrote:



Ah! Well, I think this could be an interesting proposal, but someone
should put a more formal proposal, perhaps as a draft PR.

Regards

Antoine.


Le 17/04/2024 à 11:57, David Li a écrit :

For an unsupported/other extension type.

On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:

What is "this proposal"?


Le 17/04/2024 à 10:38, David Li a écrit :

Should I take it that this proposal is dead in the water? While we

could define our own Unknown/Other type for say the ADBC PostgreSQL driver
it might be useful to have a singular type for consumers to latch on to.


On Fri, Apr 12, 2024, at 07:32, David Li wrote:

I think an "Other" extension type is slightly different than an
arbitrary extension type, though: the latter may be understood
downstream but the former represents a point at which a component
explicitly declares it does not know how to handle a field. In this
example, the PostgreSQL ADBC driver might be able to provide a
representation regardless, but a different driver (or say, the JDBC
adapter, which cannot necessarily get a bytestring for an arbitrary
JDBC type) may want an Other type to signal that it would fail if

asked

to provide particular columns.

On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:

Depending where your Arrow-encoded data is used, either extension
types or generic field metadata are options. We have this problem in
the ADBC Postgres driver, where we can convert *most* Postgres types
to an Arrow type but there are some others where we can't or don't
know or don't implement a conversion. Currently for these we return
opaque binary (the Postgres COPY representation of the value) but put
field metadata so that a consumer can implement a workaround for an
unsupported type. It would be arguably better to have implemented

this

as an extension type; however, field metadata felt like less of a
commitment when I first worked on this.

Cheers,

-dewey

On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
 wrote:


I was using UUID as an example. It looks like extension types

covers my original request.


From: Felipe Oliveira Carvalho 
Sent: Thursday, April 11, 2024 7:15 AM
To: dev@arrow.apache.org 
Subject: Re: Unsupported/Other Type

The OP used UUID as an example. Would that be enough or the request

is for

a flexible mechanism that allows the creation of one-off nominal

types for

very specific use-cases?

—
Felipe

On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou 

wrote:




Yes, JSON and UUID are obvious candidates for new canonical

extension

types. XML also comes to mind, but I'm not sure there's much of a

use

case for it.

Regards

Antoine.


Le 10/04/2024 à 22:55, Wes McKinney a écrit :

In the past we have discussed adding a canonical type for UUID

and JSON.

I

still think this is a good idea and could improve ergonomics in

downstream

language bindings (e.g. by exposing JSON querying function or

automatically

boxing UUIDs in built-in UUID types, like the Python uuid

library). Has

anyone done any work on this to anyone's knowledge?

On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield <

emkornfi...@gmail.com>

wrote:


Hi Norman,
Arrow has a concept of extension types [1] along with the

possibility of

proposing new canonical extension types [2].  This seems to

cover the

use-cases you mention but I might be misunderstanding?

Thanks,
Micah

[1]





https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types

[2]

https://arrow.apache.org/docs/format/CanonicalExtensions.html


On Wed, Apr 10, 2024 at 11:44 AM Norman Jordan
 wrote:


Problem Description

Currently Arrow schemas can only contain columns of types

supported by

Arrow. In some cases an Arrow schema maps to an external

schema. This

can

result in the Arrow schema not being able to support all the

columns

from

the external schema.

Consider an external system that contains a column of type


Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Joris Van den Bossche
+1 (binding)

Tested source with conda on Ubuntu

On Wed, 17 Apr 2024 at 16:28, Vibhatha Abeykoon  wrote:
>
> I executed the following
>
> # Verifying C++
>
> ```bash
> TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
> ```
>
> # Verifying C++ and Python
>
> ```bash
> TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh 16.0.0 0
> ```
>
> # Verifying C++ and Java
>
> ```bash
> TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
> ./verify-release-candidate.sh 16.0.0 0
> ```
>
> with:
> * Python 3.10.12
> * gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
> * openjdk version "21.0.2" 2024-01-16
> * Ubuntu 22.04.4 LTS
>
> Verifying C++, Python and Java
>
> +1 (non-binding)
>
>
> On Wed, Apr 17, 2024 at 6:16 PM David Li  wrote:
>
> > +1
> >
> > tested sources on Debian 12, x86-64
> >
> > On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> > > Hi,
> > >
> > > Just a minor note, the binary verification for
> > > verify-rc-binaries-wheels-windows failed with [1].
> > > This can be avoided by implementing the solution proposed in this
> > > comment by Kou [2]. See more details there.
> > >
> > > As shared in the comment we don't think this is a blocker as it just
> > > requires to set TZDIR and download the IANA database for the ORC test
> > > to pass on Windows.
> > >
> > > Kind regards,
> > > Raúl
> > >
> > > [1]
> > >
> > https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
> > > [2] https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
> > >
> > > El mié, 17 abr 2024 a las 11:01, Raúl Cumplido ()
> > escribió:
> > >>
> > >> Hi,
> > >>
> > >> I would like to propose the following release candidate (RC0) of Apache
> > >> Arrow version 16.0.0. This is a release consisting of 378
> > >> resolved GitHub issues[1].
> > >>
> > >> This release candidate is based on commit:
> > >> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
> > >>
> > >> The source release rc0 is hosted at [3].
> > >> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> > >> The changelog is located at [12].
> > >>
> > >> Please download, verify checksums and signatures, run the unit tests,
> > >> and vote on the release. See [13] for how to validate a release
> > candidate.
> > >>
> > >> See also a verification result on GitHub pull request [14].
> > >>
> > >> The vote will be open for at least 72 hours.
> > >>
> > >> [ ] +1 Release this as Apache Arrow 16.0.0
> > >> [ ] +0
> > >> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
> > >>
> > >> [1]:
> > https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
> > >> [2]:
> > https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
> > >> [3]:
> > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
> > >> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> > >> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> > >> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> > >> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> > >> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
> > >> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
> > >> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
> > >> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> > >> [12]:
> > https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
> > >> [13]:
> > https://arrow.apache.org/docs/developers/release_verification.html
> > >> [14]: https://github.com/apache/arrow/pull/41235
> >


Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Vibhatha Abeykoon
I executed the following

# Verifying C++

```bash
TEST_DEFAULT=0 TEST_CPP=1 ./verify-release-candidate.sh 16.0.0 0
```

# Verifying C++ and Python

```bash
TEST_DEFAULT=0 TEST_CPP=1 TEST_PYTHON=1 verify-release-candidate.sh 16.0.0 0
```

# Verifying C++ and Java

```bash
TEST_DEFAULT=0 TEST_INTEGRATION_CPP=1 TEST_INTEGRATION_JAVA=1
./verify-release-candidate.sh 16.0.0 0
```

with:
* Python 3.10.12
* gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
* openjdk version "21.0.2" 2024-01-16
* Ubuntu 22.04.4 LTS

Verifying C++, Python and Java

+1 (non-binding)


On Wed, Apr 17, 2024 at 6:16 PM David Li  wrote:

> +1
>
> tested sources on Debian 12, x86-64
>
> On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> > Hi,
> >
> > Just a minor note, the binary verification for
> > verify-rc-binaries-wheels-windows failed with [1].
> > This can be avoided by implementing the solution proposed in this
> > comment by Kou [2]. See more details there.
> >
> > As shared in the comment we don't think this is a blocker as it just
> > requires to set TZDIR and download the IANA database for the ORC test
> > to pass on Windows.
> >
> > Kind regards,
> > Raúl
> >
> > [1]
> >
> https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
> > [2] https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
> >
> > El mié, 17 abr 2024 a las 11:01, Raúl Cumplido ()
> escribió:
> >>
> >> Hi,
> >>
> >> I would like to propose the following release candidate (RC0) of Apache
> >> Arrow version 16.0.0. This is a release consisting of 378
> >> resolved GitHub issues[1].
> >>
> >> This release candidate is based on commit:
> >> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
> >>
> >> The source release rc0 is hosted at [3].
> >> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> >> The changelog is located at [12].
> >>
> >> Please download, verify checksums and signatures, run the unit tests,
> >> and vote on the release. See [13] for how to validate a release
> candidate.
> >>
> >> See also a verification result on GitHub pull request [14].
> >>
> >> The vote will be open for at least 72 hours.
> >>
> >> [ ] +1 Release this as Apache Arrow 16.0.0
> >> [ ] +0
> >> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
> >>
> >> [1]:
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
> >> [2]:
> https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
> >> [3]:
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
> >> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> >> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> >> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> >> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> >> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
> >> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
> >> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
> >> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> >> [12]:
> https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
> >> [13]:
> https://arrow.apache.org/docs/developers/release_verification.html
> >> [14]: https://github.com/apache/arrow/pull/41235
>


Re: Unsupported/Other Type

2024-04-17 Thread Weston Pace
> people generally find use in Arrow schemas independently of concrete data.

This makes sense.  I think we do want to encourage use of Arrow as a "type
system" even if there is no data involved.  And, given that we cannot
easily change a field's data type property to "optional" it makes sense to
use a dedicated type and I so I would be in favor of such a proposal (we
may eventually add an "unknown type" concept in Substrait as well, it's
come up several times, and so we could use this in that context).

I think that I would still prefer a canonical extension type (with storage
type null) over a new dedicated type.

On Wed, Apr 17, 2024 at 5:39 AM Antoine Pitrou  wrote:

>
> Ah! Well, I think this could be an interesting proposal, but someone
> should put a more formal proposal, perhaps as a draft PR.
>
> Regards
>
> Antoine.
>
>
> Le 17/04/2024 à 11:57, David Li a écrit :
> > For an unsupported/other extension type.
> >
> > On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:
> >> What is "this proposal"?
> >>
> >>
> >> Le 17/04/2024 à 10:38, David Li a écrit :
> >>> Should I take it that this proposal is dead in the water? While we
> could define our own Unknown/Other type for say the ADBC PostgreSQL driver
> it might be useful to have a singular type for consumers to latch on to.
> >>>
> >>> On Fri, Apr 12, 2024, at 07:32, David Li wrote:
>  I think an "Other" extension type is slightly different than an
>  arbitrary extension type, though: the latter may be understood
>  downstream but the former represents a point at which a component
>  explicitly declares it does not know how to handle a field. In this
>  example, the PostgreSQL ADBC driver might be able to provide a
>  representation regardless, but a different driver (or say, the JDBC
>  adapter, which cannot necessarily get a bytestring for an arbitrary
>  JDBC type) may want an Other type to signal that it would fail if
> asked
>  to provide particular columns.
> 
>  On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:
> > Depending where your Arrow-encoded data is used, either extension
> > types or generic field metadata are options. We have this problem in
> > the ADBC Postgres driver, where we can convert *most* Postgres types
> > to an Arrow type but there are some others where we can't or don't
> > know or don't implement a conversion. Currently for these we return
> > opaque binary (the Postgres COPY representation of the value) but put
> > field metadata so that a consumer can implement a workaround for an
> > unsupported type. It would be arguably better to have implemented
> this
> > as an extension type; however, field metadata felt like less of a
> > commitment when I first worked on this.
> >
> > Cheers,
> >
> > -dewey
> >
> > On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
> >  wrote:
> >>
> >> I was using UUID as an example. It looks like extension types
> covers my original request.
> >> 
> >> From: Felipe Oliveira Carvalho 
> >> Sent: Thursday, April 11, 2024 7:15 AM
> >> To: dev@arrow.apache.org 
> >> Subject: Re: Unsupported/Other Type
> >>
> >> The OP used UUID as an example. Would that be enough or the request
> is for
> >> a flexible mechanism that allows the creation of one-off nominal
> types for
> >> very specific use-cases?
> >>
> >> —
> >> Felipe
> >>
> >> On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou 
> wrote:
> >>
> >>>
> >>> Yes, JSON and UUID are obvious candidates for new canonical
> extension
> >>> types. XML also comes to mind, but I'm not sure there's much of a
> use
> >>> case for it.
> >>>
> >>> Regards
> >>>
> >>> Antoine.
> >>>
> >>>
> >>> Le 10/04/2024 à 22:55, Wes McKinney a écrit :
>  In the past we have discussed adding a canonical type for UUID
> and JSON.
> >>> I
>  still think this is a good idea and could improve ergonomics in
> >>> downstream
>  language bindings (e.g. by exposing JSON querying function or
> >>> automatically
>  boxing UUIDs in built-in UUID types, like the Python uuid
> library). Has
>  anyone done any work on this to anyone's knowledge?
> 
>  On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield <
> emkornfi...@gmail.com>
>  wrote:
> 
> > Hi Norman,
> > Arrow has a concept of extension types [1] along with the
> possibility of
> > proposing new canonical extension types [2].  This seems to
> cover the
> > use-cases you mention but I might be misunderstanding?
> >
> > Thanks,
> > Micah
> >
> > [1]
> >
> >
> >>>
> https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types
> > [2]
> https://arrow.apache.org/docs/format/CanonicalExten

Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread David Li
+1

tested sources on Debian 12, x86-64

On Wed, Apr 17, 2024, at 18:14, Raúl Cumplido wrote:
> Hi,
>
> Just a minor note, the binary verification for
> verify-rc-binaries-wheels-windows failed with [1].
> This can be avoided by implementing the solution proposed in this
> comment by Kou [2]. See more details there.
>
> As shared in the comment we don't think this is a blocker as it just
> requires to set TZDIR and download the IANA database for the ORC test
> to pass on Windows.
>
> Kind regards,
> Raúl
>
> [1] 
> https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
> [2] https://github.com/apache/arrow/pull/41235#issuecomment-2060264968
>
> El mié, 17 abr 2024 a las 11:01, Raúl Cumplido () escribió:
>>
>> Hi,
>>
>> I would like to propose the following release candidate (RC0) of Apache
>> Arrow version 16.0.0. This is a release consisting of 378
>> resolved GitHub issues[1].
>>
>> This release candidate is based on commit:
>> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
>>
>> The source release rc0 is hosted at [3].
>> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
>> The changelog is located at [12].
>>
>> Please download, verify checksums and signatures, run the unit tests,
>> and vote on the release. See [13] for how to validate a release candidate.
>>
>> See also a verification result on GitHub pull request [14].
>>
>> The vote will be open for at least 72 hours.
>>
>> [ ] +1 Release this as Apache Arrow 16.0.0
>> [ ] +0
>> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
>>
>> [1]: 
>> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
>> [2]: 
>> https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
>> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
>> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
>> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
>> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
>> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
>> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
>> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
>> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
>> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
>> [12]: 
>> https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
>> [13]: https://arrow.apache.org/docs/developers/release_verification.html
>> [14]: https://github.com/apache/arrow/pull/41235


Re: Unsupported/Other Type

2024-04-17 Thread David Li
I'll see if I can write this out more.

@Weston, indeed this is some sort of "planning stage" but I think a concrete 
type is still useful. For example, wherever we use Arrow and adapt a foreign 
catalog, we may need _something_ to indicate the presence of a column that we 
do not know how to interpret. It would be bad to simply pretend the column does 
not exist, and it would be inconvenient for the user to have a hard error. This 
comes up with the Java JDBC adapter, where currently we just give a hard error 
when we don't know how to convert a type, even if the user is just inquiring 
about the schema of the table, as well as the ADBC Postgres driver, as 
discussed.

Otherwise, we'd have to come up with our own encoding of Arrow schemas that 
allows for Option, and invent our own conventions in each language/in 
ADBC, and so on. Perhaps we could call this an abuse of Arrow schemas given 
that Arrow was meant to describe concrete in-memory data, but I think user 
requests for features like JSON encodings of Arrow schemas (even if we've made 
no progress on them) show that people generally find use in Arrow schemas 
independently of concrete data.

On Wed, Apr 17, 2024, at 20:09, Antoine Pitrou wrote:
> Ah! Well, I think this could be an interesting proposal, but someone 
> should put a more formal proposal, perhaps as a draft PR.
>
> Regards
>
> Antoine.
>
>
> Le 17/04/2024 à 11:57, David Li a écrit :
>> For an unsupported/other extension type.
>> 
>> On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:
>>> What is "this proposal"?
>>>
>>>
>>> Le 17/04/2024 à 10:38, David Li a écrit :
 Should I take it that this proposal is dead in the water? While we could 
 define our own Unknown/Other type for say the ADBC PostgreSQL driver it 
 might be useful to have a singular type for consumers to latch on to.

 On Fri, Apr 12, 2024, at 07:32, David Li wrote:
> I think an "Other" extension type is slightly different than an
> arbitrary extension type, though: the latter may be understood
> downstream but the former represents a point at which a component
> explicitly declares it does not know how to handle a field. In this
> example, the PostgreSQL ADBC driver might be able to provide a
> representation regardless, but a different driver (or say, the JDBC
> adapter, which cannot necessarily get a bytestring for an arbitrary
> JDBC type) may want an Other type to signal that it would fail if asked
> to provide particular columns.
>
> On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:
>> Depending where your Arrow-encoded data is used, either extension
>> types or generic field metadata are options. We have this problem in
>> the ADBC Postgres driver, where we can convert *most* Postgres types
>> to an Arrow type but there are some others where we can't or don't
>> know or don't implement a conversion. Currently for these we return
>> opaque binary (the Postgres COPY representation of the value) but put
>> field metadata so that a consumer can implement a workaround for an
>> unsupported type. It would be arguably better to have implemented this
>> as an extension type; however, field metadata felt like less of a
>> commitment when I first worked on this.
>>
>> Cheers,
>>
>> -dewey
>>
>> On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
>>  wrote:
>>>
>>> I was using UUID as an example. It looks like extension types covers my 
>>> original request.
>>> 
>>> From: Felipe Oliveira Carvalho 
>>> Sent: Thursday, April 11, 2024 7:15 AM
>>> To: dev@arrow.apache.org 
>>> Subject: Re: Unsupported/Other Type
>>>
>>> The OP used UUID as an example. Would that be enough or the request is 
>>> for
>>> a flexible mechanism that allows the creation of one-off nominal types 
>>> for
>>> very specific use-cases?
>>>
>>> —
>>> Felipe
>>>
>>> On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou  wrote:
>>>

 Yes, JSON and UUID are obvious candidates for new canonical extension
 types. XML also comes to mind, but I'm not sure there's much of a use
 case for it.

 Regards

 Antoine.


 Le 10/04/2024 à 22:55, Wes McKinney a écrit :
> In the past we have discussed adding a canonical type for UUID and 
> JSON.
 I
> still think this is a good idea and could improve ergonomics in
 downstream
> language bindings (e.g. by exposing JSON querying function or
 automatically
> boxing UUIDs in built-in UUID types, like the Python uuid library). 
> Has
> anyone done any work on this to anyone's knowledge?
>
> On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield 
> 
> wrote:
>
>> Hi Norman,
>> Arrow ha

Re: Unsupported/Other Type

2024-04-17 Thread Antoine Pitrou



Ah! Well, I think this could be an interesting proposal, but someone 
should put a more formal proposal, perhaps as a draft PR.


Regards

Antoine.


Le 17/04/2024 à 11:57, David Li a écrit :

For an unsupported/other extension type.

On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:

What is "this proposal"?


Le 17/04/2024 à 10:38, David Li a écrit :

Should I take it that this proposal is dead in the water? While we could define 
our own Unknown/Other type for say the ADBC PostgreSQL driver it might be 
useful to have a singular type for consumers to latch on to.

On Fri, Apr 12, 2024, at 07:32, David Li wrote:

I think an "Other" extension type is slightly different than an
arbitrary extension type, though: the latter may be understood
downstream but the former represents a point at which a component
explicitly declares it does not know how to handle a field. In this
example, the PostgreSQL ADBC driver might be able to provide a
representation regardless, but a different driver (or say, the JDBC
adapter, which cannot necessarily get a bytestring for an arbitrary
JDBC type) may want an Other type to signal that it would fail if asked
to provide particular columns.

On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:

Depending where your Arrow-encoded data is used, either extension
types or generic field metadata are options. We have this problem in
the ADBC Postgres driver, where we can convert *most* Postgres types
to an Arrow type but there are some others where we can't or don't
know or don't implement a conversion. Currently for these we return
opaque binary (the Postgres COPY representation of the value) but put
field metadata so that a consumer can implement a workaround for an
unsupported type. It would be arguably better to have implemented this
as an extension type; however, field metadata felt like less of a
commitment when I first worked on this.

Cheers,

-dewey

On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
 wrote:


I was using UUID as an example. It looks like extension types covers my 
original request.

From: Felipe Oliveira Carvalho 
Sent: Thursday, April 11, 2024 7:15 AM
To: dev@arrow.apache.org 
Subject: Re: Unsupported/Other Type

The OP used UUID as an example. Would that be enough or the request is for
a flexible mechanism that allows the creation of one-off nominal types for
very specific use-cases?

—
Felipe

On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou  wrote:



Yes, JSON and UUID are obvious candidates for new canonical extension
types. XML also comes to mind, but I'm not sure there's much of a use
case for it.

Regards

Antoine.


Le 10/04/2024 à 22:55, Wes McKinney a écrit :

In the past we have discussed adding a canonical type for UUID and JSON.

I

still think this is a good idea and could improve ergonomics in

downstream

language bindings (e.g. by exposing JSON querying function or

automatically

boxing UUIDs in built-in UUID types, like the Python uuid library). Has
anyone done any work on this to anyone's knowledge?

On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield 
wrote:


Hi Norman,
Arrow has a concept of extension types [1] along with the possibility of
proposing new canonical extension types [2].  This seems to cover the
use-cases you mention but I might be misunderstanding?

Thanks,
Micah

[1]



https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types

[2] https://arrow.apache.org/docs/format/CanonicalExtensions.html

On Wed, Apr 10, 2024 at 11:44 AM Norman Jordan
 wrote:


Problem Description

Currently Arrow schemas can only contain columns of types supported by
Arrow. In some cases an Arrow schema maps to an external schema. This

can

result in the Arrow schema not being able to support all the columns

from

the external schema.

Consider an external system that contains a column of type UUID. To

model

the schema in Arrow, the user has two choices:

 1.  Do not include the UUID column in the Arrow schema

 2.  Map the column to an existing Arrow type. This will not include

the

original type information. A UUID can be mapped to a FixedSizeBinary,

but

consumers of the Arrow schema will be unable to distinguish a
FixedSizeBinary field from a UUID field.

Possible Solution

 *   Add a new type code that represents unsupported types

 *   Values for the new type are represented as variable length

binary


Some drivers can expose data even when they don’t understand the data
type. For example, the PostgreSQL driver will return the raw bytes for
fields of an unknown type. Using an explicit type lets clients know

that

they should convert values if they were able to determine the actual

data

type.

Questions

 *   What is the impact on existing clients when they encounter

fields

of

the unsupported type?

 *   Is it safe to assume that all unsupported values can safely be
converted to a variable length binary?

 *   How can we preserve information about t

Re: Unsupported/Other Type

2024-04-17 Thread Weston Pace
> may want an Other type to signal that it would fail if asked to provide
particular columns.

I interpret "would fail" to mean we are still speaking in some kind of
"planning stage" and not yet actually creating arrays.  So I don't know
that this needs to be a data type.  In other words, I see this as
`std::optional` and not a unique instance of `DataType`.

However, if you did need to actually create an array, and you wanted some
way of saying "there is no data here because I failed to interpret the
type" then maybe you could create an extension type based on the null type?

On Wed, Apr 17, 2024 at 2:57 AM David Li  wrote:

> For an unsupported/other extension type.
>
> On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:
> > What is "this proposal"?
> >
> >
> > Le 17/04/2024 à 10:38, David Li a écrit :
> >> Should I take it that this proposal is dead in the water? While we
> could define our own Unknown/Other type for say the ADBC PostgreSQL driver
> it might be useful to have a singular type for consumers to latch on to.
> >>
> >> On Fri, Apr 12, 2024, at 07:32, David Li wrote:
> >>> I think an "Other" extension type is slightly different than an
> >>> arbitrary extension type, though: the latter may be understood
> >>> downstream but the former represents a point at which a component
> >>> explicitly declares it does not know how to handle a field. In this
> >>> example, the PostgreSQL ADBC driver might be able to provide a
> >>> representation regardless, but a different driver (or say, the JDBC
> >>> adapter, which cannot necessarily get a bytestring for an arbitrary
> >>> JDBC type) may want an Other type to signal that it would fail if asked
> >>> to provide particular columns.
> >>>
> >>> On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:
>  Depending where your Arrow-encoded data is used, either extension
>  types or generic field metadata are options. We have this problem in
>  the ADBC Postgres driver, where we can convert *most* Postgres types
>  to an Arrow type but there are some others where we can't or don't
>  know or don't implement a conversion. Currently for these we return
>  opaque binary (the Postgres COPY representation of the value) but put
>  field metadata so that a consumer can implement a workaround for an
>  unsupported type. It would be arguably better to have implemented this
>  as an extension type; however, field metadata felt like less of a
>  commitment when I first worked on this.
> 
>  Cheers,
> 
>  -dewey
> 
>  On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
>   wrote:
> >
> > I was using UUID as an example. It looks like extension types covers
> my original request.
> > 
> > From: Felipe Oliveira Carvalho 
> > Sent: Thursday, April 11, 2024 7:15 AM
> > To: dev@arrow.apache.org 
> > Subject: Re: Unsupported/Other Type
> >
> > The OP used UUID as an example. Would that be enough or the request
> is for
> > a flexible mechanism that allows the creation of one-off nominal
> types for
> > very specific use-cases?
> >
> > —
> > Felipe
> >
> > On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou 
> wrote:
> >
> >>
> >> Yes, JSON and UUID are obvious candidates for new canonical
> extension
> >> types. XML also comes to mind, but I'm not sure there's much of a
> use
> >> case for it.
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> Le 10/04/2024 à 22:55, Wes McKinney a écrit :
> >>> In the past we have discussed adding a canonical type for UUID and
> JSON.
> >> I
> >>> still think this is a good idea and could improve ergonomics in
> >> downstream
> >>> language bindings (e.g. by exposing JSON querying function or
> >> automatically
> >>> boxing UUIDs in built-in UUID types, like the Python uuid
> library). Has
> >>> anyone done any work on this to anyone's knowledge?
> >>>
> >>> On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield <
> emkornfi...@gmail.com>
> >>> wrote:
> >>>
>  Hi Norman,
>  Arrow has a concept of extension types [1] along with the
> possibility of
>  proposing new canonical extension types [2].  This seems to cover
> the
>  use-cases you mention but I might be misunderstanding?
> 
>  Thanks,
>  Micah
> 
>  [1]
> 
> 
> >>
> https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types
>  [2] https://arrow.apache.org/docs/format/CanonicalExtensions.html
> 
>  On Wed, Apr 10, 2024 at 11:44 AM Norman Jordan
>   wrote:
> 
> > Problem Description
> >
> > Currently Arrow schemas can only contain columns of types
> supported by
> > Arrow. In some cases an Arrow schema maps to an external schema.
> This
> >> can
> > resu

Re: Unsupported/Other Type

2024-04-17 Thread David Li
For an unsupported/other extension type.

On Wed, Apr 17, 2024, at 18:32, Antoine Pitrou wrote:
> What is "this proposal"?
>
>
> Le 17/04/2024 à 10:38, David Li a écrit :
>> Should I take it that this proposal is dead in the water? While we could 
>> define our own Unknown/Other type for say the ADBC PostgreSQL driver it 
>> might be useful to have a singular type for consumers to latch on to.
>> 
>> On Fri, Apr 12, 2024, at 07:32, David Li wrote:
>>> I think an "Other" extension type is slightly different than an
>>> arbitrary extension type, though: the latter may be understood
>>> downstream but the former represents a point at which a component
>>> explicitly declares it does not know how to handle a field. In this
>>> example, the PostgreSQL ADBC driver might be able to provide a
>>> representation regardless, but a different driver (or say, the JDBC
>>> adapter, which cannot necessarily get a bytestring for an arbitrary
>>> JDBC type) may want an Other type to signal that it would fail if asked
>>> to provide particular columns.
>>>
>>> On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:
 Depending where your Arrow-encoded data is used, either extension
 types or generic field metadata are options. We have this problem in
 the ADBC Postgres driver, where we can convert *most* Postgres types
 to an Arrow type but there are some others where we can't or don't
 know or don't implement a conversion. Currently for these we return
 opaque binary (the Postgres COPY representation of the value) but put
 field metadata so that a consumer can implement a workaround for an
 unsupported type. It would be arguably better to have implemented this
 as an extension type; however, field metadata felt like less of a
 commitment when I first worked on this.

 Cheers,

 -dewey

 On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
  wrote:
>
> I was using UUID as an example. It looks like extension types covers my 
> original request.
> 
> From: Felipe Oliveira Carvalho 
> Sent: Thursday, April 11, 2024 7:15 AM
> To: dev@arrow.apache.org 
> Subject: Re: Unsupported/Other Type
>
> The OP used UUID as an example. Would that be enough or the request is for
> a flexible mechanism that allows the creation of one-off nominal types for
> very specific use-cases?
>
> —
> Felipe
>
> On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou  wrote:
>
>>
>> Yes, JSON and UUID are obvious candidates for new canonical extension
>> types. XML also comes to mind, but I'm not sure there's much of a use
>> case for it.
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 10/04/2024 à 22:55, Wes McKinney a écrit :
>>> In the past we have discussed adding a canonical type for UUID and JSON.
>> I
>>> still think this is a good idea and could improve ergonomics in
>> downstream
>>> language bindings (e.g. by exposing JSON querying function or
>> automatically
>>> boxing UUIDs in built-in UUID types, like the Python uuid library). Has
>>> anyone done any work on this to anyone's knowledge?
>>>
>>> On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield 
>>> wrote:
>>>
 Hi Norman,
 Arrow has a concept of extension types [1] along with the possibility 
 of
 proposing new canonical extension types [2].  This seems to cover the
 use-cases you mention but I might be misunderstanding?

 Thanks,
 Micah

 [1]


>> https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types
 [2] https://arrow.apache.org/docs/format/CanonicalExtensions.html

 On Wed, Apr 10, 2024 at 11:44 AM Norman Jordan
  wrote:

> Problem Description
>
> Currently Arrow schemas can only contain columns of types supported by
> Arrow. In some cases an Arrow schema maps to an external schema. This
>> can
> result in the Arrow schema not being able to support all the columns
>> from
> the external schema.
>
> Consider an external system that contains a column of type UUID. To
>> model
> the schema in Arrow, the user has two choices:
>
> 1.  Do not include the UUID column in the Arrow schema
>
> 2.  Map the column to an existing Arrow type. This will not 
> include
>> the
> original type information. A UUID can be mapped to a FixedSizeBinary,
>> but
> consumers of the Arrow schema will be unable to distinguish a
> FixedSizeBinary field from a UUID field.
>
> Possible Solution
>
> *   Add a new type code that represents unsupported types
>
> *   Values for the new type are represented as v

Re: AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Antoine Pitrou



Out of curiosity, did you notice this by chance or do you have some kind 
of script that processes ASF mailing-list archives for possible voting 
irregularities?


Regards

Antoine.


Le 17/04/2024 à 10:44, Christofer Dutz a écrit :

When looking at whimsy, I can’t see any person named Sutou Kouhei listed as 
member of the Arrow PMC.

Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah … ok … 
then please ignore my mumbling ;-)

And yeah … the result now also moved to the same page … guess it was sent out a 
while after the Announce … guess that’s why I missed it.

Thanks for following up …

Chris

Von: David Li 
Datum: Mittwoch, 17. April 2024 um 10:36
An: Christofer Dutz , dev@arrow.apache.org 

Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC 0.11.0
Hi Christofer,

Sutou Kouhei is part of the PMC.

Additionally, there is a result email: 
https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15

On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:

Hi all,

while reviewing your projects activity in the last quarter as part of
my preparation for today's borads meeting I came across your last vote
on Apache Arrow ADBC 0.11.0 RC0

Technically I count only 2 binding +1 votes:
- Matthew Topol
- Dewey Dunnington

All others are not part of the PMC.

I assume the Release Manager David implicitly counted himself as +1,
however does a concept of an implicit vote not exist at Apache. If you
want to save sending an additional email, adding something like "this
also counts as my +1 vote" to your email, or - even better - send an
explicit vote email.

Also would it be good to have a RESULT email containing the result of a vote.

So right now we would need a third binding vote as soon as possible
(Possibly also for other votes, where we had the release manager
provide the missing third vote).

Chris

PS: Please keep me in CC as I'm not subscribed here.




Re: Unsupported/Other Type

2024-04-17 Thread Antoine Pitrou



What is "this proposal"?


Le 17/04/2024 à 10:38, David Li a écrit :

Should I take it that this proposal is dead in the water? While we could define 
our own Unknown/Other type for say the ADBC PostgreSQL driver it might be 
useful to have a singular type for consumers to latch on to.

On Fri, Apr 12, 2024, at 07:32, David Li wrote:

I think an "Other" extension type is slightly different than an
arbitrary extension type, though: the latter may be understood
downstream but the former represents a point at which a component
explicitly declares it does not know how to handle a field. In this
example, the PostgreSQL ADBC driver might be able to provide a
representation regardless, but a different driver (or say, the JDBC
adapter, which cannot necessarily get a bytestring for an arbitrary
JDBC type) may want an Other type to signal that it would fail if asked
to provide particular columns.

On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:

Depending where your Arrow-encoded data is used, either extension
types or generic field metadata are options. We have this problem in
the ADBC Postgres driver, where we can convert *most* Postgres types
to an Arrow type but there are some others where we can't or don't
know or don't implement a conversion. Currently for these we return
opaque binary (the Postgres COPY representation of the value) but put
field metadata so that a consumer can implement a workaround for an
unsupported type. It would be arguably better to have implemented this
as an extension type; however, field metadata felt like less of a
commitment when I first worked on this.

Cheers,

-dewey

On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
 wrote:


I was using UUID as an example. It looks like extension types covers my 
original request.

From: Felipe Oliveira Carvalho 
Sent: Thursday, April 11, 2024 7:15 AM
To: dev@arrow.apache.org 
Subject: Re: Unsupported/Other Type

The OP used UUID as an example. Would that be enough or the request is for
a flexible mechanism that allows the creation of one-off nominal types for
very specific use-cases?

—
Felipe

On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou  wrote:



Yes, JSON and UUID are obvious candidates for new canonical extension
types. XML also comes to mind, but I'm not sure there's much of a use
case for it.

Regards

Antoine.


Le 10/04/2024 à 22:55, Wes McKinney a écrit :

In the past we have discussed adding a canonical type for UUID and JSON.

I

still think this is a good idea and could improve ergonomics in

downstream

language bindings (e.g. by exposing JSON querying function or

automatically

boxing UUIDs in built-in UUID types, like the Python uuid library). Has
anyone done any work on this to anyone's knowledge?

On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield 
wrote:


Hi Norman,
Arrow has a concept of extension types [1] along with the possibility of
proposing new canonical extension types [2].  This seems to cover the
use-cases you mention but I might be misunderstanding?

Thanks,
Micah

[1]



https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types

[2] https://arrow.apache.org/docs/format/CanonicalExtensions.html

On Wed, Apr 10, 2024 at 11:44 AM Norman Jordan
 wrote:


Problem Description

Currently Arrow schemas can only contain columns of types supported by
Arrow. In some cases an Arrow schema maps to an external schema. This

can

result in the Arrow schema not being able to support all the columns

from

the external schema.

Consider an external system that contains a column of type UUID. To

model

the schema in Arrow, the user has two choices:

1.  Do not include the UUID column in the Arrow schema

2.  Map the column to an existing Arrow type. This will not include

the

original type information. A UUID can be mapped to a FixedSizeBinary,

but

consumers of the Arrow schema will be unable to distinguish a
FixedSizeBinary field from a UUID field.

Possible Solution

*   Add a new type code that represents unsupported types

*   Values for the new type are represented as variable length

binary


Some drivers can expose data even when they don’t understand the data
type. For example, the PostgreSQL driver will return the raw bytes for
fields of an unknown type. Using an explicit type lets clients know

that

they should convert values if they were able to determine the actual

data

type.

Questions

*   What is the impact on existing clients when they encounter

fields

of

the unsupported type?

*   Is it safe to assume that all unsupported values can safely be
converted to a variable length binary?

*   How can we preserve information about the original type?









Warning: The sender of this message could not be validated and may not be the 
actual sender.


AW: AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Christofer Dutz
Yeah … sorry to disapoint :-/

I actually manually go through all main mailing-lists of all projects reporting 
in a given month (usually commits@, issues@, notifications@, dev@ and private@)
I did build a little tool with tampermonkey, that allows me to quickly navigate 
the lists, but I manually check the things that I think are important to me 
(Voting being one of them).

The best anomaly-detection device is my gut-feeling ;-)

Chris

Von: Antoine Pitrou 
Datum: Mittwoch, 17. April 2024 um 11:34
An: dev@arrow.apache.org , Christofer Dutz 
, David Li , Christofer Dutz 

Betreff: Re: AW: Personal feedback on your last release on Apache Arrow ADBC 
0.11.0

Out of curiosity, did you notice this by chance or do you have some kind
of script that processes ASF mailing-list archives for possible voting
irregularities?

Regards

Antoine.


Le 17/04/2024 à 10:44, Christofer Dutz a écrit :
> When looking at whimsy, I can’t see any person named Sutou Kouhei listed as 
> member of the Arrow PMC.
>
> Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah … ok 
> … then please ignore my mumbling ;-)
>
> And yeah … the result now also moved to the same page … guess it was sent out 
> a while after the Announce … guess that’s why I missed it.
>
> Thanks for following up …
>
> Chris
>
> Von: David Li 
> Datum: Mittwoch, 17. April 2024 um 10:36
> An: Christofer Dutz , dev@arrow.apache.org 
> 
> Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC 
> 0.11.0
> Hi Christofer,
>
> Sutou Kouhei is part of the PMC.
>
> Additionally, there is a result email: 
> https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15
>
> On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
>> Hi all,
>>
>> while reviewing your projects activity in the last quarter as part of
>> my preparation for today's borads meeting I came across your last vote
>> on Apache Arrow ADBC 0.11.0 RC0
>>
>> Technically I count only 2 binding +1 votes:
>> - Matthew Topol
>> - Dewey Dunnington
>>
>> All others are not part of the PMC.
>>
>> I assume the Release Manager David implicitly counted himself as +1,
>> however does a concept of an implicit vote not exist at Apache. If you
>> want to save sending an additional email, adding something like "this
>> also counts as my +1 vote" to your email, or - even better - send an
>> explicit vote email.
>>
>> Also would it be good to have a RESULT email containing the result of a vote.
>>
>> So right now we would need a third binding vote as soon as possible
>> (Possibly also for other votes, where we had the release manager
>> provide the missing third vote).
>>
>> Chris
>>
>> PS: Please keep me in CC as I'm not subscribed here.
>


Re: [VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Raúl Cumplido
Hi,

Just a minor note, the binary verification for
verify-rc-binaries-wheels-windows failed with [1].
This can be avoided by implementing the solution proposed in this
comment by Kou [2]. See more details there.

As shared in the comment we don't think this is a blocker as it just
requires to set TZDIR and download the IANA database for the ORC test
to pass on Windows.

Kind regards,
Raúl

[1] 
https://github.com/ursacomputing/crossbow/actions/runs/8715262993/job/23907626092#step:6:5681
[2] https://github.com/apache/arrow/pull/41235#issuecomment-2060264968

El mié, 17 abr 2024 a las 11:01, Raúl Cumplido () escribió:
>
> Hi,
>
> I would like to propose the following release candidate (RC0) of Apache
> Arrow version 16.0.0. This is a release consisting of 378
> resolved GitHub issues[1].
>
> This release candidate is based on commit:
> 6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]
>
> The source release rc0 is hosted at [3].
> The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
> The changelog is located at [12].
>
> Please download, verify checksums and signatures, run the unit tests,
> and vote on the release. See [13] for how to validate a release candidate.
>
> See also a verification result on GitHub pull request [14].
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Release this as Apache Arrow 16.0.0
> [ ] +0
> [ ] -1 Do not release this as Apache Arrow 16.0.0 because...
>
> [1]: 
> https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
> [2]: 
> https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
> [3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
> [4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
> [5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
> [6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
> [7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
> [8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
> [9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
> [10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
> [11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
> [12]: 
> https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
> [13]: https://arrow.apache.org/docs/developers/release_verification.html
> [14]: https://github.com/apache/arrow/pull/41235


[VOTE] Release Apache Arrow 16.0.0 - RC0

2024-04-17 Thread Raúl Cumplido
Hi,

I would like to propose the following release candidate (RC0) of Apache
Arrow version 16.0.0. This is a release consisting of 378
resolved GitHub issues[1].

This release candidate is based on commit:
6a28035c2b49b432dc63f5ee7524d76b4ed2d762 [2]

The source release rc0 is hosted at [3].
The binary artifacts are hosted at [4][5][6][7][8][9][10][11].
The changelog is located at [12].

Please download, verify checksums and signatures, run the unit tests,
and vote on the release. See [13] for how to validate a release candidate.

See also a verification result on GitHub pull request [14].

The vote will be open for at least 72 hours.

[ ] +1 Release this as Apache Arrow 16.0.0
[ ] +0
[ ] -1 Do not release this as Apache Arrow 16.0.0 because...

[1]: 
https://github.com/apache/arrow/issues?q=is%3Aissue+milestone%3A16.0.0+is%3Aclosed
[2]: 
https://github.com/apache/arrow/tree/6a28035c2b49b432dc63f5ee7524d76b4ed2d762
[3]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-16.0.0-rc0
[4]: https://apache.jfrog.io/artifactory/arrow/almalinux-rc/
[5]: https://apache.jfrog.io/artifactory/arrow/amazon-linux-rc/
[6]: https://apache.jfrog.io/artifactory/arrow/centos-rc/
[7]: https://apache.jfrog.io/artifactory/arrow/debian-rc/
[8]: https://apache.jfrog.io/artifactory/arrow/java-rc/16.0.0-rc0
[9]: https://apache.jfrog.io/artifactory/arrow/nuget-rc/16.0.0-rc0
[10]: https://apache.jfrog.io/artifactory/arrow/python-rc/16.0.0-rc0
[11]: https://apache.jfrog.io/artifactory/arrow/ubuntu-rc/
[12]: 
https://github.com/apache/arrow/blob/6a28035c2b49b432dc63f5ee7524d76b4ed2d762/CHANGELOG.md
[13]: https://arrow.apache.org/docs/developers/release_verification.html
[14]: https://github.com/apache/arrow/pull/41235


Re: Unsupported/Other Type

2024-04-17 Thread David Li
Should I take it that this proposal is dead in the water? While we could define 
our own Unknown/Other type for say the ADBC PostgreSQL driver it might be 
useful to have a singular type for consumers to latch on to.

On Fri, Apr 12, 2024, at 07:32, David Li wrote:
> I think an "Other" extension type is slightly different than an 
> arbitrary extension type, though: the latter may be understood 
> downstream but the former represents a point at which a component 
> explicitly declares it does not know how to handle a field. In this 
> example, the PostgreSQL ADBC driver might be able to provide a 
> representation regardless, but a different driver (or say, the JDBC 
> adapter, which cannot necessarily get a bytestring for an arbitrary 
> JDBC type) may want an Other type to signal that it would fail if asked 
> to provide particular columns.
>
> On Fri, Apr 12, 2024, at 02:30, Dewey Dunnington wrote:
>> Depending where your Arrow-encoded data is used, either extension
>> types or generic field metadata are options. We have this problem in
>> the ADBC Postgres driver, where we can convert *most* Postgres types
>> to an Arrow type but there are some others where we can't or don't
>> know or don't implement a conversion. Currently for these we return
>> opaque binary (the Postgres COPY representation of the value) but put
>> field metadata so that a consumer can implement a workaround for an
>> unsupported type. It would be arguably better to have implemented this
>> as an extension type; however, field metadata felt like less of a
>> commitment when I first worked on this.
>>
>> Cheers,
>>
>> -dewey
>>
>> On Thu, Apr 11, 2024 at 1:20 PM Norman Jordan
>>  wrote:
>>>
>>> I was using UUID as an example. It looks like extension types covers my 
>>> original request.
>>> 
>>> From: Felipe Oliveira Carvalho 
>>> Sent: Thursday, April 11, 2024 7:15 AM
>>> To: dev@arrow.apache.org 
>>> Subject: Re: Unsupported/Other Type
>>>
>>> The OP used UUID as an example. Would that be enough or the request is for
>>> a flexible mechanism that allows the creation of one-off nominal types for
>>> very specific use-cases?
>>>
>>> —
>>> Felipe
>>>
>>> On Thu, 11 Apr 2024 at 05:06 Antoine Pitrou  wrote:
>>>
>>> >
>>> > Yes, JSON and UUID are obvious candidates for new canonical extension
>>> > types. XML also comes to mind, but I'm not sure there's much of a use
>>> > case for it.
>>> >
>>> > Regards
>>> >
>>> > Antoine.
>>> >
>>> >
>>> > Le 10/04/2024 à 22:55, Wes McKinney a écrit :
>>> > > In the past we have discussed adding a canonical type for UUID and JSON.
>>> > I
>>> > > still think this is a good idea and could improve ergonomics in
>>> > downstream
>>> > > language bindings (e.g. by exposing JSON querying function or
>>> > automatically
>>> > > boxing UUIDs in built-in UUID types, like the Python uuid library). Has
>>> > > anyone done any work on this to anyone's knowledge?
>>> > >
>>> > > On Wed, Apr 10, 2024 at 3:05 PM Micah Kornfield 
>>> > > wrote:
>>> > >
>>> > >> Hi Norman,
>>> > >> Arrow has a concept of extension types [1] along with the possibility 
>>> > >> of
>>> > >> proposing new canonical extension types [2].  This seems to cover the
>>> > >> use-cases you mention but I might be misunderstanding?
>>> > >>
>>> > >> Thanks,
>>> > >> Micah
>>> > >>
>>> > >> [1]
>>> > >>
>>> > >>
>>> > https://arrow.apache.org/docs/format/Columnar.html#format-metadata-extension-types
>>> > >> [2] https://arrow.apache.org/docs/format/CanonicalExtensions.html
>>> > >>
>>> > >> On Wed, Apr 10, 2024 at 11:44 AM Norman Jordan
>>> > >>  wrote:
>>> > >>
>>> > >>> Problem Description
>>> > >>>
>>> > >>> Currently Arrow schemas can only contain columns of types supported by
>>> > >>> Arrow. In some cases an Arrow schema maps to an external schema. This
>>> > can
>>> > >>> result in the Arrow schema not being able to support all the columns
>>> > from
>>> > >>> the external schema.
>>> > >>>
>>> > >>> Consider an external system that contains a column of type UUID. To
>>> > model
>>> > >>> the schema in Arrow, the user has two choices:
>>> > >>>
>>> > >>>1.  Do not include the UUID column in the Arrow schema
>>> > >>>
>>> > >>>2.  Map the column to an existing Arrow type. This will not include
>>> > the
>>> > >>> original type information. A UUID can be mapped to a FixedSizeBinary,
>>> > but
>>> > >>> consumers of the Arrow schema will be unable to distinguish a
>>> > >>> FixedSizeBinary field from a UUID field.
>>> > >>>
>>> > >>> Possible Solution
>>> > >>>
>>> > >>>*   Add a new type code that represents unsupported types
>>> > >>>
>>> > >>>*   Values for the new type are represented as variable length
>>> > binary
>>> > >>>
>>> > >>> Some drivers can expose data even when they don’t understand the data
>>> > >>> type. For example, the PostgreSQL driver will return the raw bytes for
>>> > >>> fields of an unknown type. Using an explicit type lets clients know
>>> > that
>>> > >>> they sh

AW: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Christofer Dutz
When looking at whimsy, I can’t see any person named Sutou Kouhei listed as 
member of the Arrow PMC.

Cut that … I was looking for Sutou Kouhei, but it’s Kouhei Sutou … yeah … ok … 
then please ignore my mumbling ;-)

And yeah … the result now also moved to the same page … guess it was sent out a 
while after the Announce … guess that’s why I missed it.

Thanks for following up …

Chris

Von: David Li 
Datum: Mittwoch, 17. April 2024 um 10:36
An: Christofer Dutz , dev@arrow.apache.org 

Betreff: Re: Personal feedback on your last release on Apache Arrow ADBC 0.11.0
Hi Christofer,

Sutou Kouhei is part of the PMC.

Additionally, there is a result email: 
https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15

On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
> Hi all,
>
> while reviewing your projects activity in the last quarter as part of
> my preparation for today's borads meeting I came across your last vote
> on Apache Arrow ADBC 0.11.0 RC0
>
> Technically I count only 2 binding +1 votes:
> - Matthew Topol
> - Dewey Dunnington
>
> All others are not part of the PMC.
>
> I assume the Release Manager David implicitly counted himself as +1,
> however does a concept of an implicit vote not exist at Apache. If you
> want to save sending an additional email, adding something like "this
> also counts as my +1 vote" to your email, or - even better - send an
> explicit vote email.
>
> Also would it be good to have a RESULT email containing the result of a vote.
>
> So right now we would need a third binding vote as soon as possible
> (Possibly also for other votes, where we had the release manager
> provide the missing third vote).
>
> Chris
>
> PS: Please keep me in CC as I'm not subscribed here.


Re: Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread David Li
Hi Christofer,

Sutou Kouhei is part of the PMC.

Additionally, there is a result email: 
https://lists.apache.org/thread/gb5k69pd3k6lnbzw978fm7ppx1p9cx15

On Wed, Apr 17, 2024, at 16:52, Christofer Dutz wrote:
> Hi all,
>
> while reviewing your projects activity in the last quarter as part of 
> my preparation for today's borads meeting I came across your last vote 
> on Apache Arrow ADBC 0.11.0 RC0
>
> Technically I count only 2 binding +1 votes:
> - Matthew Topol
> - Dewey Dunnington
>
> All others are not part of the PMC.
>
> I assume the Release Manager David implicitly counted himself as +1, 
> however does a concept of an implicit vote not exist at Apache. If you 
> want to save sending an additional email, adding something like "this 
> also counts as my +1 vote" to your email, or - even better - send an 
> explicit vote email.
>
> Also would it be good to have a RESULT email containing the result of a vote.
>
> So right now we would need a third binding vote as soon as possible 
> (Possibly also for other votes, where we had the release manager 
> provide the missing third vote).
>
> Chris
>
> PS: Please keep me in CC as I'm not subscribed here.


Personal feedback on your last release on Apache Arrow ADBC 0.11.0

2024-04-17 Thread Christofer Dutz
Hi all,

while reviewing your projects activity in the last quarter as part of my 
preparation for today's borads meeting I came across your last vote on Apache 
Arrow ADBC 0.11.0 RC0

Technically I count only 2 binding +1 votes:
- Matthew Topol
- Dewey Dunnington

All others are not part of the PMC.

I assume the Release Manager David implicitly counted himself as +1, however 
does a concept of an implicit vote not exist at Apache. If you want to save 
sending an additional email, adding something like "this also counts as my +1 
vote" to your email, or - even better - send an explicit vote email.

Also would it be good to have a RESULT email containing the result of a vote.

So right now we would need a third binding vote as soon as possible (Possibly 
also for other votes, where we had the release manager provide the missing 
third vote).

Chris

PS: Please keep me in CC as I'm not subscribed here.