Re: [VOTE] Removing validity bitmap from Arrow union types

2020-06-30 Thread Ryan Murray
+1 (non binding) On Tue, Jun 30, 2020 at 5:29 AM Ben Kietzman wrote: > +1 (non binding) > > On Tue, Jun 30, 2020, 00:24 Wes McKinney wrote: > > > +1 (binding) > > > > On Mon, Jun 29, 2020 at 11:09 PM Micah Kornfield > > wrote: > > > > > > +1 (binding) (I had a couple of nits on language,

[NIGHTLY] Arrow Build Report for Job nightly-2020-06-30-0

2020-06-30 Thread Crossbow
Arrow Build Report for Job nightly-2020-06-30-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-06-30-0 Failed Tasks: - centos-7-aarch64: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-06-30-0-travis-centos-7-aarch64 -

Re: [VOTE] Permitting unsigned integers for Arrow dictionary indices

2020-06-30 Thread Uwe L. Korn
+1 (binding) On Tue, Jun 30, 2020, at 6:24 AM, Wes McKinney wrote: > +1 (binding) > > On Mon, Jun 29, 2020 at 11:11 PM Ben Kietzman > wrote: > > > > +1 (non binding) > > > > On Mon, Jun 29, 2020, 18:00 Wes McKinney wrote: > > > > > Hi, > > > > > > As discussed on the mailing list [1], it has

Re: [VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release

2020-06-30 Thread Antoine Pitrou
+1 (binding) Le 29/06/2020 à 23:42, Wes McKinney a écrit : > Hi, > > As discussed on the mailing list [1], in order to demarcate the > pre-1.0.0 and post-1.0.0 worlds, and to allow the > forward-compatibility-protection changes we are making to actually > work (i.e. so that libraries can

Re: [VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release

2020-06-30 Thread Uwe L. Korn
+1 (binding) On Tue, Jun 30, 2020, at 11:11 AM, Neville Dipale wrote: > +1 (non-binding) > > On Tue, 30 Jun 2020 at 06:29, Ben Kietzman wrote: > > > +1 (non binding) > > > > On Tue, Jun 30, 2020, 00:25 Wes McKinney wrote: > > > > > +1 (binding) > > > > > > On Mon, Jun 29, 2020 at 10:49 PM

Re: [VOTE] Permitting unsigned integers for Arrow dictionary indices

2020-06-30 Thread Antoine Pitrou
+1 (binding) Le 29/06/2020 à 23:59, Wes McKinney a écrit : > Hi, > > As discussed on the mailing list [1], it has been proposed to allow > the use of unsigned dictionary indices (which is already technically > possible in our metadata serialization, but not allowed according to > the language

Re: [VOTE] Removing validity bitmap from Arrow union types

2020-06-30 Thread Antoine Pitrou
+0 Le 29/06/2020 à 23:23, Wes McKinney a écrit : > Hi, > > As discussed on the mailing list [1], it has been proposed to remove > the validity bitmap buffer from Union types in the columnar format > specification and instead let value validity be determined exclusively > by constituent arrays

Re: [VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release

2020-06-30 Thread Neville Dipale
+1 (non-binding) On Tue, 30 Jun 2020 at 06:29, Ben Kietzman wrote: > +1 (non binding) > > On Tue, Jun 30, 2020, 00:25 Wes McKinney wrote: > > > +1 (binding) > > > > On Mon, Jun 29, 2020 at 10:49 PM Micah Kornfield > > wrote: > > > > > > +1 (binding) > > > > > > On Mon, Jun 29, 2020 at 2:43 PM

Re: Arrow as a common open standard for machine learning data

2020-06-30 Thread Joaquin Vanschoren
Hi all, Sorry for restarting an old thread, but we've had a _lot_ of discussions over the past 9 months or so on how to store machine learning datasets internally. We've written a blog post about it and would love to hear your thoughts:

Re: [VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release

2020-06-30 Thread Neal Richardson
+1 (binding) On Tue, Jun 30, 2020 at 2:53 AM Antoine Pitrou wrote: > +1 (binding) > > > Le 29/06/2020 à 23:42, Wes McKinney a écrit : > > Hi, > > > > As discussed on the mailing list [1], in order to demarcate the > > pre-1.0.0 and post-1.0.0 worlds, and to allow the > >

Re: [VOTE] Permitting unsigned integers for Arrow dictionary indices

2020-06-30 Thread Neal Richardson
+1 (binding) On Tue, Jun 30, 2020 at 2:52 AM Antoine Pitrou wrote: > > +1 (binding) > > Le 29/06/2020 à 23:59, Wes McKinney a écrit : > > Hi, > > > > As discussed on the mailing list [1], it has been proposed to allow > > the use of unsigned dictionary indices (which is already technically > >

Re: [VOTE] Permitting unsigned integers for Arrow dictionary indices

2020-06-30 Thread Francois Saint-Jacques
+1 (binding) On Tue, Jun 30, 2020 at 10:55 AM Neal Richardson wrote: > > +1 (binding) > > On Tue, Jun 30, 2020 at 2:52 AM Antoine Pitrou wrote: > > > > > +1 (binding) > > > > Le 29/06/2020 à 23:59, Wes McKinney a écrit : > > > Hi, > > > > > > As discussed on the mailing list [1], it has been

Re: Bot to set "In Progress" status in JIRA

2020-06-30 Thread Neville Dipale
Thanks Wes, I noticed this today, but was a bit confused as to why you reassigned a JIRA to yourself, then back to me. This clarifies what happened :) Neville On Tue, 30 Jun 2020 at 15:39, Wes McKinney wrote: > hi, > > Yesterday I set up a bot to set issues to In Progress if they have an >

Re: Arrow as a common open standard for machine learning data

2020-06-30 Thread Nicholas Poorman
Joaquin, After reading your proposal I think there may be some things you may want to consider. It sounds like you are trying to come up with a one size fits all solution but it may be better to define your requirements based on your needs and environment. For starters, where do you plan to

Re: [DISCUSS] Ongoing LZ4 problems with Parquet files

2020-06-30 Thread Uwe L. Korn
I'm also in favor of disabling support for now. Having to deal with broken files or the detection of various incompatible implementations in the long-term will harm more than not supporting LZ4 for a while. Snappy is generally more used than LZ4 in this category as it has been available since

Re: Arrow as a common open standard for machine learning data

2020-06-30 Thread Wes McKinney
On Tue, Jun 30, 2020 at 8:09 AM Nicholas Poorman wrote: > > Joaquin, > > After reading your proposal I think there may be some things you may want > to consider. > > It sounds like you are trying to come up with a one size fits all solution > but it may be better to define your requirements based

Re: Bot to set "In Progress" status in JIRA

2020-06-30 Thread Wes McKinney
I'm setting up the bot user right now, so this should go away soon On Tue, Jun 30, 2020 at 9:39 AM Neville Dipale wrote: > > Thanks Wes, > > I noticed this today, but was a bit confused as to why you reassigned a > JIRA to yourself, then back to me. > This clarifies what happened :) > > Neville

Bot to set "In Progress" status in JIRA

2020-06-30 Thread Wes McKinney
hi, Yesterday I set up a bot to set issues to In Progress if they have an assignee once a pull request has been opened. A consequence of issuing the "Start Progress" transition in JIRA is that it assigns the issue to the JIRA user, so the bot (which uses my JIRA credentials) will then

Re: [VOTE] Removing validity bitmap from Arrow union types

2020-06-30 Thread Wes McKinney
FYI: I just submitted a PR implementing this in C++ and in the integration tests. It was not too awful https://github.com/apache/arrow/pull/7598 On Tue, Jun 30, 2020 at 4:52 AM Antoine Pitrou wrote: > > +0 > > > Le 29/06/2020 à 23:23, Wes McKinney a écrit : > > Hi, > > > > As discussed on the

Re: [VOTE] Removing validity bitmap from Arrow union types

2020-06-30 Thread Sutou Kouhei
+1 (binding) In "[VOTE] Removing validity bitmap from Arrow union types" on Mon, 29 Jun 2020 16:23:23 -0500, Wes McKinney wrote: > Hi, > > As discussed on the mailing list [1], it has been proposed to remove > the validity bitmap buffer from Union types in the columnar format >

Re: [VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release

2020-06-30 Thread Sutou Kouhei
+1 (binding) In "[VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 release" on Mon, 29 Jun 2020 16:42:45 -0500, Wes McKinney wrote: > Hi, > > As discussed on the mailing list [1], in order to demarcate the > pre-1.0.0 and post-1.0.0 worlds, and to allow the >