Re: [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-06-24 Thread Nate Bauernfeind
> Basically, it reset/set the borrow bit in eflag register based on the if condition, and runs `outpos = outpos - (-1) - borrow_bit`. That's clever, and I clearly didn't see that! On Thu, Jun 24, 2021 at 8:57 PM Yibo Cai wrote: > > > On 6/25/21 6:58 AM, Nate Bauernfeind wrote: > > FYI, the benc

Re: [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-06-24 Thread Yibo Cai
On 6/25/21 6:58 AM, Nate Bauernfeind wrote: FYI, the bench was slightly broken; but the results stand. benchmark::DoNotOptimize(output[rand()]); Since rand() has a domain of 0 to MAX_INT it blows past the output array (of length 4k). It segfaults in GCC; I'm not sure why the Clang benchmark

Re: [VOTE][RUST] Release Apache Arrow Rust 4.4.0 RC1

2021-06-24 Thread Sutou Kouhei
+1 I ran the following command line on Debian GNU/Linux sid: dev/release/verify-release-candidate.sh 4.4.0 1 Thanks, -- kou In "[VOTE][RUST] Release Apache Arrow Rust 4.4.0 RC1" on Thu, 24 Jun 2021 18:15:46 -0400, Andrew Lamb wrote: > Hi, > > I would like to propose a release of Ap

Re: [ANNOUNCE] Official media types (MIME types) for Apache Arrow formats

2021-06-24 Thread Sutou Kouhei
Hi, We've documented these extensions in https://github.com/apache/arrow/pull/10512 . Could someone add media types to our docs? And the FAQ page's source is here: https://github.com/apache/arrow-site/blob/master/faq.md Thanks, -- kou In <20210624091707.5a0e1b5b@fsol> "Re: [ANNOUNCE] Offic

Re: [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-06-24 Thread Nate Bauernfeind
FYI, the bench was slightly broken; but the results stand. > benchmark::DoNotOptimize(output[rand()]); Since rand() has a domain of 0 to MAX_INT it blows past the output array (of length 4k). It segfaults in GCC; I'm not sure why the Clang benchmark is happy with that. I modified [1] it to: > ben

[VOTE][RUST] Release Apache Arrow Rust 4.4.0 RC1

2021-06-24 Thread Andrew Lamb
Hi, I would like to propose a release of Apache Arrow Rust Implementation, version 4.4.0. This release candidate is based on commit: 32b835e5bee228d8a52015190596f4c33765849a [1] The proposed release tarball and signatures are hosted at [2]. The changelog is located at [3]. Please download, ver

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread David Li
I would also be in favor of option C, or also E if having that distinction in the schema is important to some application. -David On Thu, Jun 24, 2021, at 17:16, Andrew Lamb wrote: > C > > On Thu, Jun 24, 2021 at 5:05 PM Rok Mihevc wrote: > > > C > > > > On Thu, Jun 24, 2021 at 9:55 PM Nate B

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Andrew Lamb
C On Thu, Jun 24, 2021 at 5:05 PM Rok Mihevc wrote: > C > > On Thu, Jun 24, 2021 at 9:55 PM Nate Bauernfeind < > natebauernfe...@deephaven.io> wrote: > > > Option C. > > > > On Thu, Jun 24, 2021 at 1:53 PM Joris Peeters < > joris.mg.peet...@gmail.com> > > wrote: > > > > > C > > > > > > On Thu, J

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Rok Mihevc
C On Thu, Jun 24, 2021 at 9:55 PM Nate Bauernfeind < natebauernfe...@deephaven.io> wrote: > Option C. > > On Thu, Jun 24, 2021 at 1:53 PM Joris Peeters > wrote: > > > C > > > > On Thu, Jun 24, 2021 at 8:39 PM Antoine Pitrou > wrote: > > > > > > > > Option C. > > > > > > > > > Le 24/06/2021 à 21

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Nate Bauernfeind
Option C. On Thu, Jun 24, 2021 at 1:53 PM Joris Peeters wrote: > C > > On Thu, Jun 24, 2021 at 8:39 PM Antoine Pitrou wrote: > > > > > Option C. > > > > > > Le 24/06/2021 à 21:24, Weston Pace a écrit : > > > > > > This proposal states that Arrow should define how to encode an Instant > > > into

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Joris Peeters
C On Thu, Jun 24, 2021 at 8:39 PM Antoine Pitrou wrote: > > Option C. > > > Le 24/06/2021 à 21:24, Weston Pace a écrit : > > > > This proposal states that Arrow should define how to encode an Instant > > into Arrow data. There are several ways this could happen, some which > > change schema.fbs

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Antoine Pitrou
Option C. Le 24/06/2021 à 21:24, Weston Pace a écrit : This proposal states that Arrow should define how to encode an Instant into Arrow data. There are several ways this could happen, some which change schema.fbs and some which do not. --- For sample arguments (currently grouped as "for c

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Micah Kornfield
My preference would be C with the caveat that we replace the word with "define" as that the convention for encoding Instant is Timestamp with timezone "UTC". My second choice would be E. (This could either be an extension of option C, or support any Timestamp with Timezone). If there are concre

Re: [STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Weston Pace
[1] https://lists.apache.org/thread.html/r8216e5de3efd2935e3907ad9bd20ce07e430952f84de69b36337e5eb%40%3Cdev.arrow.apache.org%3E [2] https://docs.google.com/document/d/1xEKRhs-GUSMwjMhgmQdnCNMXwZrA10226AcXRoP8g9E/edit?usp=sharing [3] https://docs.google.com/document/d/1QDwX4ypfNvESc2ywcT1ygaf2Y1R

[STRAW POLL] (How) should Arrow define storage for "Instant"s

2021-06-24 Thread Weston Pace
The discussion in [1] led to the following question. Before we proceed on a vote it was decided we should do a straw poll to settle on an approach (which can then be voted on in a +1/-1 fashion). --- Some date & time libraries have three temporal concepts. For the sake of this document we will c

Re: [VOTE] Clarify meaning of timestamp without time zone to equal the concept of "LocalDateTime"

2021-06-24 Thread Micah Kornfield
+1 (binding) On Thu, Jun 24, 2021 at 12:17 PM Weston Pace wrote: > The discussion in [1] led to the following proposal which I would like > to submit for a vote. > > --- > Arrow allows a timestamp column to omit the time zone property. This > has caused confusion because some people have interp

[VOTE] Clarify meaning of timestamp without time zone to equal the concept of "LocalDateTime"

2021-06-24 Thread Weston Pace
The discussion in [1] led to the following proposal which I would like to submit for a vote. --- Arrow allows a timestamp column to omit the time zone property. This has caused confusion because some people have interpreted a timestamp without a time zone to be an Instant while others have interp

Re: [C++] Reducing branching in compute/kernels/vector_selection.cc

2021-06-24 Thread Niranda Perera
I created a JIRA for this. I will do the changes in select kernels and report back with benchmark results https://issues.apache.org/jira/browse/ARROW-13170 On Thu, Jun 24, 2021 at 12:27 AM Yibo Cai wrote: > Did a quick test. For random bitmaps and my trivial test code, the > branch-less code is

Re: [ANNOUNCE] Official media types (MIME types) for Apache Arrow formats

2021-06-24 Thread Maarten Breddels
Great work, nice to see this formalized. On Thu, Jun 24, 2021 at 9:17 AM Antoine Pitrou wrote: > > Can we document them in the format docs and/or in the FAQ? > > > On Thu, 24 Jun 2021 10:47:34 +0900 (JST) > Sutou Kouhei wrote: > > Hi, > > > > The official media types (MIME types) for Apache

Re: [ANNOUNCE] Official media types (MIME types) for Apache Arrow formats

2021-06-24 Thread Antoine Pitrou
Can we document them in the format docs and/or in the FAQ? On Thu, 24 Jun 2021 10:47:34 +0900 (JST) Sutou Kouhei wrote: > Hi, > > The official media types (MIME types) for Apache Arrow > formats are registered to IANA: > > * > https://www.iana.org/assignments/media-types/application/vnd.a

Re: [Python] Drop Python 3.6 and Numpy 1.16 support?

2021-06-24 Thread Joris Van den Bossche
Note that the last bug-fix release of Python 3.6 already happened at 2018-12-11 (3.6.8 release), and since then it's only supported for source-only and security-only releases. But agreed with Antoine that it's currently not a big burden to keep Python 3.6 a bit longer. With the change of the relea

Re: [Python] Drop Python 3.6 and Numpy 1.16 support?

2021-06-24 Thread Antoine Pitrou
We definitely can. The cost of supporting Python 3.6 is rather low. Le 23/06/2021 à 20:40, Micah Kornfield a écrit : Could we postpone the dropping Python 3.6 support to be inline with what the Python core maintainers deadline? Or at least until the Arrow 6 release? Thanks, Micah On Wed,