RE: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread wish maple
I think the ArrayVector can have benefits above: 1. Converting a Batch in Velox or other system to arrow array could be much more lightweight. 2. Modifying, filter and copy array or string could be much more lightweight Velox can make a Vector mutable, seems that arrow array cannot. Seems it

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Will Jones
I suppose one common use case is materializing list columns after some expanding operation like a join or unnest. That's a case where I could imagine a lot of repetition of values. Haven't yet thought of common cases where there is overlap but not full duplication, but am eager to hear any. The

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Raphael Taylor-Davies
Unless I am missing something, I think the selection use-case could be equally well served by a dictionary-encoded BinarArray/ListArray, and would have the benefit of not requiring any modifications to the existing format or kernels. The major additional flexibility of the proposed encoding

Re: [VOTE] Formalize how to change format

2023-04-25 Thread Sutou Kouhei
Hi, I've added one more note about documentation: We must update the corresponding documentation (files in ``_) too. https://github.com/apache/arrow/pull/35174#issuecomment-1522572677 See also the preview URL:

[DISCUSS][Format][Flight] Ordered data support

2023-04-25 Thread Sutou Kouhei
Hi, I would like to propose adding support for ordered data to Apache Arrow Flight. If anyone has comments for this proposal, please share them at here or the issue for this proposal: https://github.com/apache/arrow/issues/34852 This is one of proposals in "[DISCUSS] Flight RPC/Flight SQL/ADBC

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread David Li
Is there a need for a 64-bit offsets version the same way we have List and LargeList? And just to be clear, the difference with List is that the lists don't have to be stored in their logical order (or in other words, offsets do not have to be nondecreasing and so we also need sizes)? On Wed,

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Weston Pace
For context, there was some discussion on this back in [1]. At that time this was called "sequence view" but I do not like that name. However, array-view array is a little confusing. Given this is similar to list can we go with list-view array? > Thanks for the introduction. I'd be interested

Re: [DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Will Jones
Hi Felipe, Thanks for the introduction. I'd be interested to hear about the applications Velox has found for these vectors, and in what situations they are useful. This could be contrasted with the current ListArray implementations. IIUC it would be fairly cheap to transform a ListArray to an

[DISCUSS][Format] Starting the draft implementation of the ArrayView array format

2023-04-25 Thread Felipe Oliveira Carvalho
Hi folks, I would like to start a public discussion on the inclusion of a new array format to Arrow — array-view array. The name is also up for debate. This format is inspired by Velox's ArrayVector format [1]. Logically, this array represents an array of arrays. Each element is an array-view

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-25 Thread Jacob Wujciak
I checked out a trace for the cmake issue and LLVM 15.07 is found correctly. The issue come from `llvm_map_components_to_libnames` which complains about X86 not being in the lsit of libraries. But we don't add that but rather it gets appended in the function?

Arrow community meeting April 26 at 16:00 UTC

2023-04-25 Thread Ian Cook
Hi all, Our biweekly Arrow community meeting is tomorrow at 16:00 UTC / 12:00 EDT. Zoom meeting URL: https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09 Meeting ID: 876 4903 3008 Passcode: 958092 The notes for this and future instances of this meeting will be captured in this

[RESULT][VOTE][RUST] Release Apache Arrow Rust 38.0.0 RC1

2023-04-25 Thread Raphael Taylor-Davies
With 5 +1 votes (4 binding) the release is approved The release is available here: https://dist.apache.org/repos/dist/release/arrow/arrow-rs-38.0.0 It has also been released to crates.io Thank you to everyone who helped verify this release Raphael On 21/04/2023 20:10, Andrew Lamb wrote:

Re: [VOTE] Release Apache Arrow 12.0.0 - RC0

2023-04-25 Thread Raúl Cumplido
I have created the following issue for the new wheels test failure around pandas 2.0.1 : https://github.com/apache/arrow/issues/35321 I don't think we should create a new RC for that issue but I'm happy to know other people's thoughts around that. El lun, 24 abr 2023 a las 21:12, Raúl Cumplido

Re: [VOTE][RUST][DataFusion] Release DataFusion Python Bindings 23.0.0 RC2

2023-04-25 Thread L. C. Hsieh
+1 (binding) Verified on M1 Mac. Thanks Andy. On Mon, Apr 24, 2023 at 6:13 PM Andy Grove wrote: > > Hi, > > I would like to propose a release of Apache Arrow DataFusion Python > Bindings, > version 23.0.0. > > This release candidate is based on commit: >