Re: Columns/Field index semantic for parquet FileReader

2020-11-16 Thread Radu Teodorescu
ok, I retract my last question on mapping arrow fields to parquet leaf nodes, all the pieces are there and it’s a 5 line function. I still feel a bit thrown off by the column index semantics, but I see how it can open up for more interesting requests where once would want a subset of a struct.

C++: Cache RecordBatch

2020-11-16 Thread Rares Vernica
Hello, I'm using an arrow::io::BufferReader and arrow::ipc::RecordBatchStreamReader to read an arrow::RecordBatch from a file. There is only one batc in the file so I do a single RecordBatchStreamReader::ReadNext call. I store the populated RecordBatch in memory for reuse (cache). The memory buffe

Columns/Field index semantic for parquet FileReader

2020-11-16 Thread Radu Teodorescu
Hi, (my apologies if this has already been discussed) I just took a stab at the struct support in parquet FileReader and I am a bit confused by the column index semantic when trying to read a subset of columns from a subset of row groups: Say I have a single column arrow table top: struct {

Re: Using arrow/compute/kernels/*internal.h headers

2020-11-16 Thread Niranda Perera
Hi Ben and Wes, Based on our discussion, I did the following. https://hastebin.com/ajadonados.cpp It seems to be working fine. Would love to get your feedback on this! :-) But I have a couple of concerns. 1. Say I want to communicate the intermediate state data across multiple processes. Unfortun

Re: [Discuss] Arrow Release Schedule

2020-11-16 Thread Keerat Singh
Thank you, Kou and Wes, for your responses. As per discussions in the last sync call[11-Nov], there were talks about releasing more frequently and help is needed with the build process. There was also a discussion on creating specific tickets on which help is needed from the community. Just wanted

Re: [FlightRPC][C++][Python] Exposing internal middlewares

2020-11-16 Thread David Li
Hey James, The latter approach sounds fine. You could actually refactor the bit that wraps the Python object in a C++ middleware instance[1] into a cdef method onto the interface. Then the native-C++ middleware could override it and return the C++ object directly and the Python implementations

[FlightRPC][C++][Python] Exposing internal middlewares

2020-11-16 Thread James Duong
Hi, I've been working on porting the cookie middleware from Java to C++/Python clients in this PR: https://github.com/apache/arrow/pull/8655 I'm looking at the Python impl now. The Python Flight API for middlewares seems oriented towards writing middlewares directly in Python, which would get wra

Re: [DISCUSS] Alternative design for KMS interaction in parquet-cpp

2020-11-16 Thread Gidon Gershinsky
Thanks Ben, I left a few comments there. Cheers, Gidon On Mon, Nov 16, 2020 at 2:58 AM Benjamin Kietzman wrote: > @Gidon > > Copied the gist to a google doc for commenting: > > https://docs.google.com/document/d/11qz84ajysvVo5ZAV9mXKOeh6ay4-xgkBrubggCP5220/edit# > > @Micah > > > it would be pr

Re: Travis CI jobs gummed up on Arrow PRs?

2020-11-16 Thread Andrew Lamb
Thank you! On Sun, Nov 15, 2020 at 8:30 PM Kazuaki Ishizaki wrote: > I have just reported this issue at the TravisCI forum. > > > https://travis-ci.community/t/s390x-jobs-have-not-been-almost-executed/10581 > > Regards, > Kazuaki Ishizaki, > > Sutou Kouhei wrote on 2020/11/16 10:02:18: > > > Fr

Re: [DataFusion] Blocking async of async is not async

2020-11-16 Thread Rémi Dettai
Hi! Thanks for your insights! @andrew the gist I sent does start two runtimes and works. The constraint seems to be that you cannot start a new runtime (or block on one) in the async thread pool, but it seems legal to start a new on

[NIGHTLY] Arrow Build Report for Job nightly-2020-11-16-0

2020-11-16 Thread Crossbow
Arrow Build Report for Job nightly-2020-11-16-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-11-16-0 Failed Tasks: - centos-8-aarch64: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-11-16-0-travis-centos-8-aarch64 - conda-win-v