Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Gidon Gershinsky
Precisely, the main change is in the threading model. Afaik, the document proposes a model that fits pandas, but might be problematic for other users of this library. Technically, this is not showstopper though; if the community decides on this model, it will be compatible with the high-level encry

Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Gidon Gershinsky
Hi Antoine, My part there is mostly review and some advice. The bulk of the work is done by Tham, and by the community members who've reviewed the PR; my frustration is with seeing it in limbo for a while now. Regarding the remaining comments - currently, the main sticking points are the change pr

Re: Pcap2Arrow - Packet capture and data conversion tool to Apache Arrow on the fly

2021-02-16 Thread Micah Kornfield
Nice work, glad Arrow proved useful. On Mon, Feb 15, 2021 at 11:44 PM Kohei KaiGai wrote: > Hello, > > Let me share my recent works below: > https://github.com/heterodb/pg-strom/wiki/804:-Pcap2Arrow > > This standalone command-line tool allows to capture network packets > from network interface

Re: Threading Improvements Proposal

2021-02-16 Thread Micah Kornfield
> > If a method could potentially run some kind of long term blocking I/O > wait then yes. So reading / writing tables & datasets, IPC, > filesystem APIs, etc. will all need to adapt. It doesn't have to be > all at once. CPU only functions would remain as they are. So table > manipulation, comp

Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Micah Kornfield
I think some of the comments might be conflicting. One of the concerns (that I would need to refresh myself on to offer an opinion which was covered in Ben's doc) was the threading model we expect in the library. On Tue, Feb 16, 2021 at 8:03 AM Antoine Pitrou wrote: > > Hi Gidon, > > Le 16/02/2

Arrow sync call February 17 at 12:00 US/Eastern, 17:00 UTC

2021-02-16 Thread Neal Richardson
Hi all, Reminder that our biweekly call is coming up at https://meet.google.com/vtm-teks-phx. All are welcome to join. Notes will be shared with the mailing list afterward. Neal

Re: Push force to master by mistake

2021-02-16 Thread Krisztián Szűcs
Also tried to force push to master: $ git push upstream master -f Alias tip: gpu master -f Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 remote: error: GH006: Protected branch update failed for refs/heads/master. remote: error: Cannot force-push to this protected branch To https://github.co

Re: Push force to master by mistake

2021-02-16 Thread Krisztián Szűcs
According to the API, the master branch has been set as protected: curl \ -H "Accept: application/vnd.github.v3+json" \ https://api.github.com/repos/apache/arrow/branches { "name": "master", "commit": { "sha": "b89cddc3766676b5e48dad219259530c1706513f", "url": "https://

Re: Push force to master by mistake

2021-02-16 Thread Wes McKinney
This has been enabled — if things appear to be working correctly, could someone comment on the INFRA Jira so it can be closed? Thanks! On Sun, Feb 14, 2021 at 3:12 PM Wes McKinney wrote: > > https://issues.apache.org/jira/browse/INFRA-21421 > > On Sun, Feb 14, 2021 at 3:10 PM Jorge Cardoso Leitão

Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Antoine Pitrou
Hi Gidon, Le 16/02/2021 à 16:42, Gidon Gershinsky a écrit : > Regarding the high-level layer, I think it waits for a progress at > https://docs.google.com/document/d/11qz84ajysvVo5ZAV9mXKOeh6ay4-xgkBrubggCP5220/edit?usp=sharing > No activity there since last November. This is unfortunate, becaus

Re: [Rust] [DataFusion] Topic for next Rust Sync Call

2021-02-16 Thread Dominik Moritz
Somewhat related, I tried to compile DataFusion to WASM and it didn’t work because of some dependencies: https://issues.apache.org/jira/projects/ARROW/issues/ARROW-11615. I wonder whether DataFusion could have a feature flag for only shipping what is WASM compatible? On Feb 15, 2021 at 12:13:04,

Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Gidon Gershinsky
Regarding the high-level layer, I think it waits for a progress at https://docs.google.com/document/d/11qz84ajysvVo5ZAV9mXKOeh6ay4-xgkBrubggCP5220/edit?usp=sharing No activity there since last November. This is unfortunate, because Tham has put a lot of work in coding the high-level layer (and addr

Re: Exposing low-level Parquet encryption to Python user (or, maybe not)

2021-02-16 Thread Itamar Turner-Trauring
On Mon, Feb 15, 2021, at 2:49 PM, Micah Kornfield wrote: > Sorry I realized I had a typo in my email. We should definitely namespace > dangerous apis appropriately. Decryption doesn't seem necessarily dangerous? In any case, I will start with PR for decryption only and we can see how that goes

Re: [C++] adopting an SIMD library - xsimd / GPU optimization

2021-02-16 Thread Antoine Pitrou
Hi Joe, Thanks for your message, which is asking several questions at once. A bunch of answers below: 1) The number of contributors and contributor activity are an important metric to guess whether a project will receive continued maintenance over the years. On this record, nsimd seems to have

Re: Threading Improvements Proposal

2021-02-16 Thread Antoine Pitrou
Thanks for this useful writeup and the enseuing discussion. What you're proposing basically looks sound to me. Regards Antoine. Le 16/02/2021 à 09:29, Weston Pace a écrit : > Thanks for the input. I appreciate you both taking the time to look > through this. I'll consolidate the points her

[NIGHTLY] Arrow Build Report for Job nightly-2021-02-16-0

2021-02-16 Thread Crossbow
Arrow Build Report for Job nightly-2021-02-16-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-02-16-0 Failed Tasks: - conda-linux-gcc-py37-aarch64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-02-16-0-drone-conda-linux

Re: Threading Improvements Proposal

2021-02-16 Thread Weston Pace
Thanks for the input. I appreciate you both taking the time to look through this. I'll consolidate the points here. >From Wes: > I hypothesize that the bottom of the stack is a thread pool with a queue-per-thread that implements work stealing. Yes. I think there may be two pools (one for I/O