Re: [C++] AppendValues for numeric types with invalid slots omitted from source

2020-10-19 Thread Micah Kornfield
For reference, that parquet uses to space out values is in rle_decoder.h [1]. This uses both BitBlockCounter and BitRunReader. BitBlockCounter is faster than BitRunReader but on micro-benchmarks BitRunReader still provides some benefits assuming nulls are fairly infrequent. It is worth noting

Re: [Discuss] Provide pluggable APIs to support user customized compression codec

2020-10-19 Thread Wes McKinney
What is the purpose of the key-value metadata aside from automatically loading the plugin library if it's available (which seems like a security risk if reading a data file can cause a shared library to be loaded dynamically)? Is it necessary to have that metadata for it to be safe to use the

Re: Arrow C Data Interface

2020-10-19 Thread Wes McKinney
hi Pasha, Copying dev@. You can see how DuckDB interacts with the pyarrow data structures by the C interface here, maybe it's helpful https://github.com/cwida/duckdb/blob/master/tools/pythonpkg/duckdb_python.cpp We haven't defined a Python API (either C API level or Python API level) so that

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Sutou Kouhei
I'll work on "update msys2". In "Re: [VOTE] Release Apache Arrow 2.0.0 - RC2" on Tue, 20 Oct 2020 00:52:24 +0200, Krisztián Szűcs wrote: > Current status: > 1. [done] rebase master > 2. [done] upload source > 3. [done] upload binaries > 4. [in-pr] update website > 5. [done] upload

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Krisztián Szűcs
Current status: 1. [done] rebase master 2. [done] upload source 3. [done] upload binaries 4. [in-pr] update website 5. [done] upload ruby gems 6. [done] upload js packages 8. [done] upload C# packages 9. [done] upload rust crates 10. [in-pr] update conda recipes 11. [done] upload wheels to

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Andy Grove
The Rust crates (arrow, arrow-flight, parquet, datafusion) have been uploaded to crates.io The new parquet_derive crate could not be published: https://issues.apache.org/jira/browse/ARROW-10350 On Mon, Oct 19, 2020 at 3:40 PM Krisztián Szűcs wrote: > Thank You Andy! > > Current status: > 1.

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Krisztián Szűcs
Thank You Andy! Current status: 1. [done] rebase master 2. [done] upload source 3. [done] upload binaries 4. [in-pr] update website 5. [done] upload ruby gems 6. [ ] upload js packages 8. [done] upload C# packages 9. [andygrove] upload rust crates 10. [in-pr] update conda recipes 11.

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Krisztián Szűcs
Current status of post release tasks: 1. [done] rebase master 2. [done] upload source 3. [done] upload binaries 4. [kszucs] update website 5. [done] upload ruby gems 6. [ ] upload js packages 8. [done] upload C# packages 9. [ ] upload rust crates 10. [ ] update conda recipes 11. [kszucs]

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Krisztián Szűcs
The VOTE carries with 3 binding +1 votes and 1 non-binding +1 vote and 1 non-binding +0 vote. I'm starting the post release tasks and keep you posted about the remaining tasks. Thanks everyone! On Mon, Oct 19, 2020 at 7:45 PM Uwe L. Korn wrote: > > +0 from my side, I see no big issues. I was

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Uwe L. Korn
+0 from my side, I see no big issues. I was able to verify the wheels, the source verification fails due to the llvm package issues on brew; thus I'm not able to +1 this time. Uwe On Mon, Oct 19, 2020, at 7:38 PM, Krisztián Szűcs wrote: > On Mon, Oct 19, 2020 at 5:32 PM Uwe L. Korn wrote: > >

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Krisztián Szűcs
On Mon, Oct 19, 2020 at 5:32 PM Uwe L. Korn wrote: > > > > On Mon, Oct 19, 2020, at 5:07 PM, Neal Richardson wrote: > > I wouldn't expect the default S3 region to depend on locale. It does > > depend > > on aws-sdk-cpp version, as we saw in ARROW-10066; see > >

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Uwe L. Korn
On Mon, Oct 19, 2020, at 5:07 PM, Neal Richardson wrote: > I wouldn't expect the default S3 region to depend on locale. It does > depend > on aws-sdk-cpp version, as we saw in ARROW-10066; see > https://aws.amazon.com/blogs/developer/aws-sdk-for-c-version-1-8-developer-preview/ > for how the

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Neal Richardson
I wouldn't expect the default S3 region to depend on locale. It does depend on aws-sdk-cpp version, as we saw in ARROW-10066; see https://aws.amazon.com/blogs/developer/aws-sdk-for-c-version-1-8-developer-preview/ for how the new version determines the default version. And a change in Homebrew

Re: [Discuss] Provide pluggable APIs to support user customized compression codec

2020-10-19 Thread Antoine Pitrou
Hi, Again, I think the whole plugin concept falls outside of Arrow. It should be much simpler to simply allow people to override the compression codec factory. Then applications can define "plugins" if they want to. Regards Antoine. Le 19/10/2020 à 03:30, Xie, Qi a écrit : > Hi, all > >

Re: [VOTE] Release Apache Arrow 2.0.0 - RC2

2020-10-19 Thread Uwe L. Korn
Trying to verify on macOS but run into the following two issues: * The default S3 region is „eu-central-1“ for me despite setting LANG=C * llvm@10 is not available for homebrew anymore, see also https://github.com/Homebrew/homebrew-core/pull/62798#issuecomment-711606370