[NIGHTLY] Arrow Build Report for Job nightly-2020-02-14-0

2020-02-14 Thread Crossbow
Arrow Build Report for Job nightly-2020-02-14-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-14-0 Failed Tasks: - conda-linux-gcc-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-14-0-azure-conda-linux-gcc-py27 -

Thinking about 0.16.1 patch release

2020-02-14 Thread Wes McKinney
It seems inevitable that a few annoying regressions will pop up as 0.16.0 becomes more widely deployed, e.g. ARROW-7841 was just reported and patched. I created a 0.16.1 Fix Version in JIRA so that we can tag issues that we may want to cherry pick into a maint-0.16.x branch. We probably would want

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

2020-02-14 Thread Wes McKinney
There is only 1 binding +1 vote so far, we should probably wait for three before closing the vote (it's possible that lazy consensus could be employed here but not much harm in waiting a few more days) On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques wrote: > > +1 > > On Thu, Feb 13, 2020

[jira] [Created] (ARROW-7859) [R] Minor patches for CRAN submission 0.16.0.2

2020-02-14 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7859: -- Summary: [R] Minor patches for CRAN submission 0.16.0.2 Key: ARROW-7859 URL: https://issues.apache.org/jira/browse/ARROW-7859 Project: Apache Arrow

[jira] [Created] (ARROW-7860) [C++] Support cast to/from halffloat

2020-02-14 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7860: -- Summary: [C++] Support cast to/from halffloat Key: ARROW-7860 URL: https://issues.apache.org/jira/browse/ARROW-7860 Project: Apache Arrow Issue Type:

Re: Thinking about 0.16.1 patch release

2020-02-14 Thread Neal Richardson
In terms of wheel verification, I added https://issues.apache.org/jira/browse/ARROW-7853 to the 0.16.1 tag. It's for CI using pip to install the wheels we build nightly. Obviously this is not required for 0.16.1 but it would make the release verification simpler because we would be running that

[jira] [Created] (ARROW-7861) [C++][Parquet] Add fuzz regression corpus for parquet reader

2020-02-14 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7861: - Summary: [C++][Parquet] Add fuzz regression corpus for parquet reader Key: ARROW-7861 URL: https://issues.apache.org/jira/browse/ARROW-7861

[jira] [Created] (ARROW-7862) [R] Linux installation should run quieter by default

2020-02-14 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7862: -- Summary: [R] Linux installation should run quieter by default Key: ARROW-7862 URL: https://issues.apache.org/jira/browse/ARROW-7862 Project: Apache Arrow

Re: Schemaless serialization

2020-02-14 Thread Tewfik Zeghmi
Hi Micah, The primary language is Python. I'm hoping the that the small overhead of metadata is small compared to the schema information. thank you! On Fri, Feb 14, 2020 at 3:07 PM Micah Kornfield wrote: > Hi Tewfik, > What language? it is possible to serialize them separately but the right

Re: Thinking about 0.16.1 patch release

2020-02-14 Thread Krisztián Szűcs
Actually we are verifying the wheels after building them, the manylinux wheels in fresh environments using docker. We should first investigate why didn't the windows wheel issue surface during its verification [1]. [1]:

Re: Schemaless serialization

2020-02-14 Thread Micah Kornfield
Hi Tewfik, What language? it is possible to serialize them separately but the right hooks might not be exposed in all languages. There is still going to be a higher overhead for single row values in Arrow compared to Avro due to metadata requirements. Thanks, Micah On Fri, Feb 14, 2020 at 1:33

Re: Thinking about 0.16.1 patch release

2020-02-14 Thread Neal Richardson
IIUC the difference in the verification script is that we construct the wheel file name, download it, and install the file, rather than have pip query a repo and download the latest: https://github.com/apache/arrow/blob/master/dev/release/verify-release-candidate-wheels.bat#L69-L72 Verification

Schemaless serialization

2020-02-14 Thread Tewfik Zeghmi
Hi, I have a use case of creating a feature store to serve low latency traffic. Given a key, we need the ability to save and read a feature vector in a low latency Key Value store. Serializing an Arrow table with one row is takes 1344 bytes, while the same singular row serialized with AVRO

[jira] [Created] (ARROW-7863) [C++][Python][CI] Ensure running HDFS related tests

2020-02-14 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-7863: --- Summary: [C++][Python][CI] Ensure running HDFS related tests Key: ARROW-7863 URL: https://issues.apache.org/jira/browse/ARROW-7863 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7858) [C++][Python] Support casting an Extension type to its storage type

2020-02-14 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-7858: Summary: [C++][Python] Support casting an Extension type to its storage type Key: ARROW-7858 URL: https://issues.apache.org/jira/browse/ARROW-7858

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-02-13-0

2020-02-14 Thread Joris Van den Bossche
I opened an issue for the pandas-master failure: https://issues.apache.org/jira/browse/ARROW-7857 On Thu, 13 Feb 2020 at 21:08, Crossbow wrote: > > Arrow Build Report for Job nightly-2020-02-13-0 > > All tasks: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-13-0 > >

[jira] [Created] (ARROW-7857) [Python] Failing test with pandas master for extension type conversion

2020-02-14 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-7857: Summary: [Python] Failing test with pandas master for extension type conversion Key: ARROW-7857 URL: https://issues.apache.org/jira/browse/ARROW-7857