[jira] [Created] (ARROW-7306) [C++] Add Result-returning version of FileSystemFromUri

2019-12-03 Thread Kenta Murata (Jira)
Kenta Murata created ARROW-7306: --- Summary: [C++] Add Result-returning version of FileSystemFromUri Key: ARROW-7306 URL: https://issues.apache.org/jira/browse/ARROW-7306 Project: Apache Arrow

[jira] [Created] (ARROW-7305) High memory usage writing pyarrow.Table to parquet

2019-12-03 Thread Bogdan Klichuk (Jira)
Bogdan Klichuk created ARROW-7305: - Summary: High memory usage writing pyarrow.Table to parquet Key: ARROW-7305 URL: https://issues.apache.org/jira/browse/ARROW-7305 Project: Apache Arrow

Re: [C++] CSV string column category to dictionary/indices?

2019-12-03 Thread ntfs hard
Hello Thank you for your advice! I'll try to adapt it to my code. Best, -- вт, 3 дек. 2019 г. в 17:16, Antoine Pitrou : > > Agreed. I've opened https://issues.apache.org/jira/browse/ARROW-7302 to > track it. > > Regards > > Antoine. > > > Le 03/12/2019 à 04:55, Wes McKinney a écrit : > > An

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-12-03-0

2019-12-03 Thread Krisztián Szűcs
I'm going to wait until tomorrow, and fall back to the previous organization depending on or the absence of the INFRA's answer. On Tue, Dec 3, 2019 at 10:39 PM Wes McKinney wrote: > Can we point to an image outside of the Apache DockerHub org in the > meantime? This is part of why I was asking

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-12-03-0

2019-12-03 Thread Wes McKinney
Can we point to an image outside of the Apache DockerHub org in the meantime? This is part of why I was asking in that JIRA why we need to be dependent on INFRA to set permissions for us on development matters. On Tue, Dec 3, 2019 at 3:31 PM Krisztián Szűcs wrote: > > I'm waiting for

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-12-03-0

2019-12-03 Thread Krisztián Szűcs
I'm waiting for https://issues.apache.org/jira/browse/INFRA-19499 On Tue, Dec 3, 2019 at 10:07 PM Neal Richardson wrote: > Is there a jira for this yet? > > On Tue, Dec 3, 2019 at 12:31 PM Wes McKinney wrote: > > > The manylinux builds are failing because of a missing Docker image > > > > $

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-12-03-0

2019-12-03 Thread Neal Richardson
Is there a jira for this yet? On Tue, Dec 3, 2019 at 12:31 PM Wes McKinney wrote: > The manylinux builds are failing because of a missing Docker image > > $ docker-compose pull $BUILD_IMAGE > 228Pulling centos-python-manylinux2010 ... > 229ERROR: for centos-python-manylinux2010 manifest for >

Re: [DISCUSS] C data interface updated

2019-12-03 Thread Wes McKinney
I started a vote and left comments (mostly clarifications) on the specification PR. I'm reviewing the C++ patch and will post comments when I can On Wed, Nov 13, 2019 at 4:39 AM Antoine Pitrou wrote: > > > Yes, we should probably need to hold a vote at some point. Perhaps wait > a week or so

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-12-03-0

2019-12-03 Thread Wes McKinney
The manylinux builds are failing because of a missing Docker image $ docker-compose pull $BUILD_IMAGE 228Pulling centos-python-manylinux2010 ... 229ERROR: for centos-python-manylinux2010 manifest for apache/arrow-dev:amd64-centos-6.10-python-manylinux2010 not found 230manifest for

[jira] [Created] (ARROW-7304) clang-tidy diagnostics not emitted for most headers

2019-12-03 Thread Elvis Stansvik (Jira)
Elvis Stansvik created ARROW-7304: - Summary: clang-tidy diagnostics not emitted for most headers Key: ARROW-7304 URL: https://issues.apache.org/jira/browse/ARROW-7304 Project: Apache Arrow

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

2019-12-03 Thread Neal Richardson
+1 (non-binding) On Tue, Dec 3, 2019 at 10:56 AM Wes McKinney wrote: > +1 (binding) > > On Tue, Dec 3, 2019 at 12:54 PM Wes McKinney wrote: > > > > hello, > > > > We have been discussing the creation of a minimalist C-based data > > interface for applications to exchange Arrow columnar data

[jira] [Created] (ARROW-7303) [C++] Refactor benchmarks to use new Result APIs

2019-12-03 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-7303: --- Summary: [C++] Refactor benchmarks to use new Result APIs Key: ARROW-7303 URL: https://issues.apache.org/jira/browse/ARROW-7303 Project: Apache Arrow Issue

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

2019-12-03 Thread Wes McKinney
+1 (binding) On Tue, Dec 3, 2019 at 12:54 PM Wes McKinney wrote: > > hello, > > We have been discussing the creation of a minimalist C-based data > interface for applications to exchange Arrow columnar data structures > with each other. Some notable features of this interface include: > > * A

[VOTE] Adopt Arrow in-process C Data Interface specification

2019-12-03 Thread Wes McKinney
hello, We have been discussing the creation of a minimalist C-based data interface for applications to exchange Arrow columnar data structures with each other. Some notable features of this interface include: * A small amount of header-only C code can be copied into downstream applications, no

Re: predict whether pa.array() will produce ChunkedArray

2019-12-03 Thread Wes McKinney
hi John, The documentation says array : pyarrow.Array or pyarrow.ChunkedArray A ChunkedArray instead of an Array is returned if: - the object data overflowed binary storage. - the object's ``__arrow_array__`` protocol method returned a chunked array.

predict whether pa.array() will produce ChunkedArray

2019-12-03 Thread John Muehlhausen
Given input data and a type, how do we predict whether array() will produce ChunkedArray? I figure the formula involves: - the length of input - the type, and max length (to be conservative) for variable length types - some constant(s) that Arrow knows internally... that may change in the future?

Re: [C++] CSV string column category to dictionary/indices?

2019-12-03 Thread Antoine Pitrou
Agreed. I've opened https://issues.apache.org/jira/browse/ARROW-7302 to track it. Regards Antoine. Le 03/12/2019 à 04:55, Wes McKinney a écrit : > An option was recently added to dictionary encode all string columns > >

[jira] [Created] (ARROW-7302) [C++] CSV: allow converting a column to a specific dictionary type

2019-12-03 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7302: - Summary: [C++] CSV: allow converting a column to a specific dictionary type Key: ARROW-7302 URL: https://issues.apache.org/jira/browse/ARROW-7302 Project: Apache

[NIGHTLY] Arrow Build Report for Job nightly-2019-12-03-0

2019-12-03 Thread Crossbow
Arrow Build Report for Job nightly-2019-12-03-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-12-03-0 Failed Tasks: - test-debian-10-rust-nightly-2019-09-25: URL: