Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-21 Thread Fan Liya
Hi Micah, Thanks for your summary. Your proposal sounds reasonable to me. Best, Liya Fan On Tue, Sep 22, 2020 at 1:16 PM Micah Kornfield wrote: > I wanted to give this thread a bump, does the proposal I made below sound > reasonable? > > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield > wrot

Re: Arrow as a streaming format

2020-09-21 Thread Micah Kornfield
> > Is there any chance you could point me to those abstractions so that I may > have a look and play around with them? Sorry if there doesn't exist anything in Java (and I realize that might have been what you were expecting). I was thinking of C++/Python which have ChunkedArray classes. The c

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-21 Thread Micah Kornfield
I wanted to give this thread a bump, does the proposal I made below sound reasonable? On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield wrote: > If I read the responses so far it seems like the following might be a good > compromise/summary: > > 1. It does not seem too invasive to support native e

Re: Fix for - TypeError: to_pandas() got an unexpected keyword argument 'timestamp_as_object'

2020-09-21 Thread Micah Kornfield
I commented on the issue. I don't believe this is a pyarrow bug. It is an artifact of having conflicting versions of pyarrow in the environment. My comment on the issue: I think this might be due to conflicting versions of arrow in the notebook > environment? I believe this is caused by python-b

Re: Fix for - TypeError: to_pandas() got an unexpected keyword argument 'timestamp_as_object'

2020-09-21 Thread Wes McKinney
The patch https://github.com/apache/arrow/pull/7169 in theory should not have broken downstream projects. Can someone open a JIRA issue? On Mon, Sep 21, 2020 at 8:39 PM BG Srinivas wrote: > > Hi > > Is this a known issue ? Is there a fix for this issue planned on 1.0.1 ? > > https://github.com/Go

Fix for - TypeError: to_pandas() got an unexpected keyword argument 'timestamp_as_object'

2020-09-21 Thread BG Srinivas
Hi Is this a known issue ? Is there a fix for this issue planned on 1.0.1 ? https://github.com/GoogleCloudPlatform/python-docs-samples/issues/4573 Thanks BG Srinivas

Re: [DISCUSS] Rethinking our approach to scheduling CPU and IO work in C++?

2020-09-21 Thread Ben Kietzman
FWIW boost.coroutine and boost.asio provide composable coroutines, non blocking IO, and configurable scheduling for CPU work out of the box. The boost libraries are not lightweight but they are robust and cross platform, so I think asio is worth consideration. On Sat, Sep 19, 2020 at 8:22 PM Wes

Re: PyArrow: Incrementally using ParquetWriter without keeping entire dataset in memory (large than memory parquet files)

2020-09-21 Thread Niklas B
Hi, I’ve tried both with little success. I made a JIRA: https://issues.apache.org/jira/browse/ARROW-10052 Looking at it now when I've made a minimal example I see something I didn't see/realize before which is that while the memory usage is i

[NIGHTLY] Arrow Build Report for Job nightly-2020-09-21-0

2020-09-21 Thread Crossbow
Arrow Build Report for Job nightly-2020-09-21-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-09-21-0 Failed Tasks: - conda-linux-gcc-py36-aarch64: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-09-21-0-drone-conda-linux-gcc-py3