AW: Support for TIMESTAMP_NANOS in parquet-cpp

2018-11-08 Thread Roman Karlstetter
I would be willing to implement that. I’ll probably need some advice on my patch though, as I’m fairly new to the parquet code. Roman Von: Wes McKinney Gesendet: Donnerstag, 8. November 2018 23:22 An: dev@arrow.apache.org Betreff: Re: Support for TIMESTAMP_NANOS in parquet-cpp I opened an

[jira] [Created] (ARROW-3733) [GLib] Add to_string() to GArrowTable and GArrowColumn

2018-11-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3733: --- Summary: [GLib] Add to_string() to GArrowTable and GArrowColumn Key: ARROW-3733 URL: https://issues.apache.org/jira/browse/ARROW-3733 Project: Apache Arrow

Re: Assign/update : NA bitmap vs sentinel

2018-11-08 Thread Phillip Cloud
There is one database that I'm aware of that uses sentinels _and_ supports complex types with missing values: Kx's KDB+. This has led to some seriously strange choices like the ASCII space character being used as the sentinel value for strings. See https://code.kx.com/wiki/Reference/Datatypes for

[jira] [Created] (ARROW-3732) [R] Add functions to write RecordBatch or Schema to Message value, then read back

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3732: --- Summary: [R] Add functions to write RecordBatch or Schema to Message value, then read back Key: ARROW-3732 URL: https://issues.apache.org/jira/browse/ARROW-3732

Re: Support for TIMESTAMP_NANOS in parquet-cpp

2018-11-08 Thread Wes McKinney
I opened an issue here https://issues.apache.org/jira/browse/ARROW-3729. Patches would be welcome On Sat, Oct 20, 2018 at 12:55 PM Wes McKinney wrote: > > hi Roman, > > We would welcome adding such a document to the Arrow wiki > https://cwiki.apache.org/confluence/display/ARROW. As to your other

[jira] [Created] (ARROW-3729) [C++] Support for writing TIMESTAMP_NANOS Parquet metadata

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3729: --- Summary: [C++] Support for writing TIMESTAMP_NANOS Parquet metadata Key: ARROW-3729 URL: https://issues.apache.org/jira/browse/ARROW-3729 Project: Apache Arrow

[jira] [Created] (ARROW-3731) [R] R API for reading and writing Parquet files

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3731: --- Summary: [R] R API for reading and writing Parquet files Key: ARROW-3731 URL: https://issues.apache.org/jira/browse/ARROW-3731 Project: Apache Arrow Issue

[jira] [Created] (ARROW-3730) [Python] Output a representation of pyarrow.Schema that can be used to reconstruct a schema in a script

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3730: --- Summary: [Python] Output a representation of pyarrow.Schema that can be used to reconstruct a schema in a script Key: ARROW-3730 URL:

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Li Jin
Congrats! On Thu, Nov 8, 2018 at 4:02 PM Uwe L. Korn wrote: > Congratulations Krisztián! > > On Thu, Nov 8, 2018, at 9:56 PM, Philipp Moritz wrote: > > Congrats and welcome Krisztián! > > > > On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney > wrote: > > > > > The Project Management Committee (PMC)

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Li Jin
Welcome! On Thu, Nov 8, 2018 at 4:01 PM Uwe L. Korn wrote: > Welcome to all of you! > > On Thu, Nov 8, 2018, at 8:56 PM, Wes McKinney wrote: > > On behalf of the Arrow PMC, I'm happy to announce that Romain > > François, Sebastien Binet, and Yosuke Shiro have been invited to be > > committers

Re: Assign/update : NA bitmap vs sentinel

2018-11-08 Thread Wes McKinney
hey Matt, Thanks for giving your perspective on the mailing list. My objective in writing about this recently (http://wesmckinney.com/blog/bitmaps-vs-sentinel-values/, though I need to update since the sentinel case can be done more efficiently than what's there now) was to help dispel the

[jira] [Created] (ARROW-3717) Add GCSFSWrapper for DaskFileSystem

2018-11-08 Thread Emmett McQuinn (JIRA)
Emmett McQuinn created ARROW-3717: - Summary: Add GCSFSWrapper for DaskFileSystem Key: ARROW-3717 URL: https://issues.apache.org/jira/browse/ARROW-3717 Project: Apache Arrow Issue Type: New

[jira] [Created] (ARROW-3720) [GLib] Use "indices" instead of "indexes"

2018-11-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3720: --- Summary: [GLib] Use "indices" instead of "indexes" Key: ARROW-3720 URL: https://issues.apache.org/jira/browse/ARROW-3720 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-3723) [Plasma] [Ruby] Add Ruby bindings of Plasma

2018-11-08 Thread Yosuke Shiro (JIRA)
Yosuke Shiro created ARROW-3723: --- Summary: [Plasma] [Ruby] Add Ruby bindings of Plasma Key: ARROW-3723 URL: https://issues.apache.org/jira/browse/ARROW-3723 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-3725) [GLib] Add field readers to GArrowStructDataType

2018-11-08 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3725: --- Summary: [GLib] Add field readers to GArrowStructDataType Key: ARROW-3725 URL: https://issues.apache.org/jira/browse/ARROW-3725 Project: Apache Arrow Issue

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Uwe L. Korn
Congratulations Krisztián! On Thu, Nov 8, 2018, at 9:56 PM, Philipp Moritz wrote: > Congrats and welcome Krisztián! > > On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney wrote: > > > The Project Management Committee (PMC) for Apache Arrow has invited > > Krisztián Szűcs to become a PMC member and

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Uwe L. Korn
Welcome to all of you! On Thu, Nov 8, 2018, at 8:56 PM, Wes McKinney wrote: > On behalf of the Arrow PMC, I'm happy to announce that Romain > François, Sebastien Binet, and Yosuke Shiro have been invited to be > committers on the project. > > Welcome, and thanks for your contributions!

Re: [ANNOUNCE] New Arrow committers: Romain François, Sebastien Binet, Yosuke Shiro

2018-11-08 Thread Antoine Pitrou
It's nice to have new people onboard. Welcome everyone :-) Le 08/11/2018 à 20:56, Wes McKinney a écrit : > On behalf of the Arrow PMC, I'm happy to announce that Romain > François, Sebastien Binet, and Yosuke Shiro have been invited to be > committers on the project. > > Welcome, and thanks

Re: [ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Philipp Moritz
Congrats and welcome Krisztián! On Thu, Nov 8, 2018 at 11:48 AM Wes McKinney wrote: > The Project Management Committee (PMC) for Apache Arrow has invited > Krisztián Szűcs to become a PMC member and we are pleased to announce > that he has accepted. > > Congratulations and welcome, Krisztián! >

[jira] [Created] (ARROW-3724) [GLib] Update gitignore

2018-11-08 Thread Yosuke Shiro (JIRA)
Yosuke Shiro created ARROW-3724: --- Summary: [GLib] Update gitignore Key: ARROW-3724 URL: https://issues.apache.org/jira/browse/ARROW-3724 Project: Apache Arrow Issue Type: Improvement

[ANNOUNCE] New Arrow PMC member: Krisztián Szűcs

2018-11-08 Thread Wes McKinney
The Project Management Committee (PMC) for Apache Arrow has invited Krisztián Szűcs to become a PMC member and we are pleased to announce that he has accepted. Congratulations and welcome, Krisztián!

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Wes McKinney
I opened https://issues.apache.org/jira/browse/ARROW-3727 about adding examples. I will mention to add an example for CUDA also On Thu, Nov 8, 2018 at 2:30 PM Randy Zwitch wrote: > > Thanks Uwe, Wes, Pearu and Antoine. This is in the pyarrow docs, but no > example, so I'll open up a JIRA so that

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Randy Zwitch
Thanks Uwe, Wes, Pearu and Antoine. This is in the pyarrow docs, but no example, so I'll open up a JIRA so that it might be more obvious the next person. On 11/8/18 12:59 PM, Uwe L. Korn wrote: Hello Randy, you are looking for

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Uwe L. Korn
Hello Randy, you are looking for https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html#pyarrow.foreign_buffer This takes an address, size and a Python object for having a reference on the object. In your case the last one can be None. Note that this will not do a copy and

[jira] [Created] (ARROW-3728) Merging Parquet Files - Pandas Meta in Schema Mismatch

2018-11-08 Thread Micah Williamson (JIRA)
Micah Williamson created ARROW-3728: --- Summary: Merging Parquet Files - Pandas Meta in Schema Mismatch Key: ARROW-3728 URL: https://issues.apache.org/jira/browse/ARROW-3728 Project: Apache Arrow

[jira] [Created] (ARROW-3726) [Rust] CSV Reader & Writer

2018-11-08 Thread nevi_me (JIRA)
nevi_me created ARROW-3726: -- Summary: [Rust] CSV Reader & Writer Key: ARROW-3726 URL: https://issues.apache.org/jira/browse/ARROW-3726 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-3727) [Python] Document use of pyarrow.foreign_buffer in Sphinx

2018-11-08 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3727: --- Summary: [Python] Document use of pyarrow.foreign_buffer in Sphinx Key: ARROW-3727 URL: https://issues.apache.org/jira/browse/ARROW-3727 Project: Apache Arrow

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Pearu Peterson
Hi, For host memory, you can use pyarrow.foreign_buffer, see https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html For device memory, one can use pyarrow.cuda.foreign_buffer. HTH, Pearu On Thu, Nov 8, 2018 at 7:53 PM Randy Zwitch wrote: > Within OmniSci (MapD), we

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Wes McKinney
Yes, see pyarrow.foreign_buffer If this isn't in the documentation, could you open a JIRA to fix that? Thanks Wes On Thu, Nov 8, 2018, 11:53 AM Randy Zwitch Within OmniSci (MapD), we have the following code that takes a pointer > and length and reads to a NumPy array before calling py_buffer:

Re: Creating Buffer directly from pointer/length

2018-11-08 Thread Antoine Pitrou
You should be able to use pa.foreign_buffer(): https://arrow.apache.org/docs/python/generated/pyarrow.foreign_buffer.html#pyarrow.foreign_buffer Regards Antoine. Le 08/11/2018 à 18:49, Randy Zwitch a écrit : > Within OmniSci (MapD), we have the following code that takes a pointer > and

Creating Buffer directly from pointer/length

2018-11-08 Thread Randy Zwitch
Within OmniSci (MapD), we have the following code that takes a pointer and length and reads to a NumPy array before calling py_buffer: https://github.com/omnisci/pymapd/blob/master/pymapd/shm.pyx#L31-L52 Is it possible to eliminate the NumPy step and go directly do an Arrow buffer? There is

[jira] [Created] (ARROW-3718) [Gandiva] Remove spurious gtest include

2018-11-08 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-3718: - Summary: [Gandiva] Remove spurious gtest include Key: ARROW-3718 URL: https://issues.apache.org/jira/browse/ARROW-3718 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-3722) [C++] Allow specifying column types to CSV reader

2018-11-08 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-3722: - Summary: [C++] Allow specifying column types to CSV reader Key: ARROW-3722 URL: https://issues.apache.org/jira/browse/ARROW-3722 Project: Apache Arrow

[jira] [Created] (ARROW-3721) [Gandiva] [Python] Support all Gandiva literals

2018-11-08 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-3721: - Summary: [Gandiva] [Python] Support all Gandiva literals Key: ARROW-3721 URL: https://issues.apache.org/jira/browse/ARROW-3721 Project: Apache Arrow Issue