[jira] [Created] (ARROW-5580) Correct definitions of timestamp functions in Gandiva

2019-06-12 Thread Prudhvi Porandla (JIRA)
Prudhvi Porandla created ARROW-5580: --- Summary: Correct definitions of timestamp functions in Gandiva Key: ARROW-5580 URL: https://issues.apache.org/jira/browse/ARROW-5580 Project: Apache Arrow

[jira] [Created] (ARROW-5579) [Java] shade flatbuffer dependency

2019-06-12 Thread Pindikura Ravindra (JIRA)
Pindikura Ravindra created ARROW-5579: - Summary: [Java] shade flatbuffer dependency Key: ARROW-5579 URL: https://issues.apache.org/jira/browse/ARROW-5579 Project: Apache Arrow Issue

Re: [VOTE] Formalizing "Extension Type" metadata in Arrow binary protocol

2019-06-12 Thread Sutou Kouhei
+1 In "[VOTE] Formalizing "Extension Type" metadata in Arrow binary protocol" on Mon, 10 Jun 2019 15:28:22 -0500, Wes McKinney wrote: > hi folks, > > In two mailing list threads [1] [2] we have discussed adding an > "extension type" mechanism to the Arrow binary/IPC protocol. The idea >

Re: [DISCUSS] Timing of release and making a 1.0.0 release marking Arrow protocol stability

2019-06-12 Thread Sutou Kouhei
Hi, I like the plan too. If nobody wants be a release manager for 0.14.0, I can be a release manager. I'm busy recently but I'll be able to make time from June 24. Thanks, -- kou In "Re: [DISCUSS] Timing of release and making a 1.0.0 release marking Arrow protocol stability" on Mon, 10

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Robert Nishihara
Congratulations! On Wed, Jun 12, 2019 at 4:16 PM Philipp Moritz wrote: > Congrats François :) > > On Wed, Jun 12, 2019 at 3:37 PM Antoine Pitrou wrote: > > > > > Welcome on the team François :-) > > > > > > Le 12/06/2019 à 17:45, Wes McKinney a écrit : > > > On behalf of the Arrow PMC I'm

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Philipp Moritz
Congrats François :) On Wed, Jun 12, 2019 at 3:37 PM Antoine Pitrou wrote: > > Welcome on the team François :-) > > > Le 12/06/2019 à 17:45, Wes McKinney a écrit : > > On behalf of the Arrow PMC I'm happy to announce that Francois has > > accepted an invitation to become an Arrow committer! > >

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Antoine Pitrou
Welcome on the team François :-) Le 12/06/2019 à 17:45, Wes McKinney a écrit : > On behalf of the Arrow PMC I'm happy to announce that Francois has > accepted an invitation to become an Arrow committer! > > Welcome, and thank you for your contributions! >

[jira] [Created] (ARROW-5578) [C++][Flight] Flight does not build out of the box on Alpine Linux

2019-06-12 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-5578: --- Summary: [C++][Flight] Flight does not build out of the box on Alpine Linux Key: ARROW-5578 URL: https://issues.apache.org/jira/browse/ARROW-5578 Project: Apache Arrow

[jira] [Created] (ARROW-5577) [C++] Link failure due to googletest shared library on Alpine Linux

2019-06-12 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-5577: --- Summary: [C++] Link failure due to googletest shared library on Alpine Linux Key: ARROW-5577 URL: https://issues.apache.org/jira/browse/ARROW-5577 Project: Apache

[jira] [Created] (ARROW-5576) [C++] Flaky thrift_ep tarball downloads

2019-06-12 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-5576: --- Summary: [C++] Flaky thrift_ep tarball downloads Key: ARROW-5576 URL: https://issues.apache.org/jira/browse/ARROW-5576 Project: Apache Arrow Issue Type: Bug

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Krisztián Szűcs
Congratulations François! On Wed, Jun 12, 2019 at 10:30 PM Bryan Cutler wrote: > Congratulations! > > On Wed, Jun 12, 2019 at 10:49 AM Chao Sun wrote: > > > Congrats Francois! > > > > On Wed, Jun 12, 2019 at 9:01 AM Micah Kornfield > > wrote: > > > > > Congrats Francois, well deserved! > > >

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Bryan Cutler
Congratulations! On Wed, Jun 12, 2019 at 10:49 AM Chao Sun wrote: > Congrats Francois! > > On Wed, Jun 12, 2019 at 9:01 AM Micah Kornfield > wrote: > > > Congrats Francois, well deserved! > > > > On Wed, Jun 12, 2019 at 8:46 AM Wes McKinney > wrote: > > > > > On behalf of the Arrow PMC I'm

Re: Avro to Arrow?

2019-06-12 Thread Wes McKinney
hi Tim, I should think that the reader API should support deserializing a blob of schemaless Avro records as an Arrow record batch, or even feeding the reader one serialized record at a time to build a record batch incrementally - Wes On Wed, Jun 12, 2019 at 1:25 PM Tim Swast wrote: > > > Let

Re: Arrow as a common open standard for machine learning data

2019-06-12 Thread Joaquin Vanschoren
Hi Neal, Thanks, that explains the arrow-parquet relationship very nicely. So, at the moment you would recommend Parquet for any form of archival storage, right? We could also experiment with storing data as both Parquet and Arrow for now. Still curious about the other questions, like meta-data,

Re: [jira] [Created] (ARROW-5573) [Archery] Write a short user guide in a README

2019-06-12 Thread William Wood
Thanks Wes Sent from my iPhone > On Jun 12, 2019, at 12:08 PM, Wes McKinney wrote: > > Hi William, > > You can email dev-unsubscr...@arrow.apache.org > > You might consider setting up an email filter to archive new issue > notifications if you only want to follow email discussions. > > Wes

Re: Arrow as a common open standard for machine learning data

2019-06-12 Thread Neal Richardson
Hi Joaquin, I recognize that this doesn't answer all of your questions, but we are in the process of adding a FAQ to the arrow.apache.org website that speaks to some of them: https://github.com/apache/arrow/blob/master/site/faq.md Neal On Wed, Jun 12, 2019 at 3:39 AM Joaquin Vanschoren <

Re: Avro to Arrow?

2019-06-12 Thread Tim Swast
> Let me know if you want to collaborate on it. Thanks Micah. What are your thoughts on reading schemaless Avro bytes? One of the reasons I have started experimenting with the fork is that fastavro had trouble reading more than one row at a time from a schemaless reader. * • **Tim Swast* * •

Re: Arrow sync call tomorrow (June 12) at 12:00 US/Eastern, 16:00 UTC

2019-06-12 Thread Neal Richardson
Attendees: Neal Richardson François Saint-Jacques Micah Kornfield Ravindra Pindikura Ben Kietzman John Muehlhausen Micah raised several issues: * C++ status: 2 PRs: 1. make status pluggable, clean up error codes ( https://github.com/apache/arrow/pull/4484). Antoine has been reviewing, but it

Re: [jira] [Created] (ARROW-5573) [Archery] Write a short user guide in a README

2019-06-12 Thread Wes McKinney
Hi William, You can email dev-unsubscr...@arrow.apache.org You might consider setting up an email filter to archive new issue notifications if you only want to follow email discussions. Wes On Wed, Jun 12, 2019, 1:05 PM William Wood wrote: > Hello, > > Can someone tell me how to get off this

Re: [jira] [Created] (ARROW-5573) [Archery] Write a short user guide in a README

2019-06-12 Thread William Wood
Hello, Can someone tell me how to get off this mailing list or remove me? My email is willwo...@yahoo.com Thank you Sent from my iPhone On Jun 12, 2019, at 8:00 AM, Krisztian Szucs (JIRA) wrote:

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Chao Sun
Congrats Francois! On Wed, Jun 12, 2019 at 9:01 AM Micah Kornfield wrote: > Congrats Francois, well deserved! > > On Wed, Jun 12, 2019 at 8:46 AM Wes McKinney wrote: > > > On behalf of the Arrow PMC I'm happy to announce that Francois has > > accepted an invitation to become an Arrow

Re: Apache Arrow / Plasma Benchmarks in HPC

2019-06-12 Thread Lentner, Geoffrey R
Okay cool. I’ll keep those things in mind as well. Thanks. Geoff On Jun 12, 2019, at 1:22 PM, Wes McKinney mailto:wesmck...@gmail.com>> wrote: hi Geoff, It's great to hear about your benchmarking results. If you'd like to submit a pull request to the project with a blog post to showcase your

Re: Apache Arrow / Plasma Benchmarks in HPC

2019-06-12 Thread Wes McKinney
hi Geoff, It's great to hear about your benchmarking results. If you'd like to submit a pull request to the project with a blog post to showcase your results please be our guest (the blog content is all under the site/ directory, let us know if you have any issues). The only limitation on the

Apache Arrow / Plasma Benchmarks in HPC

2019-06-12 Thread Lentner, Geoffrey R
Hi everyone. I work as a data scientist in the research computing group at Purdue University. Mostly I help facilitate the use of Purdue’s supercomputing clusters by research faculty by helping with scientific software development, consulting on data analysis and data management, holding

Re: [VOTE] Formalizing "Extension Type" metadata in Arrow binary protocol

2019-06-12 Thread Bryan Cutler
+1 (non-binding) On Mon, Jun 10, 2019, 1:29 PM Wes McKinney wrote: > hi folks, > > In two mailing list threads [1] [2] we have discussed adding an > "extension type" mechanism to the Arrow binary/IPC protocol. The idea > is to be able to "annotate" built-in Arrow data types with a type name >

Using pyarrow.Table for long-term storage of pandas DataFrames

2019-06-12 Thread Bogdan Klichuk
Trying to come up with a solution for quick Pandas dataframes serialization and long-storage. Dataframe content is tabular, but provided by user, can be arbitrary, so might both completely text columns and completely numeric/boolean columns. ## Main goals are: * Serialize dataframe as quickly as

Re: [ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Micah Kornfield
Congrats Francois, well deserved! On Wed, Jun 12, 2019 at 8:46 AM Wes McKinney wrote: > On behalf of the Arrow PMC I'm happy to announce that Francois has > accepted an invitation to become an Arrow committer! > > Welcome, and thank you for your contributions! >

[ANNOUNCE] New Arrow committer: Francois Saint-Jacques

2019-06-12 Thread Wes McKinney
On behalf of the Arrow PMC I'm happy to announce that Francois has accepted an invitation to become an Arrow committer! Welcome, and thank you for your contributions!

[jira] [Created] (ARROW-5575) arrowConfig.cmake includes uninstalled targets

2019-06-12 Thread Matthijs Brobbel (JIRA)
Matthijs Brobbel created ARROW-5575: --- Summary: arrowConfig.cmake includes uninstalled targets Key: ARROW-5575 URL: https://issues.apache.org/jira/browse/ARROW-5575 Project: Apache Arrow

[jira] [Created] (ARROW-5574) [R] documentation error for read_arrow()

2019-06-12 Thread JIRA
Romain François created ARROW-5574: -- Summary: [R] documentation error for read_arrow() Key: ARROW-5574 URL: https://issues.apache.org/jira/browse/ARROW-5574 Project: Apache Arrow Issue

[jira] [Created] (ARROW-5573) [Archery] Write a short user guide in a README

2019-06-12 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-5573: -- Summary: [Archery] Write a short user guide in a README Key: ARROW-5573 URL: https://issues.apache.org/jira/browse/ARROW-5573 Project: Apache Arrow

[jira] [Created] (ARROW-5572) [Python] raise error message when passing invalid filter in parquet reading

2019-06-12 Thread Joris Van den Bossche (JIRA)
Joris Van den Bossche created ARROW-5572: Summary: [Python] raise error message when passing invalid filter in parquet reading Key: ARROW-5572 URL: https://issues.apache.org/jira/browse/ARROW-5572

[jira] [Created] (ARROW-5571) [R] Rework handing of ARROW_R_WITH_PARQUET

2019-06-12 Thread JIRA
Romain François created ARROW-5571: -- Summary: [R] Rework handing of ARROW_R_WITH_PARQUET Key: ARROW-5571 URL: https://issues.apache.org/jira/browse/ARROW-5571 Project: Apache Arrow Issue

Arrow as a common open standard for machine learning data

2019-06-12 Thread Joaquin Vanschoren
Dear all, Thanks for creating Arrow! I'm part of OpenML.org, an open source initiative/platform for sharing machine learning datasets and models. We are currently storing data in either ARFF or Parquet, but are looking into whether e.g. Feather or a mix of Feather and Parquet could be the new

Re: [Discuss][Java][Typical use cases for dictionary encoding string vectors]

2019-06-12 Thread Fan Liya
@Micah Kornfield Thanks a lot for your comments. In the doc, we identify 3 problems for the current dictionary encoding use case (there can be more, so please give your valuable suggestions): 1. there should be a convenient way to provide access to both encoded/decoded data. 2. the constructor

Re: [Discuss][Java][Typical use cases for dictionary encoding string vectors]

2019-06-12 Thread Micah Kornfield
Hi Liya Fan, Thanks you for doing this. I need to take a closer look at the PR in question and the dictionary encoding code but this seems like it is on the right track. Could other java contributors with more familiarity in the space look over the document to make sure it makes sense to them?

[jira] [Created] (ARROW-5570) Update Avro C++ code to conform to Arrow style guide and get it compiling.

2019-06-12 Thread Micah Kornfield (JIRA)
Micah Kornfield created ARROW-5570: -- Summary: Update Avro C++ code to conform to Arrow style guide and get it compiling. Key: ARROW-5570 URL: https://issues.apache.org/jira/browse/ARROW-5570

[jira] [Created] (ARROW-5569) import avro C++ code to code base.

2019-06-12 Thread Micah Kornfield (JIRA)
Micah Kornfield created ARROW-5569: -- Summary: import avro C++ code to code base. Key: ARROW-5569 URL: https://issues.apache.org/jira/browse/ARROW-5569 Project: Apache Arrow Issue Type: