Re: [Rust] crate versions and release process

2019-01-04 Thread Kouhei Sutou
Hi, I have no opinion about version of sub-crates. When should we bump version of sub-crates? Is it a matter of Rust developers rather than release managers? I just want to know whether release managers need to care version of sub-crates or not. Thanks, -- kou In "[Rust] crate versions

[jira] [Created] (ARROW-4161) [GLib] Add GPlasmaClientOptions

2019-01-04 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-4161: --- Summary: [GLib] Add GPlasmaClientOptions Key: ARROW-4161 URL: https://issues.apache.org/jira/browse/ARROW-4161 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-4160) [Rust] Add README and executable files to parquet

2019-01-04 Thread Chao Sun (JIRA)
Chao Sun created ARROW-4160: --- Summary: [Rust] Add README and executable files to parquet Key: ARROW-4160 URL: https://issues.apache.org/jira/browse/ARROW-4160 Project: Apache Arrow Issue Type:

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Gene Novark
Eager assignment is certainly the simplest fix. I'm not sure if lazy initialization is based off measurements or whether it was a "premature optimization". In any case, I agree with Antoine that the simple move assignment will not be thread safe, not only for the reasons he describes but also

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Antoine Pitrou
However, eager initialization could probably work too. I'm not sure why we used lazy initialization like this. Perhaps someone worried about the cost of incremental schema construction using repeated Schema::AddField() calls (but that's gonna be wasteful anyway)? Le 04/01/2019 à 23:42,

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Antoine Pitrou
Generally speaking, normal assignments are not thread-safe. Intuitively they could be (and perhaps in some simple cases - such as aligned machine types - they will turn up to be on some specific compiler/CPU combinations), but C++ makes no guarantee about that (for example, an assignment can be

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Antoine Pitrou
In other words, something like: class StructType { mutable std::shared_ptr> name_to_index_; std::shared_ptr> GetNameToIndex() const { if (!std:atomic_load(_to_index_)) { name_to_index = std::make_shared>(); // ... initialize name_to_index ...

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Wes McKinney
That works. I would have thought that the deterministic state of unordered_map might make move-assignment safe, but perhaps not On Fri, Jan 4, 2019 at 4:33 PM Antoine Pitrou wrote: > > > The move-assigning would definitely not be thread-safe. > > One possibility would be to wrap the

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Antoine Pitrou
The move-assigning would definitely not be thread-safe. One possibility would be to wrap the std::unordered_map in a std::shared_ptr, and use the atomic functions for shared_ptr: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic Regards Antoine. Le 04/01/2019 à 23:17, Wes McKinney

Re: Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Wes McKinney
hi Gene, Yes, feel free to submit a PR to fix this. I would suggest populating function-local std::unordered_map and then move-assigning it into name_to_index_ -- I think this should not have race conditions. If you do want to add a mutex, it could be a static one rather than creating a new mutex

Thread safety of C++ Struct/Schema::GetFieldIndex

2019-01-04 Thread Gene Novark
These are both effectively-immutable accessors with lazy initialization. However, when accessed from multiple threads a race can occur initializing the name_to_index_ map. This seems like a bug rather than a purposeful design choice based off the cpp/conventions.rst section on Immutability. I'm

Re: Timeline for Arrow 0.12.0 release

2019-01-04 Thread Wes McKinney
hi all, We should try to cut a release candidate for 0.12 as soon as practical. Since we're just coming off the holidays, it would be good to work for a few more business days to close out as many outstanding patches as possible, and be in position to start a vote sometime next week. There's a

[jira] [Created] (ARROW-4159) [C++] Check for -Wdocumentation issues

2019-01-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4159: --- Summary: [C++] Check for -Wdocumentation issues Key: ARROW-4159 URL: https://issues.apache.org/jira/browse/ARROW-4159 Project: Apache Arrow Issue Type:

Re: Rust bindings for Gandiva

2019-01-04 Thread paddy horan
Hey Andy, I am very interested in this, I’m also looking into adding explicit SIMD to our existing “array_ops”. Maybe we can plan out what is needed on the developer wiki so that we can all help out where we are able. I’ve seen it mentioned here and there but what it the current state of

[jira] [Created] (ARROW-4158) [Dev] Allow maintainers to use a GitHub API token when merging pull requests

2019-01-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4158: --- Summary: [Dev] Allow maintainers to use a GitHub API token when merging pull requests Key: ARROW-4158 URL: https://issues.apache.org/jira/browse/ARROW-4158 Project:

Re: Compiling C++ Arrow Flight

2019-01-04 Thread Tim Bisson
Thanks Wes, Yea, we might be trying to play with the code a bit too early. I made a little more progress, but became stuck during make. I might be at the same place as kszucs: https://github.com/apache/arrow/pull/2547#issuecomment-425744800 This could be completely incorrect, but I made it past

[jira] [Created] (ARROW-4157) [C++] -Wdocumentation failures with clang 6.0 on Ubuntu 18.04

2019-01-04 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-4157: --- Summary: [C++] -Wdocumentation failures with clang 6.0 on Ubuntu 18.04 Key: ARROW-4157 URL: https://issues.apache.org/jira/browse/ARROW-4157 Project: Apache Arrow

Re: [Rust] crate versions and release process

2019-01-04 Thread Marco Neumann
+1 The only thing to keep in mind is that versions are statement regarding API stability (aka semantic versioning). It is easy to forget about these things in a monorepo since you can fix all the breaking changes in the PR they got introduced. So whoever cuts the release must account for that

Rust bindings for Gandiva

2019-01-04 Thread Andy Grove
Now that the Rust implementation of Arrow is maturing, I'm interested in having bindings for Gandiva for query execution, rather than duplicating this in Rust. I will likely start looking at this soon but wanted to see if anyone else here is particularly interested in this area of functionality?

Re: Parallel processing

2019-01-04 Thread Antoine Pitrou
Le 04/01/2019 à 14:06, Romain Francois a écrit : > That should be fine indeed. I just don't want the task to start "right now" > as SerialTaskGroup::AppendReal() seems to be doing. It would probably be fine to add a DeferredSerialTaskGroup or something. Regards Antoine.

Re: Parallel processing

2019-01-04 Thread Romain Francois
That should be fine indeed. I just don't want the task to start "right now" as SerialTaskGroup::AppendReal() seems to be doing. > Le 4 janv. 2019 à 13:38, Antoine Pitrou a écrit : > > > Le 04/01/2019 à 12:24, Romain Francois a écrit : >> >> I guess that just means I need some way to hold

Re: Parallel processing

2019-01-04 Thread Antoine Pitrou
Le 04/01/2019 à 12:24, Romain Francois a écrit : > > I guess that just means I need some way to hold the tasks before they go in > the task groups. You can make the task a lambda function which will capture the necessary data by value (such as any shared_ptr pointing to the data the task

Re: Building arrow using Xcode on Mac OS

2019-01-04 Thread Hatem Helal
Thanks Uwe! I created the following issue: https://issues.apache.org/jira/browse/ARROW-4156 Unfortunately, my terminal history buffer was too small for the entire output but I'm running it again now and will upload a text file with the entire history for you to debug. Let me know if there is

[jira] [Created] (ARROW-4156) [C++] xcodebuild failure for cmake generated project

2019-01-04 Thread Hatem Helal (JIRA)
Hatem Helal created ARROW-4156: -- Summary: [C++] xcodebuild failure for cmake generated project Key: ARROW-4156 URL: https://issues.apache.org/jira/browse/ARROW-4156 Project: Apache Arrow Issue

Re: Building arrow using Xcode on Mac OS

2019-01-04 Thread Uwe L. Korn
Hello Hatem, I don't know of anyone that has used Xcode to build Arrow yet. We're normally using `-GNinja` or the default make generator to build it. As I have a Mac, I'll have a look at this but "cmake -G Xcode" is not running for me at the moment. To help us debug this, can you open a JIRA

Re: Building arrow using Xcode on Mac OS

2019-01-04 Thread Uwe L. Korn
Hello Hatem, I don't know of anyone that has used Xcode to build Arrow yet. We're normally using `-GNinja` or the default make generator to build it. As I have a Mac, I'll have a look at this but "cmake -G Xcode" is not running for me at the moment. To help us debug this, can you open a JIRA

Building arrow using Xcode on Mac OS

2019-01-04 Thread Hatem Helal
Hi all, I wonder if anyone on this list has tried building arrow using Xcode on Mac OS? I've used "cmake -G Xcode" to generate a project but calling xcodebuild fails. I've copied the syndrome below in case anyone has seen this before. Another observation is that the dylib's aren't generated

Re: Parallel processing

2019-01-04 Thread Romain Francois
Thanks. I think Task Group suits my needs almost. I might need some extra layer around it. Here is my use case. When converting a record batch to an R data structures, all R allocation has to happen on the main thread, but then filling the vectors can (for some of them) be done in a task that

Re: Format specification document?

2019-01-04 Thread Sebastien Binet
Hi, Theoretically, it's defined there: - https://arrow.apache.org/docs/ipc.html - https://arrow.apache.org/docs/metadata.html hth, -s sent from my droid On Fri, Jan 4, 2019, 02:15 Kohei KaiGai Hello, > > I'm now trying to understand the Apache Arrow format for my application. > Is there a

Re: [Rust] crate versions and release process

2019-01-04 Thread Krisztián Szűcs
Agree, +1 On Thu, Jan 3, 2019 at 10:49 PM Andy Grove wrote: > +1 from me. Keeping the code in a single repo makes sense but no need to > artificially keep versions numbers consistent between the sub-crates. > > Andy. > > On Wed, Jan 2, 2019 at 10:28 PM Chao Sun wrote: > > > Hi, > > > > This is