Re: Symbol not found: _PyCObject_Type (MacOS El Capitan, Python 3.6)
hi Quang -- I recommend clearing out your CMake temporary files after making any conda environment changes. If you activate a different conda environment, CMake will not know to recompute variables related to Python's header files and libraries. So it might have been that you invoked CMake with Python 2 activated and later activated Python 3 - Wes On Tue, May 15, 2018 at 5:15 AM, Quang Vuwrote: > Yes Antoine, that happens when compiling Arrow under an activated conda > environment. > Thank you for all the info you are helping me with! > > Quang. > > On Mon, May 14, 2018 at 3:34 PM Antoine Pitrou wrote: > >> >> To give a bit more insight: you should compile Arrow with your conda >> environment activated, so that it picks the right Python version (3.6.5, >> in your case). If it's still picking the wrong Python version, that >> might be a bug. >> >> Regards >> >> Antoine. >> >> >> Le 14/05/2018 à 20:50, Quang Vu a écrit : >> > Thanks Antoine, >> > >> > I will need to learn more about the compiling process that happens on my >> > Mac, to see how that link to Python 2. >> > I am not familiar with that process. But this is a good pointer for my >> > issue. Thank you for your response to my issue! >> > >> > Quang. >> > >> > On Mon, May 14, 2018 at 12:50 PM Antoine Pitrou >> wrote: >> > >> >> >> >> Hi Quang, >> >> >> >> It sounds like you have compiled Arrow against a Python 2 install but >> >> are now trying to use it with Python 3. This won't work, the same >> >> Python version must be used when compiling and when using PyArrow. >> >> >> >> ("PyCObject" is a Python 2-specific API that doesn't exist anymore in >> >> Python 3) >> >> >> >> Regards >> >> >> >> Antoine. >> >> >> >> >> >> Le 14/05/2018 à 18:34, Quang Vu a écrit : >> >>> Hi Arrow dev, >> >>> >> >>> I am having trouble with installing and setting my development >> >> environment >> >>> for Arrow. I wonder if anyone is familiar with the issue. My system >> info: >> >>> - MacOS 10.11.6 (El Capitan) >> >>> - conda 4.5.1 >> >>> - python 3.6.5 >> >>> - arrow's current commit: 4b8511 >> >>> >> >>> Installing Arrow C++ libraries and Pacquet are both successful. But >> >>> importing `pyarrow` fail: >> >>> >> >>> $ python -c 'import pyarrow' >> >>> >> >>> Traceback (most recent call last): >> >>> File "", line 1, in >> >>> File "/Users/myuser/code/arrow/python/pyarrow/__init__.py", line 47, >> in >> >>> >> >>> from pyarrow.lib import cpu_count, set_cpu_count >> >>> ImportError: dlopen(/Users/myuser/code/arrow/python/pyarrow/ >> >>> lib.cpython-36m-darwin.so, 2): Symbol not found: _PyCObject_Type >> >>> Referenced from: >> >>> /Users/myuser/miniconda3/envs/pyarrow-test/lib/libarrow_python.10.dylib >> >>> Expected in: flat namespace >> >>> in >> >> /Users/myuser/miniconda3/envs/pyarrow-test/lib/libarrow_python.10.dylib >> >>> >> >>> If anyone have suggestion on what the problem is about, please let me >> >> know. >> >>> Thanks! >> >>> >> >> >> > >>
Re: [VOTE] Accept donation of Arrow Ruby bindings
+1 I’ve been through IP clearance a few times, and can help if needed. -Taylor > On May 11, 2018, at 6:47 PM, Wes McKinneywrote: > > Dear all, > > Arrow PMC member Kouhei Sutou has developed Ruby bindings to the GLib > C interface for Apache Arrow > > * https://github.com/red-data-tools/red-arrow > * https://github.com/red-data-tools/red-arrow-gpu > > He is proposing to pull these projects into Apache Arrow to develop > them all in the same place > > https://github.com/apache/arrow/pull/1990 > > We are proposing to accept this code into the Apache project. If the > vote passes, the PMC and Kou will work together to complete the ASF IP > Clearance process (http://incubator.apache.org/ip-clearance/) and > import the Ruby bindings for inclusion in a future release: > >[ ] +1 : Accept contribution of Ruby bindings >[ ] 0 : No opinion >[ ] -1 : Reject contribution because... > > Here is my vote: +1 > > The vote will be open for at least 72 hours. > > Thanks, > Wes
Re: [VOTE] Accept donation of Arrow Ruby bindings
+1. Thanks On Sun, May 13, 2018 at 10:48 AM, Uwe L. Kornwrote: > +1, thanks for the code donation and building the Ruby bindings. > > Uwe > > On Sat, May 12, 2018, at 8:53 AM, Kouhei Sutou wrote: > > Hi, > > > > Thanks for starting the vote! > > > > +1 > > > > > > Thanks, > > -- > > kou > > > > In > > "[VOTE] Accept donation of Arrow Ruby bindings" on Fri, 11 May 2018 > > 18:47:52 -0400, > > Wes McKinney wrote: > > > > > Dear all, > > > > > > Arrow PMC member Kouhei Sutou has developed Ruby bindings to the GLib > > > C interface for Apache Arrow > > > > > > * https://github.com/red-data-tools/red-arrow > > > * https://github.com/red-data-tools/red-arrow-gpu > > > > > > He is proposing to pull these projects into Apache Arrow to develop > > > them all in the same place > > > > > > https://github.com/apache/arrow/pull/1990 > > > > > > We are proposing to accept this code into the Apache project. If the > > > vote passes, the PMC and Kou will work together to complete the ASF IP > > > Clearance process (http://incubator.apache.org/ip-clearance/) and > > > import the Ruby bindings for inclusion in a future release: > > > > > > [ ] +1 : Accept contribution of Ruby bindings > > > [ ] 0 : No opinion > > > [ ] -1 : Reject contribution because... > > > > > > Here is my vote: +1 > > > > > > The vote will be open for at least 72 hours. > > > > > > Thanks, > > > Wes >
Re: [CI] Code coverage reports
Hi, There's now a draft PR that generates and uploads Python / Cython code coverage. See example report here: https://codecov.io/gh/apache/arrow/pull/2050/list/ Regards Antoine. On Sat, 12 May 2018 16:18:47 +0200 Antoine Pitrouwrote: > Le 12/05/2018 à 00:55, Wes McKinney a écrit : > > > > Thanks for doing this! I am sure our code coverage has suffered as a > > result of not having the reports. I wonder what it would take to get > > C++ coverage that includes lines touched by Python unit test execution > > Nothing, because it already does :-) > I'm now working on Python / Cython code coverage. > > Regards > > Antoine. >
[jira] [Created] (ARROW-2586) Make child builders of ListBuilder and StructBuilder shared_ptr's
Joshua Storck created ARROW-2586: Summary: Make child builders of ListBuilder and StructBuilder shared_ptr's Key: ARROW-2586 URL: https://issues.apache.org/jira/browse/ARROW-2586 Project: Apache Arrow Issue Type: Improvement Reporter: Joshua Storck This is needed for changes in this PR that make it possible to deserialize arbitrary nested structures in parquet (ARROW-1644): https://github.com/apache/parquet-cpp/pull/462 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-2585) Add Decimal128::FromBigEndian
Joshua Storck created ARROW-2585: Summary: Add Decimal128::FromBigEndian Key: ARROW-2585 URL: https://issues.apache.org/jira/browse/ARROW-2585 Project: Apache Arrow Issue Type: Improvement Reporter: Joshua Storck This code is being moved from https://github.com/apache/parquet-cpp/blob/8046481235e558344c3aa059c83ee86b9f67/src/parquet/arrow/reader.cc#L1049 for us in this PR: https://github.com/apache/parquet-cpp/pull/462 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-2584) [JS] Node v10 issues
Brian Hulette created ARROW-2584: Summary: [JS] Node v10 issues Key: ARROW-2584 URL: https://issues.apache.org/jira/browse/ARROW-2584 Project: Apache Arrow Issue Type: Bug Components: JavaScript Reporter: Brian Hulette Assignee: Paul Taylor Build and tests fail with node v10. Fix these issues and bump CI to use node v10 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-2583) [Rust] Buffer should be typeless
Andy Grove created ARROW-2583: - Summary: [Rust] Buffer should be typeless Key: ARROW-2583 URL: https://issues.apache.org/jira/browse/ARROW-2583 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Andy Grove Fix For: 0.10.0 See comments in [https://github.com/apache/arrow/pull/1971] for background on this but the summary is that Buffer should just deal with untyped memory e.g. `* const u8` and all type-handling should be moved to the Array layer e.g. `BufferArray`. This would be more consistent with the other implementations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: file-system specification
Hi Martin, On Wed, 9 May 2018 11:28:15 -0400 Martin Durantwrote: > I have sketched out a possible start of a python-wide file-system > specification > https://github.com/martindurant/filesystem_spec > > This came about from my work in some other (remote) file-systems > implementations for python, particularly in the context of Dask. Since arrow > also cares about both local files and, for example, hdfs, I thought that > people on this list may have comments and opinions about a possible standard > that we ought to converge on. I do not think that my suggestions so far are > necessarily right or even good in many cases, but I want to get the > conversation going. Here are some comments: - API naming: you seem to favour re-using Unix command-line monickers in some places, while using more regular verbs or names in other places. I think it should be consistent. Since the Unix command-line doesn't exactly cover the exposed functionality, and since Unix tends to favour short cryptic names, I think it's better to use Python-like naming (which is also more familiar to non-Unix users). For example "move" or "rename" or "replace" instead of "mv", etc. - **kwargs parameters: a couple APIs (`mkdir`, `put`...) allow passing arbitrary parameters, which I assume are intended to be backend-specific. It makes it difficult to add other optional parameters to those APIs in the future. So I'd make the backend-specific directives a single (optional) dict parameter rather than a **kwargs. - `invalidate_cache` doesn't state whether it invalidates recursively or not (recursively sounds better intuitively?). Also, I think it would be more flexible to take a list of paths rather than a single path. - `du`: the effect of the `deep` parameter isn't obvious to me. I don't know what it would mean *not* to recurse here: what is the size of a directory if you don't recurse into it? - `glob` may need a formal definition (are trailing slashes significant for directory or symlink resolution? this kind of thing), though you may want to keep edge cases backend-specific. - are `head` and `tail` at all useful? They can be easily recreated using a generic `open` facility. - `read_block` tries to do too much in a single API IMHO, and using `open` directly is more flexible anyway. - if `touch` is intended to emulate the Unix API of the same name, the docstring should state "Create empty file or update last modification timestamp". - the information dicts returned by several APIs (`ls`, `info`) need standardizing, at least for non backend-specific fields. - if the backend is a networked filesystem with non-trivial latency, perhaps the operations would deserve being batched (operate on several paths at once), though I will happily defer to your expertise on the topic. Regards Antoine.
[jira] [Created] (ARROW-2582) [GLib] Add negate functions for Decimal128
yosuke shiro created ARROW-2582: --- Summary: [GLib] Add negate functions for Decimal128 Key: ARROW-2582 URL: https://issues.apache.org/jira/browse/ARROW-2582 Project: Apache Arrow Issue Type: Improvement Reporter: yosuke shiro -- This message was sent by Atlassian JIRA (v7.6.3#76005)