[jira] [Resolved] (ARROW-6777) [GLib][CI] Unpin gobject-introspection gem
[ https://issues.apache.org/jira/browse/ARROW-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-6777. - Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5572 [https://github.com/apache/arrow/pull/5572] > [GLib][CI] Unpin gobject-introspection gem > -- > > Key: ARROW-6777 > URL: https://issues.apache.org/jira/browse/ARROW-6777 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6777) [GLib][CI] Unpin gobject-introspection gem
[ https://issues.apache.org/jira/browse/ARROW-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6777: -- Labels: pull-request-available (was: ) > [GLib][CI] Unpin gobject-introspection gem > -- > > Key: ARROW-6777 > URL: https://issues.apache.org/jira/browse/ARROW-6777 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6777) [GLib][CI] Unpin gobject-introspection gem
Kouhei Sutou created ARROW-6777: --- Summary: [GLib][CI] Unpin gobject-introspection gem Key: ARROW-6777 URL: https://issues.apache.org/jira/browse/ARROW-6777 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6774) [Rust] Reading parquet file is slow
[ https://issues.apache.org/jira/browse/ARROW-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-6774: --- Summary: [Rust] Reading parquet file is slow (was: Reading parquet file is slow) > [Rust] Reading parquet file is slow > --- > > Key: ARROW-6774 > URL: https://issues.apache.org/jira/browse/ARROW-6774 > Project: Apache Arrow > Issue Type: Improvement > Components: Rust >Affects Versions: 0.15.0 >Reporter: Adam Lippai >Priority: Major > > Using the example at > [https://github.com/apache/arrow/tree/master/rust/parquet] is slow. > The snippet > {code:none} > let reader = SerializedFileReader::new(file).unwrap(); > let mut iter = reader.get_row_iter(None).unwrap(); > let start = Instant::now(); > while let Some(record) = iter.next() {} > let duration = start.elapsed(); > println!("{:?}", duration); > {code} > Runs for 17sec for a ~160MB parquet file. > If there is a more effective way to load a parquet file, it would be nice to > add it to the readme. > P.S.: My goal is to construct an ndarray from it, I'd be happy for any tips. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6760) [C++] JSON: improve error message when column changed type
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6760: -- Labels: pull-request-available (was: ) > [C++] JSON: improve error message when column changed type > -- > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Assignee: Ben Kietzman >Priority: Major > Labels: pull-request-available > Attachments: dummy.jl > > > When a column accidentally changes type in a JSON file (which is not > supported), it would be nice to get the column name that gives this problem > in the error message. > --- > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6776) [Python] Need a lite version of pyarrow
[ https://issues.apache.org/jira/browse/ARROW-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943279#comment-16943279 ] Wes McKinney commented on ARROW-6776: - Our wheel build scripts are found here https://github.com/apache/arrow/tree/master/python/manylinux1 It's easy to build your own wheels, just follow the README. We are shipping many optional components that you can turn off and make smaller wheels. The duplicated shared library issue is https://issues.apache.org/jira/browse/ARROW-5082. You are welcome to try to resolve this. I and my team have decided to not spend time on wheel-related issues anymore, but other Arrow community members are welcome to do what they wish > [Python] Need a lite version of pyarrow > --- > > Key: ARROW-6776 > URL: https://issues.apache.org/jira/browse/ARROW-6776 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.14.1 >Reporter: Haowei Yu >Priority: Major > > Currently I am building a library packages on top of pyarrow, so I include > pyarrow as a dependency and ship it to our customer. However, when our > customer installed our packages, it will also install pyarrow and pyarrow's > dependency (numpy). However the dependency size is huge. > {code:bash} > (py36env) [hyu@c6x64-hyu-newuser-final-clone connector]$ ls -l --block-size=M > /home/hyu/py36env/lib/python3.6/site-packages/pyarrow/ > total 186M > {code} > And numpy is around 80MB. Total is more than 250 MB. > Our customer want to bundle all dependency and run the code inside AWS > Lambda, however they hit the size limit and failed to run the code. > Looking into the pyarrow, I saw multiple .so files are shipped both with and > without version suffix, I wonder if you can remove the one of them (either > with or without suffix), it will at least reduce the package size by half. > Further, our library just want to use IPC and read data as record batch, I > don't need arrow flight at all (which is the biggest .so file and takes > around 100 MB). I wonder if you can push a lite version of the pyarrow so > that I can specify lite version as the dependency. Or maybe I need to build > my own lite version and push it pypi. However, this approach cause further > problem if our customer is using the "fat" version of pyarrow unless you the > change the namespace of lite version of pyarrow. > Another alternative is that I bundle the pyarrow with our library ( copy the > whole directory into vendored namespace) and ship it to our customer without > specifying pyarrow as a dependency. The advantage of this one is that I can > build pyarrow with whatever option/sub-module/libraries I need. However, I > tried a lot but failed because pyarrow use absolute import and it will fail > to import the script in the new location. > Any insight how I should resolve this issue? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6776) [Python] Need a lite version of pyarrow
[ https://issues.apache.org/jira/browse/ARROW-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haowei Yu updated ARROW-6776: - Description: Currently I am building a library packages on top of pyarrow, so I include pyarrow as a dependency and ship it to our customer. However, when our customer installed our packages, it will also install pyarrow and pyarrow's dependency (numpy). However the dependency size is huge. {code:bash} (py36env) [hyu@c6x64-hyu-newuser-final-clone connector]$ ls -l --block-size=M /home/hyu/py36env/lib/python3.6/site-packages/pyarrow/ total 186M {code} And numpy is around 80MB. Total is more than 250 MB. Our customer want to bundle all dependency and run the code inside AWS Lambda, however they hit the size limit and failed to run the code. Looking into the pyarrow, I saw multiple .so files are shipped both with and without version suffix, I wonder if you can remove the one of them (either with or without suffix), it will at least reduce the package size by half. Further, our library just want to use IPC and read data as record batch, I don't need arrow flight at all (which is the biggest .so file and takes around 100 MB). I wonder if you can push a lite version of the pyarrow so that I can specify lite version as the dependency. Or maybe I need to build my own lite version and push it pypi. However, this approach cause further problem if our customer is using the "fat" version of pyarrow unless you the change the namespace of lite version of pyarrow. Another alternative is that I bundle the pyarrow with our library ( copy the whole directory into vendored namespace) and ship it to our customer without specifying pyarrow as a dependency. The advantage of this one is that I can build pyarrow with whatever option/sub-module/libraries I need. However, I tried a lot but failed because pyarrow use absolute import and it will fail to import the script in the new location. Any insight how I should resolve this issue? was: Currently I am building a library packages on top of pyarrow, so I include pyarrow as a dependency and ship it to our customer. However, when our customer installed our packages, it will also install pyarrow and pyarrow's dependency (numpy). However the dependency size is huge. {code:bash} (py36env) [hyu@c6x64-hyu-newuser-final-clone connector]$ ls -l --block-size=M /home/hyu/py36env/lib/python3.6/site-packages/pyarrow/ total 186M {code} And numpy is around 80MB. Total is more than 250 MB. Our customer want to bundle all dependency and run the code inside AWS Lambda, however they hit the size limit and failed to run the code. Looking into the pyarrow, I saw multiple .so files are shipped both with and without version suffix, I wonder if you can remove the one of them (either with or without suffix), it will at least reduce the package size by half. Further, our library just want to use IPC and read data as record batch, I don't need arrow flight at all (which is the biggest .so file and takes around 100 MB). I wonder if you can push a lite version of the pyarrow so that I can specify lite version as the dependency. Or maybe I need to build my own lite version and push it pypi. However, this approach cause further problem if our customer is using the "fat" version of pyarrow unless you the change the namespace of lite version of pyarrow. Another alternative is that I bundle the pyarrow with our library ( copy the whole directory into vendored namespace) and ship it to our customer without specifying pyarrow as a dependency. The advantage of this one is that I can build pyarrow with whatever option/sub-module/libraries I need. However, I tried a lot but failed because pyarrow use absolute import and it will fail to import the script in the new location. Any insight how I should resolve this issue? > [Python] Need a lite version of pyarrow > --- > > Key: ARROW-6776 > URL: https://issues.apache.org/jira/browse/ARROW-6776 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.14.1 >Reporter: Haowei Yu >Priority: Major > > Currently I am building a library packages on top of pyarrow, so I include > pyarrow as a dependency and ship it to our customer. However, when our > customer installed our packages, it will also install pyarrow and pyarrow's > dependency (numpy). However the dependency size is huge. > {code:bash} > (py36env) [hyu@c6x64-hyu-newuser-final-clone connector]$ ls -l --block-size=M > /home/hyu/py36env/lib/python3.6/site-packages/pyarrow/ > total 186M > {code} > And numpy is around 80MB. Total is more than 250 MB. > Our customer want to bundle all dependency and run the code inside AWS > Lambda, however they hit the size limit and failed to run the code. >
[jira] [Created] (ARROW-6776) [Python] Need a lite version of pyarrow
Haowei Yu created ARROW-6776: Summary: [Python] Need a lite version of pyarrow Key: ARROW-6776 URL: https://issues.apache.org/jira/browse/ARROW-6776 Project: Apache Arrow Issue Type: Improvement Components: Python Affects Versions: 0.14.1 Reporter: Haowei Yu Currently I am building a library packages on top of pyarrow, so I include pyarrow as a dependency and ship it to our customer. However, when our customer installed our packages, it will also install pyarrow and pyarrow's dependency (numpy). However the dependency size is huge. {code:bash} (py36env) [hyu@c6x64-hyu-newuser-final-clone connector]$ ls -l --block-size=M /home/hyu/py36env/lib/python3.6/site-packages/pyarrow/ total 186M {code} And numpy is around 80MB. Total is more than 250 MB. Our customer want to bundle all dependency and run the code inside AWS Lambda, however they hit the size limit and failed to run the code. Looking into the pyarrow, I saw multiple .so files are shipped both with and without version suffix, I wonder if you can remove the one of them (either with or without suffix), it will at least reduce the package size by half. Further, our library just want to use IPC and read data as record batch, I don't need arrow flight at all (which is the biggest .so file and takes around 100 MB). I wonder if you can push a lite version of the pyarrow so that I can specify lite version as the dependency. Or maybe I need to build my own lite version and push it pypi. However, this approach cause further problem if our customer is using the "fat" version of pyarrow unless you the change the namespace of lite version of pyarrow. Another alternative is that I bundle the pyarrow with our library ( copy the whole directory into vendored namespace) and ship it to our customer without specifying pyarrow as a dependency. The advantage of this one is that I can build pyarrow with whatever option/sub-module/libraries I need. However, I tried a lot but failed because pyarrow use absolute import and it will fail to import the script in the new location. Any insight how I should resolve this issue? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6761) [Rust] Travis CI builds not respecting rust-toolchain
[ https://issues.apache.org/jira/browse/ARROW-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paddy Horan resolved ARROW-6761. Resolution: Fixed Issue resolved by pull request 5561 [https://github.com/apache/arrow/pull/5561] > [Rust] Travis CI builds not respecting rust-toolchain > - > > Key: ARROW-6761 > URL: https://issues.apache.org/jira/browse/ARROW-6761 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Affects Versions: 1.0.0 >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Travis builds recently started failing with a Rust ICE (Internal Compiler > Error) which has been reported to the Rust compiler team > ([https://github.com/rust-lang/rust/issues/64908]). > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6775) Proposal for several Array utility functions
Zhuo Peng created ARROW-6775: Summary: Proposal for several Array utility functions Key: ARROW-6775 URL: https://issues.apache.org/jira/browse/ARROW-6775 Project: Apache Arrow Issue Type: Wish Reporter: Zhuo Peng Hi, We developed several utilities that computes / accesses certain properties of Arrays and wonder if they make sense to get them into the upstream (into both the C++ API and pyarrow) and assuming yes, where is the best place to put them? Maybe I have overlooked existing APIs that already do the same.. in that case please point out. 1/ ListLengthFromListArray(ListArray&) Returns lengths of lists in a ListArray, as a Int32Array (or Int64Array for large lists). For example: [[1, 2, 3], [], None] => [3, 0, 0] (or [3, 0, None], but we hope the returned array can be converted to numpy) 2/ GetBinaryArrayTotalByteSize(BinaryArray&) Returns the total byte size of a BinaryArray (basically offset[len - 1] - offset[0]). Alternatively, a BinaryArray::Flatten() -> Uint8Array would work. 3/ GetArrayNullBitmapAsByteArray(Array&) Returns the array's null bitmap as a UInt8Array (which can be efficiently converted to a bool numpy array) 4/ GetFlattenedArrayParentIndices(ListArray&) Makes a int32 array of the same length as the flattened ListArray. returned_array[i] == j means i-th element in the flattened ListArray came from j-th list in the ListArray. For example [[1,2,3], [], None, [4,5]] => [0, 0, 0, 3, 3] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6774) Reading parquet file is slow
Adam Lippai created ARROW-6774: -- Summary: Reading parquet file is slow Key: ARROW-6774 URL: https://issues.apache.org/jira/browse/ARROW-6774 Project: Apache Arrow Issue Type: Improvement Components: Rust Affects Versions: 0.15.0 Reporter: Adam Lippai Using the example at [https://github.com/apache/arrow/tree/master/rust/parquet] is slow. The snippet {code:none} let reader = SerializedFileReader::new(file).unwrap(); let mut iter = reader.get_row_iter(None).unwrap(); let start = Instant::now(); while let Some(record) = iter.next() {} let duration = start.elapsed(); println!("{:?}", duration); {code} Runs for 17sec for a ~160MB parquet file. If there is a more effective way to load a parquet file, it would be nice to add it to the readme. P.S.: My goal is to construct an ndarray from it, I'd be happy for any tips. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarek Allam updated ARROW-6766: --- Description: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{\{mkdir arrow/cpp/build }} pushd arrow/cpp/build \ cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_INSTALL_LIBDIR=lib \ -DARROW_FLIGHT=OFF \ -DARROW_GANDIVA=OFF \ -DARROW_ORC=ON \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON \ .. {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=0}} {{export PYARROW_WITH_GANDIVA=0}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory: /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} {{-- Found the Arrow core library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} {{-- Found the Arrow Python library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}}{{...}}{{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:315 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:226 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} What is quite strange is that the libraries seem to indeed be there but they have an addition component such as `libarrow.15.dylib` .e.g: {{$ ls -l libarrow_python.15.dylib && echo $PWD}} {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} {{libarrow_python.15.0.0.dylib}} {{/Users/tallamjr/github/arrow/dist/lib}} I guess I am not exactly sure what the issue here is but it appears to be that the version is not captured as a variable that is used by CMAKE? I have run the same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) which both seem to produce same errors. Apologies if this is not quite the format for JIRA issues here or perhaps if it's not the correct platform for this, I'm very new to the project and contributing to apache in general. Thanks was: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build \}} pushd arrow/cpp/build \ cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_INSTALL_LIBDIR=lib \ -DARROW_FLIGHT=OFF \ -DARROW_GANDIVA=OFF \ -DARROW_ORC=ON \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON \ .. {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output
[jira] [Assigned] (ARROW-6773) [C++] Filter kernel returns invalid data when filtering with an Array slice
[ https://issues.apache.org/jira/browse/ARROW-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-6773: -- Assignee: Neal Richardson (was: Ben Kietzman) > [C++] Filter kernel returns invalid data when filtering with an Array slice > --- > > Key: ARROW-6773 > URL: https://issues.apache.org/jira/browse/ARROW-6773 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Neal Richardson >Assignee: Neal Richardson >Priority: Major > Fix For: 1.0.0 > > > See ARROW-3808. This failing test reproduces the issue: > {code:java} > --- a/cpp/src/arrow/compute/kernels/filter_test.cc > +++ b/cpp/src/arrow/compute/kernels/filter_test.cc > @@ -151,6 +151,12 @@ TYPED_TEST(TestFilterKernelWithNumeric, FilterNumeric) { >this->AssertFilter("[7, 8, 9]", "[null, 1, 0]", "[null, 8]"); >this->AssertFilter("[7, 8, 9]", "[1, null, 1]", "[7, null, 9]"); > > + this->AssertFilterArrays( > +ArrayFromJSON(this->type_singleton(), "[7, 8, 9]"), > +ArrayFromJSON(boolean(), "[0, 1, 1, 1, 0, 1]")->Slice(3, 3), > +ArrayFromJSON(this->type_singleton(), "[7, 9]") > + ); > + > {code} > {code:java} > arrow/cpp/src/arrow/testing/gtest_util.cc:82: Failure > Failed > @@ -2, +2 @@ > +0 > [ FAILED ] TestFilterKernelWithNumeric/9.FilterNumeric, where TypeParam = > arrow::DoubleType (0 ms) > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943073#comment-16943073 ] Wes McKinney commented on ARROW-6766: - Looks like your build isn't picking up the library SO/ABI version correctly. I'm not sure what's wrong, someone else may have an idea > [Python] libarrow_python..dylib does not exist > -- > > Key: ARROW-6766 > URL: https://issues.apache.org/jira/browse/ARROW-6766 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.0, 0.15.0 >Reporter: Tarek Allam >Priority: Major > > {{After following the instructions found on the developer guides for Python, > I was}} > {{able to build fine by using:}} > {{# Assuming immediately prior one has run:}} > {{# $ git clone g...@github.com:apache/arrow.git}} > # $ conda create -y -n pyarrow-dev -c conda-forge > # --file arrow/ci/conda_env_unix.yml > # --file arrow/ci/conda_env_cpp.yml > # --file arrow/ci/conda_env_python.yml > # compilers > {{# python=3.7}} > {{# $ conda activate pyarrow-dev}} > {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export > ARROW_HOME=$(pwd)/arrow/dist}} > {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export > CC=`which clang`}} > {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build \}} > pushd arrow/cpp/build \ > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ > -DCMAKE_INSTALL_LIBDIR=lib \ > -DARROW_FLIGHT=OFF \ > -DARROW_GANDIVA=OFF \ > -DARROW_ORC=ON \ > -DARROW_PARQUET=ON \ > -DARROW_PYTHON=ON \ > -DARROW_PLASMA=ON \ > -DARROW_BUILD_TESTS=ON \ > .. > {{make -j4}} > {{make install}} > {{popd}} > But when I run: > {{pushd arrow/python}} > {{export PYARROW_WITH_FLIGHT=1}} > {{export PYARROW_WITH_GANDIVA=1}} > {{export PYARROW_WITH_ORC=1}} > {{export PYARROW_WITH_PARQUET=1}} > {{python setup.py build_ext --inplace}} > {{popd}} > I get the following errors: > {{-- Build output directory: > /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} > {{-- Found the Arrow core library: > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} > {{-- Found the Arrow Python library: > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not > exist.}}{{...}}{{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} > {{CMake Error at CMakeLists.txt:230 (configure_file):}} > \{{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > \{{ CMakeLists.txt:315 (bundle_arrow_lib)}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not > exist.}} > {{CMake Error at CMakeLists.txt:226 (configure_file):}} > \{{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not > exist.}} > {{CMake Error at CMakeLists.txt:230 (configure_file):}} > \{{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} > > What is quite strange is that the libraries seem to indeed be there but they > have an addition component such as `libarrow.15.dylib` .e.g: > {{$ ls -l libarrow_python.15.dylib && echo $PWD}} > {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} > {{libarrow_python.15.0.0.dylib}} > {{/Users/tallamjr/github/arrow/dist/lib}} > I guess I am not exactly sure what the issue here is but it appears to be that > the version is not captured as a variable that is used by CMAKE? I have run > the > same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) > which both seem to produce same errors. > Apologies if this is not quite the format for JIRA issues here or perhaps if > it's not the correct platform for this, I'm very new to the project and > contributing to apache in general. Thanks > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6773) [C++] Filter kernel returns invalid data when filtering with an Array slice
Neal Richardson created ARROW-6773: -- Summary: [C++] Filter kernel returns invalid data when filtering with an Array slice Key: ARROW-6773 URL: https://issues.apache.org/jira/browse/ARROW-6773 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Neal Richardson Assignee: Ben Kietzman Fix For: 1.0.0 See ARROW-3808. This failing test reproduces the issue: {code:java} --- a/cpp/src/arrow/compute/kernels/filter_test.cc +++ b/cpp/src/arrow/compute/kernels/filter_test.cc @@ -151,6 +151,12 @@ TYPED_TEST(TestFilterKernelWithNumeric, FilterNumeric) { this->AssertFilter("[7, 8, 9]", "[null, 1, 0]", "[null, 8]"); this->AssertFilter("[7, 8, 9]", "[1, null, 1]", "[7, null, 9]"); + this->AssertFilterArrays( +ArrayFromJSON(this->type_singleton(), "[7, 8, 9]"), +ArrayFromJSON(boolean(), "[0, 1, 1, 1, 0, 1]")->Slice(3, 3), +ArrayFromJSON(this->type_singleton(), "[7, 9]") + ); + {code} {code:java} arrow/cpp/src/arrow/testing/gtest_util.cc:82: Failure Failed @@ -2, +2 @@ +0 [ FAILED ] TestFilterKernelWithNumeric/9.FilterNumeric, where TypeParam = arrow::DoubleType (0 ms) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6766: Priority: Major (was: Blocker) > [Python] libarrow_python..dylib does not exist > -- > > Key: ARROW-6766 > URL: https://issues.apache.org/jira/browse/ARROW-6766 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.0, 0.15.0 >Reporter: Tarek Allam >Priority: Major > > {{After following the instructions found on the developer guides for Python, > I was}} > {{able to build fine by using:}} > {{# Assuming immediately prior one has run:}} > {{# $ git clone g...@github.com:apache/arrow.git}} > # $ conda create -y -n pyarrow-dev -c conda-forge > # --file arrow/ci/conda_env_unix.yml > # --file arrow/ci/conda_env_cpp.yml > # --file arrow/ci/conda_env_python.yml > # compilers > {{# python=3.7}} > {{# $ conda activate pyarrow-dev}} > {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export > ARROW_HOME=$(pwd)/arrow/dist}} > {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export > CC=`which clang`}} > {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build \}} > pushd arrow/cpp/build \ > cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ > -DCMAKE_INSTALL_LIBDIR=lib \ > -DARROW_FLIGHT=OFF \ > -DARROW_GANDIVA=OFF \ > -DARROW_ORC=ON \ > -DARROW_PARQUET=ON \ > -DARROW_PYTHON=ON \ > -DARROW_PLASMA=ON \ > -DARROW_BUILD_TESTS=ON \ > .. > {{make -j4}} > {{make install}} > {{popd}} > But when I run: > {{pushd arrow/python}} > {{export PYARROW_WITH_FLIGHT=1}} > {{export PYARROW_WITH_GANDIVA=1}} > {{export PYARROW_WITH_ORC=1}} > {{export PYARROW_WITH_PARQUET=1}} > {{python setup.py build_ext --inplace}} > {{popd}} > I get the following errors: > {{-- Build output directory: > /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} > {{-- Found the Arrow core library: > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} > {{-- Found the Arrow Python library: > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not > exist.}}{{...}}{{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} > {{CMake Error at CMakeLists.txt:230 (configure_file):}} > \{{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > \{{ CMakeLists.txt:315 (bundle_arrow_lib)}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not > exist.}} > {{CMake Error at CMakeLists.txt:226 (configure_file):}} > \{{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not > exist.}} > {{CMake Error at CMakeLists.txt:230 (configure_file):}} > \{{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} > > What is quite strange is that the libraries seem to indeed be there but they > have an addition component such as `libarrow.15.dylib` .e.g: > {{$ ls -l libarrow_python.15.dylib && echo $PWD}} > {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} > {{libarrow_python.15.0.0.dylib}} > {{/Users/tallamjr/github/arrow/dist/lib}} > I guess I am not exactly sure what the issue here is but it appears to be that > the version is not captured as a variable that is used by CMAKE? I have run > the > same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) > which both seem to produce same errors. > Apologies if this is not quite the format for JIRA issues here or perhaps if > it's not the correct platform for this, I'm very new to the project and > contributing to apache in general. Thanks > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6581) [C++] Fix fuzzit job submission
[ https://issues.apache.org/jira/browse/ARROW-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6581: --- Summary: [C++] Fix fuzzit job submission (was: [C++] Fuzzing job broken) > [C++] Fix fuzzit job submission > --- > > Key: ARROW-6581 > URL: https://issues.apache.org/jira/browse/ARROW-6581 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > See [https://circleci.com/gh/ursa-labs/crossbow/2978] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6581) [C++] Fuzzing job broken
[ https://issues.apache.org/jira/browse/ARROW-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-6581. Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5407 [https://github.com/apache/arrow/pull/5407] > [C++] Fuzzing job broken > > > Key: ARROW-6581 > URL: https://issues.apache.org/jira/browse/ARROW-6581 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > See [https://circleci.com/gh/ursa-labs/crossbow/2978] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6581) [C++] Fuzzing job broken
[ https://issues.apache.org/jira/browse/ARROW-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-6581: -- Assignee: Antoine Pitrou > [C++] Fuzzing job broken > > > Key: ARROW-6581 > URL: https://issues.apache.org/jira/browse/ARROW-6581 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > See [https://circleci.com/gh/ursa-labs/crossbow/2978] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6759) [JS] Run less comprehensive every-commit build, relegate multi-target builds perhaps to nightlies
[ https://issues.apache.org/jira/browse/ARROW-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943014#comment-16943014 ] Paul Taylor commented on ARROW-6759: Yeah no sweat, we can change the `ci/travis_script_js.sh` build and test commands to only test the UMD builds. Historically these have the most issues since they're minified, so if they pass everything should pass: {code:bash} npm run build -- -m umd -t es5 -t es2015 -t esnext npm test -- -m umd -t es5 -t es2015 -t esnext {code} > [JS] Run less comprehensive every-commit build, relegate multi-target builds > perhaps to nightlies > - > > Key: ARROW-6759 > URL: https://issues.apache.org/jira/browse/ARROW-6759 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Wes McKinney >Priority: Major > Fix For: 1.0.0 > > > The JavaScript CI build is taking 25-30 minutes nowadays. This could be > abbreviated by testing fewer deployment targets. We obviously still need to > test all the deployment targets but we could do that nightly instead of on > every commit -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6772) [C++] Add operator== for interfaces with an Equals() method
[ https://issues.apache.org/jira/browse/ARROW-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943006#comment-16943006 ] Ben Kietzman commented on ARROW-6772: - {{operator==}} for Schemas was added by [https://github.com/apache/arrow/pull/5529] > [C++] Add operator== for interfaces with an Equals() method > --- > > Key: ARROW-6772 > URL: https://issues.apache.org/jira/browse/ARROW-6772 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Ben Kietzman >Assignee: Ben Kietzman >Priority: Major > > A common pattern in tests is {{ASSERT_TRUE(schm->Equals(*other)}}. The > addition of overloaded equality operators will allow this o be written > {{ASSERT_EQ(*schm, *other)}}, which is more idiomatic GTEST usage and will > allow more informative assertion failure messages. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6756) [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem`
[ https://issues.apache.org/jira/browse/ARROW-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943003#comment-16943003 ] bb commented on ARROW-6756: --- All the libhdfs docs i found said: {quote}The libhdfs APIs are a subset of the [Hadoop FileSystem APIs|https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html]. {quote} And goes on to say that: {quote}The header file for libhdfs describes each API in detail and is available in {{$HADOOP_PREFIX/src/c++/libhdfs/hdfs.h}} {quote} I grep'd hdfs.h and didnt see any explicit "acl" reference nor could I find it in the hadoop github repo but it's possible that the API is referenced under another name or I am looking in the wrong places? > [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem` > --- > > Key: ARROW-6756 > URL: https://issues.apache.org/jira/browse/ARROW-6756 > Project: Apache Arrow > Issue Type: Wish > Components: Python >Affects Versions: 0.13.0 >Reporter: bb >Priority: Major > Labels: arrow, hdfs, pyarrow > > Extended HDFS filesystem attributes are exposed through the `getfacl` command. > It would be immensely help to have this information accessible via: > {code:java} > pyarrow.hdfs.HadoopFileSystem{code} > > Link to the official Hadoop docs where this is discussed in more detail: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#getfacl] > Sample output from the *nix shell: > {code:java} > $ hadoop fs -getfacl /path/to/hdfs/dir > # file: /path/to/hdfs/dir > # owner: hive > # group: hive > user::rwx > group:unix_group_with_acl_privs_defined:rwx > group::--- > user:hive:rwx > group:hive:rwx > mask::rwx > other::--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6614) [C++][Dataset] Implement FileSystemDataSourceDiscovery
[ https://issues.apache.org/jira/browse/ARROW-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Kietzman resolved ARROW-6614. -- Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request 5529 [https://github.com/apache/arrow/pull/5529] > [C++][Dataset] Implement FileSystemDataSourceDiscovery > -- > > Key: ARROW-6614 > URL: https://issues.apache.org/jira/browse/ARROW-6614 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Fix For: 0.15.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > DataSourceDiscovery is what allows InferingSchema and constructing a > DataSource with PartitionScheme. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6614) [C++][Dataset] Implement FileSystemDataSourceDiscovery
[ https://issues.apache.org/jira/browse/ARROW-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Kietzman reassigned ARROW-6614: Assignee: Francois Saint-Jacques > [C++][Dataset] Implement FileSystemDataSourceDiscovery > -- > > Key: ARROW-6614 > URL: https://issues.apache.org/jira/browse/ARROW-6614 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset, pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > DataSourceDiscovery is what allows InferingSchema and constructing a > DataSource with PartitionScheme. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6772) [C++] Add operator== for interfaces with an Equals() method
Ben Kietzman created ARROW-6772: --- Summary: [C++] Add operator== for interfaces with an Equals() method Key: ARROW-6772 URL: https://issues.apache.org/jira/browse/ARROW-6772 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Ben Kietzman Assignee: Ben Kietzman A common pattern in tests is {{ASSERT_TRUE(schm->Equals(*other)}}. The addition of overloaded equality operators will allow this o be written {{ASSERT_EQ(*schm, *other)}}, which is more idiomatic GTEST usage and will allow more informative assertion failure messages. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6771) [Packaging][Python] Missing pytest dependency from conda and wheel builds
[ https://issues.apache.org/jira/browse/ARROW-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6771: -- Labels: pull-request-available (was: ) > [Packaging][Python] Missing pytest dependency from conda and wheel builds > - > > Key: ARROW-6771 > URL: https://issues.apache.org/jira/browse/ARROW-6771 > Project: Apache Arrow > Issue Type: Improvement > Components: Packaging, Python >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Multiple python packaging nightlies are failing: > {code} > Failed Tasks: > - conda-osx-clang-py36: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-osx-clang-py36 > - conda-osx-clang-py37: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-osx-clang-py37 > - conda-win-vs2015-py36: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-win-vs2015-py36 > - wheel-manylinux1-cp27mu: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-wheel-manylinux1-cp27mu > - conda-linux-gcc-py27: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-linux-gcc-py27 > - wheel-osx-cp27m: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-wheel-osx-cp27m > - docker-spark-integration: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-circle-docker-spark-integration > - wheel-win-cp35m: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-appveyor-wheel-win-cp35m > - conda-win-vs2015-py37: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-win-vs2015-py37 > - conda-linux-gcc-py37: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-linux-gcc-py37 > - wheel-manylinux2010-cp27mu: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-wheel-manylinux2010-cp27mu > - conda-linux-gcc-py36: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-linux-gcc-py36 > - wheel-win-cp37m: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-appveyor-wheel-win-cp37m > - wheel-win-cp36m: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-appveyor-wheel-win-cp36m > - gandiva-jar-osx: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-gandiva-jar-osx > - conda-osx-clang-py27: > URL: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-osx-clang-py27 > {code} > Because of missing, recently introduced pytest-lazy-fixture test dependency: > {code} > + pytest -m 'not requires_testing_data' --pyargs pyarrow > = test session starts > == > platform linux -- Python 3.7.3, pytest-5.2.0, py-1.8.0, pluggy-0.13.0 > hypothesis profile 'default' -> > database=DirectoryBasedExampleDatabase('$SRC_DIR/.hypothesis/examples') > rootdir: $SRC_DIR > plugins: hypothesis-4.38.1 > collected 1437 items / 1 errors / 3 deselected / 5 skipped / 1428 selected > ERRORS > > __ ERROR collecting tests/test_fs.py > ___ > ../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/lib/python3.7/site-packages/pyarrow/tests/test_fs.py:91: > in > pytest.lazy_fixture('localfs'), > E AttributeError: module 'pytest' has no attribute 'lazy_fixture' > === warnings summary > === > $PREFIX/lib/python3.7/site-packages/_pytest/mark/structures.py:324 > $PREFIX/lib/python3.7/site-packages/_pytest/mark/structures.py:324: > PytestUnknownMarkWarning: Unknown pytest.mark.s3 - is this a typo? You > can register custom marks to avoid this warning - for details, see > https://docs.pytest.org/en/latest/mark.html > PytestUnknownMarkWarning, > -- Docs: https://docs.pytest.org/en/latest/warnings.html > !!! Interrupted: 1 errors during collection > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6771) [Packaging][Python] Missing pytest dependency from conda and wheel builds
Krisztian Szucs created ARROW-6771: -- Summary: [Packaging][Python] Missing pytest dependency from conda and wheel builds Key: ARROW-6771 URL: https://issues.apache.org/jira/browse/ARROW-6771 Project: Apache Arrow Issue Type: Improvement Components: Packaging, Python Reporter: Krisztian Szucs Assignee: Krisztian Szucs Fix For: 1.0.0 Multiple python packaging nightlies are failing: {code} Failed Tasks: - conda-osx-clang-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-osx-clang-py36 - conda-osx-clang-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-osx-clang-py37 - conda-win-vs2015-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-win-vs2015-py36 - wheel-manylinux1-cp27mu: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-wheel-manylinux1-cp27mu - conda-linux-gcc-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-linux-gcc-py27 - wheel-osx-cp27m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-wheel-osx-cp27m - docker-spark-integration: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-circle-docker-spark-integration - wheel-win-cp35m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-appveyor-wheel-win-cp35m - conda-win-vs2015-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-win-vs2015-py37 - conda-linux-gcc-py37: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-linux-gcc-py37 - wheel-manylinux2010-cp27mu: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-wheel-manylinux2010-cp27mu - conda-linux-gcc-py36: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-linux-gcc-py36 - wheel-win-cp37m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-appveyor-wheel-win-cp37m - wheel-win-cp36m: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-appveyor-wheel-win-cp36m - gandiva-jar-osx: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-travis-gandiva-jar-osx - conda-osx-clang-py27: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-10-02-0-azure-conda-osx-clang-py27 {code} Because of missing, recently introduced pytest-lazy-fixture test dependency: {code} + pytest -m 'not requires_testing_data' --pyargs pyarrow = test session starts == platform linux -- Python 3.7.3, pytest-5.2.0, py-1.8.0, pluggy-0.13.0 hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('$SRC_DIR/.hypothesis/examples') rootdir: $SRC_DIR plugins: hypothesis-4.38.1 collected 1437 items / 1 errors / 3 deselected / 5 skipped / 1428 selected ERRORS __ ERROR collecting tests/test_fs.py ___ ../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/lib/python3.7/site-packages/pyarrow/tests/test_fs.py:91: in pytest.lazy_fixture('localfs'), E AttributeError: module 'pytest' has no attribute 'lazy_fixture' === warnings summary === $PREFIX/lib/python3.7/site-packages/_pytest/mark/structures.py:324 $PREFIX/lib/python3.7/site-packages/_pytest/mark/structures.py:324: PytestUnknownMarkWarning: Unknown pytest.mark.s3 - is this a typo? You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/latest/mark.html PytestUnknownMarkWarning, -- Docs: https://docs.pytest.org/en/latest/warnings.html !!! Interrupted: 1 errors during collection {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6770) [CI][Travis] Download Minio quietly
[ https://issues.apache.org/jira/browse/ARROW-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6770: --- Description: To remove verbose output https://travis-ci.org/pitrou/arrow/jobs/592577525#L191 > [CI][Travis] Download Minio quietly > --- > > Key: ARROW-6770 > URL: https://issues.apache.org/jira/browse/ARROW-6770 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > To remove verbose output > https://travis-ci.org/pitrou/arrow/jobs/592577525#L191 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6770) [CI][Travis] Download Minio quietly
[ https://issues.apache.org/jira/browse/ARROW-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6770: -- Labels: pull-request-available (was: ) > [CI][Travis] Download Minio quietly > --- > > Key: ARROW-6770 > URL: https://issues.apache.org/jira/browse/ARROW-6770 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-5655) [Python] Table.from_pydict/from_arrays not using types in specified schema correctly
[ https://issues.apache.org/jira/browse/ARROW-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5655: -- Labels: pull-request-available (was: ) > [Python] Table.from_pydict/from_arrays not using types in specified schema > correctly > - > > Key: ARROW-5655 > URL: https://issues.apache.org/jira/browse/ARROW-5655 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Joris Van den Bossche >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Example with {{from_pydict}} (from > https://github.com/apache/arrow/pull/4601#issuecomment-503676534): > {code:python} > In [15]: table = pa.Table.from_pydict( > ...: {'a': [1, 2, 3], 'b': [3, 4, 5]}, > ...: schema=pa.schema([('a', pa.int64()), ('c', pa.int32())])) > In [16]: table > Out[16]: > pyarrow.Table > a: int64 > c: int32 > In [17]: table.to_pandas() > Out[17]: >a c > 0 1 3 > 1 2 0 > 2 3 4 > {code} > Note that the specified schema has 1) different column names and 2) has a > non-default type (int32 vs int64) which leads to corrupted values. > This is partly due to {{Table.from_pydict}} not using the type information in > the schema to convert the dictionary items to pyarrow arrays. But then it is > also {{Table.from_arrays}} that is not correctly casting the arrays to > another dtype if the schema specifies as such. > Additional question for {{Table.pydict}} is whether it actually should > override the 'b' key from the dictionary as column 'c' as defined in the > schema (this behaviour depends on the order of the dictionary, which is not > guaranteed below python 3.6). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-5855) [Python] Add support for Duration type
[ https://issues.apache.org/jira/browse/ARROW-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-5855: Assignee: Joris Van den Bossche > [Python] Add support for Duration type > -- > > Key: ARROW-5855 > URL: https://issues.apache.org/jira/browse/ARROW-5855 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Joris Van den Bossche >Assignee: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Add support for the Duration type (added in C++: ARROW-835, ARROW-5261) > - add DurationType and DurationArray wrappers > - add inference support for datetime.timedelta / np.timedelta64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-5855) [Python] Add support for Duration type
[ https://issues.apache.org/jira/browse/ARROW-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5855: -- Labels: pull-request-available (was: ) > [Python] Add support for Duration type > -- > > Key: ARROW-5855 > URL: https://issues.apache.org/jira/browse/ARROW-5855 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Joris Van den Bossche >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > Add support for the Duration type (added in C++: ARROW-835, ARROW-5261) > - add DurationType and DurationArray wrappers > - add inference support for datetime.timedelta / np.timedelta64 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6756) [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem`
[ https://issues.apache.org/jira/browse/ARROW-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942897#comment-16942897 ] Wes McKinney commented on ARROW-6756: - Is this exposed in libhdfs? > [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem` > --- > > Key: ARROW-6756 > URL: https://issues.apache.org/jira/browse/ARROW-6756 > Project: Apache Arrow > Issue Type: Wish > Components: Python >Affects Versions: 0.13.0 >Reporter: bb >Priority: Major > Labels: arrow, hdfs, pyarrow > > Extended HDFS filesystem attributes are exposed through the `getfacl` command. > It would be immensely help to have this information accessible via: > {code:java} > pyarrow.hdfs.HadoopFileSystem{code} > > Link to the official Hadoop docs where this is discussed in more detail: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#getfacl] > Sample output from the *nix shell: > {code:java} > $ hadoop fs -getfacl /path/to/hdfs/dir > # file: /path/to/hdfs/dir > # owner: hive > # group: hive > user::rwx > group:unix_group_with_acl_privs_defined:rwx > group::--- > user:hive:rwx > group:hive:rwx > mask::rwx > other::--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6756) [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem`
[ https://issues.apache.org/jira/browse/ARROW-6756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6756: Summary: [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem` (was: Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem`) > [C++][Python] Include HDFS `getfacl` in `pyarrow.hdfs.HadoopFileSystem` > --- > > Key: ARROW-6756 > URL: https://issues.apache.org/jira/browse/ARROW-6756 > Project: Apache Arrow > Issue Type: Wish > Components: Python >Affects Versions: 0.13.0 >Reporter: bb >Priority: Major > Labels: arrow, hdfs, pyarrow > > Extended HDFS filesystem attributes are exposed through the `getfacl` command. > It would be immensely help to have this information accessible via: > {code:java} > pyarrow.hdfs.HadoopFileSystem{code} > > Link to the official Hadoop docs where this is discussed in more detail: > [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#getfacl] > Sample output from the *nix shell: > {code:java} > $ hadoop fs -getfacl /path/to/hdfs/dir > # file: /path/to/hdfs/dir > # owner: hive > # group: hive > user::rwx > group:unix_group_with_acl_privs_defined:rwx > group::--- > user:hive:rwx > group:hive:rwx > mask::rwx > other::--x{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6770) [CI][Travis] Download Minio quietly
[ https://issues.apache.org/jira/browse/ARROW-6770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-6770: -- Assignee: Krisztian Szucs > [CI][Travis] Download Minio quietly > --- > > Key: ARROW-6770 > URL: https://issues.apache.org/jira/browse/ARROW-6770 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration >Reporter: Krisztian Szucs >Assignee: Krisztian Szucs >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6770) [CI][Travis] Download Minio quietly
Krisztian Szucs created ARROW-6770: -- Summary: [CI][Travis] Download Minio quietly Key: ARROW-6770 URL: https://issues.apache.org/jira/browse/ARROW-6770 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Krisztian Szucs -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-5802) [CI] Dockerize "lint" Travis CI job
[ https://issues.apache.org/jira/browse/ARROW-5802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-5802: - Assignee: Francois Saint-Jacques > [CI] Dockerize "lint" Travis CI job > --- > > Key: ARROW-5802 > URL: https://issues.apache.org/jira/browse/ARROW-5802 > Project: Apache Arrow > Issue Type: Improvement > Components: Continuous Integration >Reporter: Wes McKinney >Assignee: Francois Saint-Jacques >Priority: Major > Fix For: 1.0.0 > > > Run via docker-compose; also enables contributors to lint locally -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6768) [C++][Dataset] Implement dataset::Scan to Table helper function
[ https://issues.apache.org/jira/browse/ARROW-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-6768: - Assignee: Francois Saint-Jacques > [C++][Dataset] Implement dataset::Scan to Table helper function > --- > > Key: ARROW-6768 > URL: https://issues.apache.org/jira/browse/ARROW-6768 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset > > The Scan interface exposes classes (ScanTask/Iterator) which are not of > interest to all callers. This would implement `Status > Scan::Materialize(std::shared_ptr* out)` so consumers can call > this function instead of consuming and dispatching the streaming interface. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6769) [C++][Dataset] End to End dataset integration test case
Francois Saint-Jacques created ARROW-6769: - Summary: [C++][Dataset] End to End dataset integration test case Key: ARROW-6769 URL: https://issues.apache.org/jira/browse/ARROW-6769 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Francois Saint-Jacques 1. Create a DataSource from a known directory and a PartitionScheme. 2. Create a Dataset from the previous DataSource. 3. Request a ScannerBuilder from previous Dataset. 4. Add filter expression to ScannerBuilder (and other options). 5. Finalize into a Scan operation. 6. Materialize into an arrow::Table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6769) [C++][Dataset] End to End dataset integration test case
[ https://issues.apache.org/jira/browse/ARROW-6769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-6769: - Assignee: Francois Saint-Jacques > [C++][Dataset] End to End dataset integration test case > --- > > Key: ARROW-6769 > URL: https://issues.apache.org/jira/browse/ARROW-6769 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: dataset > > 1. Create a DataSource from a known directory and a PartitionScheme. > 2. Create a Dataset from the previous DataSource. > 3. Request a ScannerBuilder from previous Dataset. > 4. Add filter expression to ScannerBuilder (and other options). > 5. Finalize into a Scan operation. > 6. Materialize into an arrow::Table. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6767) [JS] lazily bind batches in scan/scanReverse
[ https://issues.apache.org/jira/browse/ARROW-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6767: -- Labels: pull-request-available (was: ) > [JS] lazily bind batches in scan/scanReverse > > > Key: ARROW-6767 > URL: https://issues.apache.org/jira/browse/ARROW-6767 > Project: Apache Arrow > Issue Type: Improvement > Components: JavaScript >Reporter: Taylor Baldwin >Priority: Minor > Labels: pull-request-available > > Call {{bind(batch)}} lazily in {{scan}} and {{scanReverse}}, that is, only > when the predicate has matched a record in a batch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6768) [C++][Dataset] Implement dataset::Scan to Table helper function
Francois Saint-Jacques created ARROW-6768: - Summary: [C++][Dataset] Implement dataset::Scan to Table helper function Key: ARROW-6768 URL: https://issues.apache.org/jira/browse/ARROW-6768 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Francois Saint-Jacques The Scan interface exposes classes (ScanTask/Iterator) which are not of interest to all callers. This would implement `Status Scan::Materialize(std::shared_ptr* out)` so consumers can call this function instead of consuming and dispatching the streaming interface. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6767) [JS] lazily bind batches in scan/scanReverse
Taylor Baldwin created ARROW-6767: - Summary: [JS] lazily bind batches in scan/scanReverse Key: ARROW-6767 URL: https://issues.apache.org/jira/browse/ARROW-6767 Project: Apache Arrow Issue Type: Improvement Components: JavaScript Reporter: Taylor Baldwin Call {{bind(batch)}} lazily in {{scan}} and {{scanReverse}}, that is, only when the predicate has matched a record in a batch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarek Allam updated ARROW-6766: --- Description: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build \}} pushd arrow/cpp/build \ cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_INSTALL_LIBDIR=lib \ -DARROW_FLIGHT=OFF \ -DARROW_GANDIVA=OFF \ -DARROW_ORC=ON \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON \ .. {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory: /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} {{-- Found the Arrow core library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} {{-- Found the Arrow Python library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}}{{...}}{{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:315 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:226 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} What is quite strange is that the libraries seem to indeed be there but they have an addition component such as `libarrow.15.dylib` .e.g: {{$ ls -l libarrow_python.15.dylib && echo $PWD}} {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} {{libarrow_python.15.0.0.dylib}} {{/Users/tallamjr/github/arrow/dist/lib}} I guess I am not exactly sure what the issue here is but it appears to be that the version is not captured as a variable that is used by CMAKE? I have run the same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) which both seem to produce same errors. Apologies if this is not quite the format for JIRA issues here or perhaps if it's not the correct platform for this, I'm very new to the project and contributing to apache in general. Thanks was: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} {{pushd arrow/cpp/build \}} cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_INSTALL_LIBDIR=lib \ -DARROW_FLIGHT=OFF \ -DARROW_GANDIVA=OFF \ -DARROW_ORC=ON \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON \ .. {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory:
[jira] [Updated] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarek Allam updated ARROW-6766: --- Description: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} {{pushd arrow/cpp/build \}} cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_INSTALL_LIBDIR=lib \ -DARROW_FLIGHT=OFF \ -DARROW_GANDIVA=OFF \ -DARROW_ORC=ON \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON \ .. {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory: /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} {{-- Found the Arrow core library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} {{-- Found the Arrow Python library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}}{{...}}{{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:315 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:226 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} What is quite strange is that the libraries seem to indeed be there but they have an addition component such as `libarrow.15.dylib` .e.g: {{$ ls -l libarrow_python.15.dylib && echo $PWD}} {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} {{libarrow_python.15.0.0.dylib}} {{/Users/tallamjr/github/arrow/dist/lib}} I guess I am not exactly sure what the issue here is but it appears to be that the version is not captured as a variable that is used by CMAKE? I have run the same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) which both seem to produce same errors. Apologies if this is not quite the format for JIRA issues here or perhaps if it's not the correct platform for this, I'm very new to the project and contributing to apache in general. Thanks was: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} {{pushd arrow/cpp/build}}{\{cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME }} \{{ -DCMAKE_INSTALL_LIBDIR=lib }} \{{ -DARROW_FLIGHT=OFF }} \{{ -DARROW_GANDIVA=OFF }} \{{ -DARROW_ORC=ON }} \{{ -DARROW_PARQUET=ON }} \{{ -DARROW_PYTHON=ON }} \{{ -DARROW_PLASMA=ON }} \{{ -DARROW_BUILD_TESTS=ON }} \{{ ..}} {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory:
[jira] [Commented] (ARROW-6765) 0.14.1 not available on Windows
[ https://issues.apache.org/jira/browse/ARROW-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942852#comment-16942852 ] Yannik commented on ARROW-6765: --- thanks for pointing out! > 0.14.1 not available on Windows > --- > > Key: ARROW-6765 > URL: https://issues.apache.org/jira/browse/ARROW-6765 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.1 > Environment: Windows >Reporter: Yannik >Priority: Major > > On linux, I can install pyarrow 0.14.1 from pip, but on windows the latest > seems to be 0.14.0. Why is that? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarek Allam updated ARROW-6766: --- Description: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} # $ conda create -y -n pyarrow-dev -c conda-forge # --file arrow/ci/conda_env_unix.yml # --file arrow/ci/conda_env_cpp.yml # --file arrow/ci/conda_env_python.yml # compilers {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} {{pushd arrow/cpp/build}}{\{cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME }} \{{ -DCMAKE_INSTALL_LIBDIR=lib }} \{{ -DARROW_FLIGHT=OFF }} \{{ -DARROW_GANDIVA=OFF }} \{{ -DARROW_ORC=ON }} \{{ -DARROW_PARQUET=ON }} \{{ -DARROW_PYTHON=ON }} \{{ -DARROW_PLASMA=ON }} \{{ -DARROW_BUILD_TESTS=ON }} \{{ ..}} {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory: /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} {{-- Found the Arrow core library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} {{-- Found the Arrow Python library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}}{{...}}{{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:315 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:226 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} \{{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} \{{ CMakeLists.txt:320 (bundle_arrow_lib)}} What is quite strange is that the libraries seem to indeed be there but they have an addition component such as `libarrow.15.dylib` .e.g: {{$ ls -l libarrow_python.15.dylib && echo $PWD}} {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} {{libarrow_python.15.0.0.dylib}} {{/Users/tallamjr/github/arrow/dist/lib}} I guess I am not exactly sure what the issue here is but it appears to be that the version is not captured as a variable that is used by CMAKE? I have run the same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) which both seem to produce same errors. Apologies if this is not quite the format for JIRA issues here or perhaps if it's not the correct platform for this, I'm very new to the project and contributing to apache in general. Thanks was: {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} {{# $ conda create -y -n pyarrow-dev -c conda-forge \}} {{# --file arrow/ci/conda_env_unix.yml \}} {{# --file arrow/ci/conda_env_cpp.yml \}} {{# --file arrow/ci/conda_env_python.yml \}} {{# compilers \}} {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} {{pushd arrow/cpp/build}}{{cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \}} {{ -DCMAKE_INSTALL_LIBDIR=lib \}} {{ -DARROW_FLIGHT=OFF \}} {{ -DARROW_GANDIVA=OFF \}} {{ -DARROW_ORC=ON \}} {{ -DARROW_PARQUET=ON \}} {{ -DARROW_PYTHON=ON \}} {{ -DARROW_PLASMA=ON \}} {{ -DARROW_BUILD_TESTS=ON \}} {{ ..}} {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output
[jira] [Updated] (ARROW-6766) [Python] libarrow_python..dylib does not exist
[ https://issues.apache.org/jira/browse/ARROW-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarek Allam updated ARROW-6766: --- Summary: [Python] libarrow_python..dylib does not exist (was: libarrow_python..dylib does not exist) > [Python] libarrow_python..dylib does not exist > -- > > Key: ARROW-6766 > URL: https://issues.apache.org/jira/browse/ARROW-6766 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.0, 0.15.0 >Reporter: Tarek Allam >Priority: Blocker > > {{After following the instructions found on the developer guides for Python, > I was}} > {{able to build fine by using:}} > {{# Assuming immediately prior one has run:}} > {{# $ git clone g...@github.com:apache/arrow.git}} > {{# $ conda create -y -n pyarrow-dev -c conda-forge \}} > {{# --file arrow/ci/conda_env_unix.yml \}} > {{# --file arrow/ci/conda_env_cpp.yml \}} > {{# --file arrow/ci/conda_env_python.yml \}} > {{# compilers \}} > {{# python=3.7}} > {{# $ conda activate pyarrow-dev}} > {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export > ARROW_HOME=$(pwd)/arrow/dist}} > {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export > CC=`which clang`}} > {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} > {{pushd arrow/cpp/build}}{{cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \}} > {{ -DCMAKE_INSTALL_LIBDIR=lib \}} > {{ -DARROW_FLIGHT=OFF \}} > {{ -DARROW_GANDIVA=OFF \}} > {{ -DARROW_ORC=ON \}} > {{ -DARROW_PARQUET=ON \}} > {{ -DARROW_PYTHON=ON \}} > {{ -DARROW_PLASMA=ON \}} > {{ -DARROW_BUILD_TESTS=ON \}} > {{ ..}} > {{make -j4}} > {{make install}} > {{popd}} > But when I run: > {{pushd arrow/python}} > {{export PYARROW_WITH_FLIGHT=1}} > {{export PYARROW_WITH_GANDIVA=1}} > {{export PYARROW_WITH_ORC=1}} > {{export PYARROW_WITH_PARQUET=1}} > {{python setup.py build_ext --inplace}} > {{popd}} > I get the following errors: > {{-- Build output directory: > /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} > {{-- Found the Arrow core library: > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} > {{-- Found the Arrow Python library: > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} > {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib > does not exist.}}{{...}}{{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} > {{CMake Error at CMakeLists.txt:230 (configure_file):}} > {{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > {{ CMakeLists.txt:315 (bundle_arrow_lib)}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not > exist.}} > {{CMake Error at CMakeLists.txt:226 (configure_file):}} > {{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > {{ CMakeLists.txt:320 (bundle_arrow_lib)}} > {{CMake Error: File > /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not > exist.}} > {{CMake Error at CMakeLists.txt:230 (configure_file):}} > {{ configure_file Problem configuring file}} > {{Call Stack (most recent call first):}} > {{ CMakeLists.txt:320 (bundle_arrow_lib)}} > > What is quite strange is that the libraries seem to indeed be there but they > have an addition component such as `libarrow.15.dylib` .e.g: > {{$ ls -l libarrow_python.15.dylib && echo $PWD}} > {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} > {{libarrow_python.15.0.0.dylib}} > {{/Users/tallamjr/github/arrow/dist/lib}} > I guess I am not exactly sure what the issue here is but it appears to be that > the version is not captured as a variable that is used by CMAKE? I have run > the > same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) > which both seem to produce same errors. > Apologies if this is not quite the format for JIRA issues here or perhaps if > it's not the correct platform for this, I'm very new to the project and > contributing to apache in general. Thanks > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6766) libarrow_python..dylib does not exist
Tarek Allam created ARROW-6766: -- Summary: libarrow_python..dylib does not exist Key: ARROW-6766 URL: https://issues.apache.org/jira/browse/ARROW-6766 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.14.0, 0.15.0 Reporter: Tarek Allam {{After following the instructions found on the developer guides for Python, I was}} {{able to build fine by using:}} {{# Assuming immediately prior one has run:}} {{# $ git clone g...@github.com:apache/arrow.git}} {{# $ conda create -y -n pyarrow-dev -c conda-forge \}} {{# --file arrow/ci/conda_env_unix.yml \}} {{# --file arrow/ci/conda_env_cpp.yml \}} {{# --file arrow/ci/conda_env_python.yml \}} {{# compilers \}} {{# python=3.7}} {{# $ conda activate pyarrow-dev}} {{# $ brew update && brew bundle --file=arrow/cpp/Brewfile}}{{export ARROW_HOME=$(pwd)/arrow/dist}} {{export LD_LIBRARY_PATH=$(pwd)/arrow/dist/lib:$LD_LIBRARY_PATH}}{{export CC=`which clang`}} {{export CXX=`which clang++`}}{{mkdir arrow/cpp/build}} {{pushd arrow/cpp/build}}{{cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \}} {{ -DCMAKE_INSTALL_LIBDIR=lib \}} {{ -DARROW_FLIGHT=OFF \}} {{ -DARROW_GANDIVA=OFF \}} {{ -DARROW_ORC=ON \}} {{ -DARROW_PARQUET=ON \}} {{ -DARROW_PYTHON=ON \}} {{ -DARROW_PLASMA=ON \}} {{ -DARROW_BUILD_TESTS=ON \}} {{ ..}} {{make -j4}} {{make install}} {{popd}} But when I run: {{pushd arrow/python}} {{export PYARROW_WITH_FLIGHT=1}} {{export PYARROW_WITH_GANDIVA=1}} {{export PYARROW_WITH_ORC=1}} {{export PYARROW_WITH_PARQUET=1}} {{python setup.py build_ext --inplace}} {{popd}} I get the following errors: {{-- Build output directory: /Users/tallamjr/Github/arrow/python/build/temp.macosx-10.9-x86_64-3.7/release}} {{-- Found the Arrow core library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow.dylib}} {{-- Found the Arrow Python library: /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python.dylib}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}}{{...}}{{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} {{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} {{ CMakeLists.txt:315 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:226 (configure_file):}} {{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} {{ CMakeLists.txt:320 (bundle_arrow_lib)}} {{CMake Error: File /usr/local/anaconda3/envs/pyarrow-dev/lib/libarrow_python..dylib does not exist.}} {{CMake Error at CMakeLists.txt:230 (configure_file):}} {{ configure_file Problem configuring file}} {{Call Stack (most recent call first):}} {{ CMakeLists.txt:320 (bundle_arrow_lib)}} What is quite strange is that the libraries seem to indeed be there but they have an addition component such as `libarrow.15.dylib` .e.g: {{$ ls -l libarrow_python.15.dylib && echo $PWD}} {{lrwxr-xr-x 1 tallamjr staff 28 Oct 2 14:02 libarrow_python.15.dylib ->}} {{libarrow_python.15.0.0.dylib}} {{/Users/tallamjr/github/arrow/dist/lib}} I guess I am not exactly sure what the issue here is but it appears to be that the version is not captured as a variable that is used by CMAKE? I have run the same setup on `master` (`7d18c1c`) and on `apache-arrow-0.14.0` (`a591d76`) which both seem to produce same errors. Apologies if this is not quite the format for JIRA issues here or perhaps if it's not the correct platform for this, I'm very new to the project and contributing to apache in general. Thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6755) [Release] Improvements to Windows release verification script
[ https://issues.apache.org/jira/browse/ARROW-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-6755. --- Fix Version/s: (was: 1.0.0) 0.15.0 Resolution: Fixed Issue resolved by pull request 5559 [https://github.com/apache/arrow/pull/5559] > [Release] Improvements to Windows release verification script > - > > Key: ARROW-6755 > URL: https://issues.apache.org/jira/browse/ARROW-6755 > Project: Apache Arrow > Issue Type: Improvement > Components: Developer Tools >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 20m > Remaining Estimate: 0h > > * Only build dynamic libraries (we don't need the static libs to verify, and > I got "compiler is out of heap space" errors when I built locally just now, > will have to investigate that some more later) > * Maybe some other things -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6755) [Release] Improvements to Windows release verification script
[ https://issues.apache.org/jira/browse/ARROW-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-6755: - Assignee: Wes McKinney > [Release] Improvements to Windows release verification script > - > > Key: ARROW-6755 > URL: https://issues.apache.org/jira/browse/ARROW-6755 > Project: Apache Arrow > Issue Type: Improvement > Components: Developer Tools >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > * Only build dynamic libraries (we don't need the static libs to verify, and > I got "compiler is out of heap space" errors when I built locally just now, > will have to investigate that some more later) > * Maybe some other things -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-6765) 0.14.1 not available on Windows
[ https://issues.apache.org/jira/browse/ARROW-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney closed ARROW-6765. --- Resolution: Won't Fix > 0.14.1 not available on Windows > --- > > Key: ARROW-6765 > URL: https://issues.apache.org/jira/browse/ARROW-6765 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.1 > Environment: Windows >Reporter: Yannik >Priority: Major > > On linux, I can install pyarrow 0.14.1 from pip, but on windows the latest > seems to be 0.14.0. Why is that? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-2863) [Python] Add context manager APIs to RecordBatch*Writer/Reader classes
[ https://issues.apache.org/jira/browse/ARROW-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-2863: -- Assignee: Krisztian Szucs > [Python] Add context manager APIs to RecordBatch*Writer/Reader classes > -- > > Key: ARROW-2863 > URL: https://issues.apache.org/jira/browse/ARROW-2863 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Wes McKinney >Assignee: Krisztian Szucs >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This would cause the {{close}} method to be called when the scope exits -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-5655) [Python] Table.from_pydict/from_arrays not using types in specified schema correctly
[ https://issues.apache.org/jira/browse/ARROW-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942841#comment-16942841 ] Joris Van den Bossche commented on ARROW-5655: -- [~kszucs] I think this might already be fixed in the mean-time. Wes and I did some work related to schema handling the last month > [Python] Table.from_pydict/from_arrays not using types in specified schema > correctly > - > > Key: ARROW-5655 > URL: https://issues.apache.org/jira/browse/ARROW-5655 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Joris Van den Bossche >Assignee: Krisztian Szucs >Priority: Major > Fix For: 1.0.0 > > > Example with {{from_pydict}} (from > https://github.com/apache/arrow/pull/4601#issuecomment-503676534): > {code:python} > In [15]: table = pa.Table.from_pydict( > ...: {'a': [1, 2, 3], 'b': [3, 4, 5]}, > ...: schema=pa.schema([('a', pa.int64()), ('c', pa.int32())])) > In [16]: table > Out[16]: > pyarrow.Table > a: int64 > c: int32 > In [17]: table.to_pandas() > Out[17]: >a c > 0 1 3 > 1 2 0 > 2 3 4 > {code} > Note that the specified schema has 1) different column names and 2) has a > non-default type (int32 vs int64) which leads to corrupted values. > This is partly due to {{Table.from_pydict}} not using the type information in > the schema to convert the dictionary items to pyarrow arrays. But then it is > also {{Table.from_arrays}} that is not correctly casting the arrays to > another dtype if the schema specifies as such. > Additional question for {{Table.pydict}} is whether it actually should > override the 'b' key from the dictionary as column 'c' as defined in the > schema (this behaviour depends on the order of the dictionary, which is not > guaranteed below python 3.6). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6765) 0.14.1 not available on Windows
[ https://issues.apache.org/jira/browse/ARROW-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942839#comment-16942839 ] Krisztian Szucs commented on ARROW-6765: The shipped 0.14.1 wheels were broken for windows, so we've decided to remove them. The linking issue affecting the 0.14.1 wheels should be fixed now by https://issues.apache.org/jira/browse/ARROW-6584 The current 0.15 release is under vote, you can try to download and install the release candidate wheel from: https://bintray.com/apache/arrow/python-rc/0.15.0-rc2#files/python-rc/0.15.0-rc2 > 0.14.1 not available on Windows > --- > > Key: ARROW-6765 > URL: https://issues.apache.org/jira/browse/ARROW-6765 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.1 > Environment: Windows >Reporter: Yannik >Priority: Major > > On linux, I can install pyarrow 0.14.1 from pip, but on windows the latest > seems to be 0.14.0. Why is that? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6757) [Python] Creating csv.ParseOptions() causes "Windows fatal exception: access violation" with Visual Studio 2017
[ https://issues.apache.org/jira/browse/ARROW-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942835#comment-16942835 ] Wes McKinney commented on ARROW-6757: - I haven't had a chance to investigate further yet. Will report back once I learn something > [Python] Creating csv.ParseOptions() causes "Windows fatal exception: access > violation" with Visual Studio 2017 > --- > > Key: ARROW-6757 > URL: https://issues.apache.org/jira/browse/ARROW-6757 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Wes McKinney >Priority: Major > Fix For: 1.0.0 > > > I encountered this when trying to verify the release with MSVC 2017. It may > be particular to this machine or build (though it's 100% reproducible for > me). I will check the Windows wheels to see if it occurs there, too > {code} > (C:\tmp\arrow-verify-release\conda-env) λ python > Python 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 22:01:29) > [MSC v.1900 64 bit (AMD64)] :: Anaconda, Inc. on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import pyarrow.csv as pc > >>> pc.ParseOptions() > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-5655) [Python] Table.from_pydict/from_arrays not using types in specified schema correctly
[ https://issues.apache.org/jira/browse/ARROW-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-5655: -- Assignee: Krisztian Szucs > [Python] Table.from_pydict/from_arrays not using types in specified schema > correctly > - > > Key: ARROW-5655 > URL: https://issues.apache.org/jira/browse/ARROW-5655 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Joris Van den Bossche >Assignee: Krisztian Szucs >Priority: Major > Fix For: 1.0.0 > > > Example with {{from_pydict}} (from > https://github.com/apache/arrow/pull/4601#issuecomment-503676534): > {code:python} > In [15]: table = pa.Table.from_pydict( > ...: {'a': [1, 2, 3], 'b': [3, 4, 5]}, > ...: schema=pa.schema([('a', pa.int64()), ('c', pa.int32())])) > In [16]: table > Out[16]: > pyarrow.Table > a: int64 > c: int32 > In [17]: table.to_pandas() > Out[17]: >a c > 0 1 3 > 1 2 0 > 2 3 4 > {code} > Note that the specified schema has 1) different column names and 2) has a > non-default type (int32 vs int64) which leads to corrupted values. > This is partly due to {{Table.from_pydict}} not using the type information in > the schema to convert the dictionary items to pyarrow arrays. But then it is > also {{Table.from_arrays}} that is not correctly casting the arrays to > another dtype if the schema specifies as such. > Additional question for {{Table.pydict}} is whether it actually should > override the 'b' key from the dictionary as column 'c' as defined in the > schema (this behaviour depends on the order of the dictionary, which is not > guaranteed below python 3.6). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6762) [C++] JSON reader segfaults on newline
[ https://issues.apache.org/jira/browse/ARROW-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942811#comment-16942811 ] Antoine Pitrou commented on ARROW-6762: --- Well, the attached PR removes the limitation then. > [C++] JSON reader segfaults on newline > -- > > Key: ARROW-6762 > URL: https://issues.apache.org/jira/browse/ARROW-6762 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Antoine Pitrou >Priority: Major > Labels: json, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that > trying to read this file on master results in a segfault: > {code} > In [1]: from pyarrow import json >...: import pyarrow.parquet as pq >...: >...: r = json.read_json('SampleRecord.jl') > WARNING: Logging before InitGoogleLogging() is written to STDERR > F1002 09:56:55.362766 13035 reader.cc:93] Check failed: > (string_view(*next_partial).find_first_not_of(" \t\n\r")) == > (string_view::npos) > *** Check failure stack trace: *** > Aborted (core dumped) > {code} > while with 0.14.1 this works fine: > {code} > In [24]: from pyarrow import json > ...: import pyarrow.parquet as pq > ...: > ...: r = json.read_json('SampleRecord.jl') > > > In [25]: r > > > Out[25]: > pyarrow.Table > _type: string > provider_name: string > arrival: timestamp[s] > berthed: timestamp[s] > berth: null > cargoes: list volume_unit: string, buyer: null, seller: null>> > child 0, item: struct volume_unit: string, buyer: null, seller: null> > child 0, movement: string > child 1, product: string > child 2, volume: string > child 3, volume_unit: string > child 4, buyer: null > child 5, seller: null > departure: timestamp[s] > eta: null > installation: null > port_name: string > next_zone: null > reported_date: timestamp[s] > shipping_agent: null > vessel: struct null, dwt: null, flag_code: null, flag_name: null, gross_tonnage: null, imo: > string, length: int64, mmsi: null, name: string, type: null, vessel_type: > null> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: null > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: null > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: int64 > child 10, mmsi: null > child 11, name: string > child 12, type: null > child 13, vessel_type: null > In [26]: pa.__version__ > > > Out[26]: '0.14.1' > {code} > cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ARROW-6762) [C++] JSON reader segfaults on newline
[ https://issues.apache.org/jira/browse/ARROW-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942801#comment-16942801 ] Ben Kietzman edited comment on ARROW-6762 at 10/2/19 1:19 PM: -- This is a [stated limitation|https://github.com/apache/arrow/blob/master/cpp/src/arrow/json/options.h#L50] of the JSON parser when parsing with strict newline delimiters. Still, it shouldn't crash; we should probably change the debug assertion to an informative error message suggesting {{newlines_in_values=true}} or appending an empty line. was (Author: bkietz): This is a [stated limitation|https://github.com/apache/arrow/blob/master/cpp/src/arrow/json/options.h#L50] of the JSON parser when parsing with strict newline delimiters. Still, it shouldn't crash. I'll change the debug assertion to an informative error message suggesting {{newlines_in_values=true}} or appending an empty line. > [C++] JSON reader segfaults on newline > -- > > Key: ARROW-6762 > URL: https://issues.apache.org/jira/browse/ARROW-6762 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Antoine Pitrou >Priority: Major > Labels: json, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that > trying to read this file on master results in a segfault: > {code} > In [1]: from pyarrow import json >...: import pyarrow.parquet as pq >...: >...: r = json.read_json('SampleRecord.jl') > WARNING: Logging before InitGoogleLogging() is written to STDERR > F1002 09:56:55.362766 13035 reader.cc:93] Check failed: > (string_view(*next_partial).find_first_not_of(" \t\n\r")) == > (string_view::npos) > *** Check failure stack trace: *** > Aborted (core dumped) > {code} > while with 0.14.1 this works fine: > {code} > In [24]: from pyarrow import json > ...: import pyarrow.parquet as pq > ...: > ...: r = json.read_json('SampleRecord.jl') > > > In [25]: r > > > Out[25]: > pyarrow.Table > _type: string > provider_name: string > arrival: timestamp[s] > berthed: timestamp[s] > berth: null > cargoes: list volume_unit: string, buyer: null, seller: null>> > child 0, item: struct volume_unit: string, buyer: null, seller: null> > child 0, movement: string > child 1, product: string > child 2, volume: string > child 3, volume_unit: string > child 4, buyer: null > child 5, seller: null > departure: timestamp[s] > eta: null > installation: null > port_name: string > next_zone: null > reported_date: timestamp[s] > shipping_agent: null > vessel: struct null, dwt: null, flag_code: null, flag_name: null, gross_tonnage: null, imo: > string, length: int64, mmsi: null, name: string, type: null, vessel_type: > null> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: null > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: null > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: int64 > child 10, mmsi: null > child 11, name: string > child 12, type: null > child 13, vessel_type: null > In [26]: pa.__version__ > > > Out[26]: '0.14.1' > {code} > cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6762) [C++] JSON reader segfaults on newline
[ https://issues.apache.org/jira/browse/ARROW-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942801#comment-16942801 ] Ben Kietzman commented on ARROW-6762: - This is a [stated limitation|https://github.com/apache/arrow/blob/master/cpp/src/arrow/json/options.h#L50] of the JSON parser when parsing with strict newline delimiters. Still, it shouldn't crash. I'll change the debug assertion to an informative error message suggesting {{newlines_in_values=true}} or appending an empty line. > [C++] JSON reader segfaults on newline > -- > > Key: ARROW-6762 > URL: https://issues.apache.org/jira/browse/ARROW-6762 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Antoine Pitrou >Priority: Major > Labels: json, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that > trying to read this file on master results in a segfault: > {code} > In [1]: from pyarrow import json >...: import pyarrow.parquet as pq >...: >...: r = json.read_json('SampleRecord.jl') > WARNING: Logging before InitGoogleLogging() is written to STDERR > F1002 09:56:55.362766 13035 reader.cc:93] Check failed: > (string_view(*next_partial).find_first_not_of(" \t\n\r")) == > (string_view::npos) > *** Check failure stack trace: *** > Aborted (core dumped) > {code} > while with 0.14.1 this works fine: > {code} > In [24]: from pyarrow import json > ...: import pyarrow.parquet as pq > ...: > ...: r = json.read_json('SampleRecord.jl') > > > In [25]: r > > > Out[25]: > pyarrow.Table > _type: string > provider_name: string > arrival: timestamp[s] > berthed: timestamp[s] > berth: null > cargoes: list volume_unit: string, buyer: null, seller: null>> > child 0, item: struct volume_unit: string, buyer: null, seller: null> > child 0, movement: string > child 1, product: string > child 2, volume: string > child 3, volume_unit: string > child 4, buyer: null > child 5, seller: null > departure: timestamp[s] > eta: null > installation: null > port_name: string > next_zone: null > reported_date: timestamp[s] > shipping_agent: null > vessel: struct null, dwt: null, flag_code: null, flag_name: null, gross_tonnage: null, imo: > string, length: int64, mmsi: null, name: string, type: null, vessel_type: > null> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: null > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: null > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: int64 > child 10, mmsi: null > child 11, name: string > child 12, type: null > child 13, vessel_type: null > In [26]: pa.__version__ > > > Out[26]: '0.14.1' > {code} > cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6760) [C++] JSON: improve error message when column changed type
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman reassigned ARROW-6760: --- Assignee: Ben Kietzman > [C++] JSON: improve error message when column changed type > -- > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Assignee: Ben Kietzman >Priority: Major > Attachments: dummy.jl > > > When a column accidentally changes type in a JSON file (which is not > supported), it would be nice to get the column name that gives this problem > in the error message. > --- > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6762) [C++] JSON reader segfaults on newline
[ https://issues.apache.org/jira/browse/ARROW-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6762: -- Labels: json pull-request-available (was: json) > [C++] JSON reader segfaults on newline > -- > > Key: ARROW-6762 > URL: https://issues.apache.org/jira/browse/ARROW-6762 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Antoine Pitrou >Priority: Major > Labels: json, pull-request-available > > Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that > trying to read this file on master results in a segfault: > {code} > In [1]: from pyarrow import json >...: import pyarrow.parquet as pq >...: >...: r = json.read_json('SampleRecord.jl') > WARNING: Logging before InitGoogleLogging() is written to STDERR > F1002 09:56:55.362766 13035 reader.cc:93] Check failed: > (string_view(*next_partial).find_first_not_of(" \t\n\r")) == > (string_view::npos) > *** Check failure stack trace: *** > Aborted (core dumped) > {code} > while with 0.14.1 this works fine: > {code} > In [24]: from pyarrow import json > ...: import pyarrow.parquet as pq > ...: > ...: r = json.read_json('SampleRecord.jl') > > > In [25]: r > > > Out[25]: > pyarrow.Table > _type: string > provider_name: string > arrival: timestamp[s] > berthed: timestamp[s] > berth: null > cargoes: list volume_unit: string, buyer: null, seller: null>> > child 0, item: struct volume_unit: string, buyer: null, seller: null> > child 0, movement: string > child 1, product: string > child 2, volume: string > child 3, volume_unit: string > child 4, buyer: null > child 5, seller: null > departure: timestamp[s] > eta: null > installation: null > port_name: string > next_zone: null > reported_date: timestamp[s] > shipping_agent: null > vessel: struct null, dwt: null, flag_code: null, flag_name: null, gross_tonnage: null, imo: > string, length: int64, mmsi: null, name: string, type: null, vessel_type: > null> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: null > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: null > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: int64 > child 10, mmsi: null > child 11, name: string > child 12, type: null > child 13, vessel_type: null > In [26]: pa.__version__ > > > Out[26]: '0.14.1' > {code} > cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-2863) [Python] Add context manager APIs to RecordBatch*Writer/Reader classes
[ https://issues.apache.org/jira/browse/ARROW-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-2863: -- Labels: pull-request-available (was: ) > [Python] Add context manager APIs to RecordBatch*Writer/Reader classes > -- > > Key: ARROW-2863 > URL: https://issues.apache.org/jira/browse/ARROW-2863 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > > This would cause the {{close}} method to be called when the scope exits -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6765) 0.14.1 not available on Windows
[ https://issues.apache.org/jira/browse/ARROW-6765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yannik updated ARROW-6765: -- Component/s: Python > 0.14.1 not available on Windows > --- > > Key: ARROW-6765 > URL: https://issues.apache.org/jira/browse/ARROW-6765 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.14.1 > Environment: Windows >Reporter: Yannik >Priority: Major > > On linux, I can install pyarrow 0.14.1 from pip, but on windows the latest > seems to be 0.14.0. Why is that? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6765) 0.14.1 not available on Windows
Yannik created ARROW-6765: - Summary: 0.14.1 not available on Windows Key: ARROW-6765 URL: https://issues.apache.org/jira/browse/ARROW-6765 Project: Apache Arrow Issue Type: Bug Affects Versions: 0.14.1 Environment: Windows Reporter: Yannik On linux, I can install pyarrow 0.14.1 from pip, but on windows the latest seems to be 0.14.0. Why is that? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6213) [C++] tests fail for AVX512
[ https://issues.apache.org/jira/browse/ARROW-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942740#comment-16942740 ] Charles Coulombe commented on ARROW-6213: - Ok, will try what you suggests. I'll keep you posted. > [C++] tests fail for AVX512 > --- > > Key: ARROW-6213 > URL: https://issues.apache.org/jira/browse/ARROW-6213 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.14.1 > Environment: CentOS 7.6.1810, Intel Xeon Processor (Skylake, IBRS) > avx512 >Reporter: Charles Coulombe >Priority: Minor > Fix For: 2.0.0 > > Attachments: arrow-0.14.1-c++-failed-tests-cmake-conf.txt, > arrow-0.14.1-c++-failed-tests.txt, > easybuild-arrow-0.14.1-20190809.34.MgMEK.log > > > When building libraries for avx512 with GCC 7.3.0, two C++ tests fails. > {noformat} > The following tests FAILED: > 28 - arrow-compute-compare-test (Failed) > 30 - arrow-compute-filter-test (Failed) > Errors while running CTest{noformat} > while for avx2 they passes. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6750) [Python] test_fs.py not silent
[ https://issues.apache.org/jira/browse/ARROW-6750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-6750: -- Assignee: Antoine Pitrou > [Python] test_fs.py not silent > -- > > Key: ARROW-6750 > URL: https://issues.apache.org/jira/browse/ARROW-6750 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Some errors get displayed at the end of {{test_fs.py}}: > {code} > $ python -m pytest --tb=native pyarrow/tests/test_fs.py > === > test session starts > > platform linux -- Python 3.7.3, pytest-5.1.1, py-1.8.0, pluggy-0.12.0 > hypothesis profile 'dev' -> max_examples=10, > database=DirectoryBasedExampleDatabase('/home/antoine/arrow/dev/python/.hypothesis/examples') > rootdir: /home/antoine/arrow/dev/python, inifile: setup.cfg > plugins: timeout-1.3.3, repeat-0.8.0, hypothesis-3.82.1, lazy-fixture-0.5.2, > forked-1.0.2, xdist-1.28.0 > collected 90 items > > > pyarrow/tests/test_fs.py > .. > [100%] > > 90 passed in 1.33s > > 19-08-07T01-59-21Z > vary : Origin > x-amz-request-id : 15C988FDC2359A9C > x-xss-protection : 1; mode=block > [ERROR] 2019-10-01 13:29:28.597 AWSClient [139765563750208] HTTP response > code: 409 > Exception name: BucketAlreadyOwnedByYou > Error message: Your previous request to create the named bucket succeeded and > you already own it. > 9 response headers: > accept-ranges : bytes > content-length : 366 > content-security-policy : block-all-mixed-content > content-type : application/xml > date : Tue, 01 Oct 2019 13:29:28 GMT > server : MinIO/RELEASE.2019-08-07T01-59-21Z > vary : Origin > x-amz-request-id : 15C988FDC317620E > x-xss-protection : 1; mode=block > [etc.] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6750) [Python] Silence S3 error logs by default
[ https://issues.apache.org/jira/browse/ARROW-6750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs updated ARROW-6750: --- Summary: [Python] Silence S3 error logs by default (was: [Python] test_fs.py not silent) > [Python] Silence S3 error logs by default > - > > Key: ARROW-6750 > URL: https://issues.apache.org/jira/browse/ARROW-6750 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Antoine Pitrou >Assignee: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Some errors get displayed at the end of {{test_fs.py}}: > {code} > $ python -m pytest --tb=native pyarrow/tests/test_fs.py > === > test session starts > > platform linux -- Python 3.7.3, pytest-5.1.1, py-1.8.0, pluggy-0.12.0 > hypothesis profile 'dev' -> max_examples=10, > database=DirectoryBasedExampleDatabase('/home/antoine/arrow/dev/python/.hypothesis/examples') > rootdir: /home/antoine/arrow/dev/python, inifile: setup.cfg > plugins: timeout-1.3.3, repeat-0.8.0, hypothesis-3.82.1, lazy-fixture-0.5.2, > forked-1.0.2, xdist-1.28.0 > collected 90 items > > > pyarrow/tests/test_fs.py > .. > [100%] > > 90 passed in 1.33s > > 19-08-07T01-59-21Z > vary : Origin > x-amz-request-id : 15C988FDC2359A9C > x-xss-protection : 1; mode=block > [ERROR] 2019-10-01 13:29:28.597 AWSClient [139765563750208] HTTP response > code: 409 > Exception name: BucketAlreadyOwnedByYou > Error message: Your previous request to create the named bucket succeeded and > you already own it. > 9 response headers: > accept-ranges : bytes > content-length : 366 > content-security-policy : block-all-mixed-content > content-type : application/xml > date : Tue, 01 Oct 2019 13:29:28 GMT > server : MinIO/RELEASE.2019-08-07T01-59-21Z > vary : Origin > x-amz-request-id : 15C988FDC317620E > x-xss-protection : 1; mode=block > [etc.] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6750) [Python] test_fs.py not silent
[ https://issues.apache.org/jira/browse/ARROW-6750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-6750. Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5553 [https://github.com/apache/arrow/pull/5553] > [Python] test_fs.py not silent > -- > > Key: ARROW-6750 > URL: https://issues.apache.org/jira/browse/ARROW-6750 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Antoine Pitrou >Priority: Minor > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Some errors get displayed at the end of {{test_fs.py}}: > {code} > $ python -m pytest --tb=native pyarrow/tests/test_fs.py > === > test session starts > > platform linux -- Python 3.7.3, pytest-5.1.1, py-1.8.0, pluggy-0.12.0 > hypothesis profile 'dev' -> max_examples=10, > database=DirectoryBasedExampleDatabase('/home/antoine/arrow/dev/python/.hypothesis/examples') > rootdir: /home/antoine/arrow/dev/python, inifile: setup.cfg > plugins: timeout-1.3.3, repeat-0.8.0, hypothesis-3.82.1, lazy-fixture-0.5.2, > forked-1.0.2, xdist-1.28.0 > collected 90 items > > > pyarrow/tests/test_fs.py > .. > [100%] > > 90 passed in 1.33s > > 19-08-07T01-59-21Z > vary : Origin > x-amz-request-id : 15C988FDC2359A9C > x-xss-protection : 1; mode=block > [ERROR] 2019-10-01 13:29:28.597 AWSClient [139765563750208] HTTP response > code: 409 > Exception name: BucketAlreadyOwnedByYou > Error message: Your previous request to create the named bucket succeeded and > you already own it. > 9 response headers: > accept-ranges : bytes > content-length : 366 > content-security-policy : block-all-mixed-content > content-type : application/xml > date : Tue, 01 Oct 2019 13:29:28 GMT > server : MinIO/RELEASE.2019-08-07T01-59-21Z > vary : Origin > x-amz-request-id : 15C988FDC317620E > x-xss-protection : 1; mode=block > [etc.] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6764) [C++] Simplify readahead implementation
Antoine Pitrou created ARROW-6764: - Summary: [C++] Simplify readahead implementation Key: ARROW-6764 URL: https://issues.apache.org/jira/browse/ARROW-6764 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Antoine Pitrou The current implementation is very ad-hoc and allows unused padding arguments. We could refactor it using the Iterator facility. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6762) [C++] JSON reader segfaults on newline
[ https://issues.apache.org/jira/browse/ARROW-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-6762: - Assignee: Antoine Pitrou > [C++] JSON reader segfaults on newline > -- > > Key: ARROW-6762 > URL: https://issues.apache.org/jira/browse/ARROW-6762 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Assignee: Antoine Pitrou >Priority: Major > Labels: json > > Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that > trying to read this file on master results in a segfault: > {code} > In [1]: from pyarrow import json >...: import pyarrow.parquet as pq >...: >...: r = json.read_json('SampleRecord.jl') > WARNING: Logging before InitGoogleLogging() is written to STDERR > F1002 09:56:55.362766 13035 reader.cc:93] Check failed: > (string_view(*next_partial).find_first_not_of(" \t\n\r")) == > (string_view::npos) > *** Check failure stack trace: *** > Aborted (core dumped) > {code} > while with 0.14.1 this works fine: > {code} > In [24]: from pyarrow import json > ...: import pyarrow.parquet as pq > ...: > ...: r = json.read_json('SampleRecord.jl') > > > In [25]: r > > > Out[25]: > pyarrow.Table > _type: string > provider_name: string > arrival: timestamp[s] > berthed: timestamp[s] > berth: null > cargoes: list volume_unit: string, buyer: null, seller: null>> > child 0, item: struct volume_unit: string, buyer: null, seller: null> > child 0, movement: string > child 1, product: string > child 2, volume: string > child 3, volume_unit: string > child 4, buyer: null > child 5, seller: null > departure: timestamp[s] > eta: null > installation: null > port_name: string > next_zone: null > reported_date: timestamp[s] > shipping_agent: null > vessel: struct null, dwt: null, flag_code: null, flag_name: null, gross_tonnage: null, imo: > string, length: int64, mmsi: null, name: string, type: null, vessel_type: > null> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: null > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: null > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: int64 > child 10, mmsi: null > child 11, name: string > child 12, type: null > child 13, vessel_type: null > In [26]: pa.__version__ > > > Out[26]: '0.14.1' > {code} > cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4226) [C++] Add CSF sparse tensor support
[ https://issues.apache.org/jira/browse/ARROW-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942607#comment-16942607 ] Rok Mihevc commented on ARROW-4226: --- I was just reading it :) I'll start working on this sometime this week. > [C++] Add CSF sparse tensor support > --- > > Key: ARROW-4226 > URL: https://issues.apache.org/jira/browse/ARROW-4226 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Rok Mihevc >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > [https://github.com/apache/arrow/pull/2546#pullrequestreview-156064172] > {quote}Perhaps in the future, if zero-copy and future-proof-ness is really > what we want, we might want to add the CSF (compressed sparse fiber) format, > a generalisation of CSR/CSC. I'm currently working on adding it to > PyData/Sparse, and I plan to make it the preferred format (COO will still be > around though). > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4226) [C++] Add CSF sparse tensor support
[ https://issues.apache.org/jira/browse/ARROW-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942602#comment-16942602 ] Kenta Murata commented on ARROW-4226: - [~rokm] Did you check the issue [pydata/sparse#125|https://github.com/pydata/sparse/issues/125]? > [C++] Add CSF sparse tensor support > --- > > Key: ARROW-4226 > URL: https://issues.apache.org/jira/browse/ARROW-4226 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Rok Mihevc >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > [https://github.com/apache/arrow/pull/2546#pullrequestreview-156064172] > {quote}Perhaps in the future, if zero-copy and future-proof-ness is really > what we want, we might want to add the CSF (compressed sparse fiber) format, > a generalisation of CSR/CSC. I'm currently working on adding it to > PyData/Sparse, and I plan to make it the preferred format (COO will still be > around though). > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6763) [Python] Parquet s3 tests are skipped because dependencies are not installed
[ https://issues.apache.org/jira/browse/ARROW-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6763: -- Labels: pull-request-available (was: ) > [Python] Parquet s3 tests are skipped because dependencies are not installed > > > Key: ARROW-6763 > URL: https://issues.apache.org/jira/browse/ARROW-6763 > Project: Apache Arrow > Issue Type: Test > Components: Python >Reporter: Joris Van den Bossche >Priority: Minor > Labels: pull-request-available > > Currently the s3 parquet test is skipped on both Travis as ursabot -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-4226) [C++] Add CSF sparse tensor support
[ https://issues.apache.org/jira/browse/ARROW-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rok Mihevc reassigned ARROW-4226: - Assignee: Rok Mihevc (was: Kenta Murata) > [C++] Add CSF sparse tensor support > --- > > Key: ARROW-4226 > URL: https://issues.apache.org/jira/browse/ARROW-4226 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Rok Mihevc >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > [https://github.com/apache/arrow/pull/2546#pullrequestreview-156064172] > {quote}Perhaps in the future, if zero-copy and future-proof-ness is really > what we want, we might want to add the CSF (compressed sparse fiber) format, > a generalisation of CSR/CSC. I'm currently working on adding it to > PyData/Sparse, and I plan to make it the preferred format (COO will still be > around though). > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4225) [C++] Add CSC sparse matrix support
[ https://issues.apache.org/jira/browse/ARROW-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942598#comment-16942598 ] Rok Mihevc commented on ARROW-4225: --- Ok, I'll check the paper and see if I can get somewhere. :) > [C++] Add CSC sparse matrix support > --- > > Key: ARROW-4225 > URL: https://issues.apache.org/jira/browse/ARROW-4225 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > CSC sparse matrix is necessary for integration with existing sparse matrix > libraries (umfpack, superlu). > https://github.com/apache/arrow/pull/2546#issuecomment-422135645 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6762) [C++] JSON reader segfaults on newline
[ https://issues.apache.org/jira/browse/ARROW-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942597#comment-16942597 ] Antoine Pitrou commented on ARROW-6762: --- Ok, the issue is that the JSON reader assumes the file always ends with a newline. Some JSON files may not have a newline at the end of the last line. So it would not crash in release mode (it's a debug assertion), but probably produce the wrong result. > [C++] JSON reader segfaults on newline > -- > > Key: ARROW-6762 > URL: https://issues.apache.org/jira/browse/ARROW-6762 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Reporter: Joris Van den Bossche >Priority: Major > Labels: json > > Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that > trying to read this file on master results in a segfault: > {code} > In [1]: from pyarrow import json >...: import pyarrow.parquet as pq >...: >...: r = json.read_json('SampleRecord.jl') > WARNING: Logging before InitGoogleLogging() is written to STDERR > F1002 09:56:55.362766 13035 reader.cc:93] Check failed: > (string_view(*next_partial).find_first_not_of(" \t\n\r")) == > (string_view::npos) > *** Check failure stack trace: *** > Aborted (core dumped) > {code} > while with 0.14.1 this works fine: > {code} > In [24]: from pyarrow import json > ...: import pyarrow.parquet as pq > ...: > ...: r = json.read_json('SampleRecord.jl') > > > In [25]: r > > > Out[25]: > pyarrow.Table > _type: string > provider_name: string > arrival: timestamp[s] > berthed: timestamp[s] > berth: null > cargoes: list volume_unit: string, buyer: null, seller: null>> > child 0, item: struct volume_unit: string, buyer: null, seller: null> > child 0, movement: string > child 1, product: string > child 2, volume: string > child 3, volume_unit: string > child 4, buyer: null > child 5, seller: null > departure: timestamp[s] > eta: null > installation: null > port_name: string > next_zone: null > reported_date: timestamp[s] > shipping_agent: null > vessel: struct null, dwt: null, flag_code: null, flag_name: null, gross_tonnage: null, imo: > string, length: int64, mmsi: null, name: string, type: null, vessel_type: > null> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: null > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: null > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: int64 > child 10, mmsi: null > child 11, name: string > child 12, type: null > child 13, vessel_type: null > In [26]: pa.__version__ > > > Out[26]: '0.14.1' > {code} > cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6763) [Python] Parquet s3 tests are skipped because dependencies are not installed
Joris Van den Bossche created ARROW-6763: Summary: [Python] Parquet s3 tests are skipped because dependencies are not installed Key: ARROW-6763 URL: https://issues.apache.org/jira/browse/ARROW-6763 Project: Apache Arrow Issue Type: Test Components: Python Reporter: Joris Van den Bossche Currently the s3 parquet test is skipped on both Travis as ursabot -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6752) [Go] implement Stringer for Null array
[ https://issues.apache.org/jira/browse/ARROW-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastien Binet resolved ARROW-6752. Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request [https://github.com/apache/arrow/pull/] > [Go] implement Stringer for Null array > -- > > Key: ARROW-6752 > URL: https://issues.apache.org/jira/browse/ARROW-6752 > Project: Apache Arrow > Issue Type: New Feature > Components: Go >Reporter: Sebastien Binet >Assignee: Sebastien Binet >Priority: Major > Labels: pull-request-available > Fix For: 0.15.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6760) [C++] JSON: improve error message when column changed type
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942582#comment-16942582 ] Joris Van den Bossche commented on ARROW-6760: -- Indeed, a better error message would be nice. Renamed the issue to reflect this. > [C++] JSON: improve error message when column changed type > -- > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: dummy.jl > > > When a column accidentally changes type in a JSON file (which is not > supported), it would be nice to get the column name that gives this problem > in the error message. > --- > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6760) [C++] JSON: improve error message when column changed type
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-6760: - Description: When a column accidentally changes type in a JSON file (which is not supported), it would be nice to get the column name that gives this problem in the error message. --- I am trying to parse a simple json file. While doing so, am getting the error {{JSON parse error: A column changed from string to number}} {code} from pyarrow import json r = json.read_json('dummy.jl') {code} was: I am trying to parse a simple json file. While doing so, am getting the error {{JSON parse error: A column changed from string to number}} {code} from pyarrow import json r = json.read_json('dummy.jl') {code} > [C++] JSON: improve error message when column changed type > -- > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: dummy.jl > > > When a column accidentally changes type in a JSON file (which is not > supported), it would be nice to get the column name that gives this problem > in the error message. > --- > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6760) [C++] JSON: improve error message when column changed type
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-6760: - Summary: [C++] JSON: improve error message when column changed type (was: JSON parse error: A column changed from string to number) > [C++] JSON: improve error message when column changed type > -- > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: dummy.jl > > > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4225) [C++] Add CSC sparse matrix support
[ https://issues.apache.org/jira/browse/ARROW-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942578#comment-16942578 ] Kenta Murata commented on ARROW-4225: - @rok No, I didn't. I guess there is no library that has CSF tensor implementation, although the paper exists. So I'm waiting for pydata/sparse's implementation. > [C++] Add CSC sparse matrix support > --- > > Key: ARROW-4225 > URL: https://issues.apache.org/jira/browse/ARROW-4225 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > CSC sparse matrix is necessary for integration with existing sparse matrix > libraries (umfpack, superlu). > https://github.com/apache/arrow/pull/2546#issuecomment-422135645 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6760) JSON parse error: A column changed from string to number
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942574#comment-16942574 ] harikrishnan commented on ARROW-6760: - Ah I see. Thanks for the quick reply [~apitrou] . Yes definitely listing the column name here with the error message will be a saver when it comes to debugging. > JSON parse error: A column changed from string to number > > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: dummy.jl > > > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4225) [C++] Add CSC sparse matrix support
[ https://issues.apache.org/jira/browse/ARROW-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942568#comment-16942568 ] Rok Mihevc commented on ARROW-4225: --- That's great @mrkn! Did you also start on [CSF|https://issues.apache.org/jira/browse/ARROW-4226]? If not I'll pick it up. > [C++] Add CSC sparse matrix support > --- > > Key: ARROW-4225 > URL: https://issues.apache.org/jira/browse/ARROW-4225 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > CSC sparse matrix is necessary for integration with existing sparse matrix > libraries (umfpack, superlu). > https://github.com/apache/arrow/pull/2546#issuecomment-422135645 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-4226) [C++] Add CSF sparse tensor support
[ https://issues.apache.org/jira/browse/ARROW-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata reassigned ARROW-4226: --- Assignee: Kenta Murata > [C++] Add CSF sparse tensor support > --- > > Key: ARROW-4226 > URL: https://issues.apache.org/jira/browse/ARROW-4226 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > [https://github.com/apache/arrow/pull/2546#pullrequestreview-156064172] > {quote}Perhaps in the future, if zero-copy and future-proof-ness is really > what we want, we might want to add the CSF (compressed sparse fiber) format, > a generalisation of CSR/CSC. I'm currently working on adding it to > PyData/Sparse, and I plan to make it the preferred format (COO will still be > around though). > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6737) Nested column branch had multiple children
[ https://issues.apache.org/jira/browse/ARROW-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942567#comment-16942567 ] Joris Van den Bossche commented on ARROW-6737: -- I noticed that reading this file on master actually gives problems, while it works on 0.14.1, so opened ARROW-6762 for that. > Nested column branch had multiple children > -- > > Key: ARROW-6737 > URL: https://issues.apache.org/jira/browse/ARROW-6737 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: SampleRecord.jl > > > {code} > from pyarrow import json > import pyarrow.parquet as pq > r = json.read_json('example.jl') > pq.write_table(r, 'example.parquet') > {code} > Doing the above operation resulting in {{ArrowInvalid: Nested column branch > had multiple children}} > Posting it here as per the request from > https://github.com/apache/arrow/issues/4045#issuecomment-535867640 > The sample schema looks like this > {code} > package_version: string > source_version: string > uuid: string > _type: string > position: struct draught_raw: null, heading: double, lat: double, lon: double, nav_state: > int64, received_time: timestamp[s], speed: double> > child 0, ais_type: string > child 1, course: double > child 2, draught: double > child 3, draught_raw: null > child 4, heading: double > child 5, lat: double > child 6, lon: double > child 7, nav_state: int64 > child 8, received_time: timestamp[s] > child 9, speed: double > provider_name: string > vessel: struct null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, > imo: string, length: null, mmsi: string, name: string, type: null, > vessel_type: string> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: string > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: string > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: null > child 10, mmsi: string > child 11, name: string > child 12, type: null > child 13, vessel_type: string > source_provider: string > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-4222) [C++] Support equality comparison between COO and CSR sparse tensors in SparseTensorEquals
[ https://issues.apache.org/jira/browse/ARROW-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata reassigned ARROW-4222: --- Assignee: Kenta Murata > [C++] Support equality comparison between COO and CSR sparse tensors in > SparseTensorEquals > -- > > Key: ARROW-4222 > URL: https://issues.apache.org/jira/browse/ARROW-4222 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 2.0.0 > > > Currently SparseTensorEquals always returns false when it gets COO and CSR > sparse tensors. > It should support comparing the items in the sparse tensors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-4225) [C++] Add CSC sparse matrix support
[ https://issues.apache.org/jira/browse/ARROW-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenta Murata reassigned ARROW-4225: --- Assignee: Kenta Murata (was: Rok Mihevc) > [C++] Add CSC sparse matrix support > --- > > Key: ARROW-4225 > URL: https://issues.apache.org/jira/browse/ARROW-4225 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Kenta Murata >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > CSC sparse matrix is necessary for integration with existing sparse matrix > libraries (umfpack, superlu). > https://github.com/apache/arrow/pull/2546#issuecomment-422135645 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-6762) [C++] JSON reader segfaults on newline
Joris Van den Bossche created ARROW-6762: Summary: [C++] JSON reader segfaults on newline Key: ARROW-6762 URL: https://issues.apache.org/jira/browse/ARROW-6762 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Joris Van den Bossche Using the {{SampleRecord.jl}} attachment from ARROW-6737, I notice that trying to read this file on master results in a segfault: {code} In [1]: from pyarrow import json ...: import pyarrow.parquet as pq ...: ...: r = json.read_json('SampleRecord.jl') WARNING: Logging before InitGoogleLogging() is written to STDERR F1002 09:56:55.362766 13035 reader.cc:93] Check failed: (string_view(*next_partial).find_first_not_of(" \t\n\r")) == (string_view::npos) *** Check failure stack trace: *** Aborted (core dumped) {code} while with 0.14.1 this works fine: {code} In [24]: from pyarrow import json ...: import pyarrow.parquet as pq ...: ...: r = json.read_json('SampleRecord.jl') In [25]: r Out[25]: pyarrow.Table _type: string provider_name: string arrival: timestamp[s] berthed: timestamp[s] berth: null cargoes: list> child 0, item: struct child 0, movement: string child 1, product: string child 2, volume: string child 3, volume_unit: string child 4, buyer: null child 5, seller: null departure: timestamp[s] eta: null installation: null port_name: string next_zone: null reported_date: timestamp[s] shipping_agent: null vessel: struct child 0, beam: null child 1, build_year: null child 2, call_sign: null child 3, dead_weight: null child 4, dwt: null child 5, flag_code: null child 6, flag_name: null child 7, gross_tonnage: null child 8, imo: string child 9, length: int64 child 10, mmsi: null child 11, name: string child 12, type: null child 13, vessel_type: null In [26]: pa.__version__ Out[26]: '0.14.1' {code} cc [~apitrou] [~bkietz] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6760) JSON parse error: A column changed from string to number
[ https://issues.apache.org/jira/browse/ARROW-6760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942566#comment-16942566 ] Antoine Pitrou commented on ARROW-6760: --- Hmm, we should probably give better error messages. [~bkietz] In this case, though, it seems the "length" field is first a string, then an integer. Arrow only accepts homogenous JSON, i.e. all objects in the same file must have the same schema. > JSON parse error: A column changed from string to number > > > Key: ARROW-6760 > URL: https://issues.apache.org/jira/browse/ARROW-6760 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: dummy.jl > > > I am trying to parse a simple json file. While doing so, am getting the error > {{JSON parse error: A column changed from string to number}} > {code} > from pyarrow import json > r = json.read_json('dummy.jl') > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-4225) [C++] Add CSC sparse matrix support
[ https://issues.apache.org/jira/browse/ARROW-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942565#comment-16942565 ] Kenta Murata commented on ARROW-4225: - @rok I've already started to work on this ticket. I'm sorry for forgetting to update the ticket property. > [C++] Add CSC sparse matrix support > --- > > Key: ARROW-4225 > URL: https://issues.apache.org/jira/browse/ARROW-4225 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Rok Mihevc >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > CSC sparse matrix is necessary for integration with existing sparse matrix > libraries (umfpack, superlu). > https://github.com/apache/arrow/pull/2546#issuecomment-422135645 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-4225) [C++] Add CSC sparse matrix support
[ https://issues.apache.org/jira/browse/ARROW-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rok Mihevc reassigned ARROW-4225: - Assignee: Rok Mihevc > [C++] Add CSC sparse matrix support > --- > > Key: ARROW-4225 > URL: https://issues.apache.org/jira/browse/ARROW-4225 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Kenta Murata >Assignee: Rok Mihevc >Priority: Minor > Labels: sparse > Fix For: 1.0.0 > > > CSC sparse matrix is necessary for integration with existing sparse matrix > libraries (umfpack, superlu). > https://github.com/apache/arrow/pull/2546#issuecomment-422135645 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6757) [Python] Creating csv.ParseOptions() causes "Windows fatal exception: access violation" with Visual Studio 2017
[ https://issues.apache.org/jira/browse/ARROW-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942562#comment-16942562 ] Antoine Pitrou commented on ARROW-6757: --- Does the debugger tell you something? > [Python] Creating csv.ParseOptions() causes "Windows fatal exception: access > violation" with Visual Studio 2017 > --- > > Key: ARROW-6757 > URL: https://issues.apache.org/jira/browse/ARROW-6757 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: Wes McKinney >Priority: Major > Fix For: 1.0.0 > > > I encountered this when trying to verify the release with MSVC 2017. It may > be particular to this machine or build (though it's 100% reproducible for > me). I will check the Windows wheels to see if it occurs there, too > {code} > (C:\tmp\arrow-verify-release\conda-env) λ python > Python 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 22:01:29) > [MSC v.1900 64 bit (AMD64)] :: Anaconda, Inc. on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import pyarrow.csv as pc > >>> pc.ParseOptions() > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (ARROW-6737) Nested column branch had multiple children
[ https://issues.apache.org/jira/browse/ARROW-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche closed ARROW-6737. Resolution: Duplicate > Nested column branch had multiple children > -- > > Key: ARROW-6737 > URL: https://issues.apache.org/jira/browse/ARROW-6737 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: SampleRecord.jl > > > {code} > from pyarrow import json > import pyarrow.parquet as pq > r = json.read_json('example.jl') > pq.write_table(r, 'example.parquet') > {code} > Doing the above operation resulting in {{ArrowInvalid: Nested column branch > had multiple children}} > Posting it here as per the request from > https://github.com/apache/arrow/issues/4045#issuecomment-535867640 > The sample schema looks like this > {code} > package_version: string > source_version: string > uuid: string > _type: string > position: struct draught_raw: null, heading: double, lat: double, lon: double, nav_state: > int64, received_time: timestamp[s], speed: double> > child 0, ais_type: string > child 1, course: double > child 2, draught: double > child 3, draught_raw: null > child 4, heading: double > child 5, lat: double > child 6, lon: double > child 7, nav_state: int64 > child 8, received_time: timestamp[s] > child 9, speed: double > provider_name: string > vessel: struct null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, > imo: string, length: null, mmsi: string, name: string, type: null, > vessel_type: string> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: string > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: string > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: null > child 10, mmsi: string > child 11, name: string > child 12, type: null > child 13, vessel_type: string > source_provider: string > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-6737) Nested column branch had multiple children
[ https://issues.apache.org/jira/browse/ARROW-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16942560#comment-16942560 ] Joris Van den Bossche commented on ARROW-6737: -- Thanks for providing the sample file. This is indeed a duplicate of ARROW-1644. Nested lists/structs are currently not yet supported in the Arrow parquet IO implementation. > Nested column branch had multiple children > -- > > Key: ARROW-6737 > URL: https://issues.apache.org/jira/browse/ARROW-6737 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Reporter: harikrishnan >Priority: Major > Attachments: SampleRecord.jl > > > {code} > from pyarrow import json > import pyarrow.parquet as pq > r = json.read_json('example.jl') > pq.write_table(r, 'example.parquet') > {code} > Doing the above operation resulting in {{ArrowInvalid: Nested column branch > had multiple children}} > Posting it here as per the request from > https://github.com/apache/arrow/issues/4045#issuecomment-535867640 > The sample schema looks like this > {code} > package_version: string > source_version: string > uuid: string > _type: string > position: struct draught_raw: null, heading: double, lat: double, lon: double, nav_state: > int64, received_time: timestamp[s], speed: double> > child 0, ais_type: string > child 1, course: double > child 2, draught: double > child 3, draught_raw: null > child 4, heading: double > child 5, lat: double > child 6, lon: double > child 7, nav_state: int64 > child 8, received_time: timestamp[s] > child 9, speed: double > provider_name: string > vessel: struct null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, > imo: string, length: null, mmsi: string, name: string, type: null, > vessel_type: string> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: string > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: string > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: null > child 10, mmsi: string > child 11, name: string > child 12, type: null > child 13, vessel_type: string > source_provider: string > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6761) [Rust] Travis CI builds not respecting rust-toolchain
[ https://issues.apache.org/jira/browse/ARROW-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-6761: -- Summary: [Rust] Travis CI builds not respecting rust-toolchain (was: [Rust] Builds failing due to Rust Internal Compiler Error) > [Rust] Travis CI builds not respecting rust-toolchain > - > > Key: ARROW-6761 > URL: https://issues.apache.org/jira/browse/ARROW-6761 > Project: Apache Arrow > Issue Type: Bug > Components: Rust >Affects Versions: 1.0.0 >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Travis builds recently started failing with a Rust ICE (Internal Compiler > Error) which has been reported to the Rust compiler team > ([https://github.com/rust-lang/rust/issues/64908]). > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ARROW-6730) [CI] Use GitHub Actions for "C++ with clang 7" docker image
[ https://issues.apache.org/jira/browse/ARROW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-6730. - Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 5530 [https://github.com/apache/arrow/pull/5530] > [CI] Use GitHub Actions for "C++ with clang 7" docker image > --- > > Key: ARROW-6730 > URL: https://issues.apache.org/jira/browse/ARROW-6730 > Project: Apache Arrow > Issue Type: New Feature > Components: Continuous Integration >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: pull-request-available > Fix For: 1.0.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ARROW-6730) [CI] Use GitHub Actions for "C++ with clang 7" docker image
[ https://issues.apache.org/jira/browse/ARROW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou reassigned ARROW-6730: --- Assignee: Francois Saint-Jacques > [CI] Use GitHub Actions for "C++ with clang 7" docker image > --- > > Key: ARROW-6730 > URL: https://issues.apache.org/jira/browse/ARROW-6730 > Project: Apache Arrow > Issue Type: New Feature > Components: Continuous Integration >Reporter: Francois Saint-Jacques >Assignee: Francois Saint-Jacques >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-6730) [CI] Use GitHub Actions for "C++ with clang 7" docker image
[ https://issues.apache.org/jira/browse/ARROW-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou updated ARROW-6730: Summary: [CI] Use GitHub Actions for "C++ with clang 7" docker image (was: [CI] Use Github Actions for "C++ with clang 7" docker image) > [CI] Use GitHub Actions for "C++ with clang 7" docker image > --- > > Key: ARROW-6730 > URL: https://issues.apache.org/jira/browse/ARROW-6730 > Project: Apache Arrow > Issue Type: New Feature > Components: Continuous Integration >Reporter: Francois Saint-Jacques >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)