[jira] [Created] (ARROW-12501) [CI][Ruby] Remove needless workaround for MinGW build
Kouhei Sutou created ARROW-12501: Summary: [CI][Ruby] Remove needless workaround for MinGW build Key: ARROW-12501 URL: https://issues.apache.org/jira/browse/ARROW-12501 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Ruby Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12500) [C++][Dataset] Consolidate similar tests for file formats
David Li created ARROW-12500: Summary: [C++][Dataset] Consolidate similar tests for file formats Key: ARROW-12500 URL: https://issues.apache.org/jira/browse/ARROW-12500 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: David Li Assignee: David Li Fix For: 5.0.0 Between CSV/Parquet/IPC we have a number of very similar or in some cases essentially identical tests. As we're doing more refactoring and development it would be nice to consolidate these tests so that we can ensure all formats behave consistently and get the same level of testing. For instance, ARROW-11772 now adds more comprehensive tests for scanning IPC which don't yet apply to Parquet/CSV. This sort of consolidation may also be nice to do in Python. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12499) [C++][Compute] Add ScalarAggregateOptions to Any and All kernels
Rok Mihevc created ARROW-12499: -- Summary: [C++][Compute] Add ScalarAggregateOptions to Any and All kernels Key: ARROW-12499 URL: https://issues.apache.org/jira/browse/ARROW-12499 Project: Apache Arrow Issue Type: Improvement Components: C++, Python, R Reporter: Rok Mihevc Assignee: Rok Mihevc Follow up to ARROW-9054 and ARROW-12185 - see [comment|https://github.com/apache/arrow/pull/10032#pullrequestreview-641468079]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12498) cannot bind 'std::unique_ptr'
Mauricio 'Pachá' Vargas Sepúlveda created ARROW-12498: - Summary: cannot bind 'std::unique_ptr' Key: ARROW-12498 URL: https://issues.apache.org/jira/browse/ARROW-12498 Project: Apache Arrow Issue Type: Bug Components: C++, Continuous Integration Reporter: Mauricio 'Pachá' Vargas Sepúlveda centos-7-amd64 has repeatedly failed at nightly builds: 2021-04-21: [https://github.com/ursacomputing/crossbow/runs/2397521625#step:8:902] 2021-04-20: [https://github.com/ursacomputing/crossbow/runs/2387780946#step:8:901] the error reads: {code:java} /root/rpmbuild/BUILD/apache-arrow-3.1.0.dev705/cpp/src/arrow/adapters/orc/adapter.cc:581:10: error: cannot bind 'std::unique_ptr' lvalue to 'std::unique_ptr&&'{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12497) [C++] Implement array expression from R in C++
Nic Crane created ARROW-12497: - Summary: [C++] Implement array expression from R in C++ Key: ARROW-12497 URL: https://issues.apache.org/jira/browse/ARROW-12497 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Nic Crane As discussed here: https://github.com/apache/arrow/pull/10056#discussion_r616985185 Currently, the R implementation allows for array expressions to be built which are later evaluated within a single call to Filter rather than in multiple operations. This functionality should be moved to the C++ level so it's dealt with at a lower level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12496) [C++][Dataset] Ensure Scanner tests fully cover async
David Li created ARROW-12496: Summary: [C++][Dataset] Ensure Scanner tests fully cover async Key: ARROW-12496 URL: https://issues.apache.org/jira/browse/ARROW-12496 Project: Apache Arrow Issue Type: Improvement Reporter: David Li Assignee: David Li Some of the tests for scanners don't fully cover the async scanner as they scan a single fragment, which isn't supported by AsyncScanner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12495) [C++][Python] NumPy buffer sets is_mutable_ to true but does not set mutable_data_ when the NumPy array is writable
Wes McKinney created ARROW-12495: Summary: [C++][Python] NumPy buffer sets is_mutable_ to true but does not set mutable_data_ when the NumPy array is writable Key: ARROW-12495 URL: https://issues.apache.org/jira/browse/ARROW-12495 Project: Apache Arrow Issue Type: Bug Components: C++, Python Reporter: Wes McKinney Fix For: 4.0.0 Bug is evident {code} NumPyBuffer::NumPyBuffer(PyObject* ao) : Buffer(nullptr, 0) { PyAcquireGIL lock; arr_ = ao; Py_INCREF(ao); if (PyArray_Check(ao)) { PyArrayObject* ndarray = reinterpret_cast(ao); data_ = reinterpret_cast(PyArray_DATA(ndarray)); size_ = PyArray_SIZE(ndarray) * PyArray_DESCR(ndarray)->elsize; capacity_ = size_; if (PyArray_FLAGS(ndarray) & NPY_ARRAY_WRITEABLE) { is_mutable_ = true; } } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12494) [C++] ORC adapter fails to compile on GCC 4.8
Krisztian Szucs created ARROW-12494: --- Summary: [C++] ORC adapter fails to compile on GCC 4.8 Key: ARROW-12494 URL: https://issues.apache.org/jira/browse/ARROW-12494 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Krisztian Szucs Assignee: Krisztian Szucs Fix For: 4.0.0 Centos 7 packaging build failed during the release https://github.com/ursacomputing/crossbow/runs/2400255864 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12493) Support DictionaryArray in CSV and JSON formatters
Raphael Taylor-Davies created ARROW-12493: - Summary: Support DictionaryArray in CSV and JSON formatters Key: ARROW-12493 URL: https://issues.apache.org/jira/browse/ARROW-12493 Project: Apache Arrow Issue Type: Improvement Reporter: Raphael Taylor-Davies Assignee: Raphael Taylor-Davies Currently the CSV and JSON formatters do not support JSON and CSV arrays -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-12492) [Python] Add an helper method to decode a DictionaryArray back to a plain Array
Alessandro Molina created ARROW-12492: - Summary: [Python] Add an helper method to decode a DictionaryArray back to a plain Array Key: ARROW-12492 URL: https://issues.apache.org/jira/browse/ARROW-12492 Project: Apache Arrow Issue Type: New Feature Components: Python Reporter: Alessandro Molina To create a DictionaryArray pyarrow currently offers the {{Array.dictionary_encode}} helper, but there is a lack of an obvious way to do the reverse. A {{DictionaryArray.decode}} helper could provide an immediate obvious solution to get back an unrolled Array from the dictionary encoded version. -- This message was sent by Atlassian Jira (v8.3.4#803005)