[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392984#comment-16392984 ] ASF GitHub Bot commented on ARROW-2150: --- wesm closed pull request #1729: ARROW-2150: [Python] Raise NotImplementedError when comparing with pyarrow.Array for now URL: https://github.com/apache/arrow/pull/1729 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/python/pyarrow/array.pxi b/python/pyarrow/array.pxi index e785c0ec5..f05806cfa 100644 --- a/python/pyarrow/array.pxi +++ b/python/pyarrow/array.pxi @@ -267,6 +267,10 @@ cdef class Array: self.ap = sp_array.get() self.type = pyarrow_wrap_data_type(self.sp_array.get().type()) +def __richcmp__(Array self, object other, int op): +raise NotImplementedError('Comparisons with pyarrow.Array are not ' + 'implemented') + def _debug_print(self): with nogil: check_status(DebugPrint(deref(self.ap), 0)) diff --git a/python/pyarrow/tests/test_array.py b/python/pyarrow/tests/test_array.py index f034d78b3..4c14c1c61 100644 --- a/python/pyarrow/tests/test_array.py +++ b/python/pyarrow/tests/test_array.py @@ -158,6 +158,16 @@ def test_array_ref_to_ndarray_base(): assert sys.getrefcount(arr) == (refcount + 1) +def test_array_eq_raises(): +# ARROW-2150: we are raising when comparing arrays until we define the +# behavior to either be elementwise comparisons or data equality +arr1 = pa.array([1, 2, 3], type=pa.int32()) +arr2 = pa.array([1, 2, 3], type=pa.int32()) + +with pytest.raises(NotImplementedError): +arr1 == arr2 + + def test_dictionary_from_numpy(): indices = np.repeat([0, 1, 2], 2) dictionary = np.array(['foo', 'bar', 'baz'], dtype=object) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Wes McKinney >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392323#comment-16392323 ] ASF GitHub Bot commented on ARROW-2150: --- wesm opened a new pull request #1729: ARROW-2150: [Python] Raise NotImplementedError when comparing with pyarrow.Array for now URL: https://github.com/apache/arrow/pull/1729 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Bug > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Assignee: Wes McKinney >Priority: Minor > Labels: pull-request-available > Fix For: 0.9.0 > > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364224#comment-16364224 ] Antoine Pitrou commented on ARROW-2150: --- {quote}For the time being, I believe it should be a thin storage layer, and that the user API for analytics and computations should evolve separately. {quote} Such as letting the user create a Dask array from pyarrow buffers? > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > Fix For: 0.9.0 > > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364209#comment-16364209 ] Wes McKinney commented on ARROW-2150: - For the time being, I believe it should be a thin storage layer, and that the user API for analytics and computations should evolve separately. Once we have a more sizable collection of kernel functions to work with we should spend some time thinking about what the API should look like. > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > Fix For: 0.9.0 > > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363908#comment-16363908 ] Antoine Pitrou commented on ARROW-2150: --- A more general question: what is the ultimate aim for the Array API? Do we want to reproduce the Numpy API, or just provide a thin storage layer? > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > Fix For: 0.9.0 > > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362587#comment-16362587 ] Wes McKinney commented on ARROW-2150: - I think we should make this raise for now, and later perform vector comparison like NumPy (once we have a kernel available to do this) > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > Fix For: 0.9.0 > > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362407#comment-16362407 ] Antoine Pitrou commented on ARROW-2150: --- So the intended behaviour is to return a bool value, not an array of bools as with Numpy arrays? > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-2150) [Python] array equality defaults to identity
[ https://issues.apache.org/jira/browse/ARROW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362384#comment-16362384 ] Uwe L. Korn commented on ARROW-2150: No, this at the moment only lack of implementation. We have function in C++ that actually can take care of the comparisons (they are as simple as calling Array::Equal()) but they are probably not yet invoked in Python. > [Python] array equality defaults to identity > > > Key: ARROW-2150 > URL: https://issues.apache.org/jira/browse/ARROW-2150 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Affects Versions: 0.8.0 >Reporter: Antoine Pitrou >Priority: Minor > > I'm not sure this is deliberate, but it doesn't look very desirable to me: > {code} > >>> pa.array([1,2,3], type=pa.int32()) == pa.array([1,2,3], type=pa.int32()) > False > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)