[
https://issues.apache.org/jira/browse/ARROW-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16393464#comment-16393464
]
ASF GitHub Bot commented on ARROW-2288:
---------------------------------------
wesm closed pull request #1723: ARROW-2288: [Python] Fix slicing logic
URL: https://github.com/apache/arrow/pull/1723
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/python/pyarrow/array.pxi b/python/pyarrow/array.pxi
index e785c0ec5..cc65c0771 100644
--- a/python/pyarrow/array.pxi
+++ b/python/pyarrow/array.pxi
@@ -205,15 +205,25 @@ def asarray(values, type=None):
def _normalize_slice(object arrow_obj, slice key):
- cdef Py_ssize_t n = len(arrow_obj)
+ cdef:
+ Py_ssize_t start, stop, step
+ Py_ssize_t n = len(arrow_obj)
start = key.start or 0
- while start < 0:
+ if start < 0:
start += n
+ if start < 0:
+ start = 0
+ elif start >= n:
+ start = n
stop = key.stop if key.stop is not None else n
- while stop < 0:
+ if stop < 0:
stop += n
+ if stop < 0:
+ stop = 0
+ elif stop >= n:
+ stop = n
step = key.step or 1
if step != 1:
diff --git a/python/pyarrow/tests/test_array.py
b/python/pyarrow/tests/test_array.py
index f034d78b3..4a337ad23 100644
--- a/python/pyarrow/tests/test_array.py
+++ b/python/pyarrow/tests/test_array.py
@@ -132,17 +132,18 @@ def test_array_slice():
# Test slice notation
assert arr[2:].equals(arr.slice(2))
-
assert arr[2:5].equals(arr.slice(2, 3))
-
assert arr[-5:].equals(arr.slice(len(arr) - 5))
-
with pytest.raises(IndexError):
arr[::-1]
-
with pytest.raises(IndexError):
arr[::2]
+ n = len(arr)
+ for start in range(-n * 2, n * 2):
+ for stop in range(-n * 2, n * 2):
+ assert arr[start:stop].to_pylist() == arr.to_pylist()[start:stop]
+
def test_array_factory_invalid_type():
arr = np.array([datetime.timedelta(1), datetime.timedelta(2)])
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [Python] slicing logic defective
> --------------------------------
>
> Key: ARROW-2288
> URL: https://issues.apache.org/jira/browse/ARROW-2288
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.8.0
> Reporter: Antoine Pitrou
> Assignee: Antoine Pitrou
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> The slicing logic tends to go too far when normalizing large negative bounds,
> which leads to results not in line with Python's slicing semantics:
> {code}
> >>> arr = pa.array([1,2,3,4])
> >>> arr[-99:100]
> <pyarrow.lib.Int64Array object at 0x7f550813a318>
> [
> 2,
> 3,
> 4
> ]
> >>> arr.to_pylist()[-99:100]
> [1, 2, 3, 4]
> >>>
> >>>
> >>> arr[-6:-5]
> <pyarrow.lib.Int64Array object at 0x7f54cd76a908>
> [
> 3
> ]
> >>> arr.to_pylist()[-6:-5]
> []
> {code}
> Also note this crash:
> {code}
> >>> arr[10:13]
> /home/antoine/arrow/cpp/src/arrow/array.cc:105 Check failed: (offset) <=
> (data.length)
> Abandon (core dumped)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)