On Mon, Dec 30, 2024 at 1:51 PM Robert Kern <robert.k...@gmail.com> wrote:

> On Mon, Dec 30, 2024 at 10:28 AM Mark Harfouche via NumPy-Discussion <
> numpy-discussion@python.org> wrote:
>
>> Happy new year everybody!
>>
>> I've been upgrading my code to start to support array indexing and in my
>> tests I found something that was well documented, but surprising to me.
>>
>> I've tried to read through
>> https://numpy.org/doc/stable/user/basics.indexing.html#combining-advanced-and-basic-indexing
>> and even after multiple passes, I still find it very terse...
>>
>> Consider a mutli dimensional dataset:
>>
>> import numpy as np
>> shape = (10, 20, 30)
>> original = np.arange(np.prod(shape)).reshape(shape)
>>
>> Let's consider we want to collapse dim 0 to a single entry
>> Let's consider we want a subset from dim 1, with a slice
>> Let's consider that we want want 3 elements from dim 2
>>
>> i = 2
>> j = slice(1, 6)
>> k = slice(7, 10)
>> out_basic = original[i, j, k]
>> assert out_basic.shape == (5, 3)
>>
>> Now consider we want to provide freedom to have instead of a slice for k,
>> an arbitrary "array"
>>
>> k = [7, 11, 13]
>> out_array = original[i, j, k]
>> assert out_array.shape == (5, 3), f"shape is actually {out_array.shape}"
>>
>> AssertionError: shape is actually (3, 5)
>>
>
>
>> This is somewhat very surprising to me. I totally understand that things
>> won't change in terms of this kind of indexing in numpy, but is there a way
>> I can adjust my indexing strategy to regain the ability to slice into my
>> array in a "single shot".
>>
>> The main usecase is for arrays that are truly huge, but chucked in ways
>> where slicing into them can be quite efficient. This multi-dimensional
>> imaging data. Each chunk is quite "huge" so this kind of metadata
>> manipulation is worthwhile to avoid unecessary IO.
>>
>> Perhaps there is a "simple" distinction I am missing, for example using a
>> tuple for k instead of a list????
>>
>
> No, there's no simple solution that you're missing. The kind of indexing
> that you want has been considered in NEP 21 (which called it "orthogonal
> indexing"), which saw some progress, but has largely been left fallow.
> There's been some movement on the Array API specification side, so that
> might spur some movement.
>
> https://numpy.org/neps/nep-0021-advanced-indexing.html
>
> Jaime Rio made a pure Python implementation that might work for you
> (though I'm not sure about the performance for large arrays with big
> slices), but it's buried in a close PR (still works, though):
>
> https://github.com/numpy/numpy/pull/5749/files
>
> >>> original_oindex = OrthogonalIndexer(original)
> >>> original_oindex[i, j, k]
> array([[1237, 1241, 1243],
>        [1267, 1271, 1273],
>        [1297, 1301, 1303],
>        [1327, 1331, 1333],
>        [1357, 1361, 1363]])
>
> Note that some other array implementations like Xarray and Zarr provide
> the `.oindex` property which does the orthogonal indexing semantics
> (roughly) per NEP 21, so those might be options for you.
>
> --
> Robert Kern
>

Thank you Robert,

I will have a read through NEP21. It seems like it discusses many aspects
that I ran into.
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to