aboderinsamuel commented on PR #50203:
URL: https://github.com/apache/arrow/pull/50203#issuecomment-4779363878

   Thanks @tadeja @rok @AlenkaF, really helpful. I reproduced tadeja's cases 
and both are real: a wrong-shape array (e.g. (3,2) into a (2,3) tensor) is 
silently accepted, and permuted tensor types store the wrong layout (the 
to_numpy round-trip doesn't return the input).
   
   Root cause: array.pxi swaps the extension type for its storage_type before 
conversion and re-wraps with wrap_array after, so the C++ converter only ever 
sees the flat fixed_size_list, it can't know the tensor's shape or permutation. 
The flatten is correct for a plain fixed_size_list, but shape-validation and 
permutation-handling need to live in the Python/Cython layer where the 
FixedShapeTensorType is still intact.
   
   Plan:
   1. C++: switch from PyArray_Ravel to the explicit PyArray_CheckFromAny + 
NPY_ARRAY_C_CONTIGUOUS approach, reading PyArray_DATA directly (per @AlenkaF), 
this should also let me handle the byte-order case in the same call.
   2. Cython: validate each element's shape against the tensor's shape (so 
(3,2) into (2,3) errors), and reject permuted types with a clear error for now, 
with full permutation support as a follow-up, unless you'd prefer I handle the 
transpose here.
   3. Also tighten the comment + error message and document that we always 
output C order, per @rok.
   
   Does this direction sound right before I push?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to