rok commented on PR #37166: URL: https://github.com/apache/arrow/pull/37166#issuecomment-1693515812
Looking at [PyTorch `NestedTensor`](https://github.com/pytorch/pytorch/blob/3a3cf0e09d475df9237c95ebd14debf650e0c038/aten/src/ATen/native/nested/README.md#nestedtensors) it stores strides per tensor as opposed to our proposal which stores just a permutation for the entire array which can then be used to derive strides (using tensor shape) per tensor. In PyTorch if strides are not provided to the [constructor](https://github.com/pytorch/pytorch/blob/3c11184ca8fe35de70d1147f0f9d9beea8dc7e48/aten/src/ATen/NestedTensorImpl.cpp#L211C1-L221) they are apparently calculated assuming row-major/C-contiguous layout. https://github.com/pytorch/pytorch/blob/3a3cf0e09d475df9237c95ebd14debf650e0c038/aten/src/ATen/NestedTensorImpl.cpp#L104C19-L130 This makes me assume strides are stored for caching (and not to allow variable data layout per tensor). Should we adjust to store permutations (or strides) per row too or rather leave that for another extension (`GeneralVariableShapeTensorType`)? @lhoestq @mariosasko @AlenkaF @jorisvandenbossche @ezyang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
