[GitHub] [arrow] rok commented on pull request #37166: GH-24868: [C++] Add a Tensor logical value type with varying dimensions, implemented using ExtensionType

via GitHub Fri, 25 Aug 2023 08:08:51 -0700


rok commented on PR #37166:
URL: https://github.com/apache/arrow/pull/37166#issuecomment-1693515812


   Looking at [PyTorch 
`NestedTensor`](https://github.com/pytorch/pytorch/blob/3a3cf0e09d475df9237c95ebd14debf650e0c038/aten/src/ATen/native/nested/README.md#nestedtensors)
 it stores strides per tensor as opposed to our proposal which stores just a 
permutation for the entire array which can then be used to derive strides 
(using tensor shape) per tensor. In PyTorch if strides are not provided to the 
[constructor](https://github.com/pytorch/pytorch/blob/3c11184ca8fe35de70d1147f0f9d9beea8dc7e48/aten/src/ATen/NestedTensorImpl.cpp#L211C1-L221)
 they are apparently calculated assuming row-major/C-contiguous layout.
   
https://github.com/pytorch/pytorch/blob/3a3cf0e09d475df9237c95ebd14debf650e0c038/aten/src/ATen/NestedTensorImpl.cpp#L104C19-L130
   This makes me assume strides are stored for caching (and not to allow 
variable data layout per tensor).
   
   Should we adjust to store permutations (or strides) per row too or rather 
leave that for another extension (`GeneralVariableShapeTensorType`)? @lhoestq 
@mariosasko @AlenkaF @jorisvandenbossche @ezyang
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] rok commented on pull request #37166: GH-24868: [C++] Add a Tensor logical value type with varying dimensions, implemented using ExtensionType

Reply via email to