rok commented on PR #33925: URL: https://github.com/apache/arrow/pull/33925#issuecomment-1422767883
> Actually, what exactly is the dim_names proposal. The proposed logic for calculating `strides` from `dim_names` [is here](https://github.com/apache/arrow/pull/8510/files#diff-eccd12baabdd9d2f59798d3791dfa0fef6272622c18b614bd54a395d08f05956R46-R67) and tests it should pass would be: ``` auto ext_type_4 = extension::tensor_array(int64(), {3, 4, 7}, {"x", "y", "z"}); ASSERT_EQ(ext_type_4->strides(), (std::vector<int64_t>{224, 56, 8})); ASSERT_EQ(ext_type_4->strides({"x", "y", "z"}), (std::vector<int64_t>{224, 56, 8})); ASSERT_EQ(ext_type_4->strides({"y", "z", "x"}), (std::vector<int64_t>{56, 8, 224})); ASSERT_EQ(ext_type_4->strides({"z", "x", "y"}), (std::vector<int64_t>{8, 224, 56})); ``` > I assumed that the ordering of the names tell you how to interpret the physical data layout (outside in). If so, if you loaded one of these tensors into PyTorch, we would just ignore the dim names entirely and just give you a tensor in the physical layout. To reinterpret the meaning of the data is application specific; after all, who decides if you call it N or BATCH or something else? Which wouldn't really be in the purview of a framework to decide Right! Our current thinking is to provide `dim_names` in the extension and the `->strides(dim_names))` helper to allow applications derive their own strides if they chose another dimension order. That implies the application has some knowledge of `dim_names`, but I think that's a given? > If you wanted PyTorch to automatically restride; eg give the same logical view for both row-major and col-major matrix, you need a different rep. A layout permutation is sufficiently general that it would work for us. Strides would work too. A fixed enum of names and their logical positions would work as well (but people typically dislike that.) Storing a layout permutation is an interesting idea too, but that requires knowledge of the "natural order". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
