jorisvandenbossche commented on PR #33925:
URL: https://github.com/apache/arrow/pull/33925#issuecomment-1415428053

   From inline discussion at 
https://github.com/apache/arrow/pull/33925#discussion_r1094350467:
   
   > [@thomasw21] Sometimes users will manipulate non_contiguous formats 
without realising. Typically `torch` uses a 
`to(memory_format=torch.channels_last)` which essentially is a equivalent to 
`.permute(0,2,3,1).contiguous()`. This API is often used in computer vision as 
some operation run faster when the tensors are not actually stored in "row 
major" fashion. So loading images in that specific format directly would help, 
which means storing non row major tensors.
   > 
   > ...
   > 
   > More on this: 
https://pytorch.org/blog/accelerating-pytorch-vision-models-with-channels-last-on-cpu/
   
   @thomasw21 the page you link mentions the difference between "physical 
order" and "logical order". So the actual, physical layout in memory that 
pytorch uses can either be channels first (NCHW) or channels last (NHWC). But 
in both cases, this physical order is row-major. When interacting with those 
tensors as a user, pytorch however always _shows_ the data as if it's NCHW, 
independent of the physical layout, i.e. what they call the "logical order". 
And it's when viewing the data as NCHW while the physical layout is NHWC that 
you get custom strides / non-C-contiguous layout _for that view_. But when 
considering the physical layout NHWC, the data is still row-major. 
   
   And for the purpose of this specification and being able to transfer 
FixedShapeTensor data using Arrow, it's the physical order that matters, and 
optionally the dimension order that matches the physical row-major layout (i.e. 
NHWC or NCHW for pytorch).  
   
   > So loading images in that specific format directly would help, which means 
storing non row major tensors.
   
   So given the above, my understanding is that it is still perfectly possible 
to store either formats that pytorch uses as row-major data in the proposed 
FixedShapeTensor array, and be able to load it directly (without copy). 
   You could then use the proposed `"dim_names"` to keep track of the actual 
layout (for example if the physical layout is "channels last" (NHWC), you could 
specify `"dim_names": ["H", "W", "C"]`, and when reading those data into a 
pytorch tensor, you know the physical order, but are still free to _view_ as a 
different logical order).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to