[
https://issues.apache.org/jira/browse/ARROW-8714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100811#comment-17100811
]
Christian Hudon commented on ARROW-8714:
----------------------------------------
Proposed approach: a first column containing the elements from all the tensors
(in row-major order), and a second containing a tuple with that tensor's shape.
The start offset of the data for the next tensor can be computed from the shape
of the previous one. Does that sound like the right approach? Would we also
need a separate column containing the pre-computed start index of for each
tensor?
> [C++] Add a Tensor logical value type with varying dimensions, implemented
> using ExtensionType
> ----------------------------------------------------------------------------------------------
>
> Key: ARROW-8714
> URL: https://issues.apache.org/jira/browse/ARROW-8714
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, Format
> Reporter: Christian Hudon
> Priority: Major
>
> Support for tensor in Table, RecordBatch, etc. where each row is a tensor of
> a different shape (e.g images of different sizes), but of the same underlying
> type (e.g. int32). Implemented as an ExtensionType, so no need to change the
> format.
> I don't see needing each row being a tensor with a different number of
> dimensions, so if the implementation for that falls out easily of the use
> case with each row in the table having a tensor with the same number of
> dimensions, great. If it adds a lot of complexity, that case would be
> postponed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)