[ 
https://issues.apache.org/jira/browse/ARROW-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15863058#comment-15863058
 ] 

Philipp Moritz commented on ARROW-550:
--------------------------------------

We do need tensors within other types but having only tensors of primitive 
types is fine. We have written our own sequence type to support nesting within 
lists or dicts, see 
https://github.com/ray-project/ray/blob/master/src/numbuf/cpp/src/numbuf/sequence.h.
 Something that comes up a lot for example in deep learning is dictionaries of 
tensors (these are weight collections for neural networks).

Yes, fixed size types are what we need. To handle dtype=object, we convert the 
tensors to lists and then use our sequence type.

Can you clarify what exactly you mean regarding the 2G limitation? If tensors 
larger than 2G are not accessible from Java that'd be ok for us for now.

Also using arrow::DataType sounds good!

> [Format] Add a TensorMessage type
> ---------------------------------
>
>                 Key: ARROW-550
>                 URL: https://issues.apache.org/jira/browse/ARROW-550
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Format
>            Reporter: Wes McKinney
>
> Since all data message types at the moment are 1-dimensional, a "tensor" 
> message will contain an array of dimensions and an order flag (C order vs. 
> Fortran order) to enable data to be interpreted as multiple dimensions. This 
> is similar to multidimensional arrays in APL or Fortran or MATLAB, ndarrays 
> in NumPy, etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to