Hi All,

Thank you very much for creating this proposal, Alenka!

I noticed the following in the notes [1] shared from the February 15th Arrow 
Community Meeting:

"Members of Hugging Face, Ray, and PyTorch community have given input and some 
of it was incorporated - It would be good to have input from some other 
companies and project communities including Lance, NumPy, Posit, ​MATLAB, 
DLPack, CUDA/RAPIDS, Arrow Rust, Xarray, Julia, Fortran, TensorFlow, LinkedIn"

Based on the inclusion of MATLAB in the list above, I've shared this proposal 
with some colleagues at MathWorks who have expertise in the deep learning area. 
They will respond here if they have any additional input to add.

That being said, I recognize that this proposal is already nearing the voting 
phase.

[1] https://lists.apache.org/thread/bblcwwq7gl1x2hsr1qsormv9f3vr23jn

Best Regards,

Kevin Gurney

________________________________
From: Rok Mihevc <rok.mih...@gmail.com>
Sent: Thursday, February 23, 2023 8:12 AM
To: dev@arrow.apache.org <dev@arrow.apache.org>
Subject: Re: [VOTE] Format: Fixed shape tensor Canonical Extension Type

That makes sense indeed.
Do we have any more comments on the language of the proposal [1] or should
we proceed to vote?

Rok

[1] 
https://github.com/apache/arrow/pull/33925/files<https://github.com/apache/arrow/pull/33925/files>

On Wed, Feb 22, 2023 at 2:13 PM Antoine Pitrou <anto...@python.org> wrote:

>
> That's a good point.
>
> Regards
>
> Antoine.
>
>
> Le 22/02/2023 à 14:11, Dewey Dunnington a écrit :
> > I don't think having both dimension names and permutation is
> > redundant...dimension names can also serve as human-readable tags that
> help
> > a human interpret the values. If reading a NetCDF, for example, one might
> > store the dimension variable names. When determining type equality it may
> > be useful that {..., permutation = [2, 0, 1], dim_names = ["C", "H",
> "W"]}
> > is not equal to {..., permutation = [2, 0, 1], dim_names = ["x", "y",
> "z"]}.
> >
> > On Wed, Feb 22, 2023 at 4:56 AM Rok Mihevc <rok.mih...@gmail.com> wrote:
> >
> >>>
> >>>>>
> >>>>> Should we rule that `dim_names` and `permutation` are mutually
> >>> exclusive?
> >>>>>
> >>>>
> >>>> Since `dim_names` have to "map to the physical layout (row-major)"
> that
> >>>> means permutation will always be trivial which indeed makes it
> >>> unnecessary
> >>>> to store both.
> >>>
> >>> I don't think it is necessarily needed to explicitly make them
> >>> mutually exclusive. I don't know how useful this would in practice,
> >>> but you certainly *can* specify both in a meaningful way. Re-using the
> >>> example of NHWC data, which is physically stored as NCHW, you can keep
> >>> track of this by specifying a permutation of [2, 0, 1], but at the
> >>> same time you could also still save the dimension names as ["C", "H",
> >>> "W"].
> >>>
> >>
> >> I'll advocate for the original comment, but I'm ok either way. Having
> both
> >> `dim_names` and `permutation` is redundant - if the user knows their
> >> desired order of `dim_names` they can derive the permutation. If they
> don't
> >> use `dim_names` they probably don't want them.
> >>
> >
>

Reply via email to