Re: [C++] Computation functions in Apache Arrow

Weston Pace Thu, 22 Jul 2021 09:54:11 -0700

> How can I write it in a single instruction line or do I need it sequentially?

You cannot do this as a single compute call today.  ARROW-12060[1]
aims to add support for running expressions as a single call which may
fill this need.

> What do I need to do if gain or offset has a different data type?

The compute machinery does some implicit casting.  So long as both
types are numeric you should be ok.  More details can be found at [2].
If the types are dissimilar (e.g. string/int32) you may need to
explicitly cast (with the cast compute function) the data yourself.

> Does arrow support matrix operations?

No.  You may be thinking of interpreting a table as a matrix.  It is
natural to think of datasets and matrices as similar (e.g. I think
python allows you to perform matrix multiplication on datasets) but I
don't think I've seen any discussion of doing the same in Arrow.

On the other hand, there has been some interest in the past in
representing tensors as a logical data type in Arrow.  A rank 2 tensor
is either the same as a matrix or very similar to a matrix (depending
on who you ask).  Matrix multiplication could be implemented as a
compute kernel for arrays of rank-2 tensors.  That being said, I have
not seen any discussion or JIRA issues on tensor compute functions and
so I don't know that anyone is working on that.

[1] https://issues.apache.org/jira/browse/ARROW-12060
[2] https://arrow.apache.org/docs/cpp/compute.html#implicit-casts

On Thu, Jul 22, 2021 at 4:04 AM Bjoern Bachmann <[email protected]> wrote:
>
> Hey,
>
> I would like to better understand arrows built-in compute functionality. For 
> example if I've single array "X" of float type and I want to do the following 
> calc:
>
> X*gain + offset and I
>
> How can I write it in a single instruction line or do I need it sequentially? 
> What do I need to do if gain or offset has a different data type?
>
> Does arrow support matrix operations?
>
> Thanks!
>
> Bjoern

Re: [C++] Computation functions in Apache Arrow

Reply via email to