Hi Razvan,
I'm not sure about plans around tensors. However, depending on how you are
trying to transfer the data and consume it, you might consider using an
extension type [1]. For the physical representation you could model it as
something like:
{
RowLabel : Date32/64
ColumnLabels : FixedSizeList<String> (dictionary encoded)
Data : FixedSize<float>
}
which would be more compact that making N individual columns if N is
large. You would have to handle the mapping from column label to index at
the application level though.
Hope this helps.
-Micah
[1]
https://github.com/apache/arrow/blob/6fb850cf57fd6227573cca6d43a46e1d5d2b0a66/docs/source/format/Metadata.rst#extension-types
On Fri, Jul 12, 2019 at 1:53 PM Razvan Chitu <[email protected]>
wrote:
> Sure. I'd like to bundle an M x N shaped tensor along with the M row labels
> (dates) and N column labels (string identifiers) in one response.
>
> Razvan
>
> On Fri, Jul 12, 2019, 6:53 PM Wes McKinney <[email protected]> wrote:
>
> > hi Razvan -- can you clarify what "together with a row and a column
> > index? means?
> >
> > On Fri, Jul 12, 2019 at 11:17 AM Razvan Chitu <[email protected]>
> > wrote:
> > >
> > > Hi,
> > >
> > > Does the IPC format currently support streaming a tensor together with
> a
> > > row and a column index? If not, are there any plans for this to be
> > > supported? It'd be quite a useful for matrices that could have 10s of
> > > thousands of either rows, columns or both. For my use case I am
> currently
> > > representing matrices as record batches, but performance is not that
> > great
> > > when there are many columns and few rows.
> > >
> > > Thanks,
> > > Razvan
> >
>