>
> Generally, the preferred pattern is one VectorSchemaRoot that
> gets reloaded each time.  So an API like "df.loadVectorSchemaRoot(root)"
> probably makes more sense but we can iterate on this.
>

Could you expand on what exactly you mean by this?

Still a bit blurry on the best-practices behind sending
the Arrow response in Flight and seems like an important point.


... creating a new contrib module that maps
> from java objects (just like there are JDBC and Avro ones) seems
> worthwhile.  If you are interested in contributing something like this I
> think a short design doc would be worth-while.
>

Where would be the best place to post this?

I was thinking about GitHub issues but I am GitHub-centric,
not sure if JIRA or mailing list would be better.

Thanks, Micah!


On Sun, Mar 13, 2022 at 12:46 AM Micah Kornfield <emkornfi...@gmail.com>
wrote:

> Hi Gavin,
>
> > Just curious whether there is any interest/intention of possibly making a
> > higher level API around the basic FlightSQL one?
>
>
> IIUC, I don't think this is an issue with Flight but one with generic
> conversion between data into Arrow.  I don't think anyone is actively
> working on something like this, but creating a new contrib module that maps
> from java objects (just like there are JDBC and Avro ones) seems
> worthwhile.  If you are interested in contributing something like this I
> think a short design doc would be worth-while.
>
> VectorSchemaRoot root = df.toVectorSchemaRoot();
> > listener.setVectorSchemaRoot(root);
> > listener.sendVectorSchemaRootContents();
>
>
> A small nit.  Generally, the preferred pattern is one VectorSchemaRoot that
> gets reloaded each time.  So an API like "df.loadVectorSchemaRoot(root)"
> probably makes more sense but we can iterate on this.  This wasn't commonly
> understood when some of the other contrib modules were developed.
>
> Cheers,
> Micah
>
>
> On Sat, Mar 12, 2022 at 12:15 PM Gavin Ray <ray.gavi...@gmail.com> wrote:
>
> > While trying to implement and introduce the idea of adopting FlightSQL,
> the
> > largest challenge was the API itself
> >
> > I know it's meant to be low-level. But I found that most of the
> development
> > time was in code to convert to/from
> > row-based data (IE Map<String, Object>) and Java types, and columnar
> data +
> > Arrow types.
> >
> > I'm likely in the minority position here -- I know that Arrow and
> FlightSQL
> > users are largely looking at transferring large volumes of data and
> > servicing OLAP-type workloads
> > But the thing that excites me most about FlightSQL, isn't its performance
> > (always nice to have), but that it's a language-agnostic standard for
> data
> > access.
> >
> > That has broad implications -- for all kinds of data-access workloads and
> > business usecases.
> >
> > The challenge is that in trying to advocate for it, when presenting a
> > proof-of-concept,
> > rather than what a developer might expect to see, something like:
> >
> > // FlightSQL handler code
> > List<Map<String, Object>> results = ....;
> > results.add(Map.of("id", 1, "name", "Person 1");
> > return results;
> >
> > A significant portion of the code is in Arrow-specific implementation
> > details:
> > creating a VectorSchemaRoot, FieldVector, de-serializing the results on
> the
> > client, etc.
> >
> > Just curious whether there is any interest/intention of possibly making a
> > higher level API around the basic FlightSQL one?
> > Maybe something closer to the traditional notion of a row-based
> "DataFrame"
> > or "Table", like:
> >
> > DataFrame df = new DataFrame();
> > df.addColumn("id", ArrowTypes.Int);
> > df.addColumn("name", ArrowTypes.VarChar);
> > df.addRow(Map.of("id", 1, "name", "Person 1"));
> > VectorSchemaRoot root = df.toVectorSchemaRoot();
> > listener.setVectorSchemaRoot(root);
> > listener.sendVectorSchemaRootContents();
> >
>

Reply via email to