> > Generally, the preferred pattern is one VectorSchemaRoot that > gets reloaded each time. So an API like "df.loadVectorSchemaRoot(root)" > probably makes more sense but we can iterate on this. >
Could you expand on what exactly you mean by this? Still a bit blurry on the best-practices behind sending the Arrow response in Flight and seems like an important point. ... creating a new contrib module that maps > from java objects (just like there are JDBC and Avro ones) seems > worthwhile. If you are interested in contributing something like this I > think a short design doc would be worth-while. > Where would be the best place to post this? I was thinking about GitHub issues but I am GitHub-centric, not sure if JIRA or mailing list would be better. Thanks, Micah! On Sun, Mar 13, 2022 at 12:46 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > Hi Gavin, > > > Just curious whether there is any interest/intention of possibly making a > > higher level API around the basic FlightSQL one? > > > IIUC, I don't think this is an issue with Flight but one with generic > conversion between data into Arrow. I don't think anyone is actively > working on something like this, but creating a new contrib module that maps > from java objects (just like there are JDBC and Avro ones) seems > worthwhile. If you are interested in contributing something like this I > think a short design doc would be worth-while. > > VectorSchemaRoot root = df.toVectorSchemaRoot(); > > listener.setVectorSchemaRoot(root); > > listener.sendVectorSchemaRootContents(); > > > A small nit. Generally, the preferred pattern is one VectorSchemaRoot that > gets reloaded each time. So an API like "df.loadVectorSchemaRoot(root)" > probably makes more sense but we can iterate on this. This wasn't commonly > understood when some of the other contrib modules were developed. > > Cheers, > Micah > > > On Sat, Mar 12, 2022 at 12:15 PM Gavin Ray <ray.gavi...@gmail.com> wrote: > > > While trying to implement and introduce the idea of adopting FlightSQL, > the > > largest challenge was the API itself > > > > I know it's meant to be low-level. But I found that most of the > development > > time was in code to convert to/from > > row-based data (IE Map<String, Object>) and Java types, and columnar > data + > > Arrow types. > > > > I'm likely in the minority position here -- I know that Arrow and > FlightSQL > > users are largely looking at transferring large volumes of data and > > servicing OLAP-type workloads > > But the thing that excites me most about FlightSQL, isn't its performance > > (always nice to have), but that it's a language-agnostic standard for > data > > access. > > > > That has broad implications -- for all kinds of data-access workloads and > > business usecases. > > > > The challenge is that in trying to advocate for it, when presenting a > > proof-of-concept, > > rather than what a developer might expect to see, something like: > > > > // FlightSQL handler code > > List<Map<String, Object>> results = ....; > > results.add(Map.of("id", 1, "name", "Person 1"); > > return results; > > > > A significant portion of the code is in Arrow-specific implementation > > details: > > creating a VectorSchemaRoot, FieldVector, de-serializing the results on > the > > client, etc. > > > > Just curious whether there is any interest/intention of possibly making a > > higher level API around the basic FlightSQL one? > > Maybe something closer to the traditional notion of a row-based > "DataFrame" > > or "Table", like: > > > > DataFrame df = new DataFrame(); > > df.addColumn("id", ArrowTypes.Int); > > df.addColumn("name", ArrowTypes.VarChar); > > df.addRow(Map.of("id", 1, "name", "Person 1")); > > VectorSchemaRoot root = df.toVectorSchemaRoot(); > > listener.setVectorSchemaRoot(root); > > listener.sendVectorSchemaRootContents(); > > >