Re: [DISCUSS] Looking for feedback on my Rust library

Aljaž Eržen Fri, 15 Mar 2024 04:17:51 -0700

> If your audience is "regular developers" then they often do not know or want 
> to speak Arrow.


That's a good point. The thing is that I don't really have an audience
- it is just something I
needed and have now open sourced it if anyone finds it useful too. It
really goes against what
"regular developers" need, since this is converting data from row-major format
(which "regular developers" find useful) into an arrow. Even worse:
data is traversing the network in
inefficient row-major format. So no, this library really is not for
"regular developers".

This is exactly the kind of feedback I came looking for! Next time I
need to start the question with
"how does this piece code fit into the ecosystem?".

On Fri, Mar 15, 2024 at 4:01 AM Weston Pace <weston.p...@gmail.com> wrote:
>
> Felipe's points are good.
>
> I don't know that you need to adapt the entire ADBC, it sort of depends
> what you're after.  I see what you've got right now as more of an SQL
> abstraction layer.  For example, similar to things like [1][2][3] (though 3
> is more of an ORM).  If you like the SQL interface that you've come up with
> then you could add, in addition to your postgres / sqlite / etc. bindings,
> an ADBC implementation.  This would adapt anything that implements ADBC to
> your interface.  This way you could get, in theory, free support for
> backends like flight sql or snowflake, and you could replace your duckdb /
> postgres backends if you wanted.
>
> I will pass on some feedback I received recently.  If your audience is
> "regular developers" (e.g. not data engineers, people building webapps, ML
> apps, etc.) then they often do not know or want to speak Arrow.  They see
> it as an essential component, but one that is sort of a "database internal"
> or a "data engineering thing".  For example, in the python / pyarrow world
> people are happy to know that arrow data is traversing their network but,
> when they want to actually work with it (e.g. display results to users),
> they convert it to python lists or pandas (fortunately arrow makes this
> easy).
>
> For example, if you look at postgres' rust bindings you will see that
> people process results like this:
>
> ```
>
> for row in client.query("SELECT id, name, data FROM person", &[])? {
>     let id: i32 = row.get(0);
>     let name: &str = row.get(1);
>     let data: Option<&[u8]> = row.get(2);
>
>     println!("found person: {} {} {:?}", id, name, data);
> }
>
> ```
>
> The `get` method can be templated to anything implementing the `FromSql`
> trait.  This lets rust devs use types they are familiar with (e.g. `&str`,
> `i32`, `&[u8]`) instead of having to learn a new technology (whatever
> postgres is using internally)
>
> On the other hand, if your audience is, in fact, data engineers, then that
> sort of native row-based interface is going to be too efficient.  So there
> are definitely uses for both.
>
> [1] https://sequelize.org/v3/
> [2] https://docs.rs/quaint/latest/quaint/
> [3] https://www.sqlalchemy.org/
>
> On Thu, Mar 14, 2024 at 4:19 PM Felipe Oliveira Carvalho <
> felipe...@gmail.com> wrote:
>
> > Two comments:
> >
> > ——
> >
> > Since this library is analogous to things like ADBC, ODBC, and JDBC, it’s
> > more of a “driver” than a “connector”. This might make your life easier
> > when explaining what it does.
> >
> > It’s not a black and white thing, but “connector” might imply networking to
> > some people.
> >
> > I believe you delegate the networking bits of interacting with PostgreSQL
> > to a Rust connector.
> >
> > ——
> >
> > This library would be more interesting if it could be a wrapper of
> > language-agnostic database standards like ADBC and ODBC. The Rust compiler
> > can call and expose functions that follow the C ABI — the only true code
> > interface standard available on every OS/Architecture pair.
> >
> > This would mean that any database that exposes ADBC/ODBC can be used from
> > your driver. You would still offer a rich Rust interface, but everything
> > would translate to well-defined operations that vendors implement. This
> > also reduces the chances of you providing things that are heavily biased
> > towards the way the databases you supported initially work.
> >
> > —
> > Felipe
> >
> >
> >
> > On Tue, 12 Mar 2024 at 09:28 Aljaž Eržen <al...@erzen.si> wrote:
> >
> > > Hello good folks of Apache Arrow! I come looking for feedback on my
> > > Rust crate connector_arrow [1], which is an Arrow database client that
> > > is able to connect to multiple databases over their native protocols.
> > >
> > > It is very similar to ADBC, but better adapted for the Rust ecosystem,
> > > as it can be compiled with plain cargo and uses established crates for
> > > connecting to the databases.
> > >
> > > The main feedback I need is the API exposed by the library [2]. I've
> > > tried to keep it minimal and it turned out much more concise than the
> > > api exposed by ADBC. Have I missed important features?
> > >
> > > Aljaž Mur Eržen
> > >
> > > [1]: https://docs.rs/connector_arrow/latest/connector_arrow/
> > > [2]:
> > >
> > https://github.com/aljazerzen/connector_arrow/blob/main/connector_arrow/src/api.rs
> > >
> >



-- 

Aljaž Mur Eržen

Re: [DISCUSS] Looking for feedback on my Rust library

Reply via email to