Thank you for bringing this topic up.

Expanding on what you suggested, here is another about this for a vision?

DataFusion's vision is to become *the de facto query engine* of choice for
new analytic applications, by leveraging the unique features of Rust and
Apache Arrow to provide:
1.  best-in-class query performance for a single node
2. A feature-complete declarative query interface via  (most of)  PostgreSQL
3. A feature-rich  procedural interface for creating and running execution
plans
4. High performance extensibility at at every layer

The current [2] readme describes *what* Datafusion is, but does not really
give a vision going forward. A few months ago we tried a "what is everyone
thinking of working on" type approach [1] to create a roadmap. While that
was insightful, I agree having a single unified (even if vague) goal would
be very helpful

I would welcome other thoughts as well: if there appears to be some
consensus then we can make a PR to add the proposal to the DataFusion readme

@Andy Grove <andygrov...@gmail.com>  do you have any thoughts?

Andrew


[1]
https://docs.google.com/document/d/1qspsOM_dknOxJKdGvKbC1aoVoO0M3i6x1CIo58mmN2Y/edit?userstoinvite=jonas.hansen%40airbus.com&ts=604a2a22&actionButton=1
[2] https://github.com/apache/arrow-datafusion#readme

On Tue, Jun 22, 2021 at 3:18 AM Jiayu Liu <ji...@hey.com.invalid> wrote:

> Hi,
>
> This is regarding my question about the datafusion's vision and roadmap.
>
> As a new contributor, I wonder what would be a vision and roadmap that
> most of the contributors can/already have be aligned upon.
>
> Maybe due to my lack of prior context I might have missed such
> discussion, or maybe this is intentionally left to be open so that
> different contributors and companies can have their own features to be
> compatible. But I still believe in the value of having one, and it can
> somehow be shown in the README.md or contributing guideline, so that
> users and the community can see what to expect from and contribute to.
>
> By "vision" I mean something that's necessarily vague and serving as an
> overarching goal, e.g. "leveraging rust and arrow and become the most
> performant SQL-compatible query engine on a single node", or "fully
> compatible with (most of) PostgreSQL syntax and pluggable in most of the
> web-scale analytical engines".
>
> I believe having this in place can help pushing the project forwards
> esp. in cases of trade off, e.g. sticking to newest rust release v.s.
> providing LTS, or incorporating as many features as possible (e.g.
> recursive CTE? BSON support? query materializations?) v.s. keeping
> binary size small and everything else into a plugin mode.
>

Reply via email to