alamb commented on issue #8373:
URL:
https://github.com/apache/arrow-datafusion/issues/8373#issuecomment-2016612420
I took another pass through the paper. In addition to some word smithing and
whitespace engineering, I increased the size of the abstract both so the front
page didn't look as empty but also to summarize the content of the paper (in
addition to its conclusion / main point) to help readers decide if the paper
was interesting to them
Here is the current text
> Apache Arrow DataFusion\cite{DataFusion} is a fast, embeddable, and
extensible query engine written in Rust\cite{Rust} that uses Apache
Arrow\cite{Arrow} as its memory model. In this paper we describe the
technologies on which it is built, and how it fits in long term database
implementation trends. We then enumerate the features of a modern OLAP engine,
and outline optimizations required for high performance. Next we describe
DataFusion's architecture and extension APIs to illustrate the interfaces used
in modular query engines to integrate with the systems built on them. Finally,
we demonstrate open standards and extensible design do not preclude
state-of-the-art performance using a series of experimental comparisons to
DuckDB\cite{DuckDB}.
>
> While the individual techniques used in DataFusion have been previously
described many times, it differs from other industrial strength engines by
providing competitive performance \textit{and} an open architecture that can be
customized using more than 10 major extension APIs. This flexibility has led to
use in many commercial and open source databases, machine learning pipelines,
and other data-intensive systems. We anticipate that the accessibility and
versatility of DataFusion, along with its competitive performance, will further
the proliferation of high-performance custom data infrastructures tailored to
specific needs assembled from modular components\cite{ComposableManifesto,
ComposableCodex}.
Here is what it looks like

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]