alamb commented on issue #8373:
URL: 
https://github.com/apache/arrow-datafusion/issues/8373#issuecomment-1997452950

   This morning, I started working on the first page
   
   > Add more examples / better explanation of systems built on DataFusion (we 
have some good new examples I know of since -- Arroyo, Comet, and LanceDB comes 
to mind)
   
   (that looks pretty much done now to me)
   
   > The main criticism / weakness cited is that DataFusion doesn't demonstrate 
sufficient technical novelty other than integration of various existing ideas. 
I think this is a very valid point, and maybe we should re-emphasize the point 
more that it isn't technical novelty of any part, but the overall system.
   
   I reworded the abstract to try and make the "not novel" point more 
explicitly. Here is what I came up with:
   
   "Apache Arrow DataFusion\cite{DataFusion} is a fast, embeddable, and 
extensible query engine written in Rust\cite{Rust} that uses Apache 
Arrow\cite{Arrow} as its memory model. While the individual techniques used by 
DataFusion have been previously described, it differs from other industrial 
strength engines by providing competitive performance \textit{and} an open 
architecture that can be customized using over 10 major extension APIs. This 
flexibility has led to its use in many commercial and open source databases, 
machine learning pipelines, and other data-intensive systems. We anticipate 
that the accessibility and versatility of DataFusion, along with its 
competitive performance, will further enable the proliferation of 
high-performance custom data infrastructures tailored to specific needs."
   
   
   > Please move the figure out of the first page, or to the bottom of the 
first page. It is distracting to read the caption of Figure 1 before the 
abstract.
   
   I personally like the visual impact of the figure at the beginning so I 
would prefer keeping its location where it is.  However, as the reviewer points 
out, the extended caption on the figure was duplicative / repetitive with the 
abstract. I thus reduced the caption to the following, which I think captures 
the essence with less distraction
   
   "When building with DataFusion, system designers implement domain-specific 
features via extension APIs (blue), rather than re-implementing standard OLAP 
query engine technology (green)."
   
   I also updated the figure with the new DataFusion logo 
https://github.com/apache/arrow-datafusion/issues/8788 (thanks @pinarbayata)
   
   I think the first page is now looking quite good
   ![Screenshot 2024-03-14 at 9 21 34 
AM](https://github.com/apache/arrow-datafusion/assets/490673/0110f3af-d069-4919-a087-00cbf4f1bbdb)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to