Re: [PR] Document parallelism and thread scheduling in the architecture guid [arrow-datafusion]

via GitHub Wed, 24 Jan 2024 16:05:47 -0800


tustvold commented on code in PR #8986:
URL: https://github.com/apache/arrow-datafusion/pull/8986#discussion_r1465691772



##########
datafusion/core/src/lib.rs:
##########
@@ -342,16 +342,20 @@
 //!
 //! [`ExecutionPlan`]s process data using the [Apache Arrow] memory
 //! format, making heavy use of functions from the [arrow]
-//! crate. Calling [`execute`] produces 1 or more partitions of data,
-//! consisting an operator that implements
-//! [`SendableRecordBatchStream`].
-//!
-//! Values are represented with [`ColumnarValue`], which are either
+//! crate. Values are represented with [`ColumnarValue`], which are either
 //! [`ScalarValue`] (single constant values) or [`ArrayRef`] (Arrow
 //! Arrays).
 //!
-//! Balanced parallelism is achieved using [`RepartitionExec`], which
-//! implements a [Volcano style] "Exchange".
+//! Calling [`execute`] produces 1 or more partitions of data,
+//! as a [`SendableRecordBatchStream`], which implements a pull based execution
+//! API. Calling `.next().await` will incrementally compute and return the next
+//! [`RecordBatch`]. Balanced parallelism is achieved using [Volcano style]
+//! "Exchange" operations implemented by  [`RepartitionExec`].

Review Comment:
   ```suggestion
   //! "Exchange" operations implemented by [`RepartitionExec`].
   ```



##########
datafusion/core/src/lib.rs:
##########
@@ -364,8 +368,26 @@
 //!
 //! [`RepartitionExec`]: 
https://docs.rs/datafusion/latest/datafusion/physical_plan/repartition/struct.RepartitionExec.html
 //! [Volcano style]: 
https://w6113.github.io/files/papers/volcanoparallelism-89.pdf
+//! [Morsel-Driven Parallelism]: https://db.in.tum.de/~leis/papers/morsels.pdf
+//! [DataFusion paper submitted SIGMOD]: 
https://github.com/apache/arrow-datafusion/files/13874720/DataFusion_Query_Engine___SIGMOD_2024.pdf
 //! [implementors of `ExecutionPlan`]: 
https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#implementors
 //!
+//! ## Thread Scheduling
+//!
+//! DataFusion incrementally computes output from a 
[`SendableRecordBatchStream`]
+//! with `target_partitions` threads. Parallelism is implementing using 
multiple
+//! [Tokio] [`task`]s, which are executed by threads managed by a tokio 
Runtime.
+//! While tokio is most commonly used
+//! for asynchronous network I/O, its combination of an efficient, 
work-stealing
+//! scheduler, first class compiler support for automatic continuation 
generation,
+//! and exceptional performance makes it a compelling choice for CPU intensive
+//! applications as well, as explained in more detail in [Using Rustlang’s 
Async Tokio

Review Comment:
   ```suggestion
   //! applications as well. This is explained in more detail in [Using 
Rustlang’s Async Tokio
   ```



##########
datafusion/core/src/lib.rs:
##########
@@ -364,8 +368,26 @@
 //!
 //! [`RepartitionExec`]: 
https://docs.rs/datafusion/latest/datafusion/physical_plan/repartition/struct.RepartitionExec.html
 //! [Volcano style]: 
https://w6113.github.io/files/papers/volcanoparallelism-89.pdf
+//! [Morsel-Driven Parallelism]: https://db.in.tum.de/~leis/papers/morsels.pdf
+//! [DataFusion paper submitted SIGMOD]: 
https://github.com/apache/arrow-datafusion/files/13874720/DataFusion_Query_Engine___SIGMOD_2024.pdf

Review Comment:
   ```suggestion
   //! [DataFusion paper submitted to SIGMOD]: 
https://github.com/apache/arrow-datafusion/files/13874720/DataFusion_Query_Engine___SIGMOD_2024.pdf
   ```



##########
datafusion/core/src/lib.rs:
##########
@@ -342,16 +342,20 @@
 //!
 //! [`ExecutionPlan`]s process data using the [Apache Arrow] memory
 //! format, making heavy use of functions from the [arrow]
-//! crate. Calling [`execute`] produces 1 or more partitions of data,
-//! consisting an operator that implements
-//! [`SendableRecordBatchStream`].
-//!
-//! Values are represented with [`ColumnarValue`], which are either
+//! crate. Values are represented with [`ColumnarValue`], which are either
 //! [`ScalarValue`] (single constant values) or [`ArrayRef`] (Arrow
 //! Arrays).
 //!
-//! Balanced parallelism is achieved using [`RepartitionExec`], which
-//! implements a [Volcano style] "Exchange".
+//! Calling [`execute`] produces 1 or more partitions of data,
+//! as a [`SendableRecordBatchStream`], which implements a pull based execution
+//! API. Calling `.next().await` will incrementally compute and return the next
+//! [`RecordBatch`]. Balanced parallelism is achieved using [Volcano style]
+//! "Exchange" operations implemented by  [`RepartitionExec`].
+//!
+//! While some recent research such as [Morsel-Driven Parallelism] describes 
challenges
+//! with the pull style Volcano execution model on NUMA architectures, in 
practice DataFusion achieves
+//! similar scalability as systems that use morsel driven approach such as 
DuckDB.
+//! See the [DataFusion paper submitted SIGMOD] for more details.

Review Comment:
   ```suggestion
   //! See the [DataFusion paper submitted to SIGMOD] for more details.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Document parallelism and thread scheduling in the architecture guid [arrow-datafusion]

Reply via email to