zhuqi-lucas commented on code in PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#discussion_r2128809331


##########
datafusion/datasource/src/source.rs:
##########
@@ -179,12 +180,17 @@ pub trait DataSource: Send + Sync + Debug {
 /// the [`FileSource`] trait.
 ///
 /// [`FileSource`]: crate::file::FileSource
+/// We now add a `cooperative` flag to
+/// let it optionally yield back to the runtime periodically.
+/// Default is `true`, meaning it will yield back to the runtime for 
cooperative scheduling.
 #[derive(Clone, Debug)]
 pub struct DataSourceExec {
     /// The source of the data -- for example, `FileScanConfig` or 
`MemorySourceConfig`
     data_source: Arc<dyn DataSource>,
     /// Cached plan properties such as sort order
     cache: PlanProperties,
+    /// Indicates whether to enable cooperative yielding mode.
+    cooperative: bool,

Review Comment:
   Thank you @ozankabak  for this question, here is my understanding.
   
   I was trying to add this in data_source, but it seems not a good way because:
   
   1. We need to add it in all data_source reference implement. Such as 
FileScanConfig, MemorySourceConfig.
   2. If we add new source based data_source, we need to add it again.
   3. FileScanConfig also need different FileSource: ParquetSource, CsvSource, 
etc. Which add complexity for this config to inject.
   
   It's easy for us to do in high level, DataSourceExec is the good place for 
my trying until now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to