zhuqi-lucas commented on code in PR #16196: URL: https://github.com/apache/datafusion/pull/16196#discussion_r2128809331
########## datafusion/datasource/src/source.rs: ########## @@ -179,12 +180,17 @@ pub trait DataSource: Send + Sync + Debug { /// the [`FileSource`] trait. /// /// [`FileSource`]: crate::file::FileSource +/// We now add a `cooperative` flag to +/// let it optionally yield back to the runtime periodically. +/// Default is `true`, meaning it will yield back to the runtime for cooperative scheduling. #[derive(Clone, Debug)] pub struct DataSourceExec { /// The source of the data -- for example, `FileScanConfig` or `MemorySourceConfig` data_source: Arc<dyn DataSource>, /// Cached plan properties such as sort order cache: PlanProperties, + /// Indicates whether to enable cooperative yielding mode. + cooperative: bool, Review Comment: Thank you @ozankabak for this question, here is my understanding. I was trying to add this in data_source, but it seems not a good way because: 1. We need to add it in all data_source reference implement. Such as FileScanConfig, MemorySourceConfig. 2. If we add new source based data_source, we need to add it again. 3. FileScanConfig also need different FileSource: ParquetSource, CsvSource, etc. Which add complexity for this config to inject. It's easy for us to do in high level, DataSourceExec is the good place for my trying until now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org