comphead commented on PR #17242:
URL: https://github.com/apache/datafusion/pull/17242#issuecomment-3243650877
Thanks @waynexia for the diagram and explanation.
Definitely agree for the simplification, abstractions indeed are overly
flexible, more than needed and getting this simplified would be awesome. For
instance all the details related to specific datasource can be incapsulated in
DataSource provider implementation.
For example if user would like to onboard the Orc file,
For the diagram it might be still confusing having memory datasource under
file source configs.
File source are format dependent and thus having specific readers, configs
per format, however memory have no dependency on the format, opener, etc. It
should still depend on some memory scan config though.
Making some changes in the proposal
```
┌────────────────────┐
│ TableProvider │
│ (File / Memory) │
└─────────┬──────────┘
│
┌──────────────────┴──────────────────┐
│ │
┌────────▼─────────┐ ┌───────▼────────┐
│ FileTableProvider│ │ MemoryProvider │
└────────┬─────────┘ └────────┬───────┘
│ │
│ uses │ uses
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ FileScanConfig │ │ MemoryScanConfig │
└──────────┬────────┘ └──────────┬────────┘
│ │
└─────────────┬─────────────────────┘
│
┌───────▼─────────┐
│ ScanConfig │ (trait)
└───────┬─────────┘
│
┌───────────────────┼───────────────────┐
│ │ │
┌────▼────┐ ┌─────▼────┐ ┌─────▼─────┐
│FileFormat│ │FileOpener│ │FileStream │
│(Parquet, │ └──────────┘ └───────────┘
│ Avro, │
│ JSON, ..)│
└──────────┘
```
However as you correctly mentioned FileFormat, FileOpener, FileStream
probably can be incapsulated into some facade object taking a config and
providing `RecordBatchStream` hiding all the specifics inside
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]