matthewmturner commented on issue #2656:
URL: 
https://github.com/apache/arrow-datafusion/issues/2656#issuecomment-1141646635

   FYI https://github.com/apache/arrow-datafusion/issues/1836 and 
https://github.com/apache/arrow-datafusion/pull/1863 is background conversation 
on the API design for this.
   
   The original idea was to use this as shown below (slightly updated from how 
it was discussed in the above issue) - i.e. easily register multiple tables 
from a prefix into a schema.
   
   ```
   let object_store = S3FileSystem::default();
   let schema = ObjectStoreSchemaProvider::new();
   schema.register_store(object_store);
   let tables = object_store.list_dir("s3://active/schema1");
   tables.iter().map(|file|  {
       let config = ListingTableConfig::new(object_store, file).infer().await?;
       let name = extract_name_from_file(&file);
       schema.register_listing_table(name, file, config);
   }
   ```
   
   Thinking on it now, i think the same could be done without the need for this 
abstraction, like the following:
   
   ```
   let object_store = S3FileSystem::default();
   let schema = MemorySchemaProvider::new();
   let tables = object_store.list_dir("s3://active/schema1");
   tables.iter().map(|file|  {
       let config = ListingTableConfig::new(object_store, file).infer().await?;
       let table = ListingTable::try_new(config)?;
       let name = extract_name_from_file(&file);
       schema.register_table(name, table)?;
   }
   ```
   
   If you agree, then yes I do think we can remove `ObjectStoreSchemaProvider`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to