returnString commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-866756001


   Hrm, I think if we're moving this to the interface, we need to codify some 
notion of "unsupported operation". That's actually why I left it out initially 
- registering a schema inside e.g. the information_schema catalog doesn't 
really make sense, because it's a read-only projection of DB internals, and I 
didn't want to commit to more public API than was necessary.
   
   In my DataFusion-powered projects, I typically treat ExecutionContext 
instances as immutable which simplifies a lot of the setup. Essentially this 
entails creating catalogues using concrete types like `MemoryCatalogProvider` 
and then just attaching those to a new context, so I can work with the 
type-specific impls, rather than just trait methods. I'm not sure how widely 
adopted this is as a methodology, but I've found it works well!
   
   For example, if I were building a traditional database, here's how I'd 
execute queries:
   - build the list of catalogs (and internally, schemas/tables) the user has 
permissions to access (this relies on out-of-band data)
   - create an execution context populated with said catalog list
   - run the query using the context
   - discard the context
   
   Obviously this relies on the context setup being quite cheap, but I don't 
see any moves toward making that a particularly intensive process 😄 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to