zfarrell opened a new pull request, #22499:
URL: https://github.com/apache/datafusion/pull/22499

   ## Which issue does this PR close?
   
   <!-- No linked issue. Happy to file one if reviewers prefer. -->
   
   - Closes #.
   
   ## Rationale for this change
   
   Downstream catalog implementations that resolve schemas asynchronously
   cannot reuse `InformationSchemaProvider` — it enumerates schemas via
   `CatalogProvider::schema_names()`, which is synchronous, so an
   async-only catalog has to provide its own `information_schema.schemata`
   view. Today that requires either duplicating the column layout and
   the row-building logic, or reaching into private items.
   
   Exposing `InformationSchemataBuilder` and a `schemata_schema()` factory
   lets external crates emit byte-for-byte-compatible `schemata` batches
   without copy-pasting the contract.
   
   ## What changes are included in this PR?
   
   - `pub fn schemata_schema() -> SchemaRef` — extracts the column-layout
     factory. `InformationSchemata::new` now calls it instead of inlining
     the schema, so there is a single source of truth.
   - `InformationSchemataBuilder` becomes `pub` (was private) with a
     `Default` impl and a public `new()`. `add_schemata` and `finish` are
     bumped to `pub`. The function bodies and parameter types
     (`&str` / `Option<&str>`) are unchanged.
   - `finish` now returns `Result<RecordBatch>` instead of panicking via
     an internal `.unwrap()`. The one internal caller
     (`PartitionStream::execute` for `InformationSchemata`) was previously
     wrapping `Ok(builder.finish())` and is updated to just
     `builder.finish()` since the inner expression now produces the
     `Result` directly.
   
   ## Are these changes tested?
   
   Yes. A new unit test `schemata_builder_emits_canonical_schema_and_rows`
   exercises the public API end-to-end via `Default::default()`, asserts
   the produced batch's schema matches `schemata_schema()`, and verifies
   the null pattern for `schema_owner`, the three `default_character_set_*`
   columns, and `sql_path`. The pre-existing internal users
   (`InformationSchemata::new`, `PartitionStream::execute`) continue to
   exercise the same code path through the unchanged
   `InformationSchemata::builder()` constructor.
   
   ## Are there any user-facing changes?
   
   Yes — three new public items in `datafusion-catalog`: `schemata_schema`,
   `InformationSchemataBuilder` (with its `new` / `add_schemata` / `finish`
   methods + `Default` impl). No existing public API is broken. The
   `Result<RecordBatch>` return on `finish` is a first-time-public surface,
   not a regression.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to