zfarrell opened a new pull request, #22499:
URL: https://github.com/apache/datafusion/pull/22499
## Which issue does this PR close?
<!-- No linked issue. Happy to file one if reviewers prefer. -->
- Closes #.
## Rationale for this change
Downstream catalog implementations that resolve schemas asynchronously
cannot reuse `InformationSchemaProvider` — it enumerates schemas via
`CatalogProvider::schema_names()`, which is synchronous, so an
async-only catalog has to provide its own `information_schema.schemata`
view. Today that requires either duplicating the column layout and
the row-building logic, or reaching into private items.
Exposing `InformationSchemataBuilder` and a `schemata_schema()` factory
lets external crates emit byte-for-byte-compatible `schemata` batches
without copy-pasting the contract.
## What changes are included in this PR?
- `pub fn schemata_schema() -> SchemaRef` — extracts the column-layout
factory. `InformationSchemata::new` now calls it instead of inlining
the schema, so there is a single source of truth.
- `InformationSchemataBuilder` becomes `pub` (was private) with a
`Default` impl and a public `new()`. `add_schemata` and `finish` are
bumped to `pub`. The function bodies and parameter types
(`&str` / `Option<&str>`) are unchanged.
- `finish` now returns `Result<RecordBatch>` instead of panicking via
an internal `.unwrap()`. The one internal caller
(`PartitionStream::execute` for `InformationSchemata`) was previously
wrapping `Ok(builder.finish())` and is updated to just
`builder.finish()` since the inner expression now produces the
`Result` directly.
## Are these changes tested?
Yes. A new unit test `schemata_builder_emits_canonical_schema_and_rows`
exercises the public API end-to-end via `Default::default()`, asserts
the produced batch's schema matches `schemata_schema()`, and verifies
the null pattern for `schema_owner`, the three `default_character_set_*`
columns, and `sql_path`. The pre-existing internal users
(`InformationSchemata::new`, `PartitionStream::execute`) continue to
exercise the same code path through the unchanged
`InformationSchemata::builder()` constructor.
## Are there any user-facing changes?
Yes — three new public items in `datafusion-catalog`: `schemata_schema`,
`InformationSchemataBuilder` (with its `new` / `add_schemata` / `finish`
methods + `Default` impl). No existing public API is broken. The
`Result<RecordBatch>` return on `finish` is a first-time-public surface,
not a regression.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]