shyjsarah opened a new pull request, #324:
URL: https://github.com/apache/paimon-rust/pull/324
### Purpose
Linked issue: close #xxx
`paimon-datafusion` provides `register_vector_search` /
`register_full_text_search` to register table-valued functions (UDTFs) on a
session. But the Python binding `PySQLContext` only exposed `register_catalog`
/ `set_current_*` /`register_batch` / `sql` — there was no way to reach
`register_udtf` from Python, so these UDTFs were entirely unusable from
`pypaimon`.
This PR exposes a single registration entry point to the Python binding.
### Brief change log
- **`bindings/python/src/context.rs`**: add
`SQLContext.register_table_function(name, default_database=None)`. A single
dispatch method (rather than one method per function) keeps the Python API
surface stable — it `match`es on the function name, currently handling
`vector_search` and `full_text_search`, and raises a clear `ValueError` for an
unknown name. The function is bound to the current catalog.
- **`crates/integrations/datafusion/src/sql_context.rs`**: change
`SQLContext::current_catalog` from private to `pub`. The binding needs the
registered `Arc<dyn Catalog>` to pass to `register_*`; exposing the accessor
lets it read from `SQLContext` instead of keeping a duplicate catalog handle.
- **`bindings/python/Cargo.toml`**: enable the `fulltext` feature on
`paimon-datafusion` (pulls in `tantivy` + `tempfile`, both pure-Rust) so
`register_full_text_search` is compiled into the binding.
Once `register_referenced_files_size` / `register_physical_files_size`
land on `main`, wiring them is a two-line addition to the `match` — the Python
signature does not change.
### Tests
`bindings/python/tests/test_datafusion.py` — 5 new tests:
- `vector_search` / `full_text_search` register without error
- the optional `default_database` keyword is accepted
- an unknown function name raises a clear error
- calling before any catalog is registered raises
Registration touches neither the Lumina nor the Tantivy runtime, so the
tests are deterministic and need no index fixtures.
### API and Format
- New Python API: `SQLContext.register_table_function`.
- New public Rust API: `SQLContext::current_catalog` (previously private).
- Build: the binding now enables `paimon-datafusion/fulltext` (adds
`tantivy`).
- No storage format change.
### Documentation
New Python-facing API. The Rust-side `docs/src/sql.md` already documents
the underlying `register_*` functions; the pypaimon-facing docs live in the
`apache/paimon` repo and can be updated as a follow-up.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]