shyjsarah opened a new pull request, #324:
URL: https://github.com/apache/paimon-rust/pull/324

   ### Purpose
   
     Linked issue: close #xxx
   
     `paimon-datafusion` provides `register_vector_search` / 
`register_full_text_search` to register table-valued functions (UDTFs) on a 
session. But the Python binding `PySQLContext` only exposed `register_catalog` 
/ `set_current_*` /`register_batch` / `sql` — there was no way to reach 
`register_udtf` from Python, so these UDTFs were entirely unusable from 
`pypaimon`.
   
     This PR exposes a single registration entry point to the Python binding.
   
     ### Brief change log
   
     - **`bindings/python/src/context.rs`**: add 
`SQLContext.register_table_function(name, default_database=None)`. A single 
dispatch method (rather than one method per function) keeps the Python API 
surface stable — it `match`es on the function name, currently handling 
`vector_search` and `full_text_search`, and raises a clear `ValueError` for an 
unknown name. The function is bound to the current catalog.
     - **`crates/integrations/datafusion/src/sql_context.rs`**: change 
`SQLContext::current_catalog` from private to `pub`. The binding needs the 
registered `Arc<dyn Catalog>` to pass to `register_*`; exposing the accessor 
lets it read from `SQLContext` instead of keeping a duplicate catalog handle.
     - **`bindings/python/Cargo.toml`**: enable the `fulltext` feature on 
`paimon-datafusion` (pulls in `tantivy` + `tempfile`, both pure-Rust) so 
`register_full_text_search` is compiled into the binding.
   
     Once `register_referenced_files_size` / `register_physical_files_size` 
land on `main`, wiring them is a two-line addition to the `match` — the Python 
signature does not change.
   
     ### Tests
   
     `bindings/python/tests/test_datafusion.py` — 5 new tests:
     - `vector_search` / `full_text_search` register without error
     - the optional `default_database` keyword is accepted
     - an unknown function name raises a clear error
     - calling before any catalog is registered raises
   
     Registration touches neither the Lumina nor the Tantivy runtime, so the 
tests are deterministic and need no index fixtures.
   
     ### API and Format
   
     - New Python API: `SQLContext.register_table_function`.
     - New public Rust API: `SQLContext::current_catalog` (previously private).
     - Build: the binding now enables `paimon-datafusion/fulltext` (adds 
`tantivy`).
     - No storage format change.
   
     ### Documentation
   
     New Python-facing API. The Rust-side `docs/src/sql.md` already documents 
the underlying `register_*` functions; the pypaimon-facing docs live in the 
`apache/paimon` repo and can be updated as a follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to