kosiew commented on code in PR #22584:
URL: https://github.com/apache/datafusion/pull/22584#discussion_r3354869357


##########
datafusion/ffi/src/physical_optimizer.rs:
##########
@@ -31,6 +32,99 @@ use crate::execution_plan::FFI_ExecutionPlan;
 use crate::util::FFI_Result;
 use crate::{df_result, sresult_return};
 
+/// A stable struct for sharing [`PhysicalOptimizerContext`] across FFI 
boundaries.
+///
+/// This provides access to configuration options and an optional statistics 
registry
+/// for optimizer rules that need extended context.
+#[repr(C)]
+#[derive(Debug)]
+pub struct FFI_PhysicalOptimizerContext {
+    pub config_options:
+        unsafe extern "C" fn(&FFI_PhysicalOptimizerContext) -> 
FFI_ConfigOptions,
+
+    /// Returns true if a statistics registry is available.
+    pub has_statistics_registry:
+        unsafe extern "C" fn(&FFI_PhysicalOptimizerContext) -> bool,
+

Review Comment:
   `has_statistics_registry` looks unused at the moment. It is exposed in the 
`#[repr(C)]` struct and wired up in `new()`, but 
`optimize_with_context_fn_wrapper` always builds `ForeignOptimizerContext` with 
`statistics_registry: None`.
   
   Could we either remove this function pointer and 
`context_has_statistics_registry_fn` to keep the ABI surface smaller, or call 
it from the wrapper so the behavior matches the doc comment?



##########
datafusion/ffi/src/tests/mod.rs:
##########
@@ -113,6 +113,8 @@ pub struct ForeignLibraryModule {
 
     pub create_physical_optimizer_rule: extern "C" fn() -> 
FFI_PhysicalOptimizerRule,
 
+    pub create_context_aware_optimizer_rule: extern "C" fn() -> 
FFI_PhysicalOptimizerRule,

Review Comment:
   `ForeignLibraryModule` gained a required public field here, so existing 
struct literal construction would break. `cargo-semver-checks` flags this as 
`constructible_struct_adds_field`.
   
   The API change label covers this, and I realize this lives under the test 
module for cross-library integration coverage. Just calling it out so the 
public API impact is explicit.



##########
datafusion/ffi/src/physical_optimizer.rs:
##########
@@ -41,6 +135,12 @@ pub struct FFI_PhysicalOptimizerRule {
         config: FFI_ConfigOptions,
     ) -> FFI_Result<FFI_ExecutionPlan>,
 
+    pub optimize_with_context: unsafe extern "C" fn(
+        &Self,
+        plan: &FFI_ExecutionPlan,
+        context: &FFI_PhysicalOptimizerContext,
+    ) -> FFI_Result<FFI_ExecutionPlan>,
+

Review Comment:
   Small ABI-shape note for future extensions: `optimize_with_context` was 
inserted between `optimize` and `name`, so all later fields move by one slot. 
`cargo-semver-checks` also reports this as 
`repr_c_plain_struct_fields_reordered`.
   
   Since this PR already has an acknowledged major-version API change, I do not 
think this needs to block the PR. Still, for vtable-style `#[repr(C)]` structs, 
it is usually safer to append new function pointers before the data fields 
instead of inserting them in the middle. That limits how much of the existing 
layout shifts.



##########
datafusion/ffi/src/physical_optimizer.rs:
##########
@@ -31,6 +32,99 @@ use crate::execution_plan::FFI_ExecutionPlan;
 use crate::util::FFI_Result;
 use crate::{df_result, sresult_return};
 
+/// A stable struct for sharing [`PhysicalOptimizerContext`] across FFI 
boundaries.
+///
+/// This provides access to configuration options and an optional statistics 
registry
+/// for optimizer rules that need extended context.
+#[repr(C)]
+#[derive(Debug)]
+pub struct FFI_PhysicalOptimizerContext {
+    pub config_options:
+        unsafe extern "C" fn(&FFI_PhysicalOptimizerContext) -> 
FFI_ConfigOptions,
+
+    /// Returns true if a statistics registry is available.
+    pub has_statistics_registry:
+        unsafe extern "C" fn(&FFI_PhysicalOptimizerContext) -> bool,
+
+    /// Release the memory of the private data.
+    pub release: unsafe extern "C" fn(&mut FFI_PhysicalOptimizerContext),
+
+    /// Internal data. Only accessed by the provider.
+    pub private_data: *const c_void,
+}
+
+unsafe impl Send for FFI_PhysicalOptimizerContext {}
+unsafe impl Sync for FFI_PhysicalOptimizerContext {}
+
+struct OptimizerContextPrivateData {
+    config: ConfigOptions,
+    statistics_registry: Option<StatisticsRegistry>,
+}
+
+impl FFI_PhysicalOptimizerContext {
+    pub fn new(context: &dyn PhysicalOptimizerContext) -> Self {
+        let private_data = Box::new(OptimizerContextPrivateData {
+            config: context.config_options().clone(),
+            statistics_registry: context.statistics_registry().cloned(),
+        });
+        let private_data = Box::into_raw(private_data) as *const c_void;
+

Review Comment:
   This clone looks unnecessary now. `StatisticsRegistry` is cloned into 
`OptimizerContextPrivateData`, but the provider side never reads it because 
`ForeignOptimizerContext` is always created with `statistics_registry: None`.
   
   Since the registry cannot safely cross the FFI boundary here, could this 
just store `statistics_registry: None` and avoid the extra allocation on each 
`optimize_with_context` call?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to