Re: [PR] feat: implement metrics for AsyncFuncExec [datafusion]

via GitHub Mon, 05 Jan 2026 02:21:52 -0800


2010YOUY01 commented on code in PR #19626:
URL: https://github.com/apache/datafusion/pull/19626#discussion_r2660992607



##########
datafusion/core/tests/user_defined/user_defined_async_scalar_functions.rs:
##########
@@ -81,6 +88,30 @@ async fn test_async_udf_with_non_modular_batch_size() -> 
Result<()> {
     Ok(())
 }
 
+// This test checks if metrics are printed for `AsyncFuncExec`
+#[tokio::test]
+async fn test_async_udf_metrics() -> Result<()> {
+    let ctx = register_table_and_udf()?;
+
+    let df = ctx
+        .sql(
+            "EXPLAIN ANALYZE SELECT id, test_async_udf(prompt) as result FROM 
test_table",
+        )
+        .await?;
+
+    let result = df.collect().await?;
+
+    let explain_analyze_str = format_batches(&result)?.to_string();
+    let async_func_exec_without_metrics =
+        explain_analyze_str.split("\n").any(|metric_line| {
+            metric_line.contains("AsyncFuncExec") && 
!metric_line.contains("output_rows")

Review Comment:
   I suggest to assert on the output rows value (like 
`...contains("output_rows=x")...`)



##########
datafusion/physical-plan/src/async_func.rs:
##########
@@ -207,8 +209,11 @@ impl ExecutionPlan for AsyncFuncExec {
             let async_exprs_captured = Arc::clone(&async_exprs_captured);
             let schema_captured = Arc::clone(&schema_captured);
             let config_options = Arc::clone(&config_options_ref);
+            let baseline_metrics_captured = baseline_metrics.clone();
 
             async move {
+                let timer = 
baseline_metrics_captured.elapsed_compute().timer();

Review Comment:
   I think it would count the `await` time for async functions, and the 
`elapsed_compute` metric inside `BaselienMetrics` is not expected to count 
awaiting time.
   
   So in order to implement this, we need to figure out how to pass the timer 
to the async expression evaluation step.
   
   I suggest we remove the time counting for now, since 0s output is an obvious 
'unimplemented', if we include the await time inside `elapsed_compute`, the 
output might be confusing.
   
   And we can implement `elpased_compute` correctly in a follow-up PR, other 
metrics inside `BaselineMetrics` should be working after this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: implement metrics for AsyncFuncExec [datafusion]

Reply via email to