xsa-dev opened a new pull request, #18573: URL: https://github.com/apache/datafusion/pull/18573
## Description: This update introduces a new configuration option, `individual_expr_metrics`, allowing ProjectionExec to track execution time for each expression separately. When enabled, detailed profiling metrics will be generated for each expression, enhancing performance analysis in EXPLAIN ANALYZE output. The implementation includes modifications to the ProjectionStream to conditionally record metrics based on the configuration. Additionally, tests have been added to verify the correct behavior of the new feature when enabled and disabled. ## Which issue does this PR close? - Closes #18456 ## Rationale for this change This PR addresses the need for granular expression-level performance profiling in DataFusion's EXPLAIN ANALYZE output. Currently, ProjectionExec only provides aggregate metrics for the entire operation, making it difficult to identify which specific expressions are performance bottlenecks. By adding individual expression metrics, users can gain deeper insights into query performance and optimize their queries more effectively. The implementation follows DataFusion's existing metrics collection patterns and integrates seamlessly with the current configuration system, ensuring backward compatibility and minimal performance overhead when disabled. ## What changes are included in this PR? 1. **Added `individual_expr_metrics` configuration option** to enable/disable individual expression tracking 2. **Modified `ProjectionStream`** to conditionally track metrics for each expression when enabled 3. **Enhanced metrics collection** to support per-expression execution time tracking 4. **Updated `EXPLAIN ANALYZE` output** to display individual expression metrics when enabled 5. **Added comprehensive tests** to verify correct behavior in both enabled and disabled states 6. **Updated documentation** for the new configuration option and metrics output format ## Are these changes tested? Yes, this PR includes comprehensive test coverage: - **Unit tests** for the configuration option and metrics collection logic - **Integration tests** for EXPLAIN ANALYZE output with individual expression metrics - **Performance tests** to ensure minimal overhead when the feature is disabled - **Edge case tests** for various expression types and query patterns All tests pass successfully and the implementation maintains compatibility with existing functionality. ## Are there any user-facing changes? Yes, this PR introduces user-facing changes by extending the public API and functionality: **New Configuration:** - `individual_expr_metrics` - Boolean configuration option to enable/disable individual expression tracking **New User Impact:** - ✅ **Positive**: Users can now see detailed per-expression timing in EXPLAIN ANALYZE output - ✅ **Backward Compatible**: Existing queries and metrics continue to work unchanged - ✅ **Optimization Friendly**: Enables better query optimization by identifying bottlenecks - ✅ **Configurable**: Optional feature with minimal performance overhead when disabled **No Breaking Changes:** - All existing APIs remain unchanged - No modifications to public method signatures - Existing EXPLAIN ANALYZE output format remains the same when the feature is disabled The changes follow DataFusion's API evolution guidelines and are fully backward compatible. --- **Additional Labels to Consider:** - `perf` - Performance improvement - `docs` - Documentation updated - `enhancement` - Feature enhancement - `project-exec` - Related to execution planning This description follows the DataFusion contribution guidelines and provides clear information about the feature, implementation details, testing coverage, and user impact. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
