mistercrunch opened a new pull request, #34642: URL: https://github.com/apache/superset/pull/34642
WiP ## Summary Implements Phase 1 of SQL expression validation for Apache Superset, providing a robust backend infrastructure that validates SQL expressions safely and efficiently. This feature adds validation capabilities to all SQL expression inputs across Superset, helping prevent syntax errors and improving user experience. ## Problem Currently, users can enter invalid SQL expressions in various places throughout Superset (calculated columns, custom metrics, filters, etc.) without any validation feedback until they try to execute queries. This leads to: - Poor user experience with late error detection - Potential security risks from malformed SQL - Risk of full table scans during validation attempts - Inconsistent error handling across different input locations ## Solution ### Backend Infrastructure ✅ Completed **Core Components:** 1. **BaseEngineSpec.validate_sql_expression()** - Database-agnostic validation method - Builds safe validation queries with `WHERE 0=1` predicate to prevent execution - Integrates dataset `fetch_values_predicate` to prevent full table scans - Provides complete validation service with sophisticated error extraction - Returns structured results with `is_valid` flag and cleaned error messages 2. **ValidateExpressionCommand** - Command pattern implementation - Delegates to engine specs for database-specific validation - Supports expression types: `column`, `metric`, `where`, `having` - Handles dataset context and table/schema/catalog resolution - Backward compatible with existing `filter` type 3. **New API Endpoint** - `/api/v1/database/{pk}/validate_expression/` - RESTful endpoint for validating SQL expressions - Comprehensive OpenAPI documentation - Proper error handling and response formatting 4. **Frontend Components** - Reusable validation UI components - `SqlExpressionValidator` - Core validation component with error display - `SQLEditorWithValidation` - Enhanced SQL editor with built-in validation - TypeScript-compliant with proper error handling ### Safety Features - **Prevents Full Scans**: Uses `WHERE 0=1` combined with dataset predicates - **No SQL Execution**: Validates syntax without running actual queries - **Error Cleanup**: Removes wrapper query artifacts from error messages - **Timeout Handling**: Respects `SQLLAB_VALIDATION_TIMEOUT` configuration - **Database Agnostic**: Works across all supported database engines ### Testing - **Comprehensive Backend Tests**: 11 unit tests covering all expression types, error scenarios, and edge cases - **Frontend Component Tests**: Full test coverage for validation UI components - **Type Safety**: All TypeScript compilation errors resolved - **Code Quality**: Passes mypy, ruff, eslint, and pre-commit hooks ## Architecture Benefits - **Extensible Design**: Future per-engine optimizations can be added easily - **Clean Separation**: Engine specs handle validation, commands orchestrate workflow - **Structured Errors**: Uses existing `SQLValidationAnnotation` for consistent error reporting - **Performance Focused**: Designed for sub-500ms validation response times ## Implementation Phases - ✅ **Phase 1**: Backend validation infrastructure (this PR) - 🔄 **Phase 2**: Frontend integration across all 6 SQL input locations - 📋 **Phase 3**: Enhanced error messages and performance monitoring ## Files Changed ### Backend - `superset/db_engine_specs/base.py` - Core validation implementation (+315 lines) - `superset/commands/database/validate_expression.py` - Command pattern (+133 lines) - `superset/databases/api.py` - New API endpoint (+65 lines) - `superset/databases/schemas.py` - Request/response schemas (+35 lines) ### Frontend - `src/components/SqlExpressionValidator/` - Reusable validation components - `src/components/SQLEditorWithValidation/` - Enhanced SQL editor - Integration updates in metric, filter, and column control components ### Tests - `tests/unit_tests/commands/databases/validate_expression_test.py` - Backend tests (+433 lines) - `src/components/*/**.test.tsx` - Frontend component tests ## Breaking Changes None. This is a purely additive feature that maintains backward compatibility. ## How to Test 1. **Backend API Testing**: ```bash # Start Superset development environment # Test the validation endpoint curl -X POST "http://localhost:8088/api/v1/database/1/validate_expression/" \ -H "Content-Type: application/json" \ -d '{"expression": "SELECT * FROM", "expression_type": "column"}' ``` 2. **Frontend Testing**: - Navigate to chart creation - Add custom SQL metrics or calculated columns - Validation button should appear with proper error feedback 3. **Unit Tests**: ```bash pytest tests/unit_tests/commands/databases/validate_expression_test.py -v npm run test src/components/SqlExpressionValidator/ ``` ## Future Work This PR establishes the foundation for comprehensive SQL expression validation. Phase 2 will integrate validation buttons across all SQL input locations in the Superset UI, providing users with immediate feedback on expression validity. --- **Ready for Review** ✅ - All tests passing - TypeScript compilation successful - Code quality checks passed - Comprehensive documentation included -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org