mistercrunch opened a new pull request, #34642:
URL: https://github.com/apache/superset/pull/34642

   WiP
   
   ## Summary
   
   Implements Phase 1 of SQL expression validation for Apache Superset, 
providing a robust backend infrastructure that validates SQL expressions safely 
and efficiently. This feature adds validation capabilities to all SQL 
expression inputs across Superset, helping prevent syntax errors and improving 
user experience.
   
   ## Problem
   
   Currently, users can enter invalid SQL expressions in various places 
throughout Superset (calculated columns, custom metrics, filters, etc.) without 
any validation feedback until they try to execute queries. This leads to:
   
   - Poor user experience with late error detection
   - Potential security risks from malformed SQL
   - Risk of full table scans during validation attempts
   - Inconsistent error handling across different input locations
   
   ## Solution
   
   ### Backend Infrastructure ✅ Completed
   
   **Core Components:**
   
   1. **BaseEngineSpec.validate_sql_expression()** - Database-agnostic 
validation method
      - Builds safe validation queries with `WHERE 0=1` predicate to prevent 
execution
      - Integrates dataset `fetch_values_predicate` to prevent full table scans
      - Provides complete validation service with sophisticated error extraction
      - Returns structured results with `is_valid` flag and cleaned error 
messages
   
   2. **ValidateExpressionCommand** - Command pattern implementation
      - Delegates to engine specs for database-specific validation
      - Supports expression types: `column`, `metric`, `where`, `having` 
      - Handles dataset context and table/schema/catalog resolution
      - Backward compatible with existing `filter` type
   
   3. **New API Endpoint** - `/api/v1/database/{pk}/validate_expression/`
      - RESTful endpoint for validating SQL expressions
      - Comprehensive OpenAPI documentation
      - Proper error handling and response formatting
   
   4. **Frontend Components** - Reusable validation UI components
      - `SqlExpressionValidator` - Core validation component with error display
      - `SQLEditorWithValidation` - Enhanced SQL editor with built-in validation
      - TypeScript-compliant with proper error handling
   
   ### Safety Features
   
   - **Prevents Full Scans**: Uses `WHERE 0=1` combined with dataset predicates
   - **No SQL Execution**: Validates syntax without running actual queries  
   - **Error Cleanup**: Removes wrapper query artifacts from error messages
   - **Timeout Handling**: Respects `SQLLAB_VALIDATION_TIMEOUT` configuration
   - **Database Agnostic**: Works across all supported database engines
   
   ### Testing
   
   - **Comprehensive Backend Tests**: 11 unit tests covering all expression 
types, error scenarios, and edge cases
   - **Frontend Component Tests**: Full test coverage for validation UI 
components
   - **Type Safety**: All TypeScript compilation errors resolved
   - **Code Quality**: Passes mypy, ruff, eslint, and pre-commit hooks
   
   ## Architecture Benefits
   
   - **Extensible Design**: Future per-engine optimizations can be added easily
   - **Clean Separation**: Engine specs handle validation, commands orchestrate 
workflow
   - **Structured Errors**: Uses existing `SQLValidationAnnotation` for 
consistent error reporting
   - **Performance Focused**: Designed for sub-500ms validation response times
   
   ## Implementation Phases
   
   - ✅ **Phase 1**: Backend validation infrastructure (this PR)
   - 🔄 **Phase 2**: Frontend integration across all 6 SQL input locations  
   - 📋 **Phase 3**: Enhanced error messages and performance monitoring
   
   ## Files Changed
   
   ### Backend
   - `superset/db_engine_specs/base.py` - Core validation implementation (+315 
lines)
   - `superset/commands/database/validate_expression.py` - Command pattern 
(+133 lines) 
   - `superset/databases/api.py` - New API endpoint (+65 lines)
   - `superset/databases/schemas.py` - Request/response schemas (+35 lines)
   
   ### Frontend  
   - `src/components/SqlExpressionValidator/` - Reusable validation components
   - `src/components/SQLEditorWithValidation/` - Enhanced SQL editor
   - Integration updates in metric, filter, and column control components
   
   ### Tests
   - `tests/unit_tests/commands/databases/validate_expression_test.py` - 
Backend tests (+433 lines)
   - `src/components/*/**.test.tsx` - Frontend component tests
   
   ## Breaking Changes
   
   None. This is a purely additive feature that maintains backward 
compatibility.
   
   ## How to Test
   
   1. **Backend API Testing**:
      ```bash
      # Start Superset development environment
      # Test the validation endpoint
      curl -X POST 
"http://localhost:8088/api/v1/database/1/validate_expression/"; \
           -H "Content-Type: application/json" \
           -d '{"expression": "SELECT * FROM", "expression_type": "column"}'
      ```
   
   2. **Frontend Testing**: 
      - Navigate to chart creation
      - Add custom SQL metrics or calculated columns  
      - Validation button should appear with proper error feedback
   
   3. **Unit Tests**:
      ```bash
      pytest tests/unit_tests/commands/databases/validate_expression_test.py -v
      npm run test src/components/SqlExpressionValidator/
      ```
   
   ## Future Work
   
   This PR establishes the foundation for comprehensive SQL expression 
validation. Phase 2 will integrate validation buttons across all SQL input 
locations in the Superset UI, providing users with immediate feedback on 
expression validity.
   
   ---
   
   **Ready for Review** ✅
   - All tests passing
   - TypeScript compilation successful  
   - Code quality checks passed
   - Comprehensive documentation included


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Reply via email to