cj-zhukov commented on PR #20474:
URL: https://github.com/apache/datafusion/pull/20474#issuecomment-3940458481

   ### High-Level Overview
   This PR introduces a validation script to prevent dangling configuration 
settings in sqllogictest (`.slt`) files.
   
   This PR Adds:
   - A new shell script `ci/scripts/check_slt_configs.sh` that scans all `.slt` 
files and detects DataFusion configuration options that are set to: `set 
datafusion.<config> = true` but are never reset back to: `set 
datafusion.<config> = false`
   - Updates to existing `.slt` files where dangling boolean configs were found.
   All identified configs have been explicitly reset to false to ensure files 
restore the default session state.
   - A new CI job: `sqllogictest-config-check` that runs the validation script 
and fails the workflow if any abandoned boolean configuration is detected.
   
   Current Limitations:
   - Only checks boolean flags set to `true` and id does not validate 
non-boolean configuration changes (e.g. default_catalog, default_schema, etc.)
   - Does not respect ordering: It only verifies that a matching = false exists 
somewhere in the file and It does not ensure the reset occurs after the 
corresponding = true.
   - Does not validate toggle correctness: it does not verify correct pairing, 
nesting, or the number of enable/disable occurrence
   
   Follow-Up Improvements:
   - Parse (`.slt`) files properly instead of relying on regex/grep.
   - Track configuration state transitions in order.
   - Validate correct pairing and restoration semantics.
   - Support non-boolean configuration values.
   This could be implemented in Rust using a proper parser (e.g., `nom`), 
similar to what is already used in `datafusion-examples`.
   I’m happy to work on a follow-up PR to evolve this validation into a more 
robust and structured implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to