davidzollo opened a new issue, #10651:
URL: https://github.com/apache/seatunnel/issues/10651

   ## Background
   
   SeaTunnel already has much of the foundation needed for AI-assisted task 
config generation:
   
   1. `PluginDiscovery.getPlugins()` can expose installed plugins together with 
`OptionRule`.
   2. `ConfigBuilder` can parse HOCON / JSON, apply variable substitution, and 
keep unresolved placeholders.
   3. `ConfigParserUtil.checkGraph` and `ConfigValidator` already provide graph 
and option-rule validation capabilities.
   
   However, SeaTunnel still lacks the minimum closed loop required to make 
AI-generated task configs accurate and reliable:
   
   1. `seatunnel.sh --check` is still not implemented 
(`SeaTunnelConfValidateCommand` is TODO).
   2. Connector metadata exposed by the current connector-check path is 
flattened into required / optional options and loses the full rule structure.
   3. Validation failures are mostly plain exception text and are not 
machine-readable.
   
   As a result, AI-generated configs cannot yet be reliably constrained by real 
connector metadata or validated through an official reusable path.
   
   ## Goal
   
   Enable AI to generate accurate SeaTunnel task config files by using real 
connector metadata as hard constraints and validating the final generated HOCON 
through an official dry-run path.
   
   The AI output should be the final SeaTunnel HOCON itself, not an 
intermediate DSL.
   
   ## Proposal
   
   Build an initial end-to-end flow for AI-assisted config generation:
   
   1. Select connectors only from currently installed plugins.
   2. Export full connector metadata from real `OptionRule` / `Option` / 
`SingleChoiceOption`.
   3. Generate SeaTunnel HOCON under metadata constraints.
   4. Validate the generated HOCON through a reusable config-level dry-run 
validation path.
   5. Return structured validation results and errors.
   
   ## Scope
   
   This issue includes:
   
   1. Implement `SeaTunnelConfValidateCommand` as the official validation entry.
   2. Add a reusable `JobConfigValidator`.
   3. Add structured `ValidationResult` / `ValidationError`.
   4. Enhance connector metadata export so full `OptionRule` can be returned as 
JSON.
   5. Add an initial AI-assisted generation entry that accepts a 
natural-language prompt and returns generated HOCON together with validation 
output.
   
   Metadata export must preserve the real rule structure, including:
   
   1. Required rules.
   2. Exclusive rules.
   3. Bundled rules.
   4. Conditional rules with the full `Expression` / `Condition` tree.
   5. `SingleChoiceOption` candidate values.
   
   ## Accuracy Constraints
   
   Accuracy should come from hard constraints rather than unconstrained 
prompting:
   
   1. Connector names must come from installed plugins only.
   2. Option keys must come from exported metadata only.
   3. Enum values must come from `SingleChoiceOption` only.
   4. Validation must run against the final generated HOCON.
   
   ## Non-goals
   
   This issue does not include:
   
   1. Binding to a specific LLM vendor SDK.
   2. Full Web / REST / UI integration.
   3. Full factory semantic validation for every connector.
   4. Schema discovery or external system probing for all connectors.
   5. Introducing a new intermediate DSL, `JobIntent`, or a custom renderer.
   
   ## Acceptance Criteria
   
   1. SeaTunnel provides an official validation entry via `seatunnel.sh --check 
-c job.conf`.
   2. Validation output is structured and machine-readable.
   3. Connector metadata can be exported as JSON with full `OptionRule` 
structure.
   4. AI generation is constrained by installed connectors and exported 
metadata.
   5. The generated result is final SeaTunnel HOCON and is returned together 
with validation output.
   6. Conditional rules preserve the full `Expression` / `Condition` tree 
instead of being flattened.
   
   ## Note
   
   The validator in this issue should be explicitly positioned as reusable 
config-level dry-run validation.
   
   The current runtime parse path may instantiate plugin implementations, so 
this issue should avoid over-claiming full runtime-equivalent validation in the 
first step.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to