davidzollo opened a new issue, #10651: URL: https://github.com/apache/seatunnel/issues/10651
## Background SeaTunnel already has much of the foundation needed for AI-assisted task config generation: 1. `PluginDiscovery.getPlugins()` can expose installed plugins together with `OptionRule`. 2. `ConfigBuilder` can parse HOCON / JSON, apply variable substitution, and keep unresolved placeholders. 3. `ConfigParserUtil.checkGraph` and `ConfigValidator` already provide graph and option-rule validation capabilities. However, SeaTunnel still lacks the minimum closed loop required to make AI-generated task configs accurate and reliable: 1. `seatunnel.sh --check` is still not implemented (`SeaTunnelConfValidateCommand` is TODO). 2. Connector metadata exposed by the current connector-check path is flattened into required / optional options and loses the full rule structure. 3. Validation failures are mostly plain exception text and are not machine-readable. As a result, AI-generated configs cannot yet be reliably constrained by real connector metadata or validated through an official reusable path. ## Goal Enable AI to generate accurate SeaTunnel task config files by using real connector metadata as hard constraints and validating the final generated HOCON through an official dry-run path. The AI output should be the final SeaTunnel HOCON itself, not an intermediate DSL. ## Proposal Build an initial end-to-end flow for AI-assisted config generation: 1. Select connectors only from currently installed plugins. 2. Export full connector metadata from real `OptionRule` / `Option` / `SingleChoiceOption`. 3. Generate SeaTunnel HOCON under metadata constraints. 4. Validate the generated HOCON through a reusable config-level dry-run validation path. 5. Return structured validation results and errors. ## Scope This issue includes: 1. Implement `SeaTunnelConfValidateCommand` as the official validation entry. 2. Add a reusable `JobConfigValidator`. 3. Add structured `ValidationResult` / `ValidationError`. 4. Enhance connector metadata export so full `OptionRule` can be returned as JSON. 5. Add an initial AI-assisted generation entry that accepts a natural-language prompt and returns generated HOCON together with validation output. Metadata export must preserve the real rule structure, including: 1. Required rules. 2. Exclusive rules. 3. Bundled rules. 4. Conditional rules with the full `Expression` / `Condition` tree. 5. `SingleChoiceOption` candidate values. ## Accuracy Constraints Accuracy should come from hard constraints rather than unconstrained prompting: 1. Connector names must come from installed plugins only. 2. Option keys must come from exported metadata only. 3. Enum values must come from `SingleChoiceOption` only. 4. Validation must run against the final generated HOCON. ## Non-goals This issue does not include: 1. Binding to a specific LLM vendor SDK. 2. Full Web / REST / UI integration. 3. Full factory semantic validation for every connector. 4. Schema discovery or external system probing for all connectors. 5. Introducing a new intermediate DSL, `JobIntent`, or a custom renderer. ## Acceptance Criteria 1. SeaTunnel provides an official validation entry via `seatunnel.sh --check -c job.conf`. 2. Validation output is structured and machine-readable. 3. Connector metadata can be exported as JSON with full `OptionRule` structure. 4. AI generation is constrained by installed connectors and exported metadata. 5. The generated result is final SeaTunnel HOCON and is returned together with validation output. 6. Conditional rules preserve the full `Expression` / `Condition` tree instead of being flattened. ## Note The validator in this issue should be explicitly positioned as reusable config-level dry-run validation. The current runtime parse path may instantiate plugin implementations, so this issue should avoid over-claiming full runtime-equivalent validation in the first step. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
