andrewmusselman opened a new issue, #22:
URL: https://github.com/apache/tooling-gofannon/issues/22

   ### Summary
   When users declare an output schema with multiple keys, the composer LLM 
frequently generates agent code returning `{outputText: ...}` only, ignoring 
the declared schema. The runtime validator at 
`dependencies.py:validate_output_against_schema` correctly flags the mismatch 
as a warning, but the generated code itself is wrong — every run produces 
validation noise and surprises users who expected their schema to be honored.
    
   ### Details
   **Symptom:**
   - User declares output schema `{count: integer, ratio: number}`.
   - Generated `run()` function returns `{outputText: "Found 42 items"}`.
   - Runtime emits `"missing required output keys"` + `"unexpected output key 
outputText"` warnings.
   - User sees neither the count nor the ratio in their downstream consumer; 
agent appears broken.
   **Root cause:**
   The composer prompt at `agent_factory/prompts.py` (around line 140, the 
`output_schema=output_schema_str` template binding) lists the schema in 
instructions, but the LLM doesn't reliably comply. Upstream of this, S2 (Runs 
page reads from saved agent doc) fixed the bug where the runs page was sending 
the *default* `{outputText: "string"}` schema as if it were the user's declared 
schema — so the user's actual schema now flows into the composer prompt — but 
the LLM still defaults to `{outputText}`.
    
   **Impact:**
   - Every generated agent with multi-key output is wrong on first generation.
   - User has to manually edit the `return` statement after generation.
   - Validation warnings clutter every run, masking real problems.
   ### Remediation
   Three angles, in increasing effort:
    
   1. **Strengthen the prompt.** Try few-shot examples of correct vs incorrect 
returns; more emphatic phrasing; place the schema in the user message instead 
of the system prompt for higher signal. **Recommended starting point.**
   2. **Validate generated code post-hoc.** Parse the AST, check the `return` 
statement matches the schema keys; if not, regenerate or surface an error 
before saving.
   3. **Generate the return statement directly from the schema.** Emit a 
`return {key1: ..., key2: ...}` skeleton the LLM fills in, rather than asking 
for free-form return.
   ### Acceptance Criteria
   - [ ] Fixed: Composer prompt updated with few-shot examples for multi-key 
returns
   - [ ] Test added: Schemas with 1, 2, 3, and 5 output keys all produce 
matching `return` statements
   - [ ] Test added: Schemas using non-string types (int, list, dict) honored 
in generated code
   - [ ] Eval added: Regression test comparing pre-fix and post-fix generation 
on 10 representative schemas
   - [ ] Documentation: prompt-engineering notes recorded in 
`agent_factory/prompts.py` comments
   ### References
   - File: `webapp/packages/api/user-service/agent_factory/prompts.py:140`
   - File: 
`webapp/packages/api/user-service/dependencies.py:validate_output_against_schema`
   - Tracker: FIXES.md item #3
   ### Priority
   **High** - Visible on every multi-key-output agent run; affects perceived 
product quality immediately. Independent of other items; smallest of the 
high-priority fixes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to