andreahlert opened a new pull request, #12:
URL: https://github.com/apache/comdev/pull/12

   ## Problem
   
   `search_list` currently formats the first 30 emails of the backend response 
and prints `... and N more emails` for the rest. Because there is no `limit` or 
`offset` parameter, the remaining `N` entries are unreachable: re-querying the 
backend just returns the same first 30. For LLM clients doing list-wide 
analysis (e.g. enumerating every thread on a list, walking a month of archive), 
this hides most of the data behind a hard wall.
   
   Concrete example I hit while writing this PR: 
`[email protected]` has 93 emails / 46 threads. Calling 
`search_list` returns 30; the other 63 emails are invisible. Working around it 
requires chaining many narrow `from:` / `subject:` queries hoping to cover the 
full set, with no guarantee of completeness.
   
   ## Solution
   
   Two optional parameters on `search_list`:
   
   - `limit` (int, 1..200, default 30) — caps the rendered window
   - `offset` (int, >= 0, default 0) — skips the first N entries
   
   Defaults preserve current behaviour. The backend call is unchanged; both 
parameters only affect which slice of the already-fetched result set is 
formatted into the response. No new API hits, no new auth surface, no new env 
vars.
   
   The trailing summary is upgraded:
   
   - Before: `... and N more emails` (no way to reach them)
   - After: `Showing X-Y of Z.` plus `... K more emails. Re-query with offset=N 
to continue.` (deterministic pagination)
   
   ## Why now
   
   This is the first thing a long-running LLM session needs that the MCP 
doesn't have. The original 30-cap is the right default for browse-style use, 
but offers no escape hatch for analysis-style use. The 200 ceiling on `limit` 
keeps the worst-case response size sane (~40KB of summaries) while removing the 
"you can only ever see 30" wall.
   
   ## Backwards compatibility
   
   - No schema field renamed or removed.
   - Calling without `limit`/`offset` produces the same first 30 as before; 
only the trailing line text changes (`Showing 1-30 of 93.` vs `... and 63 more 
emails`).
   - No change to `get_email`, `get_thread`, `get_mbox`, or any auth/lifecycle 
tool.
   
   ## Test plan
   
   - [x] `npm test` (43 existing tests pass)
   - [x] `node --check index.js` (syntax clean)
   - [ ] Manual smoke test against `lists.apache.org`:
     - [ ] `search_list(list=dev-magpie, domain=airflow.apache.org)` — same 
first 30 as before, new footer line
     - [ ] `search_list(... limit=100)` — renders up to 93 (the full hit set)
     - [ ] `search_list(... limit=30, offset=30)` — renders emails 31-60, 
footer hints `offset=60`
     - [ ] `search_list(... offset=999)` — renders 0 emails, "offset is past 
the end" footer
   
   ## Notes for reviewers
   
   - I considered cursor-based pagination but the backend `/api/stats.lua` 
returns the full set in one response, so an opaque cursor would just be a 
re-encoded numeric offset. Plain `offset` is simpler and equivalent.
   - The same cap pattern exists on `parts.slice(0, 15)` (participants) and 
`truncate(text, 8000/4000/10000)` (bodies / mbox). Not touched here — scope of 
this PR is just `search_list`. Happy to follow up with a second PR if the 
design lands well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to