Re: [DISCUSS] [ASTERIXDB-3695] A Model Context Protocol (MCP) server

Toby Lehman Thu, 28 May 2026 08:56:57 -0700

Hey … how much AI are you guys employing in either design or code writing?
I’m wondering if AI can actually help yet with complex applications.


TJL

On Wed, May 27, 2026 at 11:26 PM Mike Carey <[email protected]> wrote:

> Wow - sounds like great progress so far - very nice!
>
> Cheers,
>
> Mike
>
> On 5/27/26 11:01 PM, Vivek Gangavarapu wrote:
> > Hello all,
> >
> > A short update on the MCP server work under ASTERIXDB-3695, so the dev
> list
> > knows what's coming over the next few weeks.
> >
> > The goal is an MCP server for AsterixDB. MCP (Model Context Protocol) is
> > the emerging standard that lets LLMs and AI agents — Claude Desktop, IDE
> > assistants, and similar clients — call external tools in a structured
> way.
> > The server lets one of these agents explore an AsterixDB instance and run
> > queries against it, without anyone writing custom glue for each client.
> It
> > runs as a separate sidecar process and talks to the cluster over the
> > existing HTTP API, so nothing inside AsterixDB itself has to change.
> >
> > A few principles are shaping the design. The server is strictly
> read-only —
> > every query is sent with readonly=true, so the CC itself rejects anything
> > that isn't a query; there are no writes and no DDL. The cluster stays the
> > single source of truth: the gateway keeps no query state of its own, so
> it
> > can't drift out of sync with the CC. Output is bounded by default, with
> > time and size limits on every query, so an agent can't accidentally pull
> an
> > unbounded result into its context window. And the tooling is
> columnar-aware
> > — when it reports a dataset's schema it also reports whether the dataset
> is
> > row- or column-formatted, so the agent can reason about access patterns
> > instead of guessing.
> >
> > Beyond running queries, a big part of the value is giving the agent
> > accurate context so it stops guessing. Two directions here. First,
> > functions: the plan is to expose the built-in function catalog along with
> > full signatures — argument counts, types, return shape — plus any
> > user-defined functions read from the metadata, so the agent uses real
> > function names and arities instead of inventing them. Second, query
> plans:
> > the query service can already return the optimized logical plan and the
> > Hyracks job (the same ones the web UI visualizes), and feeding those back
> > to the agent as context is a genuinely interesting lever — it lets the
> > model see how its query was actually compiled and executed, reason about
> > index usage, and refine the query rather than guessing blindly. That
> > plan/job introspection is one of the things I'm most keen to build out.
> >
> > On where it stands: there's an early prototype working end-to-end
> against a
> > live cluster. For development and testing I'm driving it with a local
> > Ollama model (a small quantized 7B), so the full loop — natural-language
> > request to tool call to executed SQL++ and back — can be exercised
> offline
> > without depending on a hosted model. An LLM client can already list the
> > datasets in an instance, inspect a chosen dataset's schema, and run a
> query
> > through the server and get results back; that path is proven, not just
> > mocked. That first slice covers query execution plus schema and dataset
> > discovery. The next slices build on it: asynchronous handling for
> > long-running queries, the function catalog and plan/index introspection
> > above, and the remaining safety guardrails. It's written in Python on the
> > official MCP SDK; I'll share the repo and a README once the layout
> settles.
> >
> > Early feedback is very welcome — especially from people who know the
> query
> > service and metadata internals well, since that's where I most want to
> > avoid wrong assumptions. Happy to take questions.
> >
> > Thanks,
> >   Vivek
> >

Re: [DISCUSS] [ASTERIXDB-3695] A Model Context Protocol (MCP) server

Reply via email to