Hi Vivek,
That's an interesting challenge with functions. Indeed currently there
really isn't any way to get a list of all the available functions in
the system.
Even the list of functions in BuiltInFunctions.java isn't entirely
complete. Functions can be added early on in bootstrap. Geospatial
functions do this.
It seems like having some kind of datasource function that would
return a list of functions, arity, and type restrictions would be
useful. The type part might be kind of hard, I can't think of how that
would work exactly off the top of my head. Certainly arity would be
very simple though.
One other complication is that some functions aren't really designed
to be called in a query. These are used internally by some kinds of
aggregates. So those would probably need to be filtered out by default
at least.

- Ian

On Wed, Jun 3, 2026 at 1:21 AM Vivek Gangavarapu
<[email protected]> wrote:
>
> Hi all,
>
> I'd like to share an AsterixDB MCP Server I've been building as part of
> GSoC — a  gateway that lets an LLM run read-only SQL++ against a live
> AsterixDB cluster.
>
> *Repo (fork & test welcome): *
> https://github.com/Vivek1106-04/asterixdb-mcp-server
>
> *  Run it locally:*
>   git clone https://github.com/Vivek1106-04/asterixdb-mcp-server.git
>   cd asterixdb-mcp-server
>   python3 -m venv .venv
>   source .venv/bin/activate
>   pip install -e ".[dev]"
>   export ASTERIXDB_MCP_CC_BASE_URL=http://localhost:19002   # your CC
>   asterixdb-mcp-server                                       # serves over
> stdio
>
>   Connect a client — for Claude Desktop, add to claude_desktop_config.json:
>
>   {
>     "mcpServers": {
>       "asterixdb": {
>         "command": "/absolute/path/to/.venv/bin/asterixdb-mcp-server",
>         "env": { "ASTERIXDB_MCP_CC_BASE_URL": "http://localhost:19002"; }
>       }
>     }
>   }
>
> Any MCP-capable client works the same way.  Full capability list, config,
> and security model are in the README and SECURITY.md.
>
> *Design Question — Built-in Functions Discovery:*
>
> I have hit one architectural roadblock regarding the list_functions tool
> that I wanted to get your clarity on. While reading through the
> documentation and source code to build the tool, I found out that AsterixDB
> has around 800 built-in functions.
>
> Currently, the tool successfully filters UDFs from Metadata.Function. For
> the built-ins, however, I am using a curated starter set of about 45
> functions (simple working code to test with the LLM).
>
> To give the LLM the full analytical power of the cluster, I am trying to
> figure out the best way for the MCP gateway to handle all 800 functions
> without building a heavy runtime Java-to-Python codegen pipeline.
>
> My proposed solution is to write a standalone Python extraction script that
> parses the AsterixDB source locally, generates a static
> builtin_functions.json catalog, and bundles it with the MCP package to be
> loaded into memory on server startup.
>
> Is this static approach a good direction, or should we consider other
> methods for exposing built-in function signatures to external clients?
>
> Feedback, bug reports, and thoughts on the functions catalog are very
> welcome!
>
> Thanks,
>
> Vivek

Reply via email to