Nice! One thought is that we might need the function metadata to
provide some sort of hint about the function's latency - CBO is clearly
going to have to be extended to handle expensive functions. I've been
using some of the callable functions from Unstructured.io recently, and
those can take (many) seconds per call...! (They perform operations
like processing a page of PDF into a canonical JSON extract.) I imagine
that LLM-backed functions will be similar in terms of their latency
while they are "thinking".
Cheers,
Mike
On 5/7/26 3:32 PM, Ian Maxon wrote:
Hello fellow devs,
I made a draft trying to coalesce some of the ideas I've had for a
long time regarding a functionality that would be really nice to have:
External Functions. I think it should be pretty easy to implement at
least initially re-using some of the Python UDF infrastructure,
especially on the optimizer and runtime side. The document is here:
https://cwiki.apache.org/confluence/display/ASTERIXDB/APE+34%3A+External+Functions
This is certainly not a final or complete document, so I would
definitely appreciate any thoughts others might have.
Thanks!
- Ian