On Monday, 8 June 2026 at 08:45:50 UTC, Mike Parker wrote:
We've got a handful of talks about LLMs lined up for DConf this year. As a sort of primer, you can read this new blog post by Danny Arends, ['Teaching an AI to Know Itself: Building a Local LLM Agent in D'](https://blog.dlang.org/2026/06/07/teaching-an-ai-to-know-itself-building-a-local-llm-agent-in-d/), in which he talks about [the local agentic LLM project he built in D](https://github.com/DannyArends/DLLM) called DLLM.
I wonder what are the benefits of using bindings to the model instead of HTTP requests. Most agentic software nowadays support both local models and remote models (OpenAI/Anrthopic).
And even though llama.cpp is a great project with a lot of research that is happening in the repository, it is not the most performant. And other inference engines are available https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-engine-apple-silicon
So HTTP interaction will allow to use any local inference engine or remote, as most of them support OpenAI API - which became the standard.
