On Wed, 21 Jan 2026 18:36:40 +0100,
Kirill A. Korinsky <[email protected]> wrote:
> 
> On Wed, 21 Jan 2026 18:18:29 +0100,
> Chris Cappuccio <[email protected]> wrote:
> > 
> > Kirill A. Korinsky [[email protected]] wrote:
> > > 
> > > Here a version where I:
> > > 1. added runtime dependency on textproc/ripgrep
> > > 2. patched required current_exe() usage
> > > 3. added README with sample how it can be used against llama.cpp server
> > > 
> > 
> > I like this. I had done a similar README. In model_providers you should add:
> > 
> > headers = {}
> > query_params = {}
> > 
> > They might not break anything today but they are of no use to llama.cpp and
> > the future is less certain. They are designed for openai's interface.
> > 
> 
> Do you mean something like that?
> 
>       headers = {}
>       query_params = {}
> 
>       model_provider = "local"
>       model = "local"
> 
>       [model_providers.local]
>       name = "llama-server"
>       base_url = "http://127.0.0.1:8080/v1";
>       wire_api = "chat"
> 
> 

BTW https://github.com/ggml-org/llama.cpp/pull/18486 was merged,

I'll cook an update for llama.cpp and ggml in a few days, and base on their
discussion they had made it in scope which allows to avoid write_api in
config.

If it works, I suggest to import it without deprecated wire_api = "chat" in
README.

-- 
wbr, Kirill

Reply via email to