Re: [ml] langchain runs local model officially

Undescribed Horrific Abuse, One Victim & Survivor of Many Thu, 06 Apr 2023 12:33:51 -0700

also llama.cpp is better in many ways

but in python with huggingface accelerate and the transformers package
it will spread between gpu and cpu ram, giving more total ram, if you
pass device_map='auto', and it will use fast mmap loading if you use a
safetensors model


note that huggingface's libs do tend to be somewhat crippled
user-focused things, maybe why i know them

Re: [ml] langchain runs local model officially

Reply via email to