* Gideon Silberman Moro <gerardomor...@gmail.com> [2025-02-05 09:39]: > Hi everyone, > > I'm looking for a way to automatically link notes in Zetteldeft using AI. > Ideally, I'd like an approach that analyzes the content of my notes and > suggests or creates links between relevant ones. > > Has anyone experimented with integrating AI (e.g., LLMs, embeddings, or > external tools like OpenAI or local models) to automate or enhance > Zetteldeft's linking process? Are there existing Emacs packages or > workflows that could help with this (without the need of an API)?
Hi! You can automate linking in Zetteldeft using AI by leveraging local models or embeddings. Here's a quick approach: 1. **Embeddings**: Use a local model (e.g., Sentence Transformers) to generate embeddings for your notes. Compare embeddings to find semantic similarities and suggest links. Tools like `transformers` or `gensim` can help. Personally I work with Dynamic Knowledge Repository which in turn encompass Org documents and all other kinds of documents. So my information is ordered in the PostgreSQL database. Using Langchain and other tools for chunking is necessary in that sense, as by using chunks it is possible to augment it better and better find relevant documents. Though RAG could be used as well, it depends of course of how much data you have. My "Meta" Org is like 70,000 documents, and they are all hyperlinks, but what about hyperlinks within hyperlinks? So that purpose of hyperlinking automatically is possible by using either RAG or embeddings. Providing RAG or embeddings is rather easy when there is database involved, considering that vector type already exists in PostgreSQL. 2. **LLMs**: Run a local LLM (e.g., IBM Granite, or Microsoft Phi as fully free software) to analyze note content and suggest links. You can script this in Emacs Lisp over the Python. 3. **Emacs Packages**: I would not recommend any in this moment. Your request is very specific. I am making my own LLM functions, here is one of them that works and can be adjusted: (defun rcd-llm-llamafile (prompt &optional memory rcd-llm-model) "Send PROMPT to Llama file. Optional MEMORY and MODEL may be used." (let* ((rcd-llm-model (cond ((boundp 'rcd-llm-model) rcd-llm-model) (t "LLaMA_CPP"))) (memory (cond ((and memory rcd-llm-use-users-llm-memory) (concat "Following is user's memory, until the END-OF-MEMORY-TAG: \n\n" memory "\n\n END-OF-MEMORY-TAG\n\n")))) (prompt (cond (memory (concat memory "\n\n" prompt)) (t prompt))) (temperature 0.8) (max-tokens -1) (top-p 0.95) (stream :json-false) (buffer (let ((url-request-method "POST") (url-request-extra-headers '(("Content-Type" . "application/json") ("Authorization" . "Bearer no-key"))) (prompt (encode-coding-string prompt 'utf-8)) (url-request-data (encode-coding-string (setq rcd-llm-last-json (json-encode `((model . ,rcd-llm-model) (messages . [ ((role . "system") (content . "You are a helpful assistant. Answer short.")) ((role . "user") (content . ,prompt))]) (temperature . ,temperature) (max_tokens . ,max-tokens) (top_p . ,top-p) (stream . ,stream)))) 'utf-8))) (url-retrieve-synchronously ;; "http://127.0.0.1:8080/v1/chat/completions")))) "http://192.168.188.140:8080/v1/chat/completions")))) (rcd-llm-response buffer))) and you can read there it uses some memory if necessary, and that memory can be also the list of links which you would like to insert. So the solution could be in the simple function which context or system message contains the summaries and list of links, a simple prompt could instruct the LLM to hyperlink it all. Additionally you could use grammar instruction from llama.cpp No API needed if you stick to local models! Your idea is great. Let me say it this way, solution to your problem is so much closer then we think. It is just there, requires some tuning and it can already work. It requires planning of the knowledge. I don't want all links by hyperlinked just because they match, 70,000 documents is there, but I don't want them hyperlinked. I want specific hyperlinks hyperlinked. Many of them are also ranked, I worked with many. So I would like it by rank too. You have to plan first how to sort the information, which information, etc. Then you provide it to embeddings, but how? Where are you going to store vectors? Or RAG? Using PostgreSQL and vector type is good way to go. -- Jean Louis