My takeaways from a conversation with Ihor - Well, Org's own automatic cache (`org-element-cache-persistent`) is sparse, so it will not be useful directly.
- But the output of running `(org-element-parse-buffer)` in a given file could be written somewhere, via org-persist perhaps, and then I could work with those outputs manually. - Reading each parse tree may be slow, so it may still make sense to keep a second-order cache, in the form of tables for where all SCHEDULED-timestamps are and things like that. Thanks Ihor! I'll conclude a yes on the feasibility analysis. I do wonder if it'd make sense for Org itself to have such an API (think a thin wrapper around org-element-parse-buffer results), because then org-agenda / org-ql / etc could be rewritten to make use of it, solving their perf issues no matter how many files you feed in. Martin On Fri, 23 May 2025 18:48:08 +0200 (CEST), "Martin Edström" <meedst...@runbox.eu> wrote: > Hi > > I've made the package org-mem (https://github.com/meedstrom/org-mem), which > has its own parser. > > I'm wondering if it would be possible to write something similar to use Org's > own parse trees instead, since it does seem able to persist cache to disk. > > Here's what I envision: > > - Let's say at init, you're given a list of 2,000 files not yet cached, or > where the mtime shows recent change > - One or more async Emacs processes can work over time to visit them, parse > them, and write the cached parse tree to disk > - The main Emacs process can have an API for working with a cached parse tree > for a given file, without ever opening that file. > - Packages can then query things like "is there an active timestamp anywhere > in these 2,000 files" and get an instant answer. > > Before I write code -- is that realistic to do? > > Martin Edström