My takeaways from a conversation with Ihor

- Well, Org's own automatic cache (`org-element-cache-persistent`) is sparse, 
so it will not be useful directly.

- But the output of running `(org-element-parse-buffer)` in a given file could 
be written somewhere, via org-persist perhaps, and then I could work with those 
outputs manually.

- Reading each parse tree may be slow, so it may still make sense to keep a 
second-order cache, in the form of tables for where all SCHEDULED-timestamps 
are and things like that.

Thanks Ihor! 

I'll conclude a yes on the feasibility analysis.  I do wonder if it'd make 
sense for Org itself to have such an API (think a thin wrapper around 
org-element-parse-buffer results), because then org-agenda / org-ql / etc could 
be rewritten to make use of it, solving their perf issues no matter how many 
files you feed in.

Martin


On Fri, 23 May 2025 18:48:08 +0200 (CEST), "Martin Edström" 
<meedst...@runbox.eu> wrote:

> Hi
> 
> I've made the package org-mem (https://github.com/meedstrom/org-mem), which 
> has its own parser.
> 
> I'm wondering if it would be possible to write something similar to use Org's 
> own parse trees instead, since it does seem able to persist cache to disk.
> 
> Here's what I envision:
> 
> - Let's say at init, you're given a list of 2,000 files not yet cached, or 
> where the mtime shows recent change
> - One or more async Emacs processes can work over time to visit them, parse 
> them, and write the cached parse tree to disk
> - The main Emacs process can have an API for working with a cached parse tree 
> for a given file, without ever opening that file.
> - Packages can then query things like "is there an active timestamp anywhere 
> in these 2,000 files" and get an instant answer.
> 
> Before I write code -- is that realistic to do?
> 
> Martin Edström

Reply via email to