Karl Voit writes:

> Hello John,
> Great to read your thoughts on the topic - I am a huge admirer of
> your work and we both seem to cope with similar issues with
> Org-mode.

Thanks! I am an equal admirer of the Memacs package. I think we share
some common interests there. I came across the mother of all demos
recently: http://dougengelbart.org/events/1968-demo-highlights.html. It
was inspired by the early Memex ideas.

> * John Kitchin <jkitc...@andrew.cmu.edu> wrote:
>> One is to use the new dynamic module capability to write an org parser in
>> C, or a dedicated agenda function, which would presumably be faster than in
>> elisp.  This seems hard, and for me would certainly be a multiyear project
>> I am sure! The downside of this is the need to compile the module. I don't
>> know how easy it would be to make this work across platforms with the
>> relatively easy install org-mode currently has. This could have a side
>> benefit though of a c-lib that could be used by others to expand where
>> org-mode is used.
> I'm not a fan of C at all but having the parser in C with the
> possibility to use this parser for external tools as well sounds
> awesome to me. After all, I've written a primitive parser for a
> sub-set of Org-mode for https://github.com/novoid/lazyblorg using
> Python.

I am also not so fond of C, but I am starting to learn it these days. I
have had several use cases where I either generate or want to consume
org in something other than Emacs. I don't know that C is the right
thing to use, maybe a Cython library would work as well with a very thin
C interface. It is on my bucket list to learn how to do something like

>> The other way that might work is to rely more heavily on a cached version
>> of the files, perhaps in a different format than elisp, that is faster to
>> work with. The approach I have explored in this is to index org files into
>> a sqlite database. The idea then would be to generate the agenda from a sql
>> query. I use something like this already to "find stuff in orgmode
>> anywhere". One of the reasons I wrote this is the org-agenda list of files
>> isn't practical for me because my files are so scattered on my file system.
>> I had a need to be able to find TODOs in research projects in a pretty wide
>> range of locations.
>> The code I use is at
>> https://github.com/jkitchin/scimax/blob/master/org-db.el, and from one
>> database I can find headlines, contacts, locations, TODO headlines across
>> my file system, all the files that contain a particular link, and my own
>> recent org files.
> I didn't try org-db.el yet. So far, I survived using "git grep" and
> counsel-grep [0]

The main limitation with those for me is I don't know how you would find
a headline with an EMAIL property and some tag, for example, since those
are on different lines.

>> I am moderately motivated to switch from sqlite to MongoDB
> Is org-db.el your standard way of accessing informations or do you
> use it only for occasional searches where you assume that the usual
> methods would be slow?

I think of it as my extended memory. For the buffers that are currently
open/my agenda files/recent files/project files, I usually use the
regular tools to move around and search. org-db is more for jumping to a
headline I had open last month for example, or for files that are not in
my agenda. Or to get a list of headlines with EMAIL properties and some
tag combination from any org-file I have ever opened. This is just a
query (https://github.com/jkitchin/scimax/blob/master/org-db.el#L499) on
the sqlite db, and then I can build a completion command in helm or ivy
to select them. (My elisp cached contacts code is what I usually use
https://github.com/jkitchin/scimax/blob/master/contacts.el). I don't
claim it is optimal, but it handles about 6200 contacts from 32
org-files easily!

>> The main point of the database was to get a query language, persistence and
>> good performance. I have also used caches to speed up using bibtex files,
>> and my org-contacts with reasonable performance. These have been all elisp,
>> with no additional dependencies. Maybe one could do something similar to
>> keep an agenda cache that is persistent and updated via hook functions.
> Oh yeah. My org-contacts were unusable without at least some minor
> performance improvements as well. Most important to me: improving
> manipulation of properties using [1].

The other reason I wanted org-db is also performance related. I have
close to 3400 (and growing) org files indexed in it and I want to search
across them, but I don't need them in my agenda list. I want a
structured search, e.g. to search specifically for headlines with
properties and tags, or links, or by file tags, src block, etc. I
haven't gotten completely there yet,

Another db approach is described here:
https://github.com/wvxvw/sphinx-mode. It is more comprehensive than my
sqlite implementation, and uses Sphinx on top of MySQL I think. I
haven't worked with it though.

> For example, org-set-property takes almost 20 seconds to give me its
> interactive input line in my main Org-mode file. This is a no-go.
> [1] helped me here a lot.
> [0] 
> https://github.com/novoid/dot-emacs/blob/master/config.org#optimizing-search-methods
> [1] 
> https://github.com/novoid/dot-emacs/blob/master/config.org#my-org-region-to-property--my-map-p

Professor John Kitchin
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

Reply via email to