Thank you for your response, Pengji.
On Sat, Sep 21, 2024 at 08:25:10AM +0800, Pengji Zhang wrote:
Hi Frederick,
Frederick Eaton <frede...@ofb.net> writes:
I am trying to figure out how to adapt a script I wrote for
filtering messages, to apply notmuch tags to each message. A
difficulty is that the messages are already in the Notmuch database,
because another tool has delivered them to a maildir and run
"notmuch new".
Now, Notmuch can provide me with the paths of all the new
(unfiltered) messages, which I can give to my script. The question I
have is, once the filter is done, how can the script tell Notmuch
which message to apply the tags to?
I am not sure if I understand you correctly. If the problem here is to
distinguish existing messages and new messages, would the config
option 'new.tags' work? For example, use
notmuch config set new.tags new
to give all new messages a 'new' tag.
No, I already have that configuration. The first sentence described what I
already know how to do, the second sentence is what I'm trying to do.
Suppose the filter script reads a message from a particular file and decides
that it is spam. How does the filter tell Notmuch that the message
corresponding to that file is spam? You seem to be saying below that the filter
script should extract the Message-ID and use it to identify the message to
Notmuch, since file paths of the messages are not indexed. Probably what my
script should be doing for each message is appending a line to a batch file
like this:
+spam -new -- id:some_message_id@foo
+inbox -new -- id:some_other@baz
and then passing the batch file to "notmuch tag"?
I've tentatively concluded that the best way to locate each message
in the Notmuch database is to extract the Message-ID and search for
it with "id:"? But the FAQ says that multiple messages can have the
same Message-ID (and some spam messages don't have one at all).
IIRC, in the Notmuch database tags are associated with message IDs, so
you probably do not need to worry about this.
This time, I'm not sure I understand.
If I could access the message using the filename that the script is
processing, it would seem slightly more reliable. It seems like
there should be some way to allow a Notmuch database entry to be
accessed directly by filename, without even creating a Notmuch-style
search query containing that filename, but rather by passing the
filename as a command-line argument to "notmuch". It would be nice
not to have to worry about quoting and unquoting.
I am not sure if this is useful, given that (presumably) Notmuch uses
message IDs as keys. Besides, those filenames are usually generated
automatically and quite cryptic.
It might be useful for the reasons I stated, namely in case the Message-ID does
not exist or is not unique.
When I try to search for a message using "path:", nothing seems to
work.
[...]
There were no results for any of the "path:" searches, although the
"id:" search worked. I am using version 0.32.2 and can update if
this may be related to a bug that was fixed in the past few years.
I have never used 0.32.2 so I am not sure if there are any
differences, but for version 0.38.3, the prefix "path:" is used to
search for messages in some *directory*, and the query should be
*relative* to the maildir.
I highly recommend the manual page 'notmuch-search-terms(7)' and also
other pages if you have time. They are informative and well written,
and very helpful for writing message processing scripts.
Thank you for interpreting that section for me. The manual pages may be
informative and well written, but if my opinion matters, then I think that they
could be made slightly clearer than they are. For example, explaining directly
to the user that there is no index of path names would help clarify what can be
done with the software. Also, a short example of using Notmuch in a filter
script would be useful in one of the manual pages, particularly illustrating
the case where the programmer wants to re-tag a message that is provided as a
file or on stdin.
My copy of the notmuch-search-terms manual page says:
path:<directory-path> or path:<directory-path>/** or path:/<regex>/
The path: prefix searches for email messages that are in partic-
ular directories within the mail store. The directory must be
specified relative to the top-level maildir (and without the
leading slash). ...
I see now that this text is only suggesting that Notmuch supports searches for directory names, but on first read it wasn't really clear to me whether
"directory-path" means a "path to a directory" or a "file path consisting of directories followed by a filename", particularly as there
is no obvious reason for Notmuch not to index filenames. I think "path:<directory>" would be clearer, and saying "The path: prefix matches email
messages that are stored in a specified directory on the filesystem, which must be specified relative to the top-level maildir, and here is how to find out what the
'top-level maildir' is when you have for example $HOME/mail/notmuch/ configured as your database path in ~/.notmuch-config ...". Even clearer would be to
explain why the "path:" search prefix only accepts directories, point out that it should be called "dir:" instead of "path:", and warn
the user that the search will be inefficient because there is no index of filenames.
Thank you,
Frederick
_______________________________________________
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org