Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread Felipe Contreras
On Mon, Apr 3, 2023 at 5:46 AM David Bremner  wrote:
>
> David Bremner  writes:
>
> >
> > I'm intrigued (and indeed I hadn't really thought about the degree to
> > which we were re-inventing git-fast-import and friends); however so far
> > my experiments did not get far enough to say anything conclusive.
> >
>
> I did manage to finish, about 70 minutes elapsed.
>
> Although you'r probably right that a file of tags is the right
> representation (it is what git-annex uses also), I think we'd need to
> define a custom merge driver to take unions of lists in the same way
> that git-annex does. Otherwise merging will be less automagic than it is
> now.

I'm not familiar with git-annex, I would need to see an example of
such merging happening.

One advantage of using the fast-import format is that it's easy to
change it, or support multiple formats.

In fact, the format could be specified in the URL, like
`nm::1:$HOME/mail` for the current notmuch-git format, and
`nm::2:$HOME/mail` for the new.

-- 
Felipe Contreras
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread Felipe Contreras
On Mon, Apr 3, 2023 at 6:37 PM David Bremner  wrote:
>
> Felipe Contreras  writes:
>
> >
> > That should work to update existing tags, but how are we going to
> > detect if a message has disappeared? Or is that not a thing?
>
> Indeed the same thought had occurred to me not long ago. I remembered
> (belately) that I'd been through some similar thought process with nmbug.
> Messages can and do disappear. So for I guess that optimization not OK,
> at least not without some complications.
>
> > Does "lastmod:0.." get all the revisions? If so, it might make sense
> > to set $lastmod to 0 initially.
> >
> > Then we could unconditionally do:
> >
> > $db.query('lastmod:%d..' % $lastmod, sort: Notmuch::SORT_UNSORTED)
>
> That would work, but as you point out, we'd need to deal with deletions
> somehow. It occurs to me that wr_export also needs to be able to handle
> disappearing message-ids. I suppose like notmuch-restore it can just
> complain and skip any missing ones. It's tempting to try to do some kind
> of lazy cleanup at that point, but I don't really see how that fits with
> the remote-helper protocol.

We could have an external tool, something like `git-notmuch-fsck` or
something that the user has to regularly execute, as `git fsck` was in
the past.

Or we could say that after jumping a certain threshold of lastmod we
delete all the messages and start from scratch, perhaps every 1000
revisions.

Or maybe the query could generate a virtual tag if a message was
deleted since the previous lastmod (e.g. "nm::deleted"). Then it would
be trivial for the remote helper to tell that to git.

I lean towards the threshold, because that way the user doesn't need
to do anything, and there's no modifications needed in libnotmuch.

Cheers.

-- 
Felipe Contreras
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread David Bremner
Felipe Contreras  writes:

>
> That should work to update existing tags, but how are we going to
> detect if a message has disappeared? Or is that not a thing?

Indeed the same thought had occurred to me not long ago. I remembered
(belately) that I'd been through some similar thought process with nmbug.
Messages can and do disappear. So for I guess that optimization not OK,
at least not without some complications.

> Does "lastmod:0.." get all the revisions? If so, it might make sense
> to set $lastmod to 0 initially.
>
> Then we could unconditionally do:
>
> $db.query('lastmod:%d..' % $lastmod, sort: Notmuch::SORT_UNSORTED)

That would work, but as you point out, we'd need to deal with deletions
somehow. It occurs to me that wr_export also needs to be able to handle
disappearing message-ids. I suppose like notmuch-restore it can just
complain and skip any missing ones. It's tempting to try to do some kind
of lazy cleanup at that point, but I don't really see how that fits with
the remote-helper protocol.

d
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Data loss

2023-04-03 Thread Carl Worth
Ouch.

It's really unfortunate if notmuch-mutt makes it that easy to throw away
your email.

That sounds like a nasty bug that should be fixed in that program.

As for recovering, I suppose there _is_ a fair amount of detail in your
notmuch index from all of the position-indexed terms, (as long as you
haven't run "notmuch new" again since the mail was deleted).

If you still have a large Xapian database in the notmuch database
directory, it would be theoretically possible to recover a lot of the
email content. But I don't know that anyone has ever written a recovery
tool to help with that process.

-Carl

On Mon, Apr 03 2023, Fulvio Pizzigoni wrote:
> On Mon, Apr 03, 2023 at 08:39:02PM +0200, Michael J Gruber wrote:
>> Am Mo., 3. Apr. 2023 um 20:17 Uhr schrieb Fulvio Pizzigoni
>> :
>> >
>> >
>> >
>> >
>> >
>> >
>> > Hi Carl e thanks for your prompt answer.
>> >
>> > As you suggested, I add notmuch@notmuchmail.org email as well.
>> >
>> > This is what I did:
>> > fulvio@linux:~$ notmuch setup
>> > Your full name [fulvio]:
>> > Your primary email address [my address]: return
>> > Additional email address [Press 'Enter' if none]: return
>> > Top-level directory of your email archive [/home/fulvio/.mutt]: return
>> > Tags to apply to all new messages (separated by spaces) [unread inbox]: 
>> > return
>> > Tags to exclude when searching messages (separated by spaces) [deleted 
>> > spam]: return
>> > fulvio@linux:~$
>> >
>> > After this my .mutt directory (~ 4 GB di mail-boxess) appears so:
>> > fulvio@linux:~$ ll .mutt
>> > totale 12
>> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 cur
>> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 new
>> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 tmp
>> > fulvio@linux:~$ du -h .mutt
>> > 4,0K.mutt/new
>> > 4,0K.mutt/tmp
>> > 4,0K.mutt/cur
>> > 16K .mutt
>> >
>> > That's all.
>> >
>> > Terrible!
>> >
>> > What happened?
>> >
>> > Thanks in advance.
>> >
>> > Fulvio
>
> Hi Michael.
>
>> 
>> First of all: Do you have a back-up? 
>
> G, no :-((
>
>> 
>> notmuch itself does not delete mails, as Carl pointed out. In addition
>> to notmuch you mentioned notmuch-mutt. Did you run that manually or
>> using some mutt config? 
>
> Yes I did this manually:
> fulvio@linux:~$ notmuch-mutt -o .mutt/ search 
>
>> It creates a maildir of symlinks with search
>> results. In order to do so, it deletes the maildir ... 
>
> I think that's the cause.
>
> ... And I think it's irreparable. Am I wrong?
>
> Thanks in advance.
>
> Fulvio
>
>> Usually this  sits in a cache dir, though.
>> 
>> Michael
>> 
>> > On Mon, Apr 03, 2023 at 09:27:22AM -0700, Carl Worth wrote:
>> > > Hi Fulvio,
>> > >
>> > > I've never used notmuch-mutt.
>> > >
>> > > But notmuch itself doesn't delete any mail. It's really paranoid that
>> > > way, (knowing how valuable mail is).
>> > >
>> > > I would suggest you write an email to the notmuch@notmuchmail.org
>> > > mailing list where you will be able to reach more people likely to have
>> > > experience with all of the software you were using.
>> > >
>> > > And if you could provide more details on the actual steps you used, that
>> > > would be useful. For example, you said "configuration process" and "at
>> > > the end". But what actual commands were you running on those steps, for
>> > > example?
>> > >
>> > > -Carl
>> > >
>> > > On Mon, Apr 03 2023, Fulvio Pizzigoni wrote:
>> > > > Hi.
>> > > >
>> > > > I have installed packages notmuch and notmuch-mutt.
>> > > >
>> > > > During configuration process I have indicated the directory used from
>> > > > Mutt for your mail-boxes.
>> > > >
>> > > > The configuration process have not indicated any allert.
>> > > >
>> > > > At the end the mail-boxes in that directory was removed; 3 new
>> > > > empty sub-directory were created: cur, new, tmp.
>> > > >
>> > > > Was the data irremediably lost?
>> > > >
>> > > > Thanks in advance.
>> > > >
>> > > > Fulvio Pizzigoni
>> > ___
>> > notmuch mailing list -- notmuch@notmuchmail.org
>> > To unsubscribe send an email to notmuch-le...@notmuchmail.org
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Data loss

2023-04-03 Thread Fulvio Pizzigoni
On Mon, Apr 03, 2023 at 08:39:02PM +0200, Michael J Gruber wrote:
> Am Mo., 3. Apr. 2023 um 20:17 Uhr schrieb Fulvio Pizzigoni
> :
> >
> >
> >
> >
> >
> >
> > Hi Carl e thanks for your prompt answer.
> >
> > As you suggested, I add notmuch@notmuchmail.org email as well.
> >
> > This is what I did:
> > fulvio@linux:~$ notmuch setup
> > Your full name [fulvio]:
> > Your primary email address [my address]: return
> > Additional email address [Press 'Enter' if none]: return
> > Top-level directory of your email archive [/home/fulvio/.mutt]: return
> > Tags to apply to all new messages (separated by spaces) [unread inbox]: 
> > return
> > Tags to exclude when searching messages (separated by spaces) [deleted 
> > spam]: return
> > fulvio@linux:~$
> >
> > After this my .mutt directory (~ 4 GB di mail-boxess) appears so:
> > fulvio@linux:~$ ll .mutt
> > totale 12
> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 cur
> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 new
> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 tmp
> > fulvio@linux:~$ du -h .mutt
> > 4,0K.mutt/new
> > 4,0K.mutt/tmp
> > 4,0K.mutt/cur
> > 16K .mutt
> >
> > That's all.
> >
> > Terrible!
> >
> > What happened?
> >
> > Thanks in advance.
> >
> > Fulvio

Hi Michael.

> 
> First of all: Do you have a back-up? 

G, no :-((

> 
> notmuch itself does not delete mails, as Carl pointed out. In addition
> to notmuch you mentioned notmuch-mutt. Did you run that manually or
> using some mutt config? 

Yes I did this manually:
fulvio@linux:~$ notmuch-mutt -o .mutt/ search 

> It creates a maildir of symlinks with search
> results. In order to do so, it deletes the maildir ... 

I think that's the cause.

... And I think it's irreparable. Am I wrong?

Thanks in advance.

Fulvio

> Usually this  sits in a cache dir, though.
> 
> Michael
> 
> > On Mon, Apr 03, 2023 at 09:27:22AM -0700, Carl Worth wrote:
> > > Hi Fulvio,
> > >
> > > I've never used notmuch-mutt.
> > >
> > > But notmuch itself doesn't delete any mail. It's really paranoid that
> > > way, (knowing how valuable mail is).
> > >
> > > I would suggest you write an email to the notmuch@notmuchmail.org
> > > mailing list where you will be able to reach more people likely to have
> > > experience with all of the software you were using.
> > >
> > > And if you could provide more details on the actual steps you used, that
> > > would be useful. For example, you said "configuration process" and "at
> > > the end". But what actual commands were you running on those steps, for
> > > example?
> > >
> > > -Carl
> > >
> > > On Mon, Apr 03 2023, Fulvio Pizzigoni wrote:
> > > > Hi.
> > > >
> > > > I have installed packages notmuch and notmuch-mutt.
> > > >
> > > > During configuration process I have indicated the directory used from
> > > > Mutt for your mail-boxes.
> > > >
> > > > The configuration process have not indicated any allert.
> > > >
> > > > At the end the mail-boxes in that directory was removed; 3 new
> > > > empty sub-directory were created: cur, new, tmp.
> > > >
> > > > Was the data irremediably lost?
> > > >
> > > > Thanks in advance.
> > > >
> > > > Fulvio Pizzigoni
> > ___
> > notmuch mailing list -- notmuch@notmuchmail.org
> > To unsubscribe send an email to notmuch-le...@notmuchmail.org
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Data loss

2023-04-03 Thread Fulvio Pizzigoni
On Mon, Apr 03, 2023 at 03:38:21PM -0300, David Bremner wrote:
> Fulvio Pizzigoni  writes:
> >
> > After this my .mutt directory (~ 4 GB di mail-boxess) appears so:
> > fulvio@linux:~$ ll .mutt
> > totale 12
> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 cur
> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 new
> > drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 tmp
> 
> Hi Fulvio;

Hi David.

> 
> I don't understand your situation yet, but I have a preliminary
> question.
> 
> Is the date on your computer more or less correct? 

Yes, it is.

> You mention the
> directories cur, new, and tmp being created by running notmuch setup;

It is correct.
 
> on the other hand your listing above shows they already existed in
> February.

It is the date when I did the setup.

Thanks in advance.

Fulvio

> 
> d
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


[PATCH 2/2] lib: thread-safe s-expression query parser

2023-04-03 Thread Kevin Boulain
Follow-up of 6273966d, now that sfsexp 1.4.1 doesn't rely on globals
anymore by default (https://github.com/mjsottile/sfsexp/issues/21).

This simply defers the initial query generation to use the thread-safe
helper (xapian_query_match_all) instead of Xapian::Query::MatchAll.
---
 lib/parse-sexp.cc | 85 +--
 test/T810-tsan.sh |  1 -
 2 files changed, 53 insertions(+), 33 deletions(-)

diff --git a/lib/parse-sexp.cc b/lib/parse-sexp.cc
index 9cadbc13..825bb9f9 100644
--- a/lib/parse-sexp.cc
+++ b/lib/parse-sexp.cc
@@ -3,6 +3,7 @@
 #if HAVE_SFSEXP
 #include "sexp.h"
 #include "unicode-util.h"
+#include "xapian-extra.h"
 
 /* _sexp is used for file scope symbols to avoid clashing with
  * definitions from sexp.h */
@@ -53,68 +54,85 @@ operator& (_sexp_flag_t a, _sexp_flag_t b)
static_cast(a) & static_cast(b));
 }
 
+typedef enum {
+SEXP_INITIAL_MATCH_ALL,
+SEXP_INITIAL_MATCH_NOTHING,
+} _sexp_initial_t;
+
+inline Xapian::Query
+_sexp_initial_query (_sexp_initial_t initial)
+{
+switch (initial) {
+case SEXP_INITIAL_MATCH_ALL:
+   return xapian_query_match_all ();
+case SEXP_INITIAL_MATCH_NOTHING:
+   return Xapian::Query::MatchNothing;
+}
+assert (! "unreachable");
+}
+
 typedef struct  {
 const char *name;
 Xapian::Query::op xapian_op;
-Xapian::Query initial;
+_sexp_initial_t initial;
 _sexp_flag_t flags;
 } _sexp_prefix_t;
 
 static _sexp_prefix_t prefixes[] =
 {
-{ "and",Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "and",Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_NONE },
-{ "attachment", Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "attachment", Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_FIELD | SEXP_FLAG_WILDCARD | SEXP_FLAG_EXPAND },
-{ "body",   Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "body",   Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_FIELD },
-{ "date",   Xapian::Query::OP_INVALID,  
Xapian::Query::MatchAll,
+{ "date",   Xapian::Query::OP_INVALID,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_RANGE },
-{ "from",   Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "from",   Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_FIELD | SEXP_FLAG_WILDCARD | SEXP_FLAG_REGEX | 
SEXP_FLAG_EXPAND },
-{ "folder", Xapian::Query::OP_OR,   
Xapian::Query::MatchNothing,
+{ "folder", Xapian::Query::OP_OR,   
SEXP_INITIAL_MATCH_NOTHING,
   SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN | SEXP_FLAG_WILDCARD | 
SEXP_FLAG_REGEX | SEXP_FLAG_EXPAND |
   SEXP_FLAG_PATHNAME },
-{ "id", Xapian::Query::OP_OR,   
Xapian::Query::MatchNothing,
+{ "id", Xapian::Query::OP_OR,   
SEXP_INITIAL_MATCH_NOTHING,
   SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN | SEXP_FLAG_WILDCARD | 
SEXP_FLAG_REGEX },
-{ "infix",  Xapian::Query::OP_INVALID,  
Xapian::Query::MatchAll,
+{ "infix",  Xapian::Query::OP_INVALID,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_SINGLE | SEXP_FLAG_ORPHAN },
-{ "is", Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "is", Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN | SEXP_FLAG_WILDCARD | 
SEXP_FLAG_REGEX | SEXP_FLAG_EXPAND },
-{ "lastmod",   Xapian::Query::OP_INVALID,  
Xapian::Query::MatchAll,
+{ "lastmod",   Xapian::Query::OP_INVALID,  
SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_RANGE },
-{ "matching",   Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "matching",   Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_DO_EXPAND },
-{ "mid",Xapian::Query::OP_OR,   
Xapian::Query::MatchNothing,
+{ "mid",Xapian::Query::OP_OR,   
SEXP_INITIAL_MATCH_NOTHING,
   SEXP_FLAG_FIELD | SEXP_FLAG_BOOLEAN | SEXP_FLAG_WILDCARD | 
SEXP_FLAG_REGEX },
-{ "mimetype",   Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "mimetype",   Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_FIELD | SEXP_FLAG_WILDCARD | SEXP_FLAG_EXPAND },
-{ "not",Xapian::Query::OP_AND_NOT,  
Xapian::Query::MatchAll,
+{ "not",Xapian::Query::OP_AND_NOT,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_NONE },
-{ "of", Xapian::Query::OP_AND,  
Xapian::Query::MatchAll,
+{ "of", Xapian::Query::OP_AND,  SEXP_INITIAL_MATCH_ALL,
   SEXP_FLAG_DO_EXPAND },
-{ "or", Xapian::Query::OP_OR,   
Xapian::Query::MatchNothing,
+{ "or", Xapian::Query::OP_OR,   

[PATCH 1/2] test: showcase thread-unsafe s-expression query parser

2023-04-03 Thread Kevin Boulain
The test fails quite reliably for me:
  T810-tsan: Testing run code with TSan enabled against the library
   PASS   create
   PASS   query
   FAIL   sexp query
  --- T810-tsan.3.EXPECTED2023-04-03 19:53:04.400771102 +
  +++ T810-tsan.3.OUTPUT  2023-04-03 19:53:04.402771109 +
  @@ -1,2 +1,44 @@
   == stdout ==
   == stderr ==
  +==
  +WARNING: ThreadSanitizer: data race (pid=21372)
  +  Read of size 4 at 0x7b100188 by thread T2:
  +#0 
Xapian::Internal::intrusive_ptr::intrusive_ptr(Xapian::Internal::intrusive_ptr
 const&) .../xapian/intrusive_ptr.h:107 (libnotmuch.so.5+0x3017b)
  +#1 Xapian::Query::Query(Xapian::Query const&) 
.../xapian/query.h:328 (libnotmuch.so.5+0x2fcbe)
  +#2 _sexp_to_xapian_query lib/parse-sexp.cc:707 
(libnotmuch.so.5+0x43f1f)
  +#3 _notmuch_sexp_string_to_xapian_query(_notmuch_database*, char 
const*, Xapian::Query&) lib/parse-sexp.cc:729 (libnotmuch.so.5+0x44207)
  +#4 _notmuch_query_ensure_parsed_sexpr lib/query.cc:240 
(libnotmuch.so.5+0x2c59e)
  +#5 _notmuch_query_ensure_parsed lib/query.cc:258 
(libnotmuch.so.5+0x2c646)
  +#6 _notmuch_query_search_documents lib/query.cc:362 
(libnotmuch.so.5+0x2cc0e)
  +#7 notmuch_query_search_messages lib/query.cc:350 
(libnotmuch.so.5+0x2cb76)
  +#8 thread CWD/test2.c:17 (test2+0x4012f4)
  +
  +  Previous write of size 4 at 0x7b100188 by thread T1:
  +#0 
Xapian::Internal::intrusive_ptr::intrusive_ptr(Xapian::Internal::intrusive_ptr
 const&) .../xapian/intrusive_ptr.h:107 (libnotmuch.so.5+0x3018e)
  +#1 Xapian::Query::Query(Xapian::Query const&) 
.../xapian/query.h:328 (libnotmuch.so.5+0x2fcbe)
  +#2 _sexp_to_xapian_query lib/parse-sexp.cc:707 
(libnotmuch.so.5+0x43f1f)
  +#3 _notmuch_sexp_string_to_xapian_query(_notmuch_database*, char 
const*, Xapian::Query&) lib/parse-sexp.cc:729 (libnotmuch.so.5+0x44207)
  +#4 _notmuch_query_ensure_parsed_sexpr lib/query.cc:240 
(libnotmuch.so.5+0x2c59e)
  +#5 _notmuch_query_ensure_parsed lib/query.cc:258 
(libnotmuch.so.5+0x2c646)
  +#6 _notmuch_query_search_documents lib/query.cc:362 
(libnotmuch.so.5+0x2cc0e)
  +#7 notmuch_query_search_messages lib/query.cc:350 
(libnotmuch.so.5+0x2cb76)
  +#8 thread CWD/test2.c:17 (test2+0x4012f4)
  +
  +  Location is heap block of size 56 at 0x7b100180 allocated by 
main thread:
  +#0 operator new(unsigned long)  (libtsan.so.2+0x8ba83)
  +#1 Xapian::Query::Query(std::__cxx11::basic_string, std::allocator > const&, unsigned int, unsigned 
int)  (libxapian.so.30+0x9f200)
  +#2 __static_initialization_and_destruction_0(int, int)  
(libxapian.so.30+0xa27ac)
  +#3 _GLOBAL__sub_I_query.cc  (libxapian.so.30+0xa286d)
  +#4 call_init  (ld-linux-x86-64.so.2+0x5e1d)
  +
  +  Thread T2 (tid=21375, running) created by main thread at:
  +#0 pthread_create  (libtsan.so.2+0x62de6)
  +#1 main CWD/test2.c:24 (test2+0x4013ba)
  +
  +  Thread T1 (tid=21374, running) created by main thread at:
  +#0 pthread_create  (libtsan.so.2+0x62de6)
  +#1 main CWD/test2.c:23 (test2+0x401380)
  +
  +SUMMARY: ThreadSanitizer: data race .../xapian/intrusive_ptr.h:107 
in 
Xapian::Internal::intrusive_ptr::intrusive_ptr(Xapian::Internal::intrusive_ptr
 const&)
  +==
  +ThreadSanitizer: reported 1 warnings
---
 test/T810-tsan.sh | 40 
 1 file changed, 40 insertions(+)

diff --git a/test/T810-tsan.sh b/test/T810-tsan.sh
index 7e877b27..c9008c6b 100755
--- a/test/T810-tsan.sh
+++ b/test/T810-tsan.sh
@@ -89,4 +89,44 @@ cat < EXPECTED
 EOF
 test_expect_equal_file EXPECTED OUTPUT
 
+if [ $NOTMUCH_HAVE_SFSEXP -eq 1 ]; then
+test_begin_subtest "sexp query"
+test_subtest_known_broken
+test_C ${MAIL_DIR} ${MAIL_DIR}-2 <
+#include 
+
+void *thread (void *arg) {
+  char *mail_dir = arg;
+  notmuch_database_t *db;
+  /*
+   * Query generation from s-expression used the tread-unsafe
+   * Xapian::Query::MatchAll.
+   */
+  EXPECT0(notmuch_database_open_with_config (mail_dir,
+ NOTMUCH_DATABASE_MODE_READ_ONLY,
+ NULL, NULL, , NULL));
+  notmuch_query_t *query;
+  EXPECT0(notmuch_query_create_with_syntax (db, "(from *)", 
NOTMUCH_QUERY_SYNTAX_SEXP, ));
+  notmuch_messages_t *messages;
+  EXPECT0(notmuch_query_search_messages (query, ));
+  return NULL;
+}
+
+int main (int argc, char **argv) {
+  pthread_t t1, t2;
+  EXPECT0(pthread_create (, NULL, thread, argv[1]));
+  EXPECT0(pthread_create (, NULL, thread, argv[2]));
+  EXPECT0(pthread_join (t1, 

[PATCH] ruby: query: fix get sort

2023-04-03 Thread Felipe Contreras
The order was wrong, right now `query.sort` doesn't return a number.

Signed-off-by: Felipe Contreras 
---
 bindings/ruby/query.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bindings/ruby/query.c b/bindings/ruby/query.c
index 8a2b4d3d..077def02 100644
--- a/bindings/ruby/query.c
+++ b/bindings/ruby/query.c
@@ -45,7 +45,7 @@ notmuch_rb_query_get_sort (VALUE self)
 
 Data_Get_Notmuch_Query (self, query);
 
-return FIX2INT (notmuch_query_get_sort (query));
+return INT2FIX (notmuch_query_get_sort (query));
 }
 
 /*
-- 
2.40.0+fc1

___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread Felipe Contreras
On Mon, Apr 3, 2023 at 2:40 PM David Bremner  wrote:
>
> David Bremner  writes:
>
> > Indeed that speeds up the initial clone on this machine from 39 minutes
> > (I switched machines) to 30s. I will play with it a bit more, and report
> > back.
>
> It's not a showstopper, but "git pull" takes about 1/2 the wall time
> (about 2/3 of the CPU time) of the original clone, even if there is only
> one tag changed.

Yes, every fetch should take as much time as the original clone.

> Two potential improvements I can think of.
>
> - notmuch-dump.c calls notmuch_query_set_sort (query,
>   NOTMUCH_SORT_UNSORTED). I think I managed to do this (diff below),
>   but performance gain was negligible.

OK.

> - Since you cache the lastmod value, you should be able to use it in a
>   query. This does make a big difference in my experiments. I had to
>   remove the 'deleteall' (otherwise only the changed messages are left
>   in the git repo). I'm not 100% this is correct, hopefully you see
>   quicker than I. In any case the lastmod query is what notmuch-git
>   uses.

That should work to update existing tags, but how are we going to
detect if a message has disappeared? Or is that not a thing?

> diff --git a/git-remote-nm b/git-remote-nm
> index c668b38..cabea26 100755
> --- a/git-remote-nm
> +++ b/git-remote-nm
> @@ -148,9 +148,11 @@ def wr_import(ref)
>wr_data("lastmod: %d\n" % ($lastmod || 0))
>wr_l 'from refs/notmuch/master^0' if $lastmod
>
> -  wr_l 'deleteall'
> +#  wr_l 'deleteall'
>
> -  $db.query('').search_messages.each do |msg|
> +  $query=$db.query("lastmod:%d.." % ($lastmod || 0) )

Does "lastmod:0.." get all the revisions? If so, it might make sense
to set $lastmod to 0 initially.

Then we could unconditionally do:

$db.query('lastmod:%d..' % $lastmod, sort: Notmuch::SORT_UNSORTED)

-- 
Felipe Contreras
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread David Bremner
David Bremner  writes:

> Indeed that speeds up the initial clone on this machine from 39 minutes
> (I switched machines) to 30s. I will play with it a bit more, and report
> back.

It's not a showstopper, but "git pull" takes about 1/2 the wall time
(about 2/3 of the CPU time) of the original clone, even if there is only
one tag changed.

Two potential improvements I can think of.

- notmuch-dump.c calls notmuch_query_set_sort (query,
  NOTMUCH_SORT_UNSORTED). I think I managed to do this (diff below),
  but performance gain was negligible. 

- Since you cache the lastmod value, you should be able to use it in a
  query. This does make a big difference in my experiments. I had to
  remove the 'deleteall' (otherwise only the changed messages are left
  in the git repo). I'm not 100% this is correct, hopefully you see
  quicker than I. In any case the lastmod query is what notmuch-git
  uses.

diff --git a/git-remote-nm b/git-remote-nm
index c668b38..cabea26 100755
--- a/git-remote-nm
+++ b/git-remote-nm
@@ -148,9 +148,11 @@ def wr_import(ref)
   wr_data("lastmod: %d\n" % ($lastmod || 0))
   wr_l 'from refs/notmuch/master^0' if $lastmod
 
-  wr_l 'deleteall'
+#  wr_l 'deleteall'
 
-  $db.query('').search_messages.each do |msg|
+  $query=$db.query("lastmod:%d.." % ($lastmod || 0) )
+  $query.sort=Notmuch::SORT_UNSORTED
+  $query.search_messages.each do |msg|
 hash = Blake2b.hex(msg.message_id, Blake2b::Key.none, 2)
 dir1, dir2 = hash[..1], hash[2..]
 wr_l 'M 644 inline %s/%s/%s/tags' % [dir1, dir2, 
encode_filename(msg.message_id)]


___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread David Bremner
Felipe Contreras  writes:

> By distributing the files in multiple directories like notmuch-git
> does using BLAKE2b, the operation is much faster.
>
> I've pushed the changes, now there's a dependency, but you can just
> `gem install blake2b`.
>
> I'm able to clone the database of the performance corpus in 5 seconds:
>
> % git clone --bare nm::$PWD/mail mail.git

Indeed that speeds up the initial clone on this machine from 39 minutes
(I switched machines) to 30s. I will play with it a bit more, and report
back.

I had just finished a pretty graph showing nonlinear growth of the old
version, but I guess nobody cares now ;)

d

___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Data loss

2023-04-03 Thread Michael J Gruber
Am Mo., 3. Apr. 2023 um 20:17 Uhr schrieb Fulvio Pizzigoni
:
>
>
>
>
>
>
> Hi Carl e thanks for your prompt answer.
>
> As you suggested, I add notmuch@notmuchmail.org email as well.
>
> This is what I did:
> fulvio@linux:~$ notmuch setup
> Your full name [fulvio]:
> Your primary email address [my address]: return
> Additional email address [Press 'Enter' if none]: return
> Top-level directory of your email archive [/home/fulvio/.mutt]: return
> Tags to apply to all new messages (separated by spaces) [unread inbox]: return
> Tags to exclude when searching messages (separated by spaces) [deleted spam]: 
> return
> fulvio@linux:~$
>
> After this my .mutt directory (~ 4 GB di mail-boxess) appears so:
> fulvio@linux:~$ ll .mutt
> totale 12
> drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 cur
> drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 new
> drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 tmp
> fulvio@linux:~$ du -h .mutt
> 4,0K.mutt/new
> 4,0K.mutt/tmp
> 4,0K.mutt/cur
> 16K .mutt
>
> That's all.
>
> Terrible!
>
> What happened?
>
> Thanks in advance.
>
> Fulvio

First of all: Do you have a back-up?

notmuch itself does not delete mails, as Carl pointed out. In addition
to notmuch you mentioned notmuch-mutt. Did you run that manually or
using some mutt config? It creates a maildir of symlinks with search
results. In order to do so, it deletes the maildir ... Usually this
sits in a cache dir, though.

Michael

> On Mon, Apr 03, 2023 at 09:27:22AM -0700, Carl Worth wrote:
> > Hi Fulvio,
> >
> > I've never used notmuch-mutt.
> >
> > But notmuch itself doesn't delete any mail. It's really paranoid that
> > way, (knowing how valuable mail is).
> >
> > I would suggest you write an email to the notmuch@notmuchmail.org
> > mailing list where you will be able to reach more people likely to have
> > experience with all of the software you were using.
> >
> > And if you could provide more details on the actual steps you used, that
> > would be useful. For example, you said "configuration process" and "at
> > the end". But what actual commands were you running on those steps, for
> > example?
> >
> > -Carl
> >
> > On Mon, Apr 03 2023, Fulvio Pizzigoni wrote:
> > > Hi.
> > >
> > > I have installed packages notmuch and notmuch-mutt.
> > >
> > > During configuration process I have indicated the directory used from
> > > Mutt for your mail-boxes.
> > >
> > > The configuration process have not indicated any allert.
> > >
> > > At the end the mail-boxes in that directory was removed; 3 new
> > > empty sub-directory were created: cur, new, tmp.
> > >
> > > Was the data irremediably lost?
> > >
> > > Thanks in advance.
> > >
> > > Fulvio Pizzigoni
> ___
> notmuch mailing list -- notmuch@notmuchmail.org
> To unsubscribe send an email to notmuch-le...@notmuchmail.org
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Data loss

2023-04-03 Thread David Bremner
Fulvio Pizzigoni  writes:
>
> After this my .mutt directory (~ 4 GB di mail-boxess) appears so:
> fulvio@linux:~$ ll .mutt
> totale 12
> drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 cur
> drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 new
> drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 tmp

Hi Fulvio;

I don't understand your situation yet, but I have a preliminary
question.

Is the date on your computer more or less correct? You mention the
directories cur, new, and tmp being created by running notmuch setup; on
the other hand your listing above shows they already existed in
February.

d
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Data loss

2023-04-03 Thread Fulvio Pizzigoni






Hi Carl e thanks for your prompt answer.

As you suggested, I add notmuch@notmuchmail.org email as well.

This is what I did:
fulvio@linux:~$ notmuch setup 
Your full name [fulvio]: 
Your primary email address [my address]: return
Additional email address [Press 'Enter' if none]: return
Top-level directory of your email archive [/home/fulvio/.mutt]: return
Tags to apply to all new messages (separated by spaces) [unread inbox]: return
Tags to exclude when searching messages (separated by spaces) [deleted spam]: 
return
fulvio@linux:~$

After this my .mutt directory (~ 4 GB di mail-boxess) appears so:
fulvio@linux:~$ ll .mutt
totale 12
drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 cur
drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 new
drwxr-xr-x 2 fulvio fulvio 4096 18 feb 20.32 tmp
fulvio@linux:~$ du -h .mutt
4,0K.mutt/new
4,0K.mutt/tmp
4,0K.mutt/cur
16K .mutt

That's all.

Terrible!

What happened?

Thanks in advance.

Fulvio


On Mon, Apr 03, 2023 at 09:27:22AM -0700, Carl Worth wrote:
> Hi Fulvio,
> 
> I've never used notmuch-mutt.
> 
> But notmuch itself doesn't delete any mail. It's really paranoid that
> way, (knowing how valuable mail is).
> 
> I would suggest you write an email to the notmuch@notmuchmail.org
> mailing list where you will be able to reach more people likely to have
> experience with all of the software you were using.
> 
> And if you could provide more details on the actual steps you used, that
> would be useful. For example, you said "configuration process" and "at
> the end". But what actual commands were you running on those steps, for
> example?
> 
> -Carl
> 
> On Mon, Apr 03 2023, Fulvio Pizzigoni wrote:
> > Hi.
> >
> > I have installed packages notmuch and notmuch-mutt.
> >
> > During configuration process I have indicated the directory used from 
> > Mutt for your mail-boxes.
> >
> > The configuration process have not indicated any allert.
> >
> > At the end the mail-boxes in that directory was removed; 3 new
> > empty sub-directory were created: cur, new, tmp.
> >
> > Was the data irremediably lost?
> >
> > Thanks in advance.
> >
> > Fulvio Pizzigoni
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread Felipe Contreras
On Mon, Apr 3, 2023 at 4:49 AM David Bremner  wrote:

> Performance-wise the initial clone seems pretty slow. For my 600k
> messages I have been waiting a while now.  htop tells me that
> git-fast-import has about 45 minutes of CPU time at this point.  This
> machine is not that fast, but for comparison an initial (i.e. fresh
> repo, no caching) "notmuch git commit" takes about 15-20s.

I found the problem. If all the files are in the same directory, `git
fast-import` spends a lot of time comparing all the paths.

By distributing the files in multiple directories like notmuch-git
does using BLAKE2b, the operation is much faster.

I've pushed the changes, now there's a dependency, but you can just
`gem install blake2b`.

I'm able to clone the database of the performance corpus in 5 seconds:

% git clone --bare nm::$PWD/mail mail.git

Cheers.

-- 
Felipe Contreras
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread Felipe Contreras
On Mon, Apr 3, 2023 at 4:49 AM David Bremner  wrote:
>
> Felipe Contreras  writes:
>
> > Hi,
> >
> > I noticed you promoted notmuch-git as a user tool to toy around with it.
> >
> > Very quickly I realized that most of what it does is something I've
> > been working on for at least 10 years: making git work with other
> > tools.
> >
> > I presume you haven't heard of git remote-helpers [1], because they do
> > precisely what notmuch-git is trying to do.
> >
> > As a proof of concept I created a remote helper for notmuch [2]. If
> > you have this script (`git-remote-nm`) anywhere in your path, git will
> > interpret URLs prefixed with "nm::" as notmuch transports, and you can
> > do:
> >
> >   git clone nm::$HOME/mail
>
> I'm intrigued (and indeed I hadn't really thought about the degree to
> which we were re-inventing git-fast-import and friends); however so far
> my experiments did not get far enough to say anything conclusive.
>
> I tried your script with the bindings from master (554690) but it does
> not seem to like my split configuration, where the database lives in
> ~/.local/share/share/notmuch/default/xapian.

Just clone the xapian database instead of the Maildir:

% git clone nm::$HOME/.local/share/share/notmuch/default/

> Performance-wise the initial clone seems pretty slow. For my 600k
> messages I have been waiting a while now.  htop tells me that
> git-fast-import has about 45 minutes of CPU time at this point.  This
> machine is not that fast, but for comparison an initial (i.e. fresh
> repo, no caching) "notmuch git commit" takes about 15-20s.

That's weird. In my tests generating the fast-export output is almost
instantaneous, which means `git fast-import` is the one that is slow.

And it seems it starts to get slow after a certain point, so perhaps
it's not optimized to receive many files in one go.

> If you need a larger corpus of messages to play with, the notmuch
> performance suite includes about 400k messages, and running T00-new.sh
> will build a notmuch database that you can clone.

I tried that, the database has 194562 messages, and it takes 1:43
minutes to clone in my machine.

It's weird it takes so long in your machine.

Can you try to hardcode a search query to limit the number of messages?

Just put something in here:

$db.query('').search_messages.each

Cheers.

-- 
Felipe Contreras
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread David Bremner
David Bremner  writes:

>
> I'm intrigued (and indeed I hadn't really thought about the degree to
> which we were re-inventing git-fast-import and friends); however so far
> my experiments did not get far enough to say anything conclusive.
>

I did manage to finish, about 70 minutes elapsed.

Although you'r probably right that a file of tags is the right
representation (it is what git-annex uses also), I think we'd need to
define a custom merge driver to take unions of lists in the same way
that git-annex does. Otherwise merging will be less automagic than it is
now.
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


[PATCH] CLI/git: add reset command

2023-04-03 Thread David Bremner
Sometimes merging is not what we want with tags; in particular it
tends to keep tags in the local repo that have been removed elsewhere.
This commit provides a new reset command; the reset itself is trivial,
but the work is to provide a safety check that uses the existing
--force and git.safe_fraction machinery.
---
 notmuch-git.py   | 37 +++--
 test/T850-git.sh | 46 +-
 2 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/notmuch-git.py b/notmuch-git.py
index 57098aae..80787814 100644
--- a/notmuch-git.py
+++ b/notmuch-git.py
@@ -369,7 +369,7 @@ class CachedIndex:
 _git(args=['read-tree', self.current_treeish], wait=True)
 
 
-def check_safe_fraction(status):
+def _check_fraction(change):
 safe = 0.1
 conf = _notmuch_config_get ('git.safe_fraction')
 if conf and conf != '':
@@ -380,7 +380,7 @@ def check_safe_fraction(status):
 _LOG.error('No existing tags with given prefix, 
stopping.'.format(safe))
 _LOG.error('Use --force to override.')
 exit(1)
-change = len(status['added'])+len(status['deleted'])
+
 fraction = change/total
 _LOG.debug('total messages {:d}, change: {:d}, fraction: 
{:f}'.format(total,change,fraction))
 if fraction > safe:
@@ -388,6 +388,25 @@ def check_safe_fraction(status):
 _LOG.error('Use --force to override or reconfigure git.safe_fraction.')
 exit(1)
 
+def check_safe_fraction(status):
+
+change = len(status['added'])+len(status['deleted'])
+_check_fraction(change)
+
+def check_diff_fraction():
+
+# check number of directories (i.e. messages) changed.
+change_set = set()
+
+with _git(args=['diff', '--name-only', 'HEAD', '@{upstream}'],
+  stdout=_subprocess.PIPE) as git:
+for path in git.stdout:
+change_set.add(_os.path.dirname(path))
+
+change=len(change_set)
+_check_fraction(change)
+
+
 def commit(treeish='HEAD', message=None, force=False):
 """
 Commit prefix-matching tags from the notmuch database to Git.
@@ -620,6 +639,15 @@ def push(repository=None, refspecs=None):
 _git(args=args, wait=True)
 
 
+def reset(force=False):
+"""
+reset the local git branch to match the remote one
+"""
+if not force:
+check_diff_fraction()
+
+_git(args=["reset","--soft","origin/master"],wait=True)
+
 def status():
 """
 Show pending updates in notmuch or git repo.
@@ -1048,6 +1076,7 @@ if __name__ == '__main__':
 'merge',
 'pull',
 'push',
+'reset',
 'status',
 ]:
 func = locals()[command]
@@ -1142,6 +1171,10 @@ if __name__ == '__main__':
 'Refspec (usually a branch name) to push.  See '
 'the  entry in the OPTIONS section of '
 'git-push(1) for other possibilities.'))
+elif command == 'reset':
+subparser.add_argument(
+'-f', '--force', action='store_true',
+help='reset a large fraction of tags.')
 
 args = parser.parse_args()
 
diff --git a/test/T850-git.sh b/test/T850-git.sh
index 55cec78a..ae6e7a03 100755
--- a/test/T850-git.sh
+++ b/test/T850-git.sh
@@ -213,6 +213,51 @@ cat < EXPECTED
 EOF
 test_expect_equal_file EXPECTED OUTPUT
 
+test_begin_subtest "reset"
+notmuch git -C reset.git -p '' clone remote.git
+notmuch git -C reset.git checkout --force
+notmuch tag +test4 id:20091117190054.gu3...@dottiness.seas.harvard.edu
+notmuch git -C remote.git commit --force
+notmuch tag -test4 id:20091117190054.gu3...@dottiness.seas.harvard.edu
+notmuch git -C reset.git fetch
+notmuch git -C reset.git reset
+notmuch git -C reset.git checkout --force
+notmuch dump id:20091117190054.gu3...@dottiness.seas.harvard.edu | grep -v 
'^#' > OUTPUT
+cat < EXPECTED
++inbox +signed +test2 +test3 +test4 +unread -- 
id:20091117190054.gu3...@dottiness.seas.harvard.edu
+EOF
+test_expect_equal_file EXPECTED OUTPUT
+
+test_begin_subtest "reset (require force for large change)"
+notmuch git -C reset2.git -p '' clone remote.git
+notmuch git -C reset2.git checkout --force
+notmuch tag +test5 '*'
+notmuch git -C remote.git commit --force
+notmuch tag -test5 '*'
+notmuch git -C reset2.git fetch
+test_expect_code 1 "notmuch git -C reset2.git -l debug reset"
+
+test_begin_subtest "reset (don't require force for large change to one 
message)"
+notmuch git -C reset3.git -p '' clone remote.git
+notmuch git -C reset3.git checkout --force
+notmuch dump id:20091117190054.gu3...@dottiness.seas.harvard.edu > BEFORE
+for tag in $(seq 1 100); do
+notmuch tag +$tag id:20091117190054.gu3...@dottiness.seas.harvard.edu
+done
+notmuch git -C remote.git commit --force
+notmuch restore < BEFORE
+notmuch git -C reset3.git fetch
+test_expect_code 0 "notmuch git -C reset3.git -l debug reset"
+
+test_begin_subtest "reset --force"
+notmuch git -C reset4.git -p '' clone remote.git

Re: Reimagining notmuch-git/nmbug

2023-04-03 Thread David Bremner
Felipe Contreras  writes:

> Hi,
>
> I noticed you promoted notmuch-git as a user tool to toy around with it.
>
> Very quickly I realized that most of what it does is something I've
> been working on for at least 10 years: making git work with other
> tools.
>
> I presume you haven't heard of git remote-helpers [1], because they do
> precisely what notmuch-git is trying to do.
>
> As a proof of concept I created a remote helper for notmuch [2]. If
> you have this script (`git-remote-nm`) anywhere in your path, git will
> interpret URLs prefixed with "nm::" as notmuch transports, and you can
> do:
>
>   git clone nm::$HOME/mail

I'm intrigued (and indeed I hadn't really thought about the degree to
which we were re-inventing git-fast-import and friends); however so far
my experiments did not get far enough to say anything conclusive.

I tried your script with the bindings from master (554690) but it does
not seem to like my split configuration, where the database lives in
~/.local/share/share/notmuch/default/xapian.

$ git clone nm::/home/bremner/Maildir 
Cloning into 'Maildir'...
/home/bremner/.config/scripts/git-remote-nm:164:in `initialize': failed to 
read/write file (Notmuch::FileError)
from /home/bremner/.config/scripts/git-remote-nm:164:in `new'
from /home/bremner/.config/scripts/git-remote-nm:164:in `'

If I make a fake .notmuch directory, then it seems to work.  I'm not
sure if this is an issue with the bindings or with the script.
Conceptually there is also the question of how to handle split
configurations as a URL.

Performance-wise the initial clone seems pretty slow. For my 600k
messages I have been waiting a while now.  htop tells me that
git-fast-import has about 45 minutes of CPU time at this point.  This
machine is not that fast, but for comparison an initial (i.e. fresh
repo, no caching) "notmuch git commit" takes about 15-20s.

If you need a larger corpus of messages to play with, the notmuch
performance suite includes about 400k messages, and running T00-new.sh
will build a notmuch database that you can clone.
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org