First draft of logging functionality.

2010-10-12 Thread Michal Sojka
On Tue, 12 Oct 2010, David Bremner wrote:
> On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka  
> wrote:
> 
> > I'm not sure whether implementing logging facility outside of notmuch
> > library is a good thing. If somebody will use a third-party tool (such
> > as python bindings) to manipulate his tags, they won't appear in the
> > log.
> 
> Yeah, thanks for that. I had been worrying about the same thing, but
> your message helped clarify things for me.
> 
> I guess log files should be opened in notmuch_database_open, and the
> actual logging in this case could happen from notmuch_message_add_tag.
> 
> From an atomicity point of view it might make more sense to store up a
> list of log lines, and dump them all from _notmuch_message_sync.
> One could attach a log buffer to a message, and flush that atomically
> when syncing the message back to the database.

Yes, the above sounds good to me.

> In this context, it is a little more tedious to have more than one log
> file.

Why multiple log files? You may have the buffers for message logs in
memory, right?

Another thing to keep in mind is how it will behave with 'notmuch
restore'. If do dump followed by resore you will have a very long log
with no useful information in it. Some optimization may be employed here.

Bye
-Michal


First draft of logging functionality.

2010-10-12 Thread Michal Sojka
On Mon, 11 Oct 2010, david at tethera.net wrote:
> The patches following this message are my first attempt at
> implementing atomic logging for notmuch.  The idea is that such logs
> could be useful in synchronizing notmuch instances.
> 
> Feedback of any kind is welcome. I'm particularly interested in
> comments about the log format and performance.

Hi David,

I'm not sure whether implementing logging facility outside of notmuch
library is a good thing. If somebody will use a third-party tool (such
as python bindings) to manipulate his tags, they won't appear in the
log.

-Michal


First draft of logging functionality.

2010-10-12 Thread David Bremner
On Tue, 12 Oct 2010 13:38:42 +0200, Michal Sojka  wrote:
> 
> Why multiple log files? You may have the buffers for message logs in
> memory, right?

Well, maybe one log for tag operations, one for adding messages, etc... 
But it is easy enough to mark log entries by what they are.

> 
> Another thing to keep in mind is how it will behave with 'notmuch
> restore'. If do dump followed by resore you will have a very long log
> with no useful information in it. Some optimization may be employed here.
> 

Yes. This seems hard to optimize internally, but I was thinking of some
"log compression" function that comes up with a minimal equivalent set
of operations. I had in mind that this could be used to sync:
concatenate all the logs, and then compress to a minimal set of
operations.  This is still not completely thought out...
Another issue is that the buffer could get rather big during a restore,
but this is presumably fixable by flushing it if it gets too large.

All the best,

David


First draft of logging functionality.

2010-10-12 Thread Rob Browning
david at tethera.net writes:

> The patches following this message are my first attempt at
> implementing atomic logging for notmuch.  The idea is that such logs
> could be useful in synchronizing notmuch instances.

For this to be completely safe, I suspect it may need to be adjusted to
do write ahead logging or something similar.  Otherwise operations could
be lost.

i.e. notmuch would atomically and safely write the intended (tag)
operation to the log, then perform the tag.  On startup, notmuch would
need to scan the log to detect and apply operations that hadn't been
fully completed (presumably due to a crash).

More generally, while thinking about sync/logging a few days ago, I
wondered about using sqlite.  That would help with atomicity, rollback,
synchronizing multiple readers/writers, etc.  It might also make
operations more efficient once we implement all the features we want.

For example, with the log information in sqlite, a separate notmuch sync
to another machine could be reading from the log (and with limitations,
writing) in parallel with normal notmuch operations.  Depending on how
we decide to handle sync with multiple peers, the log may also need to
track which peers have seen what, prune appropriately, etc.

Of course sqlite may not be appropriate, and would require performance
testing, etc., but we should probably think about the features we'll
eventually want, and consider how much work they're likely to require
with any given approach, regardless.

Hope this helps
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


First draft of logging functionality.

2010-10-12 Thread David Bremner
On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka  wrote:

> I'm not sure whether implementing logging facility outside of notmuch
> library is a good thing. If somebody will use a third-party tool (such
> as python bindings) to manipulate his tags, they won't appear in the
> log.

Yeah, thanks for that. I had been worrying about the same thing, but
your message helped clarify things for me.

I guess log files should be opened in notmuch_database_open, and the
actual logging in this case could happen from notmuch_message_add_tag.

>From an atomicity point of view it might make more sense to store up a
list of log lines, and dump them all from _notmuch_message_sync.
One could attach a log buffer to a message, and flush that atomically
when syncing the message back to the database. In this context, it is a
little more tedious to have more than one log file. 

d


Re: First draft of logging functionality.

2010-10-12 Thread Rob Browning
da...@tethera.net writes:

> The patches following this message are my first attempt at
> implementing atomic logging for notmuch.  The idea is that such logs
> could be useful in synchronizing notmuch instances.

For this to be completely safe, I suspect it may need to be adjusted to
do write ahead logging or something similar.  Otherwise operations could
be lost.

i.e. notmuch would atomically and safely write the intended (tag)
operation to the log, then perform the tag.  On startup, notmuch would
need to scan the log to detect and apply operations that hadn't been
fully completed (presumably due to a crash).

More generally, while thinking about sync/logging a few days ago, I
wondered about using sqlite.  That would help with atomicity, rollback,
synchronizing multiple readers/writers, etc.  It might also make
operations more efficient once we implement all the features we want.

For example, with the log information in sqlite, a separate notmuch sync
to another machine could be reading from the log (and with limitations,
writing) in parallel with normal notmuch operations.  Depending on how
we decide to handle sync with multiple peers, the log may also need to
track which peers have seen what, prune appropriately, etc.

Of course sqlite may not be appropriate, and would require performance
testing, etc., but we should probably think about the features we'll
eventually want, and consider how much work they're likely to require
with any given approach, regardless.

Hope this helps
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: First draft of logging functionality.

2010-10-12 Thread David Bremner
On Tue, 12 Oct 2010 13:38:42 +0200, Michal Sojka  wrote:
> 
> Why multiple log files? You may have the buffers for message logs in
> memory, right?

Well, maybe one log for tag operations, one for adding messages, etc... 
But it is easy enough to mark log entries by what they are.

> 
> Another thing to keep in mind is how it will behave with 'notmuch
> restore'. If do dump followed by resore you will have a very long log
> with no useful information in it. Some optimization may be employed here.
> 

Yes. This seems hard to optimize internally, but I was thinking of some
"log compression" function that comes up with a minimal equivalent set
of operations. I had in mind that this could be used to sync:
concatenate all the logs, and then compress to a minimal set of
operations.  This is still not completely thought out...
Another issue is that the buffer could get rather big during a restore,
but this is presumably fixable by flushing it if it gets too large.

All the best,

David
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: First draft of logging functionality.

2010-10-12 Thread Michal Sojka
On Tue, 12 Oct 2010, David Bremner wrote:
> On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka  wrote:
> 
> > I'm not sure whether implementing logging facility outside of notmuch
> > library is a good thing. If somebody will use a third-party tool (such
> > as python bindings) to manipulate his tags, they won't appear in the
> > log.
> 
> Yeah, thanks for that. I had been worrying about the same thing, but
> your message helped clarify things for me.
> 
> I guess log files should be opened in notmuch_database_open, and the
> actual logging in this case could happen from notmuch_message_add_tag.
> 
> From an atomicity point of view it might make more sense to store up a
> list of log lines, and dump them all from _notmuch_message_sync.
> One could attach a log buffer to a message, and flush that atomically
> when syncing the message back to the database.

Yes, the above sounds good to me.

> In this context, it is a little more tedious to have more than one log
> file.

Why multiple log files? You may have the buffers for message logs in
memory, right?

Another thing to keep in mind is how it will behave with 'notmuch
restore'. If do dump followed by resore you will have a very long log
with no useful information in it. Some optimization may be employed here.

Bye
-Michal
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: First draft of logging functionality.

2010-10-12 Thread David Bremner
On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka  wrote:

> I'm not sure whether implementing logging facility outside of notmuch
> library is a good thing. If somebody will use a third-party tool (such
> as python bindings) to manipulate his tags, they won't appear in the
> log.

Yeah, thanks for that. I had been worrying about the same thing, but
your message helped clarify things for me.

I guess log files should be opened in notmuch_database_open, and the
actual logging in this case could happen from notmuch_message_add_tag.

>From an atomicity point of view it might make more sense to store up a
list of log lines, and dump them all from _notmuch_message_sync.
One could attach a log buffer to a message, and flush that atomically
when syncing the message back to the database. In this context, it is a
little more tedious to have more than one log file. 

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: First draft of logging functionality.

2010-10-12 Thread Michal Sojka
On Mon, 11 Oct 2010, da...@tethera.net wrote:
> The patches following this message are my first attempt at
> implementing atomic logging for notmuch.  The idea is that such logs
> could be useful in synchronizing notmuch instances.
> 
> Feedback of any kind is welcome. I'm particularly interested in
> comments about the log format and performance.

Hi David,

I'm not sure whether implementing logging facility outside of notmuch
library is a good thing. If somebody will use a third-party tool (such
as python bindings) to manipulate his tags, they won't appear in the
log.

-Michal
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


First draft of logging functionality.

2010-10-11 Thread da...@tethera.net
The patches following this message are my first attempt at
implementing atomic logging for notmuch.  The idea is that such logs
could be useful in synchronizing notmuch instances.

Feedback of any kind is welcome. I'm particularly interested in
comments about the log format and performance.

In my tests, logging adds about a 10% speed penalty (tagging ~3700
messages) if enabled. I'd be curious if people for whom tagging is
slow could tell me if they take a bigger hit. If you want to test, add
a stanza like

[log]
tags = /path/to/logfile

to your notmuch config.

About the format, I am currently using

seconds-since-epoch json-quoted-message-id json-quoted-plus-minus-tag

I know some of you will not be thrilled with the quoting format; I'm
open to other ideas, but this is what was already existing in notmuch
code.  At the moment I wanted to push the quoting fairly far down the
stack and use the same for all potential logging, but perhaps this is
too much "design by convenient implementation".

Canadian-Thanksgiving-Greetings,

David


First draft of logging functionality.

2010-10-11 Thread david
The patches following this message are my first attempt at
implementing atomic logging for notmuch.  The idea is that such logs
could be useful in synchronizing notmuch instances.

Feedback of any kind is welcome. I'm particularly interested in
comments about the log format and performance.

In my tests, logging adds about a 10% speed penalty (tagging ~3700
messages) if enabled. I'd be curious if people for whom tagging is
slow could tell me if they take a bigger hit. If you want to test, add
a stanza like

[log]
tags = /path/to/logfile

to your notmuch config.

About the format, I am currently using

seconds-since-epoch json-quoted-message-id json-quoted-plus-minus-tag

I know some of you will not be thrilled with the quoting format; I'm
open to other ideas, but this is what was already existing in notmuch
code.  At the moment I wanted to push the quoting fairly far down the
stack and use the same for all potential logging, but perhaps this is
too much "design by convenient implementation".

Canadian-Thanksgiving-Greetings,

David
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch