First draft of logging functionality.
On Tue, 12 Oct 2010, David Bremner wrote: > On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka > wrote: > > > I'm not sure whether implementing logging facility outside of notmuch > > library is a good thing. If somebody will use a third-party tool (such > > as python bindings) to manipulate his tags, they won't appear in the > > log. > > Yeah, thanks for that. I had been worrying about the same thing, but > your message helped clarify things for me. > > I guess log files should be opened in notmuch_database_open, and the > actual logging in this case could happen from notmuch_message_add_tag. > > From an atomicity point of view it might make more sense to store up a > list of log lines, and dump them all from _notmuch_message_sync. > One could attach a log buffer to a message, and flush that atomically > when syncing the message back to the database. Yes, the above sounds good to me. > In this context, it is a little more tedious to have more than one log > file. Why multiple log files? You may have the buffers for message logs in memory, right? Another thing to keep in mind is how it will behave with 'notmuch restore'. If do dump followed by resore you will have a very long log with no useful information in it. Some optimization may be employed here. Bye -Michal
First draft of logging functionality.
On Mon, 11 Oct 2010, david at tethera.net wrote: > The patches following this message are my first attempt at > implementing atomic logging for notmuch. The idea is that such logs > could be useful in synchronizing notmuch instances. > > Feedback of any kind is welcome. I'm particularly interested in > comments about the log format and performance. Hi David, I'm not sure whether implementing logging facility outside of notmuch library is a good thing. If somebody will use a third-party tool (such as python bindings) to manipulate his tags, they won't appear in the log. -Michal
First draft of logging functionality.
On Tue, 12 Oct 2010 13:38:42 +0200, Michal Sojka wrote: > > Why multiple log files? You may have the buffers for message logs in > memory, right? Well, maybe one log for tag operations, one for adding messages, etc... But it is easy enough to mark log entries by what they are. > > Another thing to keep in mind is how it will behave with 'notmuch > restore'. If do dump followed by resore you will have a very long log > with no useful information in it. Some optimization may be employed here. > Yes. This seems hard to optimize internally, but I was thinking of some "log compression" function that comes up with a minimal equivalent set of operations. I had in mind that this could be used to sync: concatenate all the logs, and then compress to a minimal set of operations. This is still not completely thought out... Another issue is that the buffer could get rather big during a restore, but this is presumably fixable by flushing it if it gets too large. All the best, David
First draft of logging functionality.
david at tethera.net writes: > The patches following this message are my first attempt at > implementing atomic logging for notmuch. The idea is that such logs > could be useful in synchronizing notmuch instances. For this to be completely safe, I suspect it may need to be adjusted to do write ahead logging or something similar. Otherwise operations could be lost. i.e. notmuch would atomically and safely write the intended (tag) operation to the log, then perform the tag. On startup, notmuch would need to scan the log to detect and apply operations that hadn't been fully completed (presumably due to a crash). More generally, while thinking about sync/logging a few days ago, I wondered about using sqlite. That would help with atomicity, rollback, synchronizing multiple readers/writers, etc. It might also make operations more efficient once we implement all the features we want. For example, with the log information in sqlite, a separate notmuch sync to another machine could be reading from the log (and with limitations, writing) in parallel with normal notmuch operations. Depending on how we decide to handle sync with multiple peers, the log may also need to track which peers have seen what, prune appropriately, etc. Of course sqlite may not be appropriate, and would require performance testing, etc., but we should probably think about the features we'll eventually want, and consider how much work they're likely to require with any given approach, regardless. Hope this helps -- Rob Browning rlb @defaultvalue.org and @debian.org GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
First draft of logging functionality.
On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka wrote: > I'm not sure whether implementing logging facility outside of notmuch > library is a good thing. If somebody will use a third-party tool (such > as python bindings) to manipulate his tags, they won't appear in the > log. Yeah, thanks for that. I had been worrying about the same thing, but your message helped clarify things for me. I guess log files should be opened in notmuch_database_open, and the actual logging in this case could happen from notmuch_message_add_tag. >From an atomicity point of view it might make more sense to store up a list of log lines, and dump them all from _notmuch_message_sync. One could attach a log buffer to a message, and flush that atomically when syncing the message back to the database. In this context, it is a little more tedious to have more than one log file. d
Re: First draft of logging functionality.
da...@tethera.net writes: > The patches following this message are my first attempt at > implementing atomic logging for notmuch. The idea is that such logs > could be useful in synchronizing notmuch instances. For this to be completely safe, I suspect it may need to be adjusted to do write ahead logging or something similar. Otherwise operations could be lost. i.e. notmuch would atomically and safely write the intended (tag) operation to the log, then perform the tag. On startup, notmuch would need to scan the log to detect and apply operations that hadn't been fully completed (presumably due to a crash). More generally, while thinking about sync/logging a few days ago, I wondered about using sqlite. That would help with atomicity, rollback, synchronizing multiple readers/writers, etc. It might also make operations more efficient once we implement all the features we want. For example, with the log information in sqlite, a separate notmuch sync to another machine could be reading from the log (and with limitations, writing) in parallel with normal notmuch operations. Depending on how we decide to handle sync with multiple peers, the log may also need to track which peers have seen what, prune appropriately, etc. Of course sqlite may not be appropriate, and would require performance testing, etc., but we should probably think about the features we'll eventually want, and consider how much work they're likely to require with any given approach, regardless. Hope this helps -- Rob Browning rlb @defaultvalue.org and @debian.org GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: First draft of logging functionality.
On Tue, 12 Oct 2010 13:38:42 +0200, Michal Sojka wrote: > > Why multiple log files? You may have the buffers for message logs in > memory, right? Well, maybe one log for tag operations, one for adding messages, etc... But it is easy enough to mark log entries by what they are. > > Another thing to keep in mind is how it will behave with 'notmuch > restore'. If do dump followed by resore you will have a very long log > with no useful information in it. Some optimization may be employed here. > Yes. This seems hard to optimize internally, but I was thinking of some "log compression" function that comes up with a minimal equivalent set of operations. I had in mind that this could be used to sync: concatenate all the logs, and then compress to a minimal set of operations. This is still not completely thought out... Another issue is that the buffer could get rather big during a restore, but this is presumably fixable by flushing it if it gets too large. All the best, David ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: First draft of logging functionality.
On Tue, 12 Oct 2010, David Bremner wrote: > On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka wrote: > > > I'm not sure whether implementing logging facility outside of notmuch > > library is a good thing. If somebody will use a third-party tool (such > > as python bindings) to manipulate his tags, they won't appear in the > > log. > > Yeah, thanks for that. I had been worrying about the same thing, but > your message helped clarify things for me. > > I guess log files should be opened in notmuch_database_open, and the > actual logging in this case could happen from notmuch_message_add_tag. > > From an atomicity point of view it might make more sense to store up a > list of log lines, and dump them all from _notmuch_message_sync. > One could attach a log buffer to a message, and flush that atomically > when syncing the message back to the database. Yes, the above sounds good to me. > In this context, it is a little more tedious to have more than one log > file. Why multiple log files? You may have the buffers for message logs in memory, right? Another thing to keep in mind is how it will behave with 'notmuch restore'. If do dump followed by resore you will have a very long log with no useful information in it. Some optimization may be employed here. Bye -Michal ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: First draft of logging functionality.
On Tue, 12 Oct 2010 11:06:17 +0200, Michal Sojka wrote: > I'm not sure whether implementing logging facility outside of notmuch > library is a good thing. If somebody will use a third-party tool (such > as python bindings) to manipulate his tags, they won't appear in the > log. Yeah, thanks for that. I had been worrying about the same thing, but your message helped clarify things for me. I guess log files should be opened in notmuch_database_open, and the actual logging in this case could happen from notmuch_message_add_tag. >From an atomicity point of view it might make more sense to store up a list of log lines, and dump them all from _notmuch_message_sync. One could attach a log buffer to a message, and flush that atomically when syncing the message back to the database. In this context, it is a little more tedious to have more than one log file. d ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: First draft of logging functionality.
On Mon, 11 Oct 2010, da...@tethera.net wrote: > The patches following this message are my first attempt at > implementing atomic logging for notmuch. The idea is that such logs > could be useful in synchronizing notmuch instances. > > Feedback of any kind is welcome. I'm particularly interested in > comments about the log format and performance. Hi David, I'm not sure whether implementing logging facility outside of notmuch library is a good thing. If somebody will use a third-party tool (such as python bindings) to manipulate his tags, they won't appear in the log. -Michal ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
First draft of logging functionality.
The patches following this message are my first attempt at implementing atomic logging for notmuch. The idea is that such logs could be useful in synchronizing notmuch instances. Feedback of any kind is welcome. I'm particularly interested in comments about the log format and performance. In my tests, logging adds about a 10% speed penalty (tagging ~3700 messages) if enabled. I'd be curious if people for whom tagging is slow could tell me if they take a bigger hit. If you want to test, add a stanza like [log] tags = /path/to/logfile to your notmuch config. About the format, I am currently using seconds-since-epoch json-quoted-message-id json-quoted-plus-minus-tag I know some of you will not be thrilled with the quoting format; I'm open to other ideas, but this is what was already existing in notmuch code. At the moment I wanted to push the quoting fairly far down the stack and use the same for all potential logging, but perhaps this is too much "design by convenient implementation". Canadian-Thanksgiving-Greetings, David
First draft of logging functionality.
The patches following this message are my first attempt at implementing atomic logging for notmuch. The idea is that such logs could be useful in synchronizing notmuch instances. Feedback of any kind is welcome. I'm particularly interested in comments about the log format and performance. In my tests, logging adds about a 10% speed penalty (tagging ~3700 messages) if enabled. I'd be curious if people for whom tagging is slow could tell me if they take a bigger hit. If you want to test, add a stanza like [log] tags = /path/to/logfile to your notmuch config. About the format, I am currently using seconds-since-epoch json-quoted-message-id json-quoted-plus-minus-tag I know some of you will not be thrilled with the quoting format; I'm open to other ideas, but this is what was already existing in notmuch code. At the moment I wanted to push the quoting fairly far down the stack and use the same for all potential logging, but perhaps this is too much "design by convenient implementation". Canadian-Thanksgiving-Greetings, David ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch