[notmuch] Potential problem using Git for mail (was: Idea for storing tags)

2010-01-12 Thread martin f krafft
also sprach Scott Robinson  [2010.01.12.1644 +1300]:
> Then the whole structure is controlled via git.
> Conflict-resolution and sync comes for free.

I've just had a good think about this, also because the idea of
abandoning IMAP and using Git has been around for a while and
I have not really wrapped my head around it.

If the MDA delivers to Git, then potentially, you might get into
a situation where you cannot write your own changes back to the
repo. This is also a DoS scenario: I'll just keep sending you
e-mail, and if I manage to pass your mail filters, I'll basically
commit to your mail repository at regular intervals. Say those are
5 seconds. In order for you to write updates to the repo, e.g. to
update tags, then you would need to pull, rebase, and push all
within 5 seconds, for otherwise you'd try to push non-fast-forwards.

This a bit unrealistic, surely, but there's a real annoyance in it:
you'd have to pull/rebase/push until a push succeeds ? until you
found a time window between pull and push during which the MDA
didn't write to the repo. This might take a long time. If this
happens in the background by Cron, it's not a real concern, but if
this becomes a UI issue, I wouldn't know how to handle it.

-- 
martin | http://madduck.net/ | http://two.sentenc.es/

don't hate yourself in the morning -- sleep till noon.

spamtraps: madduck.bogus at madduck.net
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature (see http://martin-krafft.net/gpg/)
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/86fe3c96/attachment.pgp>


[notmuch] Idea for storing tags

2010-01-12 Thread martin f krafft
also sprach Scott Robinson  [2010.01.12.1644 +1300]:
> I wrote a script to store and sync my tags.
> 
>   * One filename per message-ID.
>   * Line-feed seperated tags in each file.
> 
> Then the whole structure is controlled via git.
> Conflict-resolution and sync comes for free.

How do you ensure that the external tag store and your mail store do
not go out of sync? I assume that mails without a tagfile are simply
untagged, so that's hardly the issue. However, if you delete a mail,
how do you ensure that the tag database is cleaned up?

Also, do you attach tags automatically, e.g. with procmail on the
server? If so, how do you initiate git-pull locally?

Would you consider sharing your script?

-- 
martin | http://madduck.net/ | http://two.sentenc.es/

"alle vorurteile kommen aus den eingeweiden."
 - friedrich nietzsche

spamtraps: madduck.bogus at madduck.net
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature (see http://martin-krafft.net/gpg/)
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/ea4c67fb/attachment.pgp>


[notmuch] Idea for storing tags

2010-01-12 Thread David A. Harding
On Tue, Jan 12, 2010 at 11:19:09AM +1300, martin f krafft wrote:
> I think [tag leakage] it makes in-headers unusable. After all, I don't
> ever want anyone else to know that I tag e-mails from my boss as
> "from-idiots", 

You can cryptographically hash tags so that third-parties can't read
the contents of the in-headers. For security, a salt should be appended
to the tag name to make dictionary attacks on the tags more difficult.
For their owners' convenience, mail clients will want a mapping of hash
to tag name.

> [...] pseudo-mails stored in Maildir and synchronised by IMAP

A single RFC2822 message can store the salt and hash-to-tag database. It
could contain a clear subject and directions to the end user not to move
or delete it. This would not, I think, terribly confuse existing mail
clients or their users.

-Dave
-- 
David A. HardingWebsite:  http://dtrt.org/
1 (609) 997-0765  Email:  dave at dtrt.org
Jabber/XMPP:  dharding at jabber.org


[notmuch] Potential problem using Git for mail (was: Idea for storing tags)

2010-01-12 Thread Jameson Rollins
On Tue, Jan 12, 2010 at 05:51:53PM +1300, martin f krafft wrote:
> If the MDA delivers to Git, then potentially, you might get into
> a situation where you cannot write your own changes back to the
> repo. This is also a DoS scenario: I'll just keep sending you
> e-mail, and if I manage to pass your mail filters, I'll basically
> commit to your mail repository at regular intervals. Say those are
> 5 seconds. In order for you to write updates to the repo, e.g. to
> update tags, then you would need to pull, rebase, and push all
> within 5 seconds, for otherwise you'd try to push non-fast-forwards.
> 
> This a bit unrealistic, surely, but there's a real annoyance in it:
> you'd have to pull/rebase/push until a push succeeds ? until you
> found a time window between pull and push during which the MDA
> didn't write to the repo. This might take a long time. If this
> happens in the background by Cron, it's not a real concern, but if
> this becomes a UI issue, I wouldn't know how to handle it.

What about if just the tag information is stored in the repository,
and not the mail itself?  In that case only the user would be pushing
into the repo and you wouldn't have to worry about the DoS scenario.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/b972331a/attachment.pgp>


[notmuch] Some Xapian tips and thoughts on rebuilding

2010-01-12 Thread Carl Worth
On Tue, 12 Jan 2010 09:46:14 +0800, Kan-Ru Chen  wrote:
> After compacting my database, the size shrunk significantly, but the
> number of messages also changed. Beware that you might lose messages
> after compacting if you are trying this.
> 
> run on xapian-svn r13824

Yikes. That's very discouraging. I'll try to do some more testing to see
if I can replicate that, and if so, look closer into what is happening.

-Carl
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/2e37c533/attachment.pgp>


[notmuch] Bug with commit 2e96464f9705be4ec772280cad71a6c9d5831e6f

2010-01-12 Thread Ali Polatel
racin at free.fr yazm??:
> Hello,
> 
> I just updated notmuch and now notmuch new cannot update my mail anymore... 
> It tells me that there are
> 700 files found, but tells that there's no new mail.
> 
> I did a git bisect, which tells me the first bad commit is commit 
> 2e96464f9705be4ec772280cad71a6c9d5831e6f.
> 
> I did not try to use the new xapian database or to update xapian; maybe this 
> is the problem.
> 
> I tested with several tools to get mail in the maildir format, including 
> mb2md and getmail, and I always get the problem.
Same problem here, I tried upgrading xapian to xapian-core-1.1.3_svn13824, the 
problem persists.
Here's what happens here:
3074 alip at harikalardiyari> rm -fr .maildir/.notmuch
3075 alip at harikalardiyari> notmuch new
Found 210302 total files (that's not much mail).
No new mail.
3076 alip at harikalardiyari> notmuch search from:alip at exherbo.org
3077 alip at harikalardiyari>

> 
> I will try to investigate a bit more.
> 
> 
> Matthieu

-- 
Regards,
Ali Polatel
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/2ddb912d/attachment-0001.pgp>


[notmuch] Idea for storing tags

2010-01-12 Thread martin f krafft
Folks, over in #notmuch, we just floated an idea that I'd like to
get out to you. We've been debating storing tags for messages.
Therefore I am cross-posting. Please forgive me.

So far, there are two approaches:

1. External database, which has the downside of not being
   synchronisable with standard IMAP, like the rest of your mail
   (assuming you use IMAP). Also, it's possible for mailstore and
   database to get out of sync.

2. In-headers, which has the downside of leaking (e.g. when
   bouncing), and incurs the risks associated with message rewrites
   (which I think is pretty much ignorable, but it's still there).
   Also, there's a performance issue, but in the context of an
   indexer like notmuch, this is negligible.

   The leakage is real, though and I think it makes in-headers
   unusable. After all, I don't ever want anyone else to know that
   I tag e-mails from my boss as "from-idiots", and I forward and
   bounce mail on a regular basis. I could tell my MTA to remove
   those headers, but I might forget to do that on a new system.

We also previously determined that IMAP keywords are pretty much
useless as they are stored per mailbox, not per message, not
standardised, and limited in their length anyway [0]. This also
means that we don't really need to investigate sensibly storing tags
in Maildir (e.g. with xattrs), because IMAP cannot transport them.

0. http://lists.madduck.net/pipermail/mailtags/2007-August/msg00016.html

Seriously, who implemented IMAPv4rev1 and what sort of crack were
they smoking??

I remember there was some KDE groupware contacts manager that used
IMAP to synchronise contacts. At first, this sounds horrible, but
when you detach IMAP from RFC822, it becomes a generic synchronising
protocol. The next step is then straight forward, and I want to
share this idea with you:

How about using pseudo-mails stored in Maildir and synchronised by
IMAP? E.g. every folder could have a subfolder .TAGS and if we find
a way to smartly pair messages between parent and subfolder, we'd
have a tag store alongside the mailstore it refers to, but without
the danger of leakage, and without having to rewrite messages.

The major problem with this is when clients don't understand this
"protocol", for then they will display all .TAGS folders as regular
IMAP folders, and try to treat the messages therein as regular
mails. Somewhere sometime this is bound to blow up and I don't
really know how to prevent that.

Anyway, the idea is out now. Thoughts?

-- 
martin | http://madduck.net/ | http://two.sentenc.es/

echo Prpv a\'rfg cnf har cvcr | tr Pacfghnrvp Cnpstuaeic

spamtraps: madduck.bogus at madduck.net
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature (see http://martin-krafft.net/gpg/)
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/66821671/attachment.pgp>


[notmuch] Bug with commit 2e96464f9705be4ec772280cad71a6c9d5831e6f

2010-01-12 Thread Carl Worth
On Mon, 11 Jan 2010 00:29:05 +0100 (CET), racin at free.fr wrote:
> I just updated notmuch and now notmuch new cannot update my mail anymore... 
> It tells me that there are
> 700 files found, but tells that there's no new mail.

Hi Mathieu,

Thanks for testing the new notmuch. I appreciate that. And I'm sorry to
hear that you're weren't pleasantly rewarded for your efforts. I tried
as well as I could to avoid bugs slipping through, but I also knew I
couldn't test for everything.

> I did a git bisect, which tells me the first bad commit is commit
> 2e96464f9705be4ec772280cad71a6c9d5831e6f.

Can you tell me what you were bisecting? Were you attempting the
database upgrade over again each time? Or just testing whether you could
update a post-upgraded database each time?

That commit looks remarkably benign. It doesn't really change anything
about how the database is interpreted or upgraded, (but merely adds a
list to defer the printing of a few filenames). Could you perform a test
to confirm the results of bisection---such as manually applying and
reverting this patch to the immediately preceding version and check
whether the behavior changes?

> I did not try to use the new xapian database or to update xapian;
> maybe this is the problem.

It shouldn't be. The new notmuch is intended to work just fin with an
old Xapian. And I *think* that I tested that at least a few times.

> I will try to investigate a bit more.

Thanks. I'll look forward to any further information you can provide.

And if it would be practical for you to share your email database with
me (privately, of course) so that I can investigate, I would be glad to
do that. Obviously, I'll understand if that's not possible. Just contact
me off-list if you'd like to arrange something like that.

-Carl
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/5c1b308e/attachment.pgp>


[notmuch] Some Xapian tips and thoughts on rebuilding

2010-01-12 Thread Kan-Ru Chen
On Sun, 10 Jan 2010 09:43:38 -0800, Carl Worth  wrote:
> Compacting your database
> 
> One final tip. I recently started experimenting with a Xapian feature
> for compacting a database. This is available only via a command-line
> program, (named xapian-compact in the 1.0 releases and
> xapian-compact-1.1 in the current Xapian from svn). This functionality
> is not yet available in the Xapian library interface or else I would
> probably make notmuch call it after building the database.
> 
> If you want to experiment with xapian-compact, you'll want to call it
> with a command something like the following:
> 
>  xapian-compact-1.1 --no-renumber ~/mail/.notmuch/xapian 
> ~/mail/.notmuch/xapian-compact
> 
> The --no-renumber argument is essential with a notmuch database, since
> (as of database format version 1), notmuch stores Xapian document IDs
> internally within terms. If you forget this, you'll find that all of
> your searches will return results that are unable to locate any of the
> filenames corresponding to your mail.

After compacting my database, the size shrunk significantly, but the
number of messages also changed. Beware that you might lose messages
after compacting if you are trying this.

run on xapian-svn r13824

> 
> After running the above command, you could then move your existing
> .notmuch/xapian away and move .notmuch/xapian-compact in its place to
> test, and then discard the original .notmuch/xapian if you're happy with
> the result.
> 
> For me, this compaction took my 5.0GB down to 3.1GB. So my database is
> now less than half the size of what I started with with flint, (and can
> conceivable be cached entirely within memory on my machine!), which is
> quite delightful.
> 

-- 
Kan-Ru Chen | http://kanru.info

Q: Why are my replies five sentences or less?
A: http://five.sentenc.es/
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20100112/31f4ff9d/attachment.pgp>