[notmuch] Problems importing my mail...

2009-11-28 Thread Jeffrey Ollie
On Sat, Nov 28, 2009 at 12:09 AM, Carl Worth  wrote:
> On Fri, 27 Nov 2009 19:09:56 -0600, Jeffrey Ollie  wrote:
>>
>> $ ./notmuch new
>> Found 328184 total files.
>
> That's certainly not the largest number of messages we've seen indexed
> successfully by notmuch, (I think Keith has near 3 times that
> number). [Maybe notmuch should be reporting the total size of the mail
> store as well...]

Heh, I'm not done downloading them all yet, but I doubt that I'll hit
the 1M mark, maybe 500-600K.

>> Warning: Unexpected extra parts of multipart/signed. Indexing anyway.
>
> Oh, that's a warning I put in place because I wasn't sure if it was
> legitimate for a multipart/signed message to have more than two
> parts. I'd actually be interested to know if the mail is correct, (and I
> should just eliminate the warning), or if the mail is somehow malformed
> and the warning is correct.

No, I think it's legitimate to have multiple parts inside of a
multipart/signed (just very rare).  I've identified the message that
caused the warning.  I'm including it as an attachment, hopefully it
won't get tagged as spam because it's a response to a spam report that
I sent a while back.

>> Note: Ignoring non-mail file:
>> /home/jeff/mail/message/6/5/65c74c15a686187bb6bbf9958f494fc6b80068034a659a9ad44991b08c58f2d2
>> Note: Ignoring non-mail file:
>> /home/jeff/mail/message/7/9/7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451
>> Note: Ignoring non-mail file:
>> /home/jeff/mail/message/8/0/802071f7fcd8b0b74a19e1ca64e5468184fee0c9171bacb77ae1fe1669c426ee
>
> Those you should check to see if they actually do look like mail
> messages. Notmuch decides to ignore a file when it can't find any of the
> following headers: Subject:, From:, not To:.

Yes, all of those appear to not be complete mail messages, why they
are in one of my IMAP servers remains to be seen.

>> A Xapian exception occurred creating message: Db block overwritten -
>> are there multiple writers?
>> Error: A Xapian exception occurred. Halting processing.
>
> That's an error I've never seen before. We might want to talk to the
> Xapian folks to see what that could be. There's really no way there can
> be multiple writers here. So I don't know what the actual problem might
> be.
>
>> Internal error: Message with document ID of 175013 has no thread ID.
>> ?(lib/message.cc:353).
>> [jeff at max1 notmuch]$ ./notmuch new
>> Internal error: Message with document ID of 175013 has no thread ID.
>> ?(lib/message.cc:353).
>
> Hmm... we could probably do better here. The fatal error you're getting
> here is for an invariant that notmuch thinks is fairly important, (no
> mail document should exist without a thread ID). Meanwhile, however when
> adding a new message we do actually create a mail document in the
> database, and only later resolve its thread ID and add that to the
> database as well. A better solution would be to resolve the thread ID
> before adding anything to the database so that this invariant would
> never be violated.
>
> Some people have been proposing a "notmuch gc" command or so for
> cleaning up problems like this.
>
> In the meantime, you could explore the current state of your database by
> changing the code that's currently giving you an internal error to
> instead return a fake thread ID. For example:
>
> ? ?if (i == message->doc.termlist_end () || id[0] != *prefix)
> ? ? ? ?message->thread_id = talloc_strdup (message, 
> "");
> ? ?else
> ? ? ? ?message->thread_id = talloc_strdup (message, id.c_str () + 1);

Unfortunately I deleted the database and am in the process of
recreating it with the verbose flag turned on.  So far the problem has
not occurred again.  So if there's a real bug somewhere I'm wondering
if there isn't a timing-related component to it.

-- 
Jeff Ollie
-- next part --
A non-text attachment was scrubbed...
Name: multipart.zip
Type: application/zip
Size: 5446 bytes
Desc: not available
URL: 



[notmuch] Problems importing my mail...

2009-11-27 Thread Carl Worth
On Sat, 28 Nov 2009 00:37:07 -0600, Jeffrey Ollie  wrote:
> On Sat, Nov 28, 2009 at 12:09 AM, Carl Worth  wrote:
> >> Warning: Unexpected extra parts of multipart/signed. Indexing anyway.
> >
> > Oh, that's a warning I put in place because I wasn't sure if it was
> > legitimate for a multipart/signed message to have more than two
> > parts. I'd actually be interested to know if the mail is correct, (and I
> > should just eliminate the warning), or if the mail is somehow malformed
> > and the warning is correct.
> 
> No, I think it's legitimate to have multiple parts inside of a
> multipart/signed (just very rare).

OK. Then it sounds like the warning flagged an actual bug.

What the code is trying to do is simply to avoid indexing the signature
part. It's currently doing this by skipping the second part of a
multipart/signed message. Perhaps instead it should just be skipping the
final part? (It appears that that would be correct given the example
message.)

> Yes, all of those appear to not be complete mail messages, why they
> are in one of my IMAP servers remains to be seen.

OK. I'll let you puzzle that piece out.

> Unfortunately I deleted the database and am in the process of
> recreating it with the verbose flag turned on.  So far the problem has
> not occurred again.  So if there's a real bug somewhere I'm wondering
> if there isn't a timing-related component to it.

Alright. Well, let us know if things go wrong again.

-Carl
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



[notmuch] Problems importing my mail...

2009-11-27 Thread Carl Worth
On Fri, 27 Nov 2009 19:09:56 -0600, Jeffrey Ollie  wrote:
> I'm having some problems importing my mail.  I've got quite a bit
> stored up, and some of them I'm sure are quite large.  After several
> hours I get the following.  Is notmuch running out of memory on me?

Comments below (as far as I can make them):

> $ ./notmuch new
> Found 328184 total files.

That's certainly not the largest number of messages we've seen indexed
successfully by notmuch, (I think Keith has near 3 times that
number). [Maybe notmuch should be reporting the total size of the mail
store as well...]

> Warning: Unexpected extra parts of multipart/signed. Indexing anyway.

Oh, that's a warning I put in place because I wasn't sure if it was
legitimate for a multipart/signed message to have more than two
parts. I'd actually be interested to know if the mail is correct, (and I
should just eliminate the warning), or if the mail is somehow malformed
and the warning is correct.

> Note: Ignoring non-mail file:
> /home/jeff/mail/message/6/5/65c74c15a686187bb6bbf9958f494fc6b80068034a659a9ad44991b08c58f2d2
> Note: Ignoring non-mail file:
> /home/jeff/mail/message/7/9/7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451
> Note: Ignoring non-mail file:
> /home/jeff/mail/message/8/0/802071f7fcd8b0b74a19e1ca64e5468184fee0c9171bacb77ae1fe1669c426ee

Those you should check to see if they actually do look like mail
messages. Notmuch decides to ignore a file when it can't find any of the
following headers: Subject:, From:, not To:.

> A Xapian exception occurred creating message: Db block overwritten -
> are there multiple writers?
> Error: A Xapian exception occurred. Halting processing.

That's an error I've never seen before. We might want to talk to the
Xapian folks to see what that could be. There's really no way there can
be multiple writers here. So I don't know what the actual problem might
be.

> Internal error: Message with document ID of 175013 has no thread ID.
>  (lib/message.cc:353).
> [jeff at max1 notmuch]$ ./notmuch new
> Internal error: Message with document ID of 175013 has no thread ID.
>  (lib/message.cc:353).

Hmm... we could probably do better here. The fatal error you're getting
here is for an invariant that notmuch thinks is fairly important, (no
mail document should exist without a thread ID). Meanwhile, however when
adding a new message we do actually create a mail document in the
database, and only later resolve its thread ID and add that to the
database as well. A better solution would be to resolve the thread ID
before adding anything to the database so that this invariant would
never be violated.

Some people have been proposing a "notmuch gc" command or so for
cleaning up problems like this.

In the meantime, you could explore the current state of your database by
changing the code that's currently giving you an internal error to
instead return a fake thread ID. For example:

if (i == message->doc.termlist_end () || id[0] != *prefix)
message->thread_id = talloc_strdup (message, 
"");
else
message->thread_id = talloc_strdup (message, id.c_str () + 1);

Good luck!

-Carl
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



[notmuch] Problems importing my mail...

2009-11-27 Thread Jeffrey Ollie
I'm having some problems importing my mail.  I've got quite a bit
stored up, and some of them I'm sure are quite large.  After several
hours I get the following.  Is notmuch running out of memory on me?

$ ./notmuch new
Found 328184 total files.
Warning: Unexpected extra parts of multipart/signed. Indexing anyway.
Note: Ignoring non-mail file:
/home/jeff/mail/message/6/5/65c74c15a686187bb6bbf9958f494fc6b80068034a659a9ad44991b08c58f2d2
Note: Ignoring non-mail file:
/home/jeff/mail/message/7/9/7902699be42c8a8e46fbbb4501726517e86b22c56a189f7625a6da49081b2451
Note: Ignoring non-mail file:
/home/jeff/mail/message/8/0/802071f7fcd8b0b74a19e1ca64e5468184fee0c9171bacb77ae1fe1669c426ee
A Xapian exception occurred creating message: Db block overwritten -
are there multiple writers?
Error: A Xapian exception occurred. Halting processing.
Internal error: Message with document ID of 175013 has no thread ID.
 (lib/message.cc:353).
[jeff at max1 notmuch]$ ./notmuch new
Internal error: Message with document ID of 175013 has no thread ID.
 (lib/message.cc:353).

-- 
Jeff Ollie