Re: [notmuch] indexing mail?
On 01/23/2010 07:09 AM, Carl Worth wrote: Your original patch was sent as application/octet-stream which made it awkward to read, (I would have to manually save it rather than just being able to read it within emacs with notmuch). uum yeah thanks. I'll try to figure out how this works. But I've since pushed a separate patch to fix this bug. Please give it a try and let me know what you think. will do later. It takes around half an hour for my 60K mail on reiserfs, but it did take 10 minutes already on ext4. What operation is taking that long? notmuch new, rescanning my entire 60k mails on every single new message i get. I suggest having a different approach to feed new mail in, such as: for i in (fetchmail) do notmuch new $i done I'm still not sure what is slow for you, scanning 60k mails. :D That's not fixable, other then by not doing that. but I'm also not sure how the above would help. It doesn't scan all 60K individually but only the single new one. -- Arvid Asgaard Technologies ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
On 01/23/2010 03:29 PM, Arvid Picciani wrote: done I'm still not sure what is slow for you, scanning 60k mails. :D That's not fixable, other then by not doing that. but I'm also not sure how the above would help. It doesn't scan all 60K individually but only the single new one. Thunderfuck is a pain. it sent the mail with a different quoting then showing it, again. Desperatly need to get notmuch working :( -- Arvid Asgaard Technologies ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
On Sat, 23 Jan 2010 19:09:35 +1300, Carl Worth cwo...@cworth.org wrote: But I've since pushed a separate patch to fix this bug. Please give it a try and let me know what you think. I just gave it a try, and building failed because of a seeming misspelling on line 285 (`DT_UKNOWN'), from commit 344c48a47de23cc63f1885d850b82359d1a34064 . Fixing the misspelling fixed the build. Thanks, as always, for all your work on this. Best, Jesse ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
Hi Olly, Olly == Olly Betts o...@survex.com writes: Olly On 2010-01-15, Dirk-Jan C Binnema wrote: Olly == Olly Betts o...@survex.com writes: Olly Not a full patch, but I already posted what this code should look like Olly to handle both systems without d_type, and those which return DT_UNKNOWN: Olly http://article.gmane.org/gmane.mail.notmuch.general/1044 static gboolean _set_dtype (const char* path, struct dirent *entry) Olly Underscore prefixed identifiers are reserved by ISO C at file-scope; Olly using them yourself is undefined behaviour... Ah, thanks for reminding, I thought it was __ and _C (capital), but you are right: , (7.1.3 Reserved identifiers) | All identifiers that begin with an underscore and either an uppercase letter | or another underscore are always reserved for any use. | | — All identifiers that begin with an underscore are always reserved for use as | identifiers with file scope in both the ordinary and tag name spaces. ` /* we only care about dirs, regular files and links */ if (S_ISREG (statbuf.st_mode)) entry- d_type = DT_REG; else if (S_ISDIR (statbuf.st_mode)) entry- d_type = DT_DIR; else if (S_ISLNK (statbuf.st_mode)) entry- d_type = DT_LNK; Olly This addresses the case where the FS returns DT_UNKNOWN for d_type, Olly but doesn't deal with the case of platforms where struct dirent has Olly no d_type member - from the Linux readdir man page: Olly The only fields in the dirent structure that are mandated by Olly POSIX.1 are: d_name[], of unspecified size, with at most NAME_MAX Olly characters preceding the terminating null byte; and (as an XSI Olly extension) d_ino. The other fields are unstandardized, and not Olly present on all systems; see NOTES below for some further details. Olly And in NOTES: Olly Other than Linux, the d_type field is available mainly only on BSD Olly systems. Yes, my patch could me generalized a bit more, just like your patch could not hardcode the '/'-separator :) In practice though, what Unices in use today do not support d_type? Best wishes, Dirk, -- Dirk-Jan C. Binnema Helsinki, Finland e:d...@djcbsoftware.nl w:www.djcbsoftware.nl pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
On Fri, 15 Jan 2010 21:57:32 +0200, Dirk-Jan C. Binnema djcb.b...@gmail.com wrote: Olly Underscore prefixed identifiers are reserved by ISO C at file-scope; Olly using them yourself is undefined behaviour... Ah, thanks for reminding, I thought it was __ and _C (capital), but you are right: , (7.1.3 Reserved identifiers) | All identifiers that begin with an underscore and either an uppercase letter | or another underscore are always reserved for any use. | | — All identifiers that begin with an underscore are always reserved for use as | identifiers with file scope in both the ordinary and tag name spaces. ` But please don't be too strict about this. Please feel very free to use any identifier with a _notmuch prefix. And really, feel free to use just about any underscore-prefixed identifier that you want that doesn't clash with anything on your system. Then if we do identify an actual clash somewhere then we can fix it. I think it was stupid of Posix to steal _ and a reserved prefix, and I really don't have a problem ignoring that. This is like I described in a recent mail---trying to prevent all portability problems just isn't worth the effort. It's much easier to fix problems that actually occur in practice. -Carl pgpKA3Dr7krKE.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
On Thu, 14 Jan 2010 18:38:54 +0100, Adrian Perez de Castro ape...@igalia.com wrote: the offending commit is 2c4555f1a56602ff1dd55a63699810522ba4d91e from readdir (3): Currently, only some file systems (among them: Btrfs, ext2, ext3, and ext4) have full support returning the file type in d_type. All applications must properly handle a return of DT_UNKNOWN. Yes. The broken code was my mistake. I clearly didn't read the above warning closely enough. Sorry about that! I am using XFS, which always returns DT_UNKNOWN. Taking into account that there is a good deal of people using filesystems other than the ones you mention, and that other non-linux filesystems may also return DT_UNKNOWN, in my opinion there should be a fall-back. I will try to post a patch Anytime Soon™. We definitely want the fallback. I can attempt to code it, but I don't have ready access to an afflicted filesystem, so I'd need help testing anyway. I'd love to see a patch for this bug soon. Be sure to CC me when the patch is sent and that will help me commit it sooner. Also, I have the feeling that the d_type field from struct dirent may not be available in some OSes because it is a BSD extension. I'm generally quite bad at determining whether functionality I'm using in my software is non-portable. As proven in this case, even when the man page tells me something is not portable I don't always notice, (and often, the man pages aren't even that useful). Beyond that, even if something is *known* to be theoretically non-portable, it can be a waste of time to code compatibility paths that nobody will be running in practice. So I've basically gotten to the point where I just code for what works on my system, (not out of disregard for what other people run---just that it's impossible for me to know what subset of functionality is actually relevant). Then, at the same time, I'm quite happy to accept code to improve the portability when people note that things are broken on other systems. See the git history and email archives for examples of how we fixed strndup and getline portability problems. I know that wait for people to notice it's broken isn't the nicest thing we could do with our code. But I don't really know a much better way. I'm happy to entertain suggestions here. -Carl pgpUGYn1hAchH.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
On 2010-01-14, Carl Worth wrote: On Thu, 14 Jan 2010 18:38:54 +0100, Adrian Perez de Castro ape...@igalia.com wrote: I am using XFS, which always returns DT_UNKNOWN. Taking into account that there is a good deal of people using filesystems other than the ones you mention, and that other non-linux filesystems may also return DT_UNKNOWN, in my opinion there should be a fall-back. I will try to post a patch Anytime Soon=E2=84=A2. We definitely want the fallback. I can attempt to code it, but I don't have ready access to an afflicted filesystem, so I'd need help testing anyway. I'd love to see a patch for this bug soon. Be sure to CC me when the patch is sent and that will help me commit it sooner. Not a full patch, but I already posted what this code should look like to handle both systems without d_type, and those which return DT_UNKNOWN: http://article.gmane.org/gmane.mail.notmuch.general/1044 Cheers, Olly ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
Olly == Olly Betts o...@survex.com writes: Olly On 2010-01-14, Carl Worth wrote: On Thu, 14 Jan 2010 18:38:54 +0100, Adrian Perez de Castro ape...@igalia.com wrote: I am using XFS, which always returns DT_UNKNOWN. Taking into account that there is a good deal of people using filesystems other than the ones you mention, and that other non-linux filesystems may also return DT_UNKNOWN, in my opinion there should be a fall-back. I will try to post a patch Anytime Soon=E2=84=A2. We definitely want the fallback. I can attempt to code it, but I don't have ready access to an afflicted filesystem, so I'd need help testing anyway. I'd love to see a patch for this bug soon. Be sure to CC me when the patch is sent and that will help me commit it sooner. Olly Not a full patch, but I already posted what this code should look like Olly to handle both systems without d_type, and those which return DT_UNKNOWN: Olly http://article.gmane.org/gmane.mail.notmuch.general/1044 I take a slighly different approach in mu: /* if the file system does not support entry-d_type, we add it ourselves * this is slower (extra stat) but at least it works */ static gboolean _set_dtype (const char* path, struct dirent *entry) { struct stat statbuf; char fullpath[4096]; snprintf (fullpath, sizeof(fullpath), %s%c%s, path, G_DIR_SEPARATOR, entry-d_name); if (stat (fullpath, statbuf) != 0) { g_warning (stat failed on %s: %s, fullpath, strerror(errno)); return FALSE; } /* we only care about dirs, regular files and links */ if (S_ISREG (statbuf.st_mode)) entry-d_type = DT_REG; else if (S_ISDIR (statbuf.st_mode)) entry-d_type = DT_DIR; else if (S_ISLNK (statbuf.st_mode)) entry-d_type = DT_LNK; return TRUE; } and then in some other places: /* handle FSs that don't support entry-d_type */ if (entry-d_type == DT_UNKNOWN) _set_dtype (dirname, entry); Note, that is untested as of yet. Best wishes, Dirk. -- Dirk-Jan C. Binnema Helsinki, Finland e:d...@djcbsoftware.nl w:www.djcbsoftware.nl pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [notmuch] indexing mail?
On 2010-01-15, Dirk-Jan C Binnema wrote: Olly == Olly Betts o...@survex.com writes: Olly Not a full patch, but I already posted what this code should look like Olly to handle both systems without d_type, and those which return DT_UNKNOWN: Olly http://article.gmane.org/gmane.mail.notmuch.general/1044 static gboolean _set_dtype (const char* path, struct dirent *entry) Underscore prefixed identifiers are reserved by ISO C at file-scope; using them yourself is undefined behaviour... /* we only care about dirs, regular files and links */ if (S_ISREG (statbuf.st_mode)) entry-d_type = DT_REG; else if (S_ISDIR (statbuf.st_mode)) entry-d_type = DT_DIR; else if (S_ISLNK (statbuf.st_mode)) entry-d_type = DT_LNK; This addresses the case where the FS returns DT_UNKNOWN for d_type, but doesn't deal with the case of platforms where struct dirent has no d_type member - from the Linux readdir man page: The only fields in the dirent structure that are mandated by POSIX.1 are: d_name[], of unspecified size, with at most NAME_MAX characters preceding the terminating null byte; and (as an XSI extension) d_ino. The other fields are unstandardized, and not present on all systems; see NOTES below for some further details. And in NOTES: Other than Linux, the d_type field is available mainly only on BSD systems. Cheers, Olly ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch