On Mon, May 09, 2016 at 08:57:55PM -0400, Damien Riegel wrote: > On Mon, May 09, 2016 at 06:42:33PM -0500, Derek Martin wrote: > > On Sat, May 07, 2016 at 04:14:36AM -0400, Damien Riegel wrote: > > OK, I get it now. I think this could work OK if the parent did a > > single stat, and perhaps did an opendir if the object was a directory, > > and then passed copies/pointers/whatever of those objects into the > > registered probe functions (pointers for sure). There's no need to do > > multiple stats. Then at least the only efficiency loss is the extra > > function calls. If dir, the probe function should assume it needs to > > do rewinddir(). > > Thanks for your input Derek. My code was based on the assumption that > mx_get_magic is idempotent, but maybe it is not true.
Now that I've actually looked at the code... ;-) I think it is, in the sense that you mean. That's not the issue though, if I understand correctly. I think your probe design, keeping the knowledge about a mailbox implementation separate from the rest of mutt, is a good idea, and I didn't mean to discourage that. I'm just pointing out that as implemented by your current set of patches, it has a performance concern, but it's one that can be overcome. Maybe I'm not making myself clear, so I'll try again. The issue is that Mutt is going to call your registered probe functions one at a time, until one of them returns that it matches the mailbox type. Currently in all cases this is a function that wraps the mx_get_magic() function, which always checks every possible path, until it finds one that matches. Kevin was (I think) pointing out that this is inefficient, which it is. You'll stat each relevant path up to n times, where n is the number of mail box types you've registered. Currently the mx_get_magic() function and friends stat roughly 10 paths, so you'll do up to roughly n*10 stat() calls. As you say, a normal stat takes negligible time and the user will never notice; and probably the data will end up served from some cache somewhere in the filesystem layer an/or hardware. However, for example, if the mail store is on an NFS server with a slow link, in a worst-case scenario may add up to being very noticable to the user (NFS caching may or may not help hide some of this). If there were 10 mailbox types and your latency to the NFS server was 100ms, this might end up taking roughly 2 * 100ms * 10 mailboxes * 10 paths = 20,000ms, or 20s. Yuck. What I'm suggesting is that if the function that runs through the probe functions does the work in advance, and caches as much of it as makes sense, you can get most of that back. I just took a look at the mx_get_magic() function and those that it calls... The lot of that basically just stats a bunch of paths--there aren't even any opendir() calls. SO: - register a TYPE of folder, where the type is either local file stor or not (or something of the sort) - register a list of paths that your mailbox driver needs to stat to identify mail stores of its type (can be NULL) - FIRST the driver function should check for non-local file stores (like IMAP and POP), since those require no stat() calls (and I believe should mostly just be a string compare, specifying the protocol). - THEN have the driver function pre-fetch the stat structs and cache them somehow (linked list, hash table, whatever), and pass that structure to the probe functions Now you can break up mx_get_magic() into its relative peices, and make each piece the probe function for that specific mailbox type. All must take a pointer to your cache structure, from which they'll get the struct stat structures they'll use to make their decision. Then they'll just pull out the stat structs they need. You could do this by having two different probe functions in your struct--one for local file store, and another for non-local. This would be sufficient to distinguish the type, though for neatness you may want to have a field that specifies it explicitly. For each mail store type you would register the correct one and leave the other NULL. Then your driver function would call all of the probe functions for non-local, which need not take your stat cache structure. This way you only need to generate the stats if at least one of them is needed. Alternatively, have just one, use the type field, and leave the stat struct cache argument NULL for the non-local types (and still wait to generate the cache until they all fail). Does this make sense? Note also that you need not necessarily implement something like this for the first pass... It's of course up to Kevin how and when (and if) he wants to take the various patches. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02 -=-=-=-=- This message is posted from an invalid address. Replying to it will result in undeliverable mail due to spam prevention. Sorry for the inconvenience.
pgpcm2SSBvnjf.pgp
Description: PGP signature
