Re: MESSAGE quota resource implemention
On 05/09/11 20:16, Greg Banks wrote: > On 02/09/11 20:03, Bron Gondwana wrote: >> On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote: >> >>> How's about this for a strategy? >>> >>> When a quota resource is first enabled, (i.e. the limit is changed from >>> UNLIMITED to some finite value), the usage is stored as some special >>> value which I'll call INDETERMINATE. >> >> What about 'getquota'? I don't support any solution which leaves getquota >> returning bogus values or failing to respond. That's just icky and >> confusing. >> >> I don't think you can avoid two passes, and I don't even think you can >> avoid two values during if you really want to be good about it. >> >> [...] >> >> There are pure ways to do this, that guarantee consistency. [...] >> >> That's a real, robust solution. But it's pretty heavy engineering. > > After some thought, I agreed, decided it's quite doable, and decided to > start. I have an implementation coded, so far the hardest bit is testing it. Implemented, tested, and I'd appreciate a review. https://github.com/gnb/cyrus-imapd/commit/af8d5bd3b40c4cb49fb7c943c85cb0aeab209215 https://github.com/gnb/cassandane/commit/c529f745b51aa0b566020ecbffca774e783124ee https://github.com/gnb/cassandane/commit/569eea2b9a2ad56b98c3c5cea60509689fa49bad -- Greg.
Re: MESSAGE quota resource implemention
On Mon, Sep 05, 2011 at 07:19:00PM +0200, Julien Coloos wrote: > Le 05/09/2011 12:06, Bron Gondwana a écrit : > >On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote: > > > >> - add a 32b mailbox index header entry to track the storage in bytes > >>used by all annotations for the mailbox itself or for messages in the > >>mailbox > >Why not 64b? Admittedly hanging gigabytes of annotations off a single > >folders is probably evil, but this is the mailbox header - 4 bytes per > >mailbox to not even have to think about it is super-cheap. We keep a > >handful of "blanks" around in the header already. > > > >... > > > >> - in mailbox_commit_quota(), if the field is not "unknown", then > >>calculate the delta in usage and apply to the quota db. > > > This new field seems useful for the quota utility and when setting a > new quotaroot: it wouldn't need to check each annotation associated > to the mailbox (or messages) and would rely on that field, as it > currently does for total message size and now the number of messages > (with my patch). > I don't know what other people do with cyrus, but here those two > situations do not happen that often: > - quotaroot is usually set once upon creating the user mailbox > - quota utility is rarely used, and usually on one mailbox at a time > So maybe I am missing something, but my question would be: is it > really worth it ? I would challenge the "usually on one mailbox at a time" theory - it's often on one quota root at a time (i.e. - a single user), but also can be used on the entire server. Or for someone with a domain quota, on all the users in a single domain. This can be quite the race condition if someone has a lot of mailboxes and a lot of traffic on said mailboxes. Short of killing their connections and blocking LMTP traffic, you can't guarantee atomicity on actions that cross multiple mailboxes. But - the new field definitely has value. I like the idea of just trusting that field in the quota utility and for regular tasks. Bron.
Re: MESSAGE quota resource implemention
Le 05/09/2011 12:06, Bron Gondwana a écrit : On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote: - add a 32b mailbox index header entry to track the storage in bytes used by all annotations for the mailbox itself or for messages in the mailbox Why not 64b? Admittedly hanging gigabytes of annotations off a single folders is probably evil, but this is the mailbox header - 4 bytes per mailbox to not even have to think about it is super-cheap. We keep a handful of "blanks" around in the header already. ... - in mailbox_commit_quota(), if the field is not "unknown", then calculate the delta in usage and apply to the quota db. This new field seems useful for the quota utility and when setting a new quotaroot: it wouldn't need to check each annotation associated to the mailbox (or messages) and would rely on that field, as it currently does for total message size and now the number of messages (with my patch). I don't know what other people do with cyrus, but here those two situations do not happen that often: - quotaroot is usually set once upon creating the user mailbox - quota utility is rarely used, and usually on one mailbox at a time So maybe I am missing something, but my question would be: is it really worth it ? Regards Julien
Re: MESSAGE quota resource implemention
Le 05/09/2011 06:12, Greg Banks a écrit : Julien, I think we agreed on everything else, right? I'm looking forward to your next iteration. After picking your 'uquota_t removal' commit, I also removed it on my end, and changed the code according to our previous discussions. Adding an helper function which fills a quota_t array with the current messages usage from mailbox struct made the code a bit cleaner and more generic :) I still have to look for transferring limits upon DUMP/UNDUMP. And for the time being, I still kept the quota.sets[] thing (we will see later if we can do it more smartly). Regards Julien
Re: MESSAGE quota resource implemention
On 05/09/11 20:06, Bron Gondwana wrote: > On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote: >> Ok, I'm now convinced my first attempt at annotation quotas sucked too >> hard, here's how I want to re-implement it. Let me know what you think. > >> - add a 32b mailbox index header entry to track the storage in bytes >> used by all annotations for the mailbox itself or for messages in the >> mailbox > > Why not 64b? Admittedly hanging gigabytes of annotations off a single > folders is probably evil, Do we even have a db backend that would support that? > but this is the mailbox header - 4 bytes per > mailbox to not even have to think about it is super-cheap. We keep a > handful of "blanks" around in the header already. > >> - at header upgrade time, write the special value ~0 into this field, >> meaning "unknown" > > Funky. If you didn't want to get all special-value about it, you could > use a flag over in the flags field as well, there are only about 5 of > the 32 in use so far. Sure. >> - in mailbox_commit_quota(), if the field is not "unknown", then >> calculate the delta in usage and apply to the quota db. > > I assume this is done the same way it's done now? Something like that. > By taking a "snapshot" > of the value at the start, and calculating a diff to apply at the end. > That was actually a filthy workaround (one of many, dammit) for using > unsigned datatypes. If we made this a signed 64 bit datatype as well, > with a flag off to the side for "unknown", then we could actually store > a diff in the mailbox object directly - and update both this field and > the quota's value during the commit. Ok. >> - make reconstruct scan the annotations db to figure out the correct >> value for this new field. > > Yep - that makes sense. Reconstruct already does this for the > quota_mailbox_used field, doesn't it. Yes. > And any other derived > "aggregate" values in the header. > >> This means that the annotation STORE and SETMETADATA paths will be >> updating the quota db in the same place that APPEND et al do, >> mailbox_commit_quota(). This should work around Bug #3529 and also make >> the code neater. > > Nice. Yeah, all I have to do is work out how to futz with the annotate API yet again :( -- Greg.
Re: MESSAGE quota resource implemention
On 02/09/11 20:03, Bron Gondwana wrote: > On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote: > >> How's about this for a strategy? >> >> When a quota resource is first enabled, (i.e. the limit is changed from >> UNLIMITED to some finite value), the usage is stored as some special >> value which I'll call INDETERMINATE. > > What about 'getquota'? I don't support any solution which leaves getquota > returning bogus values or failing to respond. That's just icky and > confusing. > > I don't think you can avoid two passes, and I don't even think you can > avoid two values during if you really want to be good about it. > > [...] > > There are pure ways to do this, that guarantee consistency. [...] > > That's a real, robust solution. But it's pretty heavy engineering. After some thought, I agreed, decided it's quite doable, and decided to start. I have an implementation coded, so far the hardest bit is testing it. -- Greg.
Re: MESSAGE quota resource implemention
On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote: > Ok, I'm now convinced my first attempt at annotation quotas sucked too > hard, here's how I want to re-implement it. Let me know what you think. > - add a 32b mailbox index header entry to track the storage in bytes > used by all annotations for the mailbox itself or for messages in the > mailbox Why not 64b? Admittedly hanging gigabytes of annotations off a single folders is probably evil, but this is the mailbox header - 4 bytes per mailbox to not even have to think about it is super-cheap. We keep a handful of "blanks" around in the header already. > - at header upgrade time, write the special value ~0 into this field, > meaning "unknown" Funky. If you didn't want to get all special-value about it, you could use a flag over in the flags field as well, there are only about 5 of the 32 in use so far. > - in mailbox_commit_quota(), if the field is not "unknown", then > calculate the delta in usage and apply to the quota db. I assume this is done the same way it's done now? By taking a "snapshot" of the value at the start, and calculating a diff to apply at the end. That was actually a filthy workaround (one of many, dammit) for using unsigned datatypes. If we made this a signed 64 bit datatype as well, with a flag off to the side for "unknown", then we could actually store a diff in the mailbox object directly - and update both this field and the quota's value during the commit. > - make reconstruct scan the annotations db to figure out the correct > value for this new field. Yep - that makes sense. Reconstruct already does this for the quota_mailbox_used field, doesn't it. And any other derived "aggregate" values in the header. > This means that the annotation STORE and SETMETADATA paths will be > updating the quota db in the same place that APPEND et al do, > mailbox_commit_quota(). This should work around Bug #3529 and also make > the code neater. Nice. Bron.