Re: MESSAGE quota resource implemention

2011-09-18 Thread Greg Banks
On 17/09/11 01:37, Julien Coloos wrote:
 As discussed here and on IRC, I rebased my commits with all the changes:
   - the 'utility' methods have been promoted to 'command' ones, which
 are now generic and may (options hash) concern cyrus binaries
   - the 'start_command_bg' method is now replaced by a 'background'
 option in 'run_command'
   - dropped the 'mode' option, leaving only the 'redirects' one
   - moved 'unpack' method from Cassandane::Unit::TestCase to
 Cassandane::Instance, leaving the tar utility determine itself the kind
 of the tar file
 - by default destination is the current cassandane instance base
 directory; alternatively one can specify a relative path -
 'path/to/file' - to this directory, or an absolute path  - '/path/to/file'
   - use the pre-existing 'user.cassandane' mailbox for quota upgrade tests
   - fixed a few minor things (like avoiding short-life zombies by
 harvesting some cassandane dead children)
   - refactored a few more stuff
   - removed stuff that became unnecessary
 
 Hoping this will work out this time :)

Excellent work, thanks a lot.  I particularly like the refactoring in
Quota.pm that allows the tests to express checks of reported quota
numbers in a way that's concise but not sensitive to the order reported
by the server.

I've pulled your changes and pushed them out to github and cmu.


-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-16 Thread Julien Coloos

As discussed here and on IRC, I rebased my commits with all the changes:
  - the 'utility' methods have been promoted to 'command' ones, which 
are now generic and may (options hash) concern cyrus binaries
  - the 'start_command_bg' method is now replaced by a 'background' 
option in 'run_command'

  - dropped the 'mode' option, leaving only the 'redirects' one
  - moved 'unpack' method from Cassandane::Unit::TestCase to 
Cassandane::Instance, leaving the tar utility determine itself the kind 
of the tar file
- by default destination is the current cassandane instance base 
directory; alternatively one can specify a relative path - 
'path/to/file' - to this directory, or an absolute path  - '/path/to/file'

  - use the pre-existing 'user.cassandane' mailbox for quota upgrade tests
  - fixed a few minor things (like avoiding short-life zombies by 
harvesting some cassandane dead children)

  - refactored a few more stuff
  - removed stuff that became unnecessary

Hoping this will work out this time :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-15 Thread Greg Banks
On 15/09/11 03:14, Julien Coloos wrote:
 Le 12/09/2011 23:09, Bron Gondwana a écrit :

   - for 'old' mailboxes (those created before the annotation storage
 usage field in the index header), current annotations usage shall be
 computed (and added to the quota entry) upon upgrading; this way
 users won't have to run 'quota -f' for all quotaroots after
 switching to this new version ;)
 Definitely.  Upgrading usually handles things like that.  It's
 the right way[tm].

 Pushed a few lines of code in the 'upgrade_index' function. Though it
 still lacks the annotation storage usage computing.
 It's in the branch 'gnb/annotate/fixes'; commit
 ed61f721487804b205e399c538fc35ddc153bd93.

Thanks Julien, I've cherry-picked this commit and the Fixes commit
from the other day into my annotate branch.

BTW I'm working on making reconstruct calculate the X-ANNOTATION-STORAGE
quota.  Some of that code may also be useful during upgrade and for
handling message expunging.

 
 Actually I just gave up the 'old' test: there is no easy way to
 simulate upgrading mailbox index, or at least I don't feel confident
 enough to make it in cassandane :(

No, there isn't a really good way at the moment.

 Easiest way is the same way I did for the broken quotas test.
 Have a tar file with the contents of the old mailbox which
 you unpack onto the filesystem, and then open the mailbox and
 check that the upgraded fields are what you would expect.

 I also did something similar for the crash on thread test,
 where there were 5 messages which were known to be able to
 crash the THREAD command.  I unpacked the folder contents
 with those messages in a test.
 Nice idea :)

Bron's method of saving a tarball of the state of a mailbox generated by
an older release is reasonably straightforward.  The difficulty however
is going to be working out whether the result of the upgrade is correct.
 In the general case, you can't just test that the first open of the
mailbox doesn't report an error from upgrading, you need to test at least:

 * the mailbox is re-written with the latest version number

 * all the new fields have correct or at least feasible values

 * the messages previously present are still present and in the same
order with the same contents and the same metadata (uid, flags,
internaldate, etc)

To do this properly you really need to save, independently of the Cyrus
mailbox store, a whole lot of information about the expected results of
the mailbox open so that Cassandane can check it.  The two other
alternatives are both worse: a) don't check and b) hardcode checks into
the Cassandane code which depend intimately on details of the data in
the mailbox state tarball.

I'm thinking that what we need is a standalone utility built from the
Cassandane source tree.  Someone would point this utility at a live
working Cyrus instance (not one controlled by Cassandane), and it would
create a new folder and fill the folder with known data, then harvest
the on-disk state left by Cyrus and packages that state up into a
tarball along with a separate description of the messages.  That tarball
could then be checked into the Cassandane tree and used to run upgrade
tests with reasonable post-upgrade checks.  The advantages I see for
this approach are:

 * when (not 'if') we need to improve either the data generated or the
checks run, we can improve this utility and then re-capture new state as
time permits.

 * the versions of Cyrus tested can be on separate machines from
Cassandane itself

   - which allows testing upgrades from really old versions which don't
install or run on modern OSes due to library issues, and

   - allows testing upgrades from Cyrus on other platforms (e.g.
platforms with different endianness or wordsize).


BTW at some point in the future I want Cassandane to be able to handle
multiple installed versions of Cyrus, so that we can test configurations
like cross-version replication.  But that's not really the same problem
as upgrade testing.  For upgrade testing, we don't need functioning code
for the old Cyrus versions, just a snapshot of their spoor.  For
cross-version replication or murder testing we need live installs to
poke.  The difference is critical, because I figure we'll need to
support upgrade from much older versions than we support in a
cross-version murder, and those older versions will be quite challenging
to install or get running on the same platform that we develop on.

 Pushed a few things.

Thanks. I pulled in your commit Added quota MESSAGE resource (RFC 2087)
test cases. yesterday and then refactored the new tests a bit, but
didn't get a chance to push it out again until today.


 Added an option hash so that when running commands:
   - test can run system commands (based on the current code which would
 run cyrus commands inside the current cassandane instance directory)



   - finer I/O redirections are possible
   - working directory can be specified
 While adding those new 'features', I still kept 

Re: MESSAGE quota resource implemention

2011-09-14 Thread Julien Coloos

Le 12/09/2011 23:09, Bron Gondwana a écrit :



  - for 'old' mailboxes (those created before the annotation storage
usage field in the index header), current annotations usage shall be
computed (and added to the quota entry) upon upgrading; this way
users won't have to run 'quota -f' for all quotaroots after
switching to this new version ;)

Definitely.  Upgrading usually handles things like that.  It's
the right way[tm].

Pushed a few lines of code in the 'upgrade_index' function. Though it 
still lacks the annotation storage usage computing.
It's in the branch 'gnb/annotate/fixes'; commit 
ed61f721487804b205e399c538fc35ddc153bd93.



Actually I just gave up the 'old' test: there is no easy way to
simulate upgrading mailbox index, or at least I don't feel confident
enough to make it in cassandane :(

Easiest way is the same way I did for the broken quotas test.
Have a tar file with the contents of the old mailbox which
you unpack onto the filesystem, and then open the mailbox and
check that the upgraded fields are what you would expect.

I also did something similar for the crash on thread test,
where there were 5 messages which were known to be able to
crash the THREAD command.  I unpacked the folder contents
with those messages in a test.

Nice idea :)
Pushed a few things.
Added an option hash so that when running commands:
  - test can run system commands (based on the current code which would 
run cyrus commands inside the current cassandane instance directory)

  - finer I/O redirections are possible
  - working directory can be specified
While adding those new 'features', I still kept the 'run_utility' method 
(cyrus commands) and simply added the 'run_command' method for other 
commands.


Then, as you suggested on IRC, I added an 'unpack' method to extract 
tar/gz/bz2 (and combinations) files using system commands.


And I added back my quota upgrade test from a cyrus v2.4(.11) mailbox, 
using two tar.gz files containing a mailbox content and its quota file.


It's in the 'quotamessage/gnb/annotate' branch; commits 
d2bf4e4f42f7f8c9b53713441116ddbea5b0a265, 
86a52daeed5a03f078b88de67d3d10b51a7f8cc4 and 
fb827e8fc77529a5e23465d2c19d6e88adf7cae8.



Maybe you won't keep everything I pushed, but I hope some parts will be 
helpful :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-12 Thread Julien Coloos

Le 09/09/2011 14:18, Greg Banks a écrit :

Ok, here you go. Not completely tested yet, so caveat emptor.


Had to change two things:
 - mailbox_quota_check now expects a quota diff array which is good :), 
but change is now really applied if all diffs are ' 0' (instead of '= 
0' previously); in some cases '0' is used to check that mailbox is not 
currently overquota (e.g. in LMTP service), so either we should go back 
to testing '= 0', or change callers that did rely on that test
 - changing the value of annotations quota storage in the index 
(mailbox_use_annot_quota function) do dirty the quota, but not the 
index; the new value is thus not committed (at least when setting 
mailbox annotations), and the quota entry becomes false when later 
deleting the mailbox


If needed, you can look in our 'cyrus-imapd' repository at the 
'gnb/annotate/fixes' branch. There is one commit for those matters.



Remaining things concerning annotations:
 - when deleting messages, annotations length is not substracted; the 
solution may not be that simple, since I believe users are allowed to 
unexpunge mails: so since index entry is still here - until real 
unlinking -, annotations may have to stay there too - until unlinking too
 - for 'old' mailboxes (those created before the annotation storage 
usage field in the index header), current annotations usage shall be 
computed (and added to the quota entry) upon upgrading; this way users 
won't have to run 'quota -f' for all quotaroots after switching to this 
new version ;)



Then maybe some of the cassandane tests I pushed on our repository would
need to be refreshed (at least the one that checks what happens for
legacy mailboxes on which we add one of the newly handled quota resources).

That's next week's focus I think.
Actually I just gave up the 'old' test: there is no easy way to simulate 
upgrading mailbox index, or at least I don't feel confident enough to 
make it in cassandane :(
Other tests do work. Once the annotations usage is subtracted upon 
messages deletion (see before), all tests shall pass :)


I rebased the 'quotamessage/gnb/annotate' branch of our 'cassandane' 
repository today, leaving that test aside.



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-12 Thread Bron Gondwana
On Mon, Sep 12, 2011 at 04:46:52PM +0200, Julien Coloos wrote:
 Le 09/09/2011 14:18, Greg Banks a écrit :
 Ok, here you go. Not completely tested yet, so caveat emptor.
 Remaining things concerning annotations:
  - when deleting messages, annotations length is not substracted;
 the solution may not be that simple, since I believe users are
 allowed to unexpunge mails: so since index entry is still here -
 until real unlinking -, annotations may have to stay there too -
 until unlinking too

When they unexpunge, you add it back.  The decision to have
delayed expunge and WHEN to expunge is a server administrator
decision that's outside the control of the user.

With storage, we subtract the usage when we mark the record
expunged.  Annotation length should be handled the same way.
This will have to be done when the message gets expunged,
by finding the size of the annotations and subtracting that
also.

  - for 'old' mailboxes (those created before the annotation storage
 usage field in the index header), current annotations usage shall be
 computed (and added to the quota entry) upon upgrading; this way
 users won't have to run 'quota -f' for all quotaroots after
 switching to this new version ;)

Definitely.  Upgrading usually handles things like that.  It's
the right way[tm].

 Actually I just gave up the 'old' test: there is no easy way to
 simulate upgrading mailbox index, or at least I don't feel confident
 enough to make it in cassandane :(

Easiest way is the same way I did for the broken quotas test.
Have a tar file with the contents of the old mailbox which
you unpack onto the filesystem, and then open the mailbox and
check that the upgraded fields are what you would expect.

I also did something similar for the crash on thread test,
where there were 5 messages which were known to be able to
crash the THREAD command.  I unpacked the folder contents
with those messages in a test.

Regards,

Bron.


Re: MESSAGE quota resource implemention

2011-09-07 Thread Greg Banks



Sent from my iPhone

On 08/09/2011, at 0:45, Julien Coloos julien.col...@atos.net wrote:


Le 06/09/2011 10:23, Greg Banks a écrit :


a) my commit Make quota -f less racy is going to cause lots of  
clashes

(sorry!)

b) Bron and I both think that your commit Compute each quota  
resource

upon setting it for the first time. is unnecessary, given that

  i) quota -f doesn't suck now, and

  ii) soon, all of the quota-able quantities will be tracked in  
fields

in the index header.

So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out  
the
function annotatemore_computequota() for now.  We'll use something  
very
much like it for reconstruct later.  I'm hoping to be able to pull  
your
next round of changes into my annotate branch before I reimplement  
the

annotation quota in the next few days.

...

I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)
So, I saved my current branch to 'quotamessage-0/gnb/annotate' and  
rebased my patches on current 'annotate' branch (with less racy  
'quota -f').
I removed everything related to recomputing from my patches (as well  
as quota.sets[]).




Excellent, I'll take a look at these when I get into the office.


What is missing now is the new index field, which value will be used  
in mailbox_get_usage function. Since my changes do rely on this  
function, and sometimes computes a delta compared to a previous call  
of that function, it may not need to be updated afterwards ... I hope.


That seems likely. I have an almost-building diff which adds that field.

Then maybe some of the cassandane tests I pushed on our repository  
would need to be refreshed (at least the one that checks what  
happens for legacy mailboxes on which we add one of the newly  
handled quota resources).




Yep.

Greg.

Re: MESSAGE quota resource implemention

2011-09-06 Thread Greg Banks
On 05/09/11 20:16, Greg Banks wrote:
 On 02/09/11 20:03, Bron Gondwana wrote:
 On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:

 How's about this for a strategy?

 When a quota resource is first enabled, (i.e. the limit is changed from
 UNLIMITED to some finite value), the usage is stored as some special
 value which I'll call INDETERMINATE.

 What about 'getquota'?  I don't support any solution which leaves getquota
 returning bogus values or failing to respond.  That's just icky and
 confusing.

 I don't think you can avoid two passes, and I don't even think you can
 avoid two values during if you really want to be good about it.

 [...]

 There are pure ways to do this, that guarantee consistency. [...]

 That's a real, robust solution.  But it's pretty heavy engineering.
 
 After some thought, I agreed, decided it's quite doable, and decided to
 start. I have an implementation coded, so far the hardest bit is testing it.

Implemented, tested, and I'd appreciate a review.

https://github.com/gnb/cyrus-imapd/commit/af8d5bd3b40c4cb49fb7c943c85cb0aeab209215

https://github.com/gnb/cassandane/commit/c529f745b51aa0b566020ecbffca774e783124ee
https://github.com/gnb/cassandane/commit/569eea2b9a2ad56b98c3c5cea60509689fa49bad



-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-06 Thread Julien Coloos

Le 06/09/2011 10:23, Greg Banks a écrit :


So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

If the new code is capable of returning actual resource usages upon 
getquota (I think Bron wanted it that way), then I guess I can drop some 
more of the functions I added (annotatemore_computequota, but also 
mboxlist_updatequota and associated code including the quota.sets[]; 
then maybe I could also change the code as proposed earlier and prevent 
writing of resource limit in quota entry if it is QUOTA_UNLIMITED). Will 
look at it tomorrow.


Worked on DUMP/UNDUMP today: 
https://github.com/worldline-messaging/cyrus-imapd/commit/45148b20a4f2343d72aa7436e7255a92508d7bf8
I used an IMAP list to send data, as for annotations. For future usages, 
maybe the UNDUMP code could try to skip IMAP lists if it finds any 
instead of the expected file content LITERAL ? (to prevent breaking 
compatibility too many times, in case other things would need to be 
transmitted this way through DUMP/UNDUMP).


By the way, I noticed that UNDUMPing annotations fails for now (function 
annotate_state_write uses 'int_mboxname' in the annotation state struct, 
but it is NULL in this context).




I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)

No problem, I was somehow prepared to that kind of scenario :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-06 Thread Julien Coloos

Le 06/09/2011 19:48, Julien Coloos a écrit :

Le 06/09/2011 10:23, Greg Banks a écrit :


So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

If the new code is capable of returning actual resource usages upon 
getquota (I think Bron wanted it that way), then I guess I can drop 
some more of the functions I added (annotatemore_computequota, but 
also mboxlist_updatequota and associated code including the 
quota.sets[]; then maybe I could also change the code as proposed 
earlier and prevent writing of resource limit in quota entry if it is 
QUOTA_UNLIMITED). Will look at it tomorrow.

When I said 'actual', I meant based on the upcoming mailbox index field.
But wait, I got a bit confused with the last point about playing along 
with quota.sets[]. If what you discussed with Bron is that, in the end, 
usage (re)computing will only be done with the quota utility (no more 
automatic computing upon setquota, neither for getquota), then after 
upgrading people shall call 'quota -f' once and for all and quota.sets[] 
must disappear - and all resources must be written in the quota entry -, 
otherwise users would need to call 'quota -f' on a given quotaroot each 
time they set a new quota resource limit to a mailbox.



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Bron Gondwana
On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
 Ok, I'm now convinced my first attempt at annotation quotas sucked too
 hard, here's how I want to re-implement it.  Let me know what you think.

  - add a 32b mailbox index header entry to track the storage in bytes
 used by all annotations for the mailbox itself or for messages in the
 mailbox

Why not 64b?  Admittedly hanging gigabytes of annotations off a single
folders is probably evil, but this is the mailbox header - 4 bytes per
mailbox to not even have to think about it is super-cheap.  We keep a
handful of blanks around in the header already.

  - at header upgrade time, write the special value ~0 into this field,
 meaning unknown

Funky.  If you didn't want to get all special-value about it, you could
use a flag over in the flags field as well, there are only about 5 of
the 32 in use so far.

  - in mailbox_commit_quota(), if the field is not unknown, then
 calculate the delta in usage and apply to the quota db.

I assume this is done the same way it's done now?  By taking a snapshot
of the value at the start, and calculating a diff to apply at the end.
That was actually a filthy workaround (one of many, dammit) for using
unsigned datatypes.  If we made this a signed 64 bit datatype as well,
with a flag off to the side for unknown, then we could actually store
a diff in the mailbox object directly - and update both this field and
the quota's value during the commit.

  - make reconstruct scan the annotations db to figure out the correct
 value for this new field.

Yep - that makes sense.  Reconstruct already does this for the
quota_mailbox_used field, doesn't it.  And any other derived
aggregate values in the header.

 This means that the annotation STORE and SETMETADATA paths will be
 updating the quota db in the same place that APPEND et al do,
 mailbox_commit_quota().  This should work around Bug #3529 and also make
 the code neater.

Nice.

Bron.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 02/09/11 20:03, Bron Gondwana wrote:
 On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
 
 How's about this for a strategy?

 When a quota resource is first enabled, (i.e. the limit is changed from
 UNLIMITED to some finite value), the usage is stored as some special
 value which I'll call INDETERMINATE.
 
 What about 'getquota'?  I don't support any solution which leaves getquota
 returning bogus values or failing to respond.  That's just icky and
 confusing.
 
 I don't think you can avoid two passes, and I don't even think you can
 avoid two values during if you really want to be good about it.
 
 [...]
 
 There are pure ways to do this, that guarantee consistency. [...]
 
 That's a real, robust solution.  But it's pretty heavy engineering.

After some thought, I agreed, decided it's quite doable, and decided to
start. I have an implementation coded, so far the hardest bit is testing it.



-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 05/09/11 20:06, Bron Gondwana wrote:
 On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
 Ok, I'm now convinced my first attempt at annotation quotas sucked too
 hard, here's how I want to re-implement it.  Let me know what you think.
 
  - add a 32b mailbox index header entry to track the storage in bytes
 used by all annotations for the mailbox itself or for messages in the
 mailbox
 
 Why not 64b?  Admittedly hanging gigabytes of annotations off a single
 folders is probably evil,

Do we even have a db backend that would support that?

 but this is the mailbox header - 4 bytes per
 mailbox to not even have to think about it is super-cheap.  We keep a
 handful of blanks around in the header already.
 
  - at header upgrade time, write the special value ~0 into this field,
 meaning unknown
 
 Funky.  If you didn't want to get all special-value about it, you could
 use a flag over in the flags field as well, there are only about 5 of
 the 32 in use so far.

Sure.

  - in mailbox_commit_quota(), if the field is not unknown, then
 calculate the delta in usage and apply to the quota db.
 
 I assume this is done the same way it's done now?

Something like that.

  By taking a snapshot
 of the value at the start, and calculating a diff to apply at the end.
 That was actually a filthy workaround (one of many, dammit) for using
 unsigned datatypes.  If we made this a signed 64 bit datatype as well,
 with a flag off to the side for unknown, then we could actually store
 a diff in the mailbox object directly - and update both this field and
 the quota's value during the commit.

Ok.

  - make reconstruct scan the annotations db to figure out the correct
 value for this new field.
 
 Yep - that makes sense.  Reconstruct already does this for the
 quota_mailbox_used field, doesn't it.

Yes.

  And any other derived
 aggregate values in the header.
 
 This means that the annotation STORE and SETMETADATA paths will be
 updating the quota db in the same place that APPEND et al do,
 mailbox_commit_quota().  This should work around Bug #3529 and also make
 the code neater.
 
 Nice.

Yeah, all I have to do is work out how to futz with the annotate API yet
again :(


-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Julien Coloos

Le 05/09/2011 06:12, Greg Banks a écrit :


Julien, I think we agreed on everything else, right?  I'm looking
forward to your next iteration.
After picking your 'uquota_t removal' commit, I also removed it on my 
end, and changed the code according to our previous discussions. Adding 
an helper function which fills a quota_t array with the current messages 
usage from mailbox struct made the code a bit cleaner and more generic :)
I still have to look for transferring limits upon DUMP/UNDUMP. And for 
the time being, I still kept the quota.sets[] thing (we will see later 
if we can do it more smartly).



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Bron Gondwana
On Mon, Sep 05, 2011 at 07:19:00PM +0200, Julien Coloos wrote:
 Le 05/09/2011 12:06, Bron Gondwana a écrit :
 On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
 
   - add a 32b mailbox index header entry to track the storage in bytes
 used by all annotations for the mailbox itself or for messages in the
 mailbox
 Why not 64b?  Admittedly hanging gigabytes of annotations off a single
 folders is probably evil, but this is the mailbox header - 4 bytes per
 mailbox to not even have to think about it is super-cheap.  We keep a
 handful of blanks around in the header already.
 
 ...
 
   - in mailbox_commit_quota(), if the field is not unknown, then
 calculate the delta in usage and apply to the quota db.
 
 This new field seems useful for the quota utility and when setting a
 new quotaroot: it wouldn't need to check each annotation associated
 to the mailbox (or messages) and would rely on that field, as it
 currently does for total message size and now the number of messages
 (with my patch).
 I don't know what other people do with cyrus, but here those two
 situations do not happen that often:
   - quotaroot is usually set once upon creating the user mailbox
   - quota utility is rarely used, and usually on one mailbox at a time
 So maybe I am missing something, but my question would be: is it
 really worth it ?

I would challenge the usually on one mailbox at a time theory - it's
often on one quota root at a time (i.e. - a single user), but also can
be used on the entire server.  Or for someone with a domain quota, on
all the users in a single domain.

This can be quite the race condition if someone has a lot of mailboxes
and a lot of traffic on said mailboxes.  Short of killing their
connections and blocking LMTP traffic, you can't guarantee atomicity
on actions that cross multiple mailboxes.

But - the new field definitely has value.  I like the idea of just
trusting that field in the quota utility and for regular tasks.

Bron.


Re: MESSAGE quota resource implemention

2011-09-02 Thread Bron Gondwana
On Fri, Sep 02, 2011 at 09:37:24AM +1000, Rob Mueller wrote:
 
 Actually, really I'd like to create a new UNIQUEID - and store
 all the files in paths based on uniqueid rather than on folder
 name.  This would not only mean renames could be fast and
 atomic, but that delayed delete would be fast.  The downside is
 a more opaque filesystem layout.  Oh, another upside - file path
 limitations don't exist so much any more.
 
 While UNIQUEID is nice, the opaqueness is annoying. Personally, I
 liked the idea that we talked about a while back which I think was:
 
 $spooldir/a/user/aardvark/user.aardvark/
 $spooldir/a/user/aardvark/user.aardvark.drafts/
 $spooldir/a/user/aardvark/user.aardvark.trash/
 $spooldir/a/user/aardvark/user.aardvark.foo/
 $spooldir/a/user/aardvark/user.aardvark.foo.bar/
 $spooldir/a/user/aardvark/user.aardvark.abc.xyz/

I still have a patch somewhere that does that.

 So you end up with every folder for a user in one dir. This solves
 the current messy handling of sub-dirs (eg currently you have to
 create the intermediate dir /abc/ even though there's no entry in
 mailboxes.db for it), and makes renaming any folder very cheap still
 (because you can do it with a single dir rename, rather than having
 to move each message file), but you don't go completely opaque.

It does give you a 256 character folder tree limit on many operating
systems.

 Delete handling is still easier, just rename to
 DELETED.$oldname.$UNIQUEID or something like that, because it's
 cheap to rename anyway.

Sorry, make that 239, once you make space for timestamps and dots.

 Of course it means re-organising folder layout for every
 installation out there, but maybe we need to bump major versions
 anyway, cyrus 3.0 here we come :)

tools/rehash - I have one of those that can handle this plus all the
other existing formats as well.  It's on a git branch somewhere.

Bron.


Re: MESSAGE quota resource implemention

2011-09-02 Thread Greg Banks
On 01/09/11 22:22, Bron Gondwana wrote:
 On Thu, Sep 01, 2011 at 01:27:00PM +0200, Julien Coloos wrote:
 Le 01/09/2011 03:03, Greg Banks a écrit :

 
 But more generally, the update the quotaroot is atomic-safe,
 because the mailbox doesn't add/remove things from the quotaroot
 racily - but quota -f IS racy.  The only way around that would
 be a dual pass mark and collect thing, where you marked each
 mailbox as we're re-calculating, so don't update the quota usage
 in the first sweep, then came back and removed the mark as you
 read the value.  Tricky, but doable.  The downside is that a
 crash part way through can leave you in a broken state.  But
 that's true of everything.
 
 But I agree that in case of underflow detection throwing a warning
 in syslog might help draw the attention when logs are analysed.
 
 when.  haha.
 
 (maybe at a few sites that care... but for the vast majority of
  sites, if you're depending on them reading syslog you've already
  lost.  Software that understands that and is robust in the face
  of errors is much nicer for the poor suckers on the receiving end
  of all this)

If the software was robust, underflow would not happen and we would not
need to test for it and handle it.  Thus the log messages are not
operational messages intended for the sysadmin, but warnings about
internal Cyrus problems intended for Cyrus developers, and syslog is a
suitable place for them.

How's about this for a strategy?

When a quota resource is first enabled, (i.e. the limit is changed from
UNLIMITED to some finite value), the usage is stored as some special
value which I'll call INDETERMINATE.

Changing a mailbox' quotaroot to an existing quotaroot, resets all the
quotaroot's usages to INDETERMINATE.

Usage deltas applied to the INDETERMINATE value silently yield
INDETERMINATE back.

Usage deltas which would underflow a finite (i.e. not INDETERMINATE)
usage value are clamped with a warning about an internal consistency
problem.

A regularly run process such as cyr_expire finds quotaroots which have
one or more INDETERMINATE usages and runs quota -f on them.

In this model, INDETERMINATE value acts like your dual pass mark.


 
 In the 1st hunk in cmd_append(), at this point in the code I
 believe totalsize = 0, so you could easily pass 0 for the new last
 argument as well.
 Yes at this point totalsize is still 0.
 The current code, which handles MULTIAPPEND, does a preliminary
 check to see if mailbox is not overquota, then receives all
 messages, and finally checks that all messages can fit in the
 mailbox.
 Since we know for sure at least one message is coming, I though it
 could be a good idea to early-check that at least one more message
 would fit in the mailbox before receiving anything. Otherwise we are
 staging new file only to trash it at the end (when overquota).

As far as I can see the only point of that first append_check() call is
to fail early in the case of a permission fail.

 So
 actually I wonder if the code could be updated to check (LITERAL
 parameter) that each message about to be received would fit in the
 mailbox, instead of checking only once at the end ?

Literals, and binary, and urls...

I do think there's an argument to made for moving the maintenance of the
totalsize parameter into the struct appendstate, and once you do that
it's possible to do quota/permission checks at finer grained points.
But why bother?  Why slow down the common case of not exceeding quota?

 
 There are some bugs about this already.  In particularl the opposite
 case, checking that we actually want to append something before
 aborting a sieve delivery - because it may be discarded or redirected
 or even duplicate suppressed anyway.  Something to keep in mind.
 

 
 I'm thinking that there's now three places in the code which take
 a mailbox* and fill out an array of quota diffs, interpreting the
 contents of the struct mailbox.  That should really be
 centralised.
 I'm not sure to see what you have in mind here.
 Are you talking about the places where the QUOTA_STORAGE and
 QUOTA_MESSAGE entries of the quota diff array are computed
 relatively to the 'quota_mailbox_used' and 'exists' index fields of
 the struct mailbox ? 

Yes.


 One place please :)  Ideally I'd like to absorb more of the quota
 stuff into mailbox.c.  Greg and I have some debate about this - how
 much is too much for that file to be doing.  Probably it should be
 abstracted into a couple of layers of stuff - but I really do like
 the consistency of having just a couple of function calls:
 
 mailbox_append_index_record; and
 mailbox_update_index_record
 
 which do all the consistency checking and counter updating inside.
 Plus of course a mailbox_check_quota thing that takes a set of
 quota checks to do and sees if there will be space for the
 planned changes!

After implementing the X-ANNOTATION-STORAGE quota, I'm almost completely
convinced I got it very wrong, and should have left all the quota
updating in 

Re: MESSAGE quota resource implemention

2011-09-02 Thread Bron Gondwana
On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
 If the software was robust, underflow would not happen and we would not
 need to test for it and handle it.  Thus the log messages are not
 operational messages intended for the sysadmin, but warnings about
 internal Cyrus problems intended for Cyrus developers, and syslog is a
 suitable place for them.

In theory, yes - assuming you can keep blackbox control over all the
filesystems that Cyrus is operating over, and the sysadmin never restores
from backup or otherwise screws with any of the underlying files.

 How's about this for a strategy?
 
 When a quota resource is first enabled, (i.e. the limit is changed from
 UNLIMITED to some finite value), the usage is stored as some special
 value which I'll call INDETERMINATE.

What about 'getquota'?  I don't support any solution which leaves getquota
returning bogus values or failing to respond.  That's just icky and
confusing.

I don't think you can avoid two passes, and I don't even think you can
avoid two values during if you really want to be good about it.

Anyway - moving a new folder into a quotaroot is NOT racy.  You just need
to read quota_mailbox_used on the mailbox, then lock the old quota root,
subtract N bytes and unlock it - then lock the NEW quota root, add N bytes
and unlock it again.  No problem.

The only issue is updating WITHOUT changing the quotaroot, which is an
issue because a particular mailbox doesn't know if it's already been counted
in the new quota value or not, so if it should be updating the value.

There are pure ways to do this, that guarantee consistency.  I think the best
way is probably some sort of A/B thing, where you label the quotaroot as A
or B in the mailbox - AND in the quota root.  So the initial state looks like
this:

ROOT: A
A: $usage
B: INVALID

When you want to run a quota -f you set 'B' to zero, and then run the update
logic over all mailboxes, updating B with the value as you go, and setting
the quotaroot in the mailbox to be in state 'B', so it also updates B.
Any mailbox in state 'B' will update both A AND B, because the root is still
in state A.  Mailboxes in state 'A' will only update 'A', because they match
the root.

When you have finished quota -f, both values are being updated simultaneously
by all mailboxes.  You also have two fields which you can compare, and a
guarantee that they were both atomically updated, so if they're not the same
then there was definitely corruption, not just a race condition.  So you can
report that.

Then you update the quota root to say 'ROOT: B', and 'A' invalid.  Or even
just A: zero.  If anything continues to update the wrong field then you
also have corruption (probably a mailbox outside the quotaroot pointing to
it, which is pretty silly)

That's a real, robust solution.  But it's pretty heavy engineering.

 As far as I can see the only point of that first append_check() call is
 to fail early in the case of a permission fail.

It's nice to fail before the client starts uploading.  Failing any later
than that is kinda pointless, because you're not saving the client
bandwidth - which may matter.  The disk IO is unlikely to matter to the
server, since it's a rare case.

 convinced I got it very wrong, and should have left all the quota
 updating in mailbox_commit_quota().  I was trying hard to avoid adding a
 field to the index header to track the storage used by all the
 annotations for the mailbox and for messages in the mailbox; but I'm
 really not happy with the results :(

Well, we're not committed to keeping it that way - it's not as if it's
even in the wild except for some really early adopters of the master
branch, who deserve whatever pain they get (mostly, that's us - and we
know enough to be able to clean up any mess)

Bron.


Re: MESSAGE quota resource implemention

2011-09-01 Thread Greg Banks

Julien Coloos wrote:

Hi,

As discussed on IRC last week, I worked on implementing MESSAGE quota 
resource in cyrus (see RFC 2087; STORAGE resource already being handled).
I created a branch based on Greg's 'annotate' one on github, since his 
work on annotation storage management made mine a lot easier.


Details on the changes I made:

cyrus-imapd:
  - added MESSAGE quota resource management
- updated cunit test
- added 'quotawarnmsg' option which behaviour is similar to 
'quotawarnkb'; warning message shown when selecting folder in IMAP 
tells which resource limit was triggered
- added 'autocreatequotamsg' option, to set MESSAGE limit 
(unlimited by default) when user auto-creates its mailbox

  - xfer transfers all non-unlimited resources
  - added helper function to compute annotation storage usage for a 
given mailbox

  - changed the way quota entries are read/written
- resources presence in fetched entry is remembered
- when writing entry, only present resources are written
- when setting resources limits, entries which were not present 
upon fetching are now marked as present and their current usage is 
computed
  - quota utility lists and computes (-f) all resources associated to 
mailbox
The branch 'quotamessage/gnb/annotate' is available here: 
git://github.com/worldline-messaging/cyrus-imapd.git. It is based on 
Greg's 'annotate' branch on github.




Reviewed:

commit 2d4fb0 Added quota MESSAGE resource (RFC 2087) management.

imap/quota.h

Why #include stdint.h?  It's not like we're using any of the typedefs 
there.  There's a good argument to be made that we *should* do so, but 
until then the #include doesn't seem useful.


I'm unconvinced about the need for quota.sets[].  After a lot of reading 
it seems to me that (quota.sets[x]) and (quota.limits[x] != 
QUOTA_UNLIMITED) are identical or should be.  Also, are you setting 
sets[] anywhere except when loading from the db?


I like quota_update_useds(), it does seem necessary to be able to update 
several usages together.


Otherwise, looks good.

imap/quota_db.c

In quota_write(), your sanity check should goto the buf_free() at the 
end of the function rather than return early.  Otherwise, looks good.


imap/append.c

Looks good.

imap/append.h

Looks good

imap/cyr_virusscan.c

Looks good

imap/imap_err.et

Looks good

imap/imapd.c

In the 1st hunk in cmd_append(), at this point in the code I believe 
totalsize = 0, so you could easily pass 0 for the new last argument as well.


Otherwise looks good.

BTW, reading your code I realised a number of places I failed to account 
for X-ANNOTATION-STORAGE quota properly :(  I think I'll have to change 
a number of those new pairs of quota_t arguments to a quota_t 
quotadiff[QUOTA_NUMRESOURCES].


imap/index.c

Looks good

imapd/lmtpd.c

Looks good

imap/lmtpengine.c

Looks good

imap/lmtpengine.h

Looks good


imap/mailbox.c

I'm not sure it's a good idea to set mailbox-i.exists = 0 in 
mailbox_delete().  Perhaps it would be better to check for 
(mailbox-i.options  OPT_MAILBOX_DELETED) when sampling 
mailbox-i.exists in mailbox_commit_quota().


Otherwise, looks good.

imap/mailbox.h

Looks good

imap/mbdump.c

Looks good

imap/nntpd.c

Looks good

lib/imapoptions

Looks good.

commit 0e613c Removed unused code.

Looks good.

commit c8aeef Compute each quota resource upon setting it for the first 
time.


commit b30c10 List and compute all set quota resources.

Oh, here's why you include stdint.h:  you're using PRIdMAX and not 
QUOTA_REPORT_FMT, which is used only in imapd.c.


I'm thinking that there's now three places in the code which take a 
mailbox* and fill out an array of quota diffs, interpreting the contents 
of the struct mailbox.  That should really be centralised.



--
Greg.



Re: MESSAGE quota resource implemention

2011-09-01 Thread Julien Coloos

Hi, thanks for your comments and reviewing the code

Le 01/09/2011 03:03, Greg Banks a écrit :
I think that there are two things that may also be done concerning 
quota entries:
  - always recompute resource usage when changing its limit from 
unlimited to bounded
- currently it is only done once when setting the usage limit 
for the first time
- that way, it may not be necessary to track resource presence 
when reading/writing quota entries
- but maybe it could be too time consuming in some cases, since 
it seems to be possible to associate a quota resource to a whole 
domain (recomputing usage for all mailboxes would then take some time)

  - do not write resource in quota entry when its usage is unlimited
- except for the STORAGE resource, for backward compatibility
- would also help keeping quota entries size to the bare minimum

What do you think ?


This one's tough, I wasn't sure what to do.  However, I'm happy to 
leave it to the administrator to have to manually run quota -f (maybe 
twice!) if they set a quota on a resource that is already being used.  
I'm unconvinced that automatically doing the equivalent of -f as a 
side effect of setting the first limit is necessary or wise.  Perhaps 
we should a) document it clearly and b) detect the situation and put 
an obvious message saying something like you may need to run quota -f 
... where the human making the change will see it.  Also, such a 
message might be useful when usage underflow is detected.


My concern here is that there are times when quota changes are not done 
manually by a human, but happen automatically as part of platform 
provisioning scheme. For example, on many platforms we manage, if a user 
subscribes to some offer his/her mailbox quota will automatically be 
upgraded (IMAP action). So, at least in our case, I though it might be 
useful.


But I agree that in case of underflow detection throwing a warning in 
syslog might help draw the attention when logs are analysed.




Le 01/09/2011 10:15, Greg Banks a écrit :

commit 2d4fb0 Added quota MESSAGE resource (RFC 2087) management.

imap/quota.h

Why #include stdint.h?  It's not like we're using any of the 
typedefs there.  There's a good argument to be made that we *should* 
do so, but until then the #include doesn't seem useful.


As you realised in the next commit, I removed QUOTA_REPORT_FMT 
(appearing only in quota.c) to use PRIdMAX/PRIuMAX and cast to 
(u)intmax_t upon formatting. Using those appeared as a future-proof 
solution in the discussion about 32/64-bit variables thread I started 
some time ago :)
So yeah, the include could have been done *only* in quota.c, but I 
remembered that Bron wanted to get rid of (u)quota_t later. So I decided 
to put the include in quota.h for later use.


I'm unconvinced about the need for quota.sets[].  After a lot of 
reading it seems to me that (quota.sets[x]) and (quota.limits[x] != 
QUOTA_UNLIMITED) are identical or should be.  Also, are you setting 
sets[] anywhere except when loading from the db?


These would be perfectly identical if implementing the last point I 
brought up (not writing the resource in the quota entry if it's usage is 
unlimited). But then I need to compute actual quota usage at the right 
time. And when I saw that the code would allow to associate a quota to a 
whole domain (ouch, performance issue foreseen if quota is to be 
computed too often), I decided on the solution I proposed: compute it 
only once the first time it is set, hence the need of quota.sets[] 
(since when changing from bounded to unlimited limit, the resource is 
still kept in the quota entry to prevent recomputing its usage next time).


Otherwise, sets[] is updated when setting quota limits (in 
mboxlist_setquotas).



imap/quota_db.c

In quota_write(), your sanity check should goto the buf_free() at the 
end of the function rather than return early.  Otherwise, looks good.



Ah, forgot that one.
Actually I took a shortcut here, since buf_free should not be necessary 
with the current code (which does append strings; so if the len is still 
0, no string was malloc'ed). But for sanity reasons you are right, it is 
wiser to goto the end of the function and do the expected cleanup stuff. 
This shall be future-proof.



imap/imapd.c

In the 1st hunk in cmd_append(), at this point in the code I believe 
totalsize = 0, so you could easily pass 0 for the new last argument as 
well.

Yes at this point totalsize is still 0.
The current code, which handles MULTIAPPEND, does a preliminary check to 
see if mailbox is not overquota, then receives all messages, and finally 
checks that all messages can fit in the mailbox.
Since we know for sure at least one message is coming, I though it could 
be a good idea to early-check that at least one more message would fit 
in the mailbox before receiving anything. Otherwise we are staging new 
file only to trash it at the end (when overquota). So actually I wonder 
if the code could be 

Re: MESSAGE quota resource implemention

2011-09-01 Thread Bron Gondwana
On Thu, Sep 01, 2011 at 01:27:00PM +0200, Julien Coloos wrote:
 Le 01/09/2011 03:03, Greg Banks a écrit :
 This one's tough, I wasn't sure what to do.  However, I'm happy to
 leave it to the administrator to have to manually run quota -f
 (maybe twice!) if they set a quota on a resource that is already
 being used.  I'm unconvinced that automatically doing the
 equivalent of -f as a side effect of setting the first limit is
 necessary or wise.  Perhaps we should a) document it clearly and
 b) detect the situation and put an obvious message saying
 something like you may need to run quota -f ... where the human
 making the change will see it.  Also, such a message might be
 useful when usage underflow is detected.
 
 My concern here is that there are times when quota changes are not
 done manually by a human, but happen automatically as part of
 platform provisioning scheme. For example, on many platforms we
 manage, if a user subscribes to some offer his/her mailbox quota
 will automatically be upgraded (IMAP action). So, at least in our
 case, I though it might be useful.

A provisioning system could run quota -f itself after making the
change, of course.

But more generally, the update the quotaroot is atomic-safe,
because the mailbox doesn't add/remove things from the quotaroot
racily - but quota -f IS racy.  The only way around that would
be a dual pass mark and collect thing, where you marked each
mailbox as we're re-calculating, so don't update the quota usage
in the first sweep, then came back and removed the mark as you
read the value.  Tricky, but doable.  The downside is that a
crash part way through can leave you in a broken state.  But
that's true of everything.

 But I agree that in case of underflow detection throwing a warning
 in syslog might help draw the attention when logs are analysed.

when.  haha.

(maybe at a few sites that care... but for the vast majority of
 sites, if you're depending on them reading syslog you've already
 lost.  Software that understands that and is robust in the face
 of errors is much nicer for the poor suckers on the receiving end
 of all this)

 Le 01/09/2011 10:15, Greg Banks a écrit :
 commit 2d4fb0 Added quota MESSAGE resource (RFC 2087) management.
 
 imap/quota.h
 
 Why #include stdint.h?  It's not like we're using any of the
 typedefs there.  There's a good argument to be made that we
 *should* do so, but until then the #include doesn't seem useful.
 
 As you realised in the next commit, I removed QUOTA_REPORT_FMT
 (appearing only in quota.c) to use PRIdMAX/PRIuMAX and cast to
 (u)intmax_t upon formatting. Using those appeared as a future-proof
 solution in the discussion about 32/64-bit variables thread I
 started some time ago :)
 So yeah, the include could have been done *only* in quota.c, but I
 remembered that Bron wanted to get rid of (u)quota_t later. So I
 decided to put the include in quota.h for later use.

I don't mind keeping quota_t around.  Using it makes it a bit
clearer that you intend to store a quota in this variable.  It's
like free documentation.  Just uquota_t is pointless in a 64bit
required world.

 In the 1st hunk in cmd_append(), at this point in the code I
 believe totalsize = 0, so you could easily pass 0 for the new last
 argument as well.
 Yes at this point totalsize is still 0.
 The current code, which handles MULTIAPPEND, does a preliminary
 check to see if mailbox is not overquota, then receives all
 messages, and finally checks that all messages can fit in the
 mailbox.
 Since we know for sure at least one message is coming, I though it
 could be a good idea to early-check that at least one more message
 would fit in the mailbox before receiving anything. Otherwise we are
 staging new file only to trash it at the end (when overquota). So
 actually I wonder if the code could be updated to check (LITERAL
 parameter) that each message about to be received would fit in the
 mailbox, instead of checking only once at the end ?

There are some bugs about this already.  In particularl the opposite
case, checking that we actually want to append something before
aborting a sieve delivery - because it may be discarded or redirected
or even duplicate suppressed anyway.  Something to keep in mind.

 imap/mailbox.c
 
 I'm not sure it's a good idea to set mailbox-i.exists = 0 in
 mailbox_delete().  Perhaps it would be better to check for
 (mailbox-i.options  OPT_MAILBOX_DELETED) when sampling
 mailbox-i.exists in mailbox_commit_quota().
 Hmm, I wondered about it too. At the end of mailbox_delete, upon
 calling mailbox_close, all physical files in the mailbox are
 deleted, and it didn't seem like 'exists' was used anywhere
 in-between, so I thought it would be alright.
 But given the complexity of the code at that point (lot of cleaning
 when deleting the mailbox), and unforeseen future usages, I guess
 you are right. Actually maybe even quota_mailbox_use shall be
 treated that way ?

FYI - I'm planning to be a bit more lazy about 

Re: MESSAGE quota resource implemention

2011-09-01 Thread Michael Menge

Quoting Bron Gondwana br...@fastmail.fm:


Actually, really I'd like to create a new UNIQUEID - and store
all the files in paths based on uniqueid rather than on folder
name.  This would not only mean renames could be fast and
atomic, but that delayed delete would be fast.  The downside is
a more opaque filesystem layout.  Oh, another upside - file path
limitations don't exist so much any more.



How do you restore a folder from a filebased backup?
Or exclude some folders form backup (e.g. trash folders)?




M.MengeTel.: (49) 7071/29-70316
Universität Tübingen   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung  mail:  
michael.me...@zdv.uni-tuebingen.de

Wächterstraße 76
72074 Tübingen

smime.p7s
Description: S/MIME Signatur


Re: MESSAGE quota resource implemention

2011-09-01 Thread Rob Mueller



Actually, really I'd like to create a new UNIQUEID - and store
all the files in paths based on uniqueid rather than on folder
name.  This would not only mean renames could be fast and
atomic, but that delayed delete would be fast.  The downside is
a more opaque filesystem layout.  Oh, another upside - file path
limitations don't exist so much any more.


While UNIQUEID is nice, the opaqueness is annoying. Personally, I liked the 
idea that we talked about a while back which I think was:


$spooldir/a/user/aardvark/user.aardvark/
$spooldir/a/user/aardvark/user.aardvark.drafts/
$spooldir/a/user/aardvark/user.aardvark.trash/
$spooldir/a/user/aardvark/user.aardvark.foo/
$spooldir/a/user/aardvark/user.aardvark.foo.bar/
$spooldir/a/user/aardvark/user.aardvark.abc.xyz/

So you end up with every folder for a user in one dir. This solves the 
current messy handling of sub-dirs (eg currently you have to create the 
intermediate dir /abc/ even though there's no entry in mailboxes.db for it), 
and makes renaming any folder very cheap still (because you can do it with a 
single dir rename, rather than having to move each message file), but you 
don't go completely opaque.


Delete handling is still easier, just rename to DELETED.$oldname.$UNIQUEID 
or something like that, because it's cheap to rename anyway.


Of course it means re-organising folder layout for every installation out 
there, but maybe we need to bump major versions anyway, cyrus 3.0 here we 
come :)


Rob



Re: MESSAGE quota resource implemention

2011-08-31 Thread Bron Gondwana
On Wed, Aug 31, 2011 at 05:50:36PM +0200, Julien Coloos wrote:
 Things that may be worth noting:
   - DUMP/UNDUMP currently does nothing special about MESSAGE or
 X-ANNOTATION-STORAGE quota resources
 - should it be transferred ?

I'd like to replace DUMP/UNDUMP with replication protocol
communications for XFER.

 - without breaking backward compatibility, limits could only be
 transferred through a 'fake' file entry, as for annotations

But for now, that's definitely the pragmatic way to go.

   - quota usage is currently stored in a uquota_t variable, and
 delta is computed as quota_t; so theorically there could be overflow
 issues if quota usage to add/substract cannot be held in a quota_t;
 in practice it should be unlikely since that would mean a usage of
 over 2^63-1

I propose we scrap uquota_t - it is un-necessary for the medium-term
future, now that we're requiring 64 bit types.

 I think that there are two things that may also be done concerning
 quota entries:
   - always recompute resource usage when changing its limit from
 unlimited to bounded

That's not too expensive really - we do it in the quota tool anyway.

The bigger issue is locking.  You need to be very careful of deadlock
situations if you do this.  We have that issue in the quota tool now.

The only _real_ way to do it reliably would be to go through and
remove all the quota roots from each mailbox, then go through again
one at a time and add them back, recording usage as you go.  Otherwise
there's no way of knowing if your end result is consistent.

 - that way, it may not be necessary to track resource presence
 when reading/writing quota entries
 - but maybe it could be too time consuming in some cases, since
 it seems to be possible to associate a quota resource to a whole
 domain (recomputing usage for all mailboxes would then take some
 time)

Yeah, now this is what you could call all sorts of bad names.  It's
really not sane to keep a quota counter at EVERY possible level in
the tree just incase someone wants to hang a limit there later.
You would get insane amounts of contention on the domain quota
lock for a busy domain, which would be un-necessary.

 PS: should I open a new bugzilla ticket for that ?

Why not - numbers are cheap.

 PPS:
 cunit: on my computer cunit tests succeed except the 'foreach' one
 in the 'quota' suite which timeouts (it seems messing with 4096
 quotalegacy is too much for my computer).

There's stuff you can do with fd limits.  Check out /etc/pam.d/* for
pam_limits.so, and then /etc/security/limits.conf.  I just had to
learn about that stuff today!

Bron.