Re: [vchkpw] vpopmail development

2009-01-12 Thread Manvendra Bhangui
On Fri, 2009-01-09 at 08:57 -0600, Matt Brookings wrote:
 This would not work because users can be deleted out of the hash tree
 anywhere.  It appears your patch assumes a FILO ordering of user additions
 and deletions.
I have not been able to explain properly. It would be FIFO.

 If the hashes, 'a' through 'd' existed, and the 'b' hash directory cleared
 out, your method would fail to backfill correctly.
Let's take an example
suppose
there are 100 users (with 100 directories) in /var/vpopmail/domains
there are 100 users (with 100 directories) in /var/vpopmail/domains/0
there are 100 users (with 100 directories) in /var/vpopmail/domains/1
there are 100 users (with 100 directories) in /var/vpopmail/domains/2
there are 50  users (with  50 direcotires) in /var/vpopmail/domains/3

Now let say I delete a user who has a directory
in /var/vpopmail/domains/1
The backfill code will put the entry '1' in the first line in the file
dir_control_free.
Also let us say that we delete two users in /var/vpopmail/domains/2
The backfill code in vdeluser will put entry '2' twice in the file
dir_control_free

So after deleting 3 users, the file dir_control_free will have 3 lines
1
2
2


So now we have 99 users in /var/vpopmail/domains/1
andwe have 98 users in /var/vpopmail/domains/2

Now the modified vadduser will call a function called backfill() which
will open this file, lock the file and pickup the first line, delete the
line and return the value as user_hash

#ifdef USERS_BIG_DIR
  /* go into a user hash dir if required */
  if (!(user_hash = backfill(domain)))
  {
  open_big_dir(domain, uid, gid);
  user_hash = next_big_dir(uid, gid);
  close_big_dir(domain, uid, gid);
  chdir(user_hash);
  }
#endif

Each time the function backfill() is called it will deplete the file
dir_control_free by one line and will always return the first line as
the user_hash. When all lines get depleted, backfill() will return NULL
in which case the regular dir_control will again come into effect and
start from where it had left earlier.

The advantage of this method is that you can use the find command to
generate the missing directories in dir_control_free to catch up with
the actual dir_control.

Another way to explain this is that when backfill is in operation,
dir_control stops working and when backfill() gets depleted and stops
working, dir_control starts working


!DSPAM:496b235e32678184414047!



Re: [vchkpw] vpopmail development

2009-01-12 Thread Joshua Megerman
On Monday 12 January 2009 07:48:17 am ISP Lists wrote:
 Can someone please provide a brief discussion as to when a vpopmail hashed
 folder tree becomes big enough to warrant backfilling?  Or, is big
 just one concern amongst others such as: rate of deletes and adds,
 filesystem choice...
 I'm not quite picking up why the backfill is important.

Well, I don't know what other people are considering too big, but I actually 
wrote a backfill patch when I was working at a medium-sized college.  We kept 
all 62 top-level hash directories on separate partitions, but didn't ever 
want to go to second level hashes - and with ~1200 adds (incoming freshmen) 
and deletes (outgoing seniors) every year, this became an issue pretty quick.  
The other issue with backfill is that the current implementation makes it 
so that you can easily exceed the 100 users per hash dir limit by deleting 
users from prior hash dirs and then adding new ones since the only check for 
a new hash dir is total users/100.  My patch and the reasons therefore can 
be found at 
http://sourceforge.net/tracker/index.php?func=detailaid=1619600group_id=85937atid=577800.

The main reason it's not currently slated for inclusion is that it's for the 
mysql backend only, and whatever process is used to provide backfill must be 
available for all backends.

One last note - the idea of maintaining a list of backfill slots in a text 
file is a pretty good one, but it still doesn't address the issue of not 
properly calculating the number of users in a directory...

Josh
-- 
Joshua Megerman
SJGames MIB #5273 - OGRE AI Testing Division
You can't win; You can't break even; You can't even quit the game.
  - Layman's translation of the Laws of Thermodynamics
j...@honorablemenschen.com

!DSPAM:496b440e32671399810511!



Re: [vchkpw] vpopmail development

2009-01-12 Thread Matt Brookings
Manvendra Bhangui wrote:
 Now let say I delete a user who has a directory
 in /var/vpopmail/domains/1
 The backfill code will put the entry '1' in the first line in the file
 dir_control_free.
 
 So after deleting 3 users, the file dir_control_free will have 3 lines
 1
 2
 2
 Each time the function backfill() is called it will deplete the file
 dir_control_free by one line and will always return the first line as
 the user_hash. When all lines get depleted, backfill() will return NULL
 in which case the regular dir_control will again come into effect and
 start from where it had left earlier.

Okay.  I can definitely see how this would work.  It is a reasonable
solution, and I'd be very interested to see a completed patch against
the CVS head.

The one comment I would make is that it's okay for user deletion to be
an expensive call since it won't be being done nearly as much as queries
for user information, but that depending upon the number of users a system
has, for instance, where the hash levels have tripled up, the dir_control_free
file would become very large and your solution requires a re-write of the file
occasionally.

It would be interesting to see a more efficient method where duplicates,
as in your example, the hash directory 2, could be listed a single time.

Remember that this feature does not yet exist, and that there are probably
many systems with backfilling needs that go back years.  Potentially this
patch could hit a system with four levels of hashing simply because there's
been a lot of additions and deletions.  If the backfill patch doesn't take
this into consideration, we may need to consider writing some sort of
utility to analyze and clean, a system that is overhashed.

 The advantage of this method is that you can use the find command to
 generate the missing directories in dir_control_free to catch up with
 the actual dir_control.
 
 Another way to explain this is that when backfill is in operation,
 dir_control stops working and when backfill() gets depleted and stops
 working, dir_control starts working

Agreed.
-- 
/*
Matt Brookings m...@inter7.com   GnuPG Key D9414F70
Software developer Systems technician
Inter7 Internet Technologies, Inc. (815)776-9465
*/


Re: [vchkpw] vpopmail development

2009-01-12 Thread Matt Brookings
ISP Lists wrote:
 Can someone please provide a brief discussion as to when a vpopmail hashed
 folder tree becomes big enough to warrant backfilling?  Or, is big
 just one concern amongst others such as: rate of deletes and adds,
 filesystem choice...
 I'm not quite picking up why the backfill is important.

You've got it backwards.  Backfilling becomes important when adding users.
vpopmail hashes directories at around 100 users per directory.  It also does
this with domain directories.  The problem is that the hashing does not take
user removal into account.  If you add 1000 users, and delete the first 500,
the hashing leaves empty hash directories and continues to add new ones, rather
than re-using previously created hash directories that are no longer full.
-- 
/*
Matt Brookings m...@inter7.com   GnuPG Key D9414F70
Software developer Systems technician
Inter7 Internet Technologies, Inc. (815)776-9465
*/


Re: [vchkpw] vpopmail development

2009-01-12 Thread Matt Brookings
Joshua Megerman wrote:
 One last note - the idea of maintaining a list of backfill slots in a text 
 file is a pretty good one, but it still doesn't address the issue of not 
 properly calculating the number of users in a directory...

What are you referring to when you say it doesn't properly calculate the number
of users?  The current hashing structure keeps track of additions only.  Once 
users
are removed, it is no longer up to date with correct user counts.  That's what
we're addressing with this proposed patch.
-- 
/*
Matt Brookings m...@inter7.com   GnuPG Key D9414F70
Software developer Systems technician
Inter7 Internet Technologies, Inc. (815)776-9465
*/


Re: [vchkpw] vpopmail development

2009-01-12 Thread DAve

Matt Brookings wrote:

Remember that this feature does not yet exist, and that there are probably
many systems with backfilling needs that go back years.  Potentially this
patch could hit a system with four levels of hashing simply because there's
been a lot of additions and deletions.  If the backfill patch doesn't take
this into consideration, we may need to consider writing some sort of
utility to analyze and clean, a system that is overhashed.


My system would be one of those, here are the stats from just one domain 
after 4 years of use. I have been putting off hacking together a Perl 
script to move everything around and update the MySQL tables. Honestly, 
I cannot say there is any performance hit even with the dirs this messed up.


[r...@newnfs:/usr/local/scripts/old-scripts]# ./dircheck.sh tls.net
dir 0 -35
dir 1 -38
dir 2 -36
dir 3 -30
dir 4 -32
dir 5 -38
dir 6 -32
dir 7 -33
dir 8 -38
dir 9 -33
dir A -31
dir B -26
dir C -45
dir D -32
dir E -32
dir F -19
dir G -36
dir H -42
dir I -39
dir J -30
dir K -34
dir L -30
dir M -33
dir N -31
dir O -33
dir P -26
dir Q -27
dir R -24
dir S -29
dir T -31
dir U -32
dir V -25
dir W -38
dir X -45
dir Y -30
dir Z -30
dir a -31
dir b -11
dir c -31
dir d -36
dir e - 3
dir f - 2
dir g - 5
dir h -64
dir i -14
dir j -13
dir k -13
dir l -13
dir m -26
dir n -17
dir o -36
dir p -16
dir q -17
dir r -33
dir s -23
dir t -30
dir u -   620
dir v -   759

DAve
--
The whole internet thing is sucking the life out of me,
there ain't no pony in there.

!DSPAM:496b560c32679005657564!