On Monday 07 June 2004 3:08 pm, Rick Widmer wrote:
> Jeremy Kitchen wrote:
> > On Monday 07 June 2004 02:22 pm, Jeremy Kitchen wrote:
> >>>As an ex-database admin, I wrote the many-domains=no option as
> >>>an optimization to the database schema to save disk space.
> >>>
> >>>Consider hosting one domain with 1 million email accounts.
> >>>The pw_domain(64 char) field is redundant (only one domain). So
> >>>we would be wasting 64MB of space in the database. So instead
> >>>the table is named after the domain.
> >>
> >>I was about to ask you this but you ran off before I could.. so I'll just
> >>post for discussion purposes.
> >>
> >>Isn't this what the varchar field is for?
>
> Yes it is, but the current database layout uses char fields.  Still if
> you have one domain with 1 million accounts, you have at least 4MB of
> duplicated domain names, if the domain name is a.au, better than the
> 64MB the current char field would use, but not as good as 0 bytes if the
> table is named for the domain and you don't store the domain name at
> all.  It is interesting in theory, but there probably aren't many sites
> like this.
>
> It does bring up an interesting question...  would it be a good idea to
> change from char to varchar for all the fields? 

Standard database design says to *never* use varchars. And
if you do use them, you better have a darn good reason.
It's easy to spot a newbie doing database design, 
all fields are varchars. 

Varchars make it very difficult for the SQL engine to optimize
disk to memory paging. Which in practice means no optimization
for tables with varchars. When each row is a fixed length, the engine
can calculate how many rows will fit in a memory block. Then it can
do one disk read per memory block (this is the optimization). 
With varchar fields the engine needs to read in a section of disk into
a temporary memory buffer. Then parse the varchar lengths to calculate
row lengths, one row at a time. The end result is more disk reads.
And disk I/O is the bottleneck in databases. So it all boils down to 
significant performance improvements of fixed lenght rows over variable
lenght rows. 

> It should give a 
> substantial reduction in database size, and I don't think it will be
> that much slower. 
> Someone _should_ be able to do an alter table and    
> find out if anything breaks...
>
>
> Rick
>
> p.s.  It does bring up a real question - does anyone actually use
> separate tables for each domain anymore?  It would be nice if we could
> remove the option some day.  (It causes a number of four state ifdef
> structures that are kind of messy.)

I know what you mean. I kinda wish I never wrote the two versions.
But removing it would break backward compatibility.

Ken

Reply via email to