On Mon, 2004-11-08 at 16:22, Matthew T. O'Connor wrote: > Hans Kristian Rosbach wrote: > > >On Fri, 2004-11-05 at 21:32, Sean Chittenden wrote: > > > > > >>CREATE TABLE headers ( > >> id SERIAL PRIMARY KEY, > >> name TEXT NOT NULL > >>); > >>CREATE UNIQUE INDEX headers_name_udx ON headers(name); > >> > >>CREATE TABLE header_msg_map ( > >> id SERIAL8 PRIMARY KEY, > >> physmessage_id INT NOT NULL, > >> header_id INT NOT NULL, > >> value TEXT NOT NULL, > >> hdr_order INT2 NOT NULL, > >> FOREIGN KEY (header_id) REFERENCES headers(id) > >>); > >>CREATE INDEX header_msg_map_idx ON header_msg_map(physmessage_id, > >>header_id); > >> > >> > > > >Well, I agree that you would save a little space. > > > >But I do not however think that it is worth the extra join and > >complexity. The join will increase dbload and tmp-file-usage > >and could make the dbserver overloaded at a far lower load. > > > >Space is cheap, even in huge amounts. IO-speed is NOT cheap. > >pgsql will also compress varchar fields, and a b-tree index on > >it will provide for the searching without decompressing. > > > On the flip I think it will reduce overall DB load since it would search > the very large header_msg_map table using integer comparison rather than > string. A moderate size DBMail database is likely to have 1M messages > in it which would translate to 20-30M headers so that could be a not > only a significant space savings, but a huge performance win when > searching. Also, space saving translates to IO saving.
No, b-tree indexing allows you to not search the whole table for matches. The index search is not a string search and very fast. Also, remember that you add several bytes to the database by adding the id in both tables (8 bytes each) plus the separate indexes on both of id. (Btw, I ussume you meant SERIAL8 in both tables). So unless you can save _on_average_ atleast (16 bytes + index + table overhead) then it's actually going to increase database size. And how many header names are 16+ bytes? Not many.. -=Dead2=-