> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Greg
> Sent: Wednesday, October 08, 2003 3:11 PM
> To: PgSQL Performance ML
> Subject: Re: [PERFORM] Compare rows
> Josh Berkus wrote:
> > Greg,
> >>The data represents metrics at a point in time on a system for
> >>network, disk, memory, bus, controller, and so-on. Rx, Tx, errors,
> >>speed, and whatever else can be gathered.
> >>We arrived at this one 642 column table after testing the whole
> >>process from data gathering, methods of temporarily storing then
> >>loading to the database. Initially, 37+ tables were in use but
> >>the one big-un has saved us over 3.4 minutes.
> > Hmmm ... if few of those columns are NULL, then you are
> probably right ...
> > this is probably the most normalized design. If, however,
> many of columns
> > are NULL the majority of the time, then the design you should
> be using is a
> > vertial child table, of the form ( value_type | value ).
> > Such a vertical child table would also make your comparison
> between instances
> > *much* easier, as it could be executed via a simple
> 4-table-outer-join and 3
> > where clauses. So even if you don't have a lot of NULLs, you
> probably want
> > to consider this.
> You lost me on that one. What's a "vertical child table"?
Parent table Fkey | Option | Value
| OS | Solaris
| DISK1 | 30g
^^^^^^^^ ^^^-- values
fields are values in a column rather than 'fields'
> Statistically, about 6% of the rows use more than 200 of the columns,
> 27% of the rows use 80-199 or more columns, 45% of the rows use 40-79
> columns and the remaining 22% of the rows use 39 or less of the columns.
> That is a lot of NULLS. Never gave that much thought.
> To ensure query efficiency, hide the NULLs and simulate the multiple
> tables I have a boatload of indexes, ensure that every query makees use
> of an index, and have created 37 views. It's worked pretty well so
> >>The reason for my initial question was this. We save changes only.
> >>In other words, if system S has row T1 for day D1 and if on day D2
> >>we have another row T1 (excluding our time column) we don't want
> >>to save it.
> > If re-designing the table per the above is not a possibility,
> then I'd suggest
> > that you locate 3-5 columns that:
> > 1) are not NULL for any row;
> > 2) combined, serve to identify a tiny subset of rows, i.e. 3%
> or less of the
> > table.
> There are always, always, always 7 columns that contain data.
> > Then put a multi-column index on those columns, and do your
> > Hopefully the planner should pick up on the availablity of the
> index and scan
> > only the rows retrieved by the index. However, there is the distinct
> > possibility that the presence of 637 WHERE criteria will
> confuse the planner,
> > causing it to resort to a full table seq scan; in that case,
> you will want to
> > use a subselect to force the issue.
> That's what I'm trying to avoid is a big WHERE (c1,c2,...,c637) <>
> (d1,d2,...,d637) clause. Ugly.
> > Or, as Joe Conway suggested, you could figure out some kind of
> value hash that
> > uniquely identifies your rows.
> I've given that some though and though appealing I don't think I'd care
> to spend the CPU cycles to do it. Best way I can figure to accomplish
> it would be to generate an MD5 on each row without the timestamp and
> store it in another column, create an index on the MD5 column, generate
> MD5 on each line I want to insert. Makes for a simple WHERE...
> Okay. I'll give it a whirl. What's one more column, right?
> Greg Spiegelberg
> Sr. Product Development Engineer
> Cranel, Incorporated.
> Phone: 614.318.4314
> Fax: 614.431.8388
> Email: [EMAIL PROTECTED]
> Cranel. Technology. Integrity. Focus.
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])