Re: [SQL] Re: Restriction by grouping problem.
At 11:47 27/07/01 +1000, Philip Warner wrote: > >...the FROM statement would be a lot nicer if your had a table of SIDs, or >at least had a good way of generating the list of SIDs. > A final (& possibly best) option, is: Select MDT.GETFILE, [anything else from 'best' record] From (Select Distinct SID From LOGS) as S, (Select * From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as MDT But unfortunately PG 7.1 does not allow this - the second select can not reference the first select. I have not tried it in 7.1.2. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[SQL] nullif BUG???
I try > select nullif(NULL,5) ++ | case | ++ | | ++ -- ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] nullif BUG???
Guard, > > select nullif(NULL,5) > > ++ > | case | > ++ > | | > ++ > Er... what were you expecting, exactly? Except for IS NULL (and COALESCE, which uses IS NULL) any operation involving a NULL is also NULL. -Josh __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businessesfax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[SQL] Re: D308-E9AF-4C11 : CONFIRM from pgsql-sql (subscribe)
A Question for those of you who consider yourself crack Database Designers. I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. My question concerns establishing the relationships. Currently Relationships between tables are established via a Unique Integer ID like this: *=APrimaryKey PartTypesCustomer Parts - - PartTypeID CustomerID PartID *PartType*Customer PartTypeID Address CustomerID *PartNumber(2FieldPrimaryKey) *PartRevision(2FieldPrimaryKey) PartName HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: *=APrimaryKey PartTypesCustomer Parts - - *PartType*Customer PartType Address *PartNumber(2FieldPrimaryKey) *PartRevison(2FieldPrimaryKey) PartName Customer Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables. Thanks! ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[SQL] Database Design Question
A Question for those of you who consider yourself crack Database Designers. I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. My question concerns establishing the relationships. Currently Relationships between tables are established via a Unique Integer ID like this: *=APrimaryKey PartTypesCustomer Parts - - PartTypeID CustomerID PartID *PartType*Customer PartTypeID Address CustomerID *PartNumber(2FieldPrimaryKey) *PartRevision(2FieldPrimaryKey) PartName HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: *=APrimaryKey PartTypesCustomer Parts - - *PartType*Customer PartType Address *PartNumber(2FieldPrimaryKey) *PartRevison(2FieldPrimaryKey) PartName Customer Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables. Thanks! ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
RE: [SQL] Database Design Question
I'm by no means a crack database designer, but I do have my ideas on this subject. I prefer that every table has a unique integer (auto-incrementing) primary key. Why? Consistency. With the alternative, some tables may not have a clear-cut candidate for a unique id. In your example, you had "customer". How many "John Smith"s would it take before we decide that is not a good identifier. On the other hand, some tables would have perfectly logical identifiers. Part numbers, SSNs So, you would need to create some tables with integer primary keys, and others would have some other natural identifier. That to me is inconsistent. Every table should be, IMHO, predictable in it's definition of a primary key. I don't even have to guess what the names of my primary keys are either because the are all named _ID. Always. I've only come up with these thoughts on my own, and have not extensively tried the other way, so I'd be interested in hearing other's ideas for the other side. Thanks, Jimmie Fulton Systems Administrator Environmental Health & Safety Office Emory University School Of Medicine -Original Message- From: Gonzo Rock [mailto:[EMAIL PROTECTED]] Sent: Friday, July 27, 2001 2:03 PM To: [EMAIL PROTECTED] Subject: [SQL] Database Design Question A Question for those of you who consider yourself crack Database Designers. I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. My question concerns establishing the relationships. Currently Relationships between tables are established via a Unique Integer ID like this: *=APrimaryKey PartTypesCustomer Parts - - PartTypeID CustomerID PartID *PartType*Customer PartTypeID Address CustomerID *PartNumber(2FieldPrimaryKey) *PartRevision(2FieldPrimaryKey) PartName HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: *=APrimaryKey PartTypesCustomer Parts - - *PartType*Customer PartType Address *PartNumber(2FieldPrimaryKey) *PartRevison(2FieldPrimaryKey) PartName Customer Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys Is one recommended over the other??? Sure appreciate the commentary before I get in too deep with all these tables. Thanks! ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
RE: [SQL] Database Design Question
You would not necessarily want clients searching by the integer-based key. That is used internally by the database or application. Clients wouldn't know what the keys are anyway because they are not used for anything except referential integrity/relation of data. You would have Part_ID, and Part_Number. Part_ID is defining the relationship and Part_Number is human readable for searching. If you don't want your clients to see the Part_ID, create a view that excludes it and make them use the views for queries. As stated already by James, you can still make Part_Number unique even if a primary key already exists. Hope this makes sense/is helpful, Jimmie Fulton Systems Administrator Environmental Health & Safety Office Emory University School Of Medicine -Original Message- From: Gonzo Rock [mailto:[EMAIL PROTECTED]] Sent: Friday, July 27, 2001 4:22 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: RE: [SQL] Database Design Question OK... Fair Enough... Good Points indeed y'all. Well... What about the problem of users trying to Query the Database?? You know... like when using Crystal Reports or something?. SELECT * from HistoryTable WHERE PartID = SomeInteger vs SELECT * from HistoryTable WHERE PartNum = 12345636 AND PartRev = C How are they supposed to know What the PartID is ?? Anyway, that I why I was considering changing... current users always have trouble peering into the database... They don't quite get it. At 02:31 PM 7/27/01 -0400, Mike Mascari wrote: >I prefer using unique integer ids generated from sequences rather than >keys composed of meaningful values. > >Advantages: > >Client side applications can store/handle the unique integer ids more >readily than having to deal with composite primary keys composed of >varying data types. For example, I can stuff the id associated with a >particular record easily in list boxes, combo boxes, edit controls, etc. >via SetItemData() or some other appropriate method. Its a bit more >complicated to track database records via composite keys of something >like: part no, vendor no, vendor group. > >Updating the data doesn't require cascading updates. If you use keys >with meaning, the referential integrity constraints must support >cascading updates so if the key changes in the primary table the change >is cascaded to all referencing tables as well. Earlier versions of most >databases (Access, Oracle, etc.) only provided cascading deletes under >the assumption you would be using sequence generated keys. > >Downside: > >Many queries might require more joins against the primary table to fetch >the relevant information associated with the numerical id, whereas keys >composed of solely the values with which they are associated might not >require the joins, which will speed some applications. I now have some >queries with 20-way joins. But PostgreSQL provides a way to explicitly >set the path the planner will choose and so the execution of the query >is instantaneous. I'm not sure about other databases. In earlier >versions, I had to denormalize a bit solely for performance reasons. > >In the past, I used to use composite keys and switched to the purely >sequence generated path and don't regret it at all. Of course, you'll >still have a unique constraint on the what-would-have-been meaningful >primary key. > >Hope that helps, > >Mike Mascari >[EMAIL PROTECTED] > >Gonzo Rock wrote: >> >> A Question for those of you who consider yourself crack Database Designers. >> >> I am currently moving a large database(100+Tables) into pgSQL... with the intention of deploying against 'any' SQL database in the future. The development side will be rigorously using Standard SQL constructs with no unique/proprietary extensions. >> >> My question concerns establishing the relationships. >> >> Currently Relationships between tables are established via a Unique Integer ID like this: >> >> *=APrimaryKey >> >> PartTypesCustomer Parts >> - - >> PartTypeID CustomerID PartID >> *PartType*Customer PartTypeID >> Address CustomerID >> *PartNumber(2FieldPrimaryKey) >> *PartRevision(2FieldPrimaryKey) >>PartName >> >> >> HOWEVER; I have read lots of texts describing the Relational Design should be instead like this: >> >> *=APrimaryKey >> >> PartTypesCustomer Parts >> - - >> *PartType*Customer PartType >> Address *PartNumber(2FieldPrimaryKey) >> *PartRevison(2FieldPrimaryKey) >>PartName >>Customer >> >> Both Techniques have a unique foreign key back to the parent tables but one uses No.Meaningful.Info.Integer.Data for the ForeignKey while the second uses Human.Understandable.ForeignKeys >> >> Is one recommended over the other??? Sure appreciate the commentary befo
Re: [SQL] Re: Restriction by grouping problem.
Phillip, > What it effectively does is executes the second SELECT for each row > of the > first select, a bit like a column-select expression, but it allows > access > to all rows. Ah! I knew that there had to be a simple way to do what you want to do in PGSQL: SELECT MIN(getfile) as bestfile FROM logs, (SELECT SID, MAX(logtime) as lasttime FROM logs) MDT WHERE logs.SID = MDT.SID AND logs.logtime = MDT.lasttime GROUP BY SID This gives you the "getfile" from the logs with the latest timestamp. In the event of two with the exact same logtime, it selects the first one alphabetically so that you don't get two files for one SID. -Josh __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businessesfax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Re: Restriction by grouping problem.
At 17:19 27/07/01 -0700, Josh Berkus wrote: >Phillip, > >> What it effectively does is executes the second SELECT for each row >> of the >> first select, a bit like a column-select expression, but it allows >> access >> to all rows. > >Ah! I knew that there had to be a simple way to do what you want to do >in PGSQL: The first two examples I sent do it 'simply' in PG. Unfortunately the example you give will only return one row, and since Jeff wanted one gettime for each SID, we'd need to modify both your and my first solution to: Select SID,Min(GETFILE) From (Select Min(DATETIME),SID From LOGS Group by SID) as MDT, LOGS L Where L.SID = MDT.SID And L.DATETIME = MDT.DATETIME This will work, but will not solve the larger problem of 'give me all the attrs of of the record with the least datetime for a given SID'. Jeff does not need this, but it is a common problem. Simply using min/max works for one attr, but fails to provide a consistent view of multiple attrs. PG has no elegant single-statement solution to this problem. The SQL solution I use in other systems is, as I mentioned earlier: Select [any list of attrs from the 'best' record] From (Select Distinct SID From LOGS) as S, -- Get the grouping attrs (Select * From LOGS L Where L.SID = S.SID -- For each group do a 'row-select' Order By DATETIME Asc Limit 1) as MDT The second select can be thought of as analagous to a column-select-expression, but allowing access to all attrs of the resulting row. If you want f1,f2,and f3 from LOGS, then a similar result would be achieved (inelegantly) by: Select (Select F1 From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as F1, (Select F2 From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as F2, (Select F3 From LOGS L Where L.SID = S.SID Order By DATETIME Asc Limit 1) as F3 From (Select Distinct SID From LOGS) as S, -- Get the grouping attrs Assuming DATETIME is unique then F1, F2, F3 will all come from the same row and you will have a consistent record. I have not checked, but I'd guess that PG will create a cross block with four entries, whereas the original syntax above should just use 2 entries. Like I said, it's just an optimizer hint. Philip Warner| __---_ Albatross Consulting Pty. Ltd. |/ - \ (A.B.N. 75 008 659 498) | /(@) __---_ Tel: (+61) 0500 83 82 81 | _ \ Fax: (+61) 0500 83 82 82 | ___ | Http://www.rhyme.com.au |/ \| |---- PGP key available upon request, | / and from pgp5.ai.mit.edu:11371 |/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[SQL] Why does this plpgslq always return 1?
Can anyone tell me why this always return 1? Thanks! CREATE FUNCTION passrate(date, date, text) RETURNS float AS ' DECLARE begindate ALIAS FOR $1; enddate ALIAS FOR $2; passfail ALIAS FOR $3; ret float; countp float; counttotal float; BEGIN SELECT INTO countp COUNT(*) FROM benchmark WHERE passfail = passfail; SELECT INTO counttotal COUNT(*) FROM benchmark; ret := countp / counttotal; RETURN ret; END;' LANGUAGE 'plpgsql'; John ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[SQL] RE: Database Design Question
On Fri, 27 Jul 2001, Jimmie Fulton wrote:
> I'm by no means a crack database designer, but I do have my ideas on this
> subject. I prefer that every table has a unique integer (auto-incrementing)
> primary key. Why? Consistency. With the alternative, some tables may not
> have a clear-cut candidate for a unique id. In your example, you had
> "customer". How many "John Smith"s would it take before we decide that is
> not a good identifier. On the other hand, some tables would have perfectly
> logical identifiers. Part numbers, SSNs So, you would need to create
> some tables with integer primary keys, and others would have some other
> natural identifier. That to me is inconsistent. Every table should be,
> IMHO, predictable in it's definition of a primary key. I don't even have to
> guess what the names of my primary keys are either because the are all named
> _ID. Always. I've only come up with these thoughts on my own,
> and have not extensively tried the other way, so I'd be interested in
> hearing other's ideas for the other side.
For large scale databases, there are theories aplenty about proper naming,
etc., and these should be be investigated. For small/simple databases,
this might be overkill in complexity and learning curve.
I teach a series of classes on small-scale database design for nonprofit
organizations, and recommend some simple rules:
* for tables where there is no candidate key (ie a Person table where
there is no SSN, etc.), use the table name + "id", and use a SERIAL-type.
* for tables where there is a candidate key, and that candidate key meets
all the usual criteria (always present, always unique, changes very
rarely), use the table name + "code", and use the appropriate type
(text/int/whatever), UNLESS
* there exists a very common name for this piece of info. Rather than
calling a SSN a "personcode" (in the above example), call it the SSN
* always put the primary key first in the table
Why not always use a SERIAL integer? It's a pain when a perfectly good
value exists. For example, in a table keeping track of US states, their
capitals, and governors, does it really make sense to code Maryland as
"45", when a memorable, unique, unlikely-to-change code already exists
("md")? Using a random number when a real-world code could do only forces
your user to do that lookup themselves.
[apologies to the international readers: Maryland is a state in the USA,
and "MD" is the postal code abbreviation for it]
I think that you could make some basic rules that would give you a system
that is intuitive and easy-to-rememeber, _without_ sacrificing clarity for
rigidity.
hth,
--
Joel Burton <[EMAIL PROTECTED]>
Director of Information Systems, Support Center of Washington
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Why does this plpgslq always return 1?
On Fri, 27 Jul 2001, John Oakes wrote: > Can anyone tell me why this always return 1? Thanks! > > CREATE FUNCTION passrate(date, date, text) RETURNS float AS ' > > DECLARE > begindate ALIAS FOR $1; > enddate ALIAS FOR $2; > passfail ALIAS FOR $3; > ret float; > countp float; > counttotal float; > > > BEGIN > > SELECT INTO countp COUNT(*) > FROM benchmark > WHERE passfail = passfail; Ehhh? Were you expecting that to substitue for *one* of those passfails and not the other? ;) Rename the alias variable to something else. > > SELECT INTO counttotal COUNT(*) > FROM benchmark; > > ret := countp / counttotal; > RETURN ret; > END;' > LANGUAGE 'plpgsql'; ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
