Re: [SQL] Returning NULL results?
On Tue, 2005-10-11 at 15:39 +0100, Richard Huxton wrote: > Mario Splivalo wrote: > > > > Of course, if it returns no rows, I'm presented with an error, saying: > > > > ERROR: record "rec" has no field "_qv" > > > > This is logical. My question would be is there a way around this > > withouth first doing SELECT COUNT(*) FROM tmbBids WHERE ..., and then if > > COUNT(*) is zero THEN bidCount := NULL ELSE if COUNT(*) is not zero then > > do SELECT from the begining? > > Could you check the FOUND variable? As in IF NOT FOUND THEN RETURN NULL > That actually works, but it's a bit of fuss to use. I need to have it inside the FOR loop, then exit the loop if FOUND is NOT TRUE, wich makes the code even harder to read/mantain. Do you think that anytime soon would we be able to do SELECTs from the temporary tables withouth EXECing those selects? And, some qualifier for the variables (@ in T-SQL, $ in PHP, or something similair) would make the code even more readable. Just my tought. Btw, on postgres 7.4 you can do SELECT something FROM , and if query returns no rows, the FOR loop is skipped. So, it's safe to do something like: someVar:=NULL; _SQL:='SELECT someVar FROM someTable WHERE someCondition...' FOR rec IN EXECUTE _SQL LOOP someVar:=rec.someVar; END LOOP; Now, if select returns no rows, the LOOP/END LOOP is skippend, and someVar is se to be null before FOR. Mario -- Mario Splivalo Mob-Art [EMAIL PROTECTED] "I can do it quick, I can do it cheap, I can do it well. Pick any two." ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] SEVEN cross joins?!?!?
Frank Bax wrote: Are you saying that you WANT to generate a cross-join, score the millions of results and then pick the best 10? It's doing what you want, but you'd like it to be faster. Or are you saying that you'd like to avoid the explosion in rows altogether? In either case - I don't suppose you could provide a real example of the query, so we can see exactly what you're trying to do. There is no "best 10". I currently limit each subselect to 10 items so that query will actually run without crashing. I would like to remove the "ORDER BY itemid LIMIT 10" mentioned above. At the end of the query I have a "LIMIT 100" clause which will stay and produces a list of "best 100" combos. Either of your solutions would be acceptable; since avoiding the "explosion" would also make the query faster. Current calculations indicate that running the query without "LIMIT 10" in subselect would take years to process. OK - so at the heart of the problem is the fact that you want to search a space with 100 billion possible states. There are three main options 1. Brute force and patience - simple and is guaranteed to produce the "best" answers. You can use cursors/multiple queries to manage the workload. The disadvantage is that it's probably slower than you'd like. 2. Smarter algorithms - possibly something genetic to work towards local maxima. Or, if you'd like to keep it simple, just split your 7 locations into 2,2,3 and solve for each separately. 3. Statistical - examine a subset of possible states and accept you'll only be able to say "almost best" to 99% confidence or similar. I'd be tempted by #2 - there are probably some combinations you can rule out, which combined with a split/recombine should reduce the number of states to query. That any help? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] SEVEN cross joins?!?!?
Richard Huxton wrote: Frank Bax wrote: Are you saying that you WANT to generate a cross-join, score the millions of results and then pick the best 10? It's doing what you want, but you'd like it to be faster. Or are you saying that you'd like to avoid the explosion in rows altogether? In either case - I don't suppose you could provide a real example of the query, so we can see exactly what you're trying to do. There is no "best 10". I currently limit each subselect to 10 items so that query will actually run without crashing. I would like to remove the "ORDER BY itemid LIMIT 10" mentioned above. At the end of the query I have a "LIMIT 100" clause which will stay and produces a list of "best 100" combos. Either of your solutions would be acceptable; since avoiding the "explosion" would also make the query faster. Current calculations indicate that running the query without "LIMIT 10" in subselect would take years to process. OK - so at the heart of the problem is the fact that you want to search a space with 100 billion possible states. There are three main options 1. Brute force and patience - simple and is guaranteed to produce the "best" answers. You can use cursors/multiple queries to manage the workload. The disadvantage is that it's probably slower than you'd like. 2. Smarter algorithms - possibly something genetic to work towards local maxima. Or, if you'd like to keep it simple, just split your 7 locations into 2,2,3 and solve for each separately. 3. Statistical - examine a subset of possible states and accept you'll only be able to say "almost best" to 99% confidence or similar. I'd be tempted by #2 - there are probably some combinations you can rule out, which combined with a split/recombine should reduce the number of states to query. I'll second this recommendation. The OP is trying to drive a nail with a screwdriver. He needs a program, not a query. That any help? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Daryl Richter (daryl (at)(brandywine dot com)) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[SQL] Strange join...maybe some improvements???
I have 3 tables:
CREATE TABLE orders( id int4
SERIAL, id_ag int4, id_modell int4 ->> this is linked
to the modell.id
)
CREATE TABLE
modell( id int4 SERIAL, id_hersteller int4)
CREATE TABLE contactpartner( id int4
SERIAL, id_ag int4, ->> this is linked to order.id_ag or
modell.id_hersteller
id_user int4
).
I get a list of id_ag from the contactpartner which belongs to a
user(AG_LIST). Then I have to selectselect/count all the data's from the order
table that have the order.id_ag in the AG LIST or which have the
modell.id_hersteller in the AG_LIST.
I have this query:
SELECT count(o.id) FROM orders oINNER JOIN modell m ON
m.id=o.id_modellWHERE o.id_ag IN (SELECT id_ag FROM contactpartner cp WHERE
id_user=15) OR m.id_hersteller IN (SELECT id_ag FROM contactpartner cp WHERE
id_user=15)
Aggregate (cost=7828.60..7828.60 rows=1 width=4) (actual
time=1145.150..1145.151 rows=1 loops=1) -> Hash Join
(cost=1689.64..7706.32 rows=48913 width=4) (actual time=153.059..1136.457
rows=9395 loops=1) Hash Cond:
("outer".id_modell = "inner".id)
Join Filter: ((hashed subplan) OR (hashed
subplan)) -> Seq Scan on
orders o (cost=0.00..3129.17 rows=65217 width=12) (actual
time=0.031..94.444 rows=65217
loops=1) -> Hash
(cost=1218.07..1218.07 rows=66607 width=8) (actual time=151.211..151.211 rows=0
loops=1)
-> Seq Scan on modell m (cost=0.00..1218.07 rows=66607 width=8)
(actual time=0.044..87.154 rows=66607
loops=1)
SubPlan ->
Index Scan using contactpartner_id_user_idx on contactpartner cp
(cost=0.00..6.02 rows=2 width=4) (actual time=0.010..0.018 rows=4
loops=1)
Index Cond: (id_user =
15) -> Index
Scan using contactpartner_id_user_idx on contactpartner cp
(cost=0.00..6.02 rows=2 width=4) (actual time=0.092..0.116 rows=4
loops=1)
Index Cond: (id_user = 15)Total runtime: 1145.689 ms
I tried also this one:
SELECT count(o.id) FROM orders oINNER JOIN modell m ON
m.id=o.id_modellINNER JOIN contactpartner cp ON cp.id_user=15 AND
(o.id_ag=cp.id_ag OR cp.id_ag=m.id_hersteller)
Aggregate (cost=11658.63..11658.63 rows=1 width=4) (actual
time=1691.570..1691.570 rows=1 loops=1) -> Nested Loop
(cost=7752.40..11657.27 rows=542 width=4) (actual time=213.945..1683.515
rows=9416 loops=1) Join Filter:
(("inner".id_ag = "outer".id_ag) OR ("outer".id_ag =
"inner".id_hersteller))
-> Index Scan using contactpartner_id_user_idx on contactpartner
cp (cost=0.00..6.02 rows=2 width=4) (actual time=0.108..0.188 rows=4
loops=1)
Index Cond: (id_user = 15)
-> Materialize (cost=7752.40..8723.57 rows=65217 width=12)
(actual time=37.586..352.620 rows=65217
loops=4)
-> Hash Join (cost=1677.59..7368.18 rows=65217 width=12) (actual
time=150.220..1153.872 rows=65217
loops=1)
Hash Cond: ("outer".id_modell =
"inner".id)
-> Seq Scan on orders o (cost=0.00..3129.17 rows=65217 width=12)
(actual time=0.034..95.133 rows=65217
loops=1)
-> Hash (cost=1218.07..1218.07 rows=66607 width=8) (actual
time=149.961..149.961 rows=0
loops=1)
-> Seq Scan on modell m (cost=0.00..1218.07 rows=66607 width=8)
(actual time=0.032..86.378 rows=66607 loops=1)Total runtime: 1696.253
ms
but this brings me some double
information(the same o.id) in the situation in which the o.id_ag and
m.id_hersteller are different, but still both in the AG_LIST.
Is there any way to speed up this query???
Regards,
Andy.
[SQL] Text->Date conversion in a WHERE clause
I have a table that has some columns which store 'custom' fields so the content varies according to the user that the row belongs to. For one of the groups of users the field is a date (the type of the field is 'text' though). I'm trying to perform a query where it only returns values in a certain date range so in the WHERE clause I have WHERE cust3 <> '' AND cust3::text::timestamp > CURRENT_DATE - interval '1 month' This results in the error 'ERROR: date/time field value out of range: "052-44-5863"'. Now that is obviously not a valid date but there is actually more to the where clause and the first part of it excludes all rows where the user is not even the correct type, so the row which includes the field '052-44-5863' should really not even be checked. My main confusion lies in the assumption I made that the offending row would not even be included as it should have already been discarded. Is this not the case?How can I overcome this problem? There appears to be no isDate() function in postgresql like there is in sql server. Regards, Collin Peters ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] regular expression
[Please copy the mailing list on replies.] On Wed, Oct 12, 2005 at 04:28:47PM -0400, [EMAIL PROTECTED] wrote: > Thanks for the reply. It helped a lot. I was goint to ask similar question > regarding regular expression, but this time I need not the numeric part. What > I > mean is; > say that this may be in these format > "TSWUU" ---need--- "TSWUU" > "4 - DSC"---need--- "DSC" > "6768 - THY" ---need--- "THY" > basically string may or may not start with number, > I need substring of character if it starts with number discard it This isn't a PostgreSQL issue, but rather one of understanding how regular expressions work in general. See the previously mentioned "Pattern Matching" section of the PostgreSQL documentation for the PostgreSQL-specific details, and use a search engine to find a regular expression tutorial; they're frequently used in Perl and other languages so a lot of learning material exists. If you can find a copy in a library or bookstore, the book _Mastering Regular Expressions_ by Jeffrey E. F. Friedl, published by O'Reilly, is a good resource. -- Michael Fuhr ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] Text->Date conversion in a WHERE clause
As an example: CREATE TABLE userdata ( userdata_id serial NOT NULL, user_id smallint, data text ); CREATE TABLE users ( user_id serial NOT NULL, name text, "type" smallint ); INSERT INTO userdata (userdata_id, user_id, data) VALUES (1, 1, '2005-01-01'); INSERT INTO userdata (userdata_id, user_id, data) VALUES (2, 2, '2005-10-10'); INSERT INTO userdata (userdata_id, user_id, data) VALUES (3, 3, '052-44-5863'); INSERT INTO userdata (userdata_id, user_id, data) VALUES (4, 4, '052-44-5863'); INSERT INTO users (user_id, name, "type") VALUES (1, 'Jim', 1); INSERT INTO users (user_id, name, "type") VALUES (2, 'John', 1); INSERT INTO users (user_id, name, "type") VALUES (3, 'Bob', 2); INSERT INTO users (user_id, name, "type") VALUES (4, 'Bill', 2); Then run the query: SELECT * FROM ( SELECT u.user_id, ud.data FROM users u, userdata ud WHERE u.user_id = ud.user_id AND u.type = 1 ) subusers WHERE subusers.data::text::date < now(); Returns the message: ERROR: date/time field value out of range: "052-44-5863" So my question is how does this query ever even SEE the row containing "052-44-5863"? The sub-query doesn't return that row so I don't see how it can get this error. Regards, Collin Peters ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] pg, mysql comparison with "group by" clause
Stephan Szabo <[EMAIL PROTECTED]> writes: > On Tue, 11 Oct 2005, Rick Schumeyer wrote: > > > I'm not sure what I was thinking, but I tried the following query in pg: > > > > SELECT * FROM t GROUP BY state; > > > > pg returns an error. > > > > Mysql, OTOH, returns the first row for each state. (The first row with > > "AK", the first row with "PA", etc.) > > > > I'm no SQL expert, but it seems to me that the pg behavior is correct, and > > the mysql result is just weird. Am I correct? > > In your case, it sounds like the mysql result is wrong. I believe SQL99 > would allow it if the other columns were functionally dependant upon state > (as there'd by definition only be one value for the other columns per > group). I believe this is a documented feature. MySQL treats "select a,b from t group by a" equivalently to Postgres's "select distinct on (a) a,b from t" I suppose "equivalent" isn't quite true. It's more general since it allows aggregate functions as well. The equivalently general Postgres syntax is to have a first() aggregate function and do "select a,first(b) from t group by a". I'm sure it's very convenient. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] regular expression
This isn't a PostgreSQL issue, but rather one of understanding how regular expressions work in general. See the previously mentioned "Pattern Matching" section of the PostgreSQL documentation for the PostgreSQL-specific details, and use a search engine to find a regular expression tutorial; they're frequently used in Perl and other languages so a lot of learning material exists. If you can find a copy in a library or bookstore, the book _Mastering Regular Expressions_ by Jeffrey E. F. Friedl, published by O'Reilly, is a good resource. If you have Perl installed, do a 'man perlretut'. It is quite a wonderful tutorial. That is what I used to learn regular expressions. Yasir ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Text->Date conversion in a WHERE clause
Collin Peters <[EMAIL PROTECTED]> writes: > I have a table that has some columns which store 'custom' fields so the > content varies according to the user that the row belongs to. For one > of the groups of users the field is a date (the type of the field is > 'text' though). I'm trying to perform a query where it only returns > values in a certain date range so in the WHERE clause I have > > WHERE cust3 <> '' > AND cust3::text::timestamp > CURRENT_DATE - interval '1 month' > > This results in the error 'ERROR: date/time field value out of range: > This results in the error 'ERROR: date/time field value out of range: > "052-44-5863"'. Now that is obviously not a valid date but there > is actually more to the where clause and the first part of it excludes > all rows where the user is not even the correct type, so the row which > includes the field '052-44-5863' should really not even be checked. I think you have to use a CASE expression like: WHERE CASE WHEN user_type = 1 THEN cust3::timestamp > CURRENT_DATE - interval '1 month' ELSE false END CASE AND ... There's no advantage to doing this kind of thing though. Good database design principles dictate having one column for each piece of data. Just leave the columns for which the data is inappropriate NULL. NULLs take effectively no space. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] pg, mysql comparison with "group by" clause
On Wed, 2005-10-12 at 16:54, Greg Stark wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > > On Tue, 11 Oct 2005, Rick Schumeyer wrote: > > > > > I'm not sure what I was thinking, but I tried the following query in pg: > > > > > > SELECT * FROM t GROUP BY state; > > > > > > pg returns an error. > > > > > > Mysql, OTOH, returns the first row for each state. (The first row with > > > "AK", the first row with "PA", etc.) > > > > > > I'm no SQL expert, but it seems to me that the pg behavior is correct, and > > > the mysql result is just weird. Am I correct? > > > > In your case, it sounds like the mysql result is wrong. I believe SQL99 > > would allow it if the other columns were functionally dependant upon state > > (as there'd by definition only be one value for the other columns per > > group). > > I believe this is a documented feature. Hehe. When I turn on my windshield wipers and my airbag deploys, is it a documented "feature" if the dealership told me about this behaviour ahead of time? In much the same way, while this behaviour may be documented by MySQL, I can't imagine it really being called a feature. But at least this misbehaviour is documented. However, I think most people in the MySQL universe just stumble onto it by accident when they try it and it works. I'd at least prefer it to throw a warning or notice or something. > MySQL treats "select a,b from t group by a" equivalently to Postgres's > "select distinct on (a) a,b from t" > > I suppose "equivalent" isn't quite true. It's more general since it allows > aggregate functions as well. The equivalently general Postgres syntax is to > have a first() aggregate function and do "select a,first(b) from t group by > a". A Subselect would let you do such a thing as well, and while it's more complicated to write, it is likely to be easier to tell just what it's doing. > I'm sure it's very convenient. Not always, but I'd rather get the right answer with difficulty than the wrong one with ease. :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] Text->Date conversion in a WHERE clause
Collin Peters <[EMAIL PROTECTED]> writes: > WHERE cust3 <> '' > AND cust3::text::timestamp > CURRENT_DATE - interval '1 month' > This results in the error 'ERROR: date/time field value out of range: > "052-44-5863"'. Now that is obviously not a valid date but there > is actually more to the where clause and the first part of it excludes > all rows where the user is not even the correct type, so the row which > includes the field '052-44-5863' should really not even be checked. Your mistake is in supposing that WHERE clauses are guaranteed to be evaluated in a particular order. Such guarantees are made only for a very few specific constructs such as CASE. See http://www.postgresql.org/docs/8.0/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] pg, mysql comparison with "group by" clause
> Not always, but I'd rather get the right answer with difficulty than the wrong one with ease. :) agreed. I made it a point to mention this so called "feature" in my book. This is a bug they never fixed and they decided to call it a feature. It is, imo, *ridiculous*. Regards, Anthony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Marlowe Sent: Wednesday, October 12, 2005 6:25 PM To: Greg Stark Cc: Stephan Szabo; Rick Schumeyer; [email protected] Subject: Re: [SQL] pg, mysql comparison with "group by" clause On Wed, 2005-10-12 at 16:54, Greg Stark wrote: > Stephan Szabo <[EMAIL PROTECTED]> writes: > > > On Tue, 11 Oct 2005, Rick Schumeyer wrote: > > > > > I'm not sure what I was thinking, but I tried the following query in pg: > > > > > > SELECT * FROM t GROUP BY state; > > > > > > pg returns an error. > > > > > > Mysql, OTOH, returns the first row for each state. (The first row with > > > "AK", the first row with "PA", etc.) > > > > > > I'm no SQL expert, but it seems to me that the pg behavior is correct, and > > > the mysql result is just weird. Am I correct? > > > > In your case, it sounds like the mysql result is wrong. I believe SQL99 > > would allow it if the other columns were functionally dependant upon state > > (as there'd by definition only be one value for the other columns per > > group). > > I believe this is a documented feature. Hehe. When I turn on my windshield wipers and my airbag deploys, is it a documented "feature" if the dealership told me about this behaviour ahead of time? In much the same way, while this behaviour may be documented by MySQL, I can't imagine it really being called a feature. But at least this misbehaviour is documented. However, I think most people in the MySQL universe just stumble onto it by accident when they try it and it works. I'd at least prefer it to throw a warning or notice or something. > MySQL treats "select a,b from t group by a" equivalently to Postgres's > "select distinct on (a) a,b from t" > > I suppose "equivalent" isn't quite true. It's more general since it allows > aggregate functions as well. The equivalently general Postgres syntax is to > have a first() aggregate function and do "select a,first(b) from t group by a". A Subselect would let you do such a thing as well, and while it's more complicated to write, it is likely to be easier to tell just what it's doing. > I'm sure it's very convenient. Not always, but I'd rather get the right answer with difficulty than the wrong one with ease. :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] Text->Date conversion in a WHERE clause
[EMAIL PROTECTED] writes: > SELECT * > FROM > ( > SELECT u.user_id, ud.data > FROM users u, userdata ud > WHERE u.user_id = ud.user_id > AND u.type = 1 > ) subusers > WHERE subusers.data::text::date < now(); > So my question is how does this query ever even SEE the row containing > "052-44-5863"? The sub-query doesn't return that row so I don't see > how it can get this error. BTW, the fallacy in this idea is that the planner pushes WHERE clauses as far down the plan tree as it can. EXPLAIN would show you the actual plan tree, but it's probably along the lines of Join using u.user_id = ud.user_id Scan users u where u.type = 1 Scan userdata ud where ud.data::text::date < now(); If we did not do this, it would pretty much cripple the performance of queries involving views (since a view is nothing but a macro for a sub-select). regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] pg, mysql comparison with "group by" clause
Scott Marlowe <[EMAIL PROTECTED]> writes: > Hehe. When I turn on my windshield wipers and my airbag deploys, is it > a documented "feature" if the dealership told me about this behaviour > ahead of time? Well it's more like my car where the dashboard dims when I turn on my headlights which annoys me to no end since I learned to always put my headlights on even in the day. > In much the same way, while this behaviour may be documented by MySQL, I > can't imagine it really being called a feature. But at least this > misbehaviour is documented. However, I think most people in the MySQL > universe just stumble onto it by accident when they try it and it works. I'd > at least prefer it to throw a warning or notice or something. I don't see why you think people stumble on this by accident. I think it's actually an extremely common need. So common that Postgres has the same feature (though less general) and invented a whole syntax to handle it. I think most MySQL users don't stumble on it, they learn it as the way to handle the common use case when you join a master table against a detail table and then want to aggregate all the detail records. In standard SQL you have to write GROUP BY ... and list every single column you need from the master table. Forcing the database to do a lot of redundant comparisons and sort on uselessly long keys where in fact you only really need it to sort and group by the primary key. Remember, most MySQL users learn MySQL first, and only later learn what is standard SQL and what isn't. > A Subselect would let you do such a thing as well, and while it's more > complicated to write, it is likely to be easier to tell just what it's > doing. Subselects have their own problems here. Mainly Postgres's optimizer, as good as it is, doesn't treat them with the same code paths as joins and can't find all the same plans for them. But in any case you cannot always write a subselect that's equivalent to an arbitrary join. -- greg ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[SQL] Update timestamp on update
I have a table like: CREATE TABLE products ( id int, status int, last_status_change timestamp DEFAULT now() ); What I would like is that whenever the status is changed the last_status_change timestamp is updated to the current time. I have had a look at the rules and what I want would be similar to: CREATE RULE last_status_change AS ON UPDATE TO products WHERE NEW.status <> OLD.status DO UPDATE products SET last_status_change = now() WHERE id = OLD.id; Except of course that the above is recursive and doesn't work. How can I do this? Jeff ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] Update timestamp on update
Jeff Williams <[EMAIL PROTECTED]> writes: > last_status_change timestamp DEFAULT now() > What I would like is that whenever the status is changed the > last_status_change timestamp is updated to the current time. For this you use an ON UPDATE trigger; rules are not a good way to solve it. See the documentation about triggers. The first example on this page does it along with a few other things: http://developer.postgresql.org/docs/postgres/plpgsql-trigger.html regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] SEVEN cross joins?!?!?
At 09:00 AM 10/12/05, Daryl Richter wrote: Richard Huxton wrote: Frank Bax wrote: Are you saying that you WANT to generate a cross-join, score the millions of results and then pick the best 10? It's doing what you want, but you'd like it to be faster. Or are you saying that you'd like to avoid the explosion in rows altogether? In either case - I don't suppose you could provide a real example of the query, so we can see exactly what you're trying to do. There is no "best 10". I currently limit each subselect to 10 items so that query will actually run without crashing. I would like to remove the "ORDER BY itemid LIMIT 10" mentioned above. At the end of the query I have a "LIMIT 100" clause which will stay and produces a list of "best 100" combos. Either of your solutions would be acceptable; since avoiding the "explosion" would also make the query faster. Current calculations indicate that running the query without "LIMIT 10" in subselect would take years to process. OK - so at the heart of the problem is the fact that you want to search a space with 100 billion possible states. There are three main options 1. Brute force and patience - simple and is guaranteed to produce the "best" answers. You can use cursors/multiple queries to manage the workload. The disadvantage is that it's probably slower than you'd like. 2. Smarter algorithms - possibly something genetic to work towards local maxima. Or, if you'd like to keep it simple, just split your 7 locations into 2,2,3 and solve for each separately. 3. Statistical - examine a subset of possible states and accept you'll only be able to say "almost best" to 99% confidence or similar. I'd be tempted by #2 - there are probably some combinations you can rule out, which combined with a split/recombine should reduce the number of states to query. I'll second this recommendation. The OP is trying to drive a nail with a screwdriver. He needs a program, not a query. Richard, you've summed it up nicely. Splitting locations into subsets (like 2,2,3) doesn't work because it is possible that low values in one location can be offset by high values in another location, and still result in an excellent combo. The good news is these suggestions got me thinking outside the box. I think I can program a modified brute-force that bypasses large numbers of combos early. It might still be too large/slow, so I'd be interested in finding more info about these "smarter algorithms" in option 2. Where do I look? Greg: my son's the gamer; I'm just trying to help him out. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SQL] pg, mysql comparison with "group by" clause
Greg Stark <[EMAIL PROTECTED]> writes: > I think most MySQL users don't stumble on it, they learn it as the way > to handle the common use case when you join a master table against a > detail table and then want to aggregate all the detail records. In > standard SQL you have to write GROUP BY ... and list every single > column you need from the master table. Forcing the database to do a > lot of redundant comparisons and sort on uselessly long keys where in > fact you only really need it to sort and group by the primary key. Actually, if you're grouping by a table's primary key, the SQL99 spec says you shouldn't have to explicitly list the other columns from that table --- they are guaranteed to have unique values per group anyway. This is a single case in the "functional dependency" stuff. That verbiage is incredibly dense and I don't think we want to tackle all of it any time soon, but the primary-key case probably wouldn't be very hard to implement. We really ought to have this in TODO ... I'm sure it's been discussed before. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] SEVEN cross joins?!?!?
> Splitting locations into subsets (like 2,2,3) doesn't work > because it is possible that low values in one location can be offset by > high values in another location, and still result in an excellent combo. > > The good news is these suggestions got me thinking outside > the box. I think I can program a modified brute-force that bypasses > large numbers of combos early. It might still be too large/slow, so I'd be > interested in finding more info about these "smarter algorithms" in option Could you define an view with a calculated field, say, 2 * a1 + 6 * a2 + 3 * a3, and then use this aggregate to score the individual rows? I haven't looked at the exact nature of the problem, but if you're multiplying a table by itself, it may be worth redefining the problem in terms of a simple ranking algorithm, define a column to calculate this, then sort by that column. Best regards, Ben Stewart -- Robert Bosch (Australia) Pty. Ltd. Engineering Quality Services, Student Software Engineer (RBAU/EQS4) Locked Bag 66 - Clayton South, VIC 3169 - AUSTRALIA Tel: +61 3 9541-7002 Fax: +61 3 9541-7700 mailto:[EMAIL PROTECTED] http://www.bosch.com.au/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [SQL] Update timestamp on update
Tom Lane wrote: >Jeff Williams <[EMAIL PROTECTED]> writes: > > >>last_status_change timestamp DEFAULT now() >> >> > > > >>What I would like is that whenever the status is changed the >>last_status_change timestamp is updated to the current time. >> >> > >For this you use an ON UPDATE trigger; rules are not a good way to solve >it. See the documentation about triggers. The first example on this >page does it along with a few other things: >http://developer.postgresql.org/docs/postgres/plpgsql-trigger.html > > Thanks. Triggers was my first thought, but chapter 35 on Triggers didn't really indicate a way I could do this easily and scared me with a lot of c code. Maybe it is a good idea to present some of the more common things you would want to do with triggers in the triggers chapter? Jeff ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] Update timestamp on update
Jeff Williams <[EMAIL PROTECTED]> writes: > Thanks. Triggers was my first thought, but chapter 35 on Triggers didn't > really indicate a way I could do this easily and scared me with a lot of > c code. Yeah. This is a documentation issue that's bothered me for awhile. The problem is that we treat the PL languages as add-ons and therefore the documentation of the "core" system shouldn't rely on them ... but that leaves us presenting C-code triggers as the only examples in chapter 35. There is a paragraph in there suggesting you go look at the PL languages first, but obviously it's not getting the job done. Anybody have a better idea? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] Update timestamp on update
> The problem is that we treat the PL languages as add-ons and therefore > the documentation of the "core" system shouldn't rely on them ... but > that leaves us presenting C-code triggers as the only examples in > chapter 35. There is a paragraph in there suggesting you go look at > the PL languages first, but obviously it's not getting the job done. > > Anybody have a better idea? Include a sample trigger for each PL that is shipped with the standard distribution, along with comments to the effect that "it is suggested that the documentation for $PL is read before creating triggers"? It may also be worth including a comment that explicitly states that all(?) PLs are supported in triggers. Best regards, Ben Stewart -- Robert Bosch (Australia) Pty. Ltd. Engineering Quality Services, Student Software Engineer (RBAU/EQS4) Locked Bag 66 - Clayton South, VIC 3169 - AUSTRALIA mailto:[EMAIL PROTECTED] http://www.bosch.com.au/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] pg, mysql comparison with "group by" clause
> I don't see why you think people stumble on this by accident. I think it's > actually an extremely common need. I can't imagine how that's a common need at all. It makes no sense. When you add an additional column in the select, it must be included in the group by as it changes the meaning of the query. Consider: select deptno, count(*) from emp group by deptno; DEPTNO COUNT(*) -- -- 10 3 20 5 30 6 the query above counts the number of employees in each department. Now consider the following query: select deptno,job,count(*) from emp group by deptno,job; DEPTNO JOB COUNT(*) -- - -- 10 CLERK 1 10 MANAGER1 10 PRESIDENT 1 20 CLERK 2 20 ANALYST2 20 MANAGER1 30 CLERK 1 30 MANAGER1 30 SALESMAN 4 the query above counts the number of different job types in each department. In mysql, you would be allowed to do the following: select deptno,job,count(*) from emp group by deptno; but it makes no sense. What value would it return and what does it mean? How can that possibly represent reliable data? What would the result set above look like? It would be meaningless in a production system. honestly. It's a silly bug that mysql has touted has a feature and I can't imagine why people think it's useful. > I think most MySQL users don't stumble on it, they learn it as the way to > handle the common use case when you join a master table against a detail > table and then want to aggregate all the detail records. Huh? I don't follow that at all Perhaps your confusing the concept with window functions that neither pg nor mysql have implemented yet? For example, using window functions allows you to return aggregate and detailed info simultaneously: select ename, deptno, job, count(*)over(partition by deptno) as emps_per_dept, count(*)over(partition by deptno,job) as job_per_dept, count(*)over() as total from emp ENAME DEPTNO JOB EMPS_PER_DEPT JOB_PER_DEPT TOTAL -- -- - - - MILLER 10 CLERK 3114 CLARK 10 MANAGER 3114 KING 10 PRESIDENT 3114 SCOTT 20 ANALYST 5214 FORD 20 ANALYST 5214 SMITH 20 CLERK 5214 ADAMS 20 CLERK 5214 JONES 20 MANAGER 5114 JAMES 30 CLERK 6114 BLAKE 30 MANAGER 6114 ALLEN 30 SALESMAN 6414 MARTIN 30 SALESMAN 6414 TURNER 30 SALESMAN 6414 WARD 30 SALESMAN 6414 But this is not a group by, this is aggregating and windowing, which is quite different from mysql adding that nasty little bug and calling it a feature. - a -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Wednesday, October 12, 2005 9:13 PM To: Scott Marlowe Cc: Greg Stark; Stephan Szabo; Rick Schumeyer; [email protected] Subject: Re: [SQL] pg, mysql comparison with "group by" clause Scott Marlowe <[EMAIL PROTECTED]> writes: > Hehe. When I turn on my windshield wipers and my airbag deploys, is it > a documented "feature" if the dealership told me about this behaviour > ahead of time? Well it's more like my car where the dashboard dims when I turn on my headlights which annoys me to no end since I learned to always put my headlights on even in the day. > In much the same way, while this behaviour may be documented by MySQL, I > can't imagine it really being called a feature. But at least this > misbehaviour is documented. However, I think most people in the MySQL > universe just stumble onto it by accident when they try it and it works. I'd > at least prefer it to throw a warning or notice or something. I don't see why you think people stumble on this by accident. I think it's actually an extremely common need. So common that Postgres has the same feature (though less general) and invented a whole syntax to handle it. I think most MySQL users don't stumble on it, they learn it as the way to handle the common use case when you join a master table against a detail table and then want to aggregate all the detail records. In standard SQL you have to write GROUP BY ... and list every single column you need from the master table. Forcing the database to do a lot of redundant comparisons and sort on uselessly long keys where in fact you only really need it to sort and group by the primary key. Remember, m
Re: [SQL] Update timestamp on update
On Oct 13, 2005, at 11:52 , Tom Lane wrote: Jeff Williams <[EMAIL PROTECTED]> writes: Thanks. Triggers was my first thought, but chapter 35 on Triggers didn't really indicate a way I could do this easily and scared me with a lot of c code. Yeah. This is a documentation issue that's bothered me for awhile. The problem is that we treat the PL languages as add-ons and therefore the documentation of the "core" system shouldn't rely on them ... but that leaves us presenting C-code triggers as the only examples in chapter 35. There is a paragraph in there suggesting you go look at the PL languages first, but obviously it's not getting the job done. I think examples in PL/pgSQL would perhaps be most appropriate, as it's a PostgreSQL-specific language, arguably easy to read, and a PL that doesn't rely on any external dependencies, so it should be relatively easy for people to install it if they want. (IIRC, there's even been discussion in the past of whether or not PL/pgSQL should be installed by default.) Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] pg, mysql comparison with "group by" clause
Tom, Don't do it man... It's not the same. The standard has it wrong (that or I misunderstood your explanation). But this statement: > you shouldn't have to explicitly list the other columns from that > table --- they are guaranteed to have unique values per group anyway ...is way off. By changing the values in the select/group by you are changing Group! How can you arbitrarily add or exclude a column? You can't do it. Look, all it takes is one example to prove the standard wrong... Consider this simple example (in oracle, but the same in pg): create table foo (id number primary key, name varchar2(10)); insert into foo values (1,'joe'); insert into foo values (2,'joe'); insert into foo values (3,'jim'); select * from foo; ID NAME -- -- 1 joe 2 joe 3 jim ok, lets get the count by ID (which is unique): select id, count(*) from foo group by id; ID COUNT(*) -- -- 1 1 2 1 3 1 makes sense... the values for ID, 1,2,3 are our groups and the count aggregate op shows that there's one member per group. now let's get the count by name: select name, count(*) from foo group by name; NAME COUNT(*) -- -- jim 1 joe 2 so far so good. The queries and results are representative of the actual data. The values for name, "jim" and "joe" are our groups and the count aggregate op shows that there's 1 member in the "jim" group and two members in the "joe" group. But, as soon as we add id to the group by... select name, count(*) from foo group by name,id; NAME COUNT(*) -- -- jim 1 joe 1 joe 1 it changes the query! Even tho id is unique, it changes the query. The group by it's definition (it's in the group by) is no longer Name or id, it's both name and id. How can you simply remove id? Which result set should it return The first or second? Makes no sense... If it's aint obvious why, simply plug id into the select: select id, name, count(*) from foo group by name,id; ID NAME COUNT(*) --- -- -- 3 jim 1 1 joe 1 2 joe 1 so, how can it be that you don't have to list the other columns (in this case "name")? it makes no sense because if you remove the either column from the group by (id or name) it changes the meaning of the query. The way you guys do it now is correct. Mysql has it wrong. And if the standard states that you can leave out the pk from a group by, They are wrong too, as the simple examples above prove. Fyi, Oracle just bought innodb, so, I'd not be too concerned with mysql and they so called "features" anyway. Regards, Anthony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Wednesday, October 12, 2005 9:51 PM To: Greg Stark Cc: Scott Marlowe; Stephan Szabo; Rick Schumeyer; [email protected] Subject: Re: [SQL] pg, mysql comparison with "group by" clause Greg Stark <[EMAIL PROTECTED]> writes: > I think most MySQL users don't stumble on it, they learn it as the way > to handle the common use case when you join a master table against a > detail table and then want to aggregate all the detail records. In > standard SQL you have to write GROUP BY ... and list every single > column you need from the master table. Forcing the database to do a > lot of redundant comparisons and sort on uselessly long keys where in > fact you only really need it to sort and group by the primary key. Actually, if you're grouping by a table's primary key, the SQL99 spec says you shouldn't have to explicitly list the other columns from that table --- they are guaranteed to have unique values per group anyway. This is a single case in the "functional dependency" stuff. That verbiage is incredibly dense and I don't think we want to tackle all of it any time soon, but the primary-key case probably wouldn't be very hard to implement. We really ought to have this in TODO ... I'm sure it's been discussed before. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] pg, mysql comparison with "group by" clause
"Anthony Molinaro" <[EMAIL PROTECTED]> writes: > By changing the values in the select/group by you are changing > Group! How can you arbitrarily add or exclude a column? > You can't do it. Go back and reread the previous posts again. You missed the whole point. -- greg ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] pg, mysql comparison with "group by" clause
Greg, You'll have to pardon me... I saw this comment: "I don't see why you think people stumble on this by accident. I think it's actually an extremely common need." Which, if referring to the ability to have items in the select that do not need to be included in the group, (excluding constants and the like) is just silly. OTOH, if you're all poking fun at a mysql bug that they try to pass off as a feature, then yes, I'm a clod and I missed that the first time around :) Regards, Anthony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 12, 2005 11:43 PM To: Anthony Molinaro Cc: Tom Lane; Greg Stark; Scott Marlowe; Stephan Szabo; Rick Schumeyer; [email protected] Subject: Re: [SQL] pg, mysql comparison with "group by" clause "Anthony Molinaro" <[EMAIL PROTECTED]> writes: > By changing the values in the select/group by you are changing > Group! How can you arbitrarily add or exclude a column? > You can't do it. Go back and reread the previous posts again. You missed the whole point. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] pg, mysql comparison with "group by" clause
"Anthony Molinaro" <[EMAIL PROTECTED]> writes: > Greg, > You'll have to pardon me... > > I saw this comment: > > "I don't see why you think people stumble on this by accident. > I think it's actually an extremely common need." > > Which, if referring to the ability to have items in the select that do not > need to be included in the group, (excluding constants and the like) is just > silly. Well the "constants and the like" are precisely the point. There are plenty of cases where adding the column to the GROUP BY is unnecessary and since Postgres makes no attempt to prune them out, inefficient. And constants aren't the only such case. The most common case is columns that are coming from a table where the primary key is already included in the GROUP BY list. In the case of columns coming from a table where the primary key is already in the GROUP BY list it's possible for the database to deduce that it's unnecessary to group on that column. But it's also possible to have cases where the programmer has out of band knowledge that it's unnecessary but the database doesn't have that knowledge. The most obvious case that comes to mind is a denormalized data model that includes a redundant column. select dept_id, dept_name, count(*) from employee_list For example if dept_name is guaranteed to be the same for all records with the same dept_id. Of course that's generally considered poor design but it doesn't mean there aren't thousands of databases out there with data models like that. -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] UPDATE Trigger on multiple tables
Is it possible to have a single trigger on multiple tables simultaneously? Example: CREATE TRIGGER emp_cust_stamp BEFORE INSERT OR UPDATE ON employees, customers FOR EACH ROW EXECUTE PROCEDURE last_updated_stamp(); I tried something like the above but get an error message at the comma. I tried using the keyword AND as well. I couldn't find anything on this in the docs. I have many different tables in my databases which all have a "last_updated" field and it seems less tedious to be able to enforce updating this trigger database-wide using just one trigger. Is it possible? Ferindo ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SQL] UPDATE Trigger on multiple tables
Ferindo Middleton Jr <[EMAIL PROTECTED]> writes: > Is it possible to have a single trigger on multiple tables > simultaneously? Example: > CREATE TRIGGER emp_cust_stamp BEFORE INSERT OR UPDATE ON employees, customers > FOR EACH ROW EXECUTE PROCEDURE last_updated_stamp(); No. You can use the same function for multiple triggers, but you have to CREATE TRIGGER for each table separately. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SQL] pg, mysql comparison with "group by" clause
Greg, Ok, I think I see where you're going (I don't agree, but I think I get you now). So, using your example of: "dept_name is guaranteed to be the same for all records with the same dept_id." Here: select d.deptno,d.dname from emp e, dept d where e.deptno=d.deptno DEPTNO DNAME -- -- 10 ACCOUNTING 10 ACCOUNTING 10 ACCOUNTING 20 RESEARCH 20 RESEARCH 20 RESEARCH 20 RESEARCH 20 RESEARCH 30 SALES 30 SALES 30 SALES 30 SALES 30 SALES 30 SALES ok, so there's your scenario. And you're suggesting that one should be able to Do the following query? select d.deptno,d.dname,count(*) from emp e, dept d where e.deptno=d.deptno if that's what you suggest, then we'll just have to agree to disagree. That query needs a group by. What you're suggesting is, imo, a wholly unnecessary shortcut (almost as bad as that ridiculous "natural join" - whoever came up with that should be tarred and feathered). I think I see your point now, I just disagree. Your depending on syntax to work based on data integrity? Hmmm don't think I like that idea What performance improvement do you get from leaving group by out? Look at the query above, doesn't a count of distinct deptno,dname pairs have to take place anyway? What do you save by excluding group by? Are you suggesting COUNT be computed for each row (windowed) or that COUNT is computed for each group? If you want repeating rows, then you want windowing. For example: select d.deptno,d.dname,count(*)over(partition by d.deptno,d.dname) cnt from emp e, dept d where e.deptno=d.deptno DEPTNO DNAME CNT -- -- --- 10 ACCOUNTING 3 10 ACCOUNTING 3 10 ACCOUNTING 3 20 RESEARCH 5 20 RESEARCH 5 20 RESEARCH 5 20 RESEARCH 5 20 RESEARCH 5 30 SALES6 30 SALES6 30 SALES6 30 SALES6 30 SALES6 30 SALES6 if you want "groups", then use group by: select d.deptno,d.dname,count(*) cnt from emp e, dept d where e.deptno=d.deptno group by d.deptno,d.dname DEPTNO DNAME CNT -- -- --- 10 ACCOUNTING 3 20 RESEARCH 5 30 SALES6 what your suggesting doesn't seem to fit in at all, particularly when pg implements window functions. If you're suggesting the pg optimizer isn't doing the right thing with group by queries, then this is an optimizer issue and that should be hacked, not group by. If you're suggesting certain rows be ditched or shortcuts be taken, then the optimizer should do that, not the programmer writing sql. Db2 and oracle have no problem doing these queries, I don't see why pg should have a problem. imo, the only items that should not be listed in the group by are: 1. constants and deterministic functions 2. scalar subqueries 3. window functions 1 - because the value is same for each row 2&3 - because they are evaluated after the grouping takes place regards, Anthony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, October 13, 2005 12:25 AM To: Anthony Molinaro Cc: [EMAIL PROTECTED]; Tom Lane; Scott Marlowe; Stephan Szabo; Rick Schumeyer; [email protected] Subject: Re: [SQL] pg, mysql comparison with "group by" clause "Anthony Molinaro" <[EMAIL PROTECTED]> writes: > Greg, > You'll have to pardon me... > > I saw this comment: > > "I don't see why you think people stumble on this by accident. > I think it's actually an extremely common need." > > Which, if referring to the ability to have items in the select that do not > need to be included in the group, (excluding constants and the like) is just > silly. Well the "constants and the like" are precisely the point. There are plenty of cases where adding the column to the GROUP BY is unnecessary and since Postgres makes no attempt to prune them out, inefficient. And constants aren't the only such case. The most common case is columns that are coming from a table where the primary key is already included in the GROUP BY list. In the case of columns coming from a table where the primary key is already in the GROUP BY list it's possible for the database to deduce that it's unnecessary to group on that column. But it's also possible to have cases where the programmer has out of band knowledge that it's unnecessary but the database doesn't have that knowledge. The most obvious case that comes to mind is a denormalized data model that includes a redundant column. select dept_id, dept_name, count(*) from employee_list For example if dept_name is guaranteed to be the same for all records with the same dept_id. Of course that's generally considered poor design but it doesn't mean there aren't thousands of databases out there with data models like that. -- greg ---(end of broad
