Re: [SQL] Returning NULL results?

2005-10-12 Thread Mario Splivalo
On Tue, 2005-10-11 at 15:39 +0100, Richard Huxton wrote:
> Mario Splivalo wrote:
> > 
> > Of course, if it returns no rows, I'm presented with an error, saying:
> > 
> > ERROR:  record "rec" has no field "_qv"
> > 
> > This is logical. My question would be is there a way around this
> > withouth first doing SELECT COUNT(*) FROM tmbBids WHERE ..., and then if
> > COUNT(*) is zero THEN bidCount := NULL ELSE if COUNT(*) is not zero then
> > do SELECT from the begining?
> 
> Could you check the FOUND variable? As in IF NOT FOUND THEN RETURN NULL
> 

That actually works, but it's a bit of fuss to use. I need to have it
inside the FOR loop, then exit the loop if FOUND is NOT TRUE, wich makes
the code even harder to read/mantain. Do you think that anytime soon
would we be able to do SELECTs from the temporary tables withouth
EXECing those selects? And, some qualifier for the variables (@ in
T-SQL, $ in PHP, or something similair) would make the code even more
readable. Just my tought.

Btw, on postgres 7.4 you can do SELECT something FROM , and if query
returns no rows, the FOR loop is skipped. So, it's safe to do something
like:

someVar:=NULL;
_SQL:='SELECT someVar FROM someTable WHERE someCondition...'
FOR rec IN EXECUTE _SQL LOOP someVar:=rec.someVar; END LOOP;

Now, if select returns no rows, the LOOP/END LOOP is skippend, and
someVar is se to be null before FOR. 

Mario
-- 
Mario Splivalo
Mob-Art
[EMAIL PROTECTED]

"I can do it quick, I can do it cheap, I can do it well. Pick any two."



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] SEVEN cross joins?!?!?

2005-10-12 Thread Richard Huxton

Frank Bax wrote:
Are you saying that you WANT to generate a cross-join, score the 
millions of results and then pick the best 10? It's doing what you 
want, but you'd like it to be faster.


Or are you saying that you'd like to avoid the explosion in rows 
altogether?


In either case - I don't suppose you could provide a real example of 
the query, so we can see exactly what you're trying to do.



There is no "best 10".  I currently limit each subselect to 10 items so 
that query will actually run without crashing.  I would like to remove 
the "ORDER BY itemid LIMIT 10" mentioned above.  At the end of the query 
I have a "LIMIT 100" clause which will stay and produces a list of "best 
100" combos.


Either of your solutions would be acceptable; since avoiding the 
"explosion" would also make the query faster.  Current calculations 
indicate that running the query without "LIMIT 10" in subselect would 
take years to process.


OK - so at the heart of the problem is the fact that you want to search 
a space with 100 billion possible states. There are three main options


1. Brute force and patience - simple and is guaranteed to produce the 
"best" answers. You can use cursors/multiple queries to manage the 
workload. The disadvantage is that it's probably slower than you'd like.
2. Smarter algorithms - possibly something genetic to work towards local 
maxima. Or, if you'd like to keep it simple, just split your 7 locations 
into 2,2,3 and solve for each separately.
3. Statistical - examine a subset of possible states and accept you'll 
only be able to say "almost best" to 99% confidence or similar.


I'd be tempted by #2 - there are probably some combinations you can rule 
out, which combined with a split/recombine should reduce the number of 
states to query.


That any help?
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] SEVEN cross joins?!?!?

2005-10-12 Thread Daryl Richter

Richard Huxton wrote:

Frank Bax wrote:

Are you saying that you WANT to generate a cross-join, score the 
millions of results and then pick the best 10? It's doing what you 
want, but you'd like it to be faster.


Or are you saying that you'd like to avoid the explosion in rows 
altogether?


In either case - I don't suppose you could provide a real example of 
the query, so we can see exactly what you're trying to do.




There is no "best 10".  I currently limit each subselect to 10 items 
so that query will actually run without crashing.  I would like to 
remove the "ORDER BY itemid LIMIT 10" mentioned above.  At the end of 
the query I have a "LIMIT 100" clause which will stay and produces a 
list of "best 100" combos.


Either of your solutions would be acceptable; since avoiding the 
"explosion" would also make the query faster.  Current calculations 
indicate that running the query without "LIMIT 10" in subselect would 
take years to process.



OK - so at the heart of the problem is the fact that you want to search 
a space with 100 billion possible states. There are three main options


1. Brute force and patience - simple and is guaranteed to produce the 
"best" answers. You can use cursors/multiple queries to manage the 
workload. The disadvantage is that it's probably slower than you'd like.
2. Smarter algorithms - possibly something genetic to work towards local 
maxima. Or, if you'd like to keep it simple, just split your 7 locations 
into 2,2,3 and solve for each separately.
3. Statistical - examine a subset of possible states and accept you'll 
only be able to say "almost best" to 99% confidence or similar.


I'd be tempted by #2 - there are probably some combinations you can rule 
out, which combined with a split/recombine should reduce the number of 
states to query.




I'll second this recommendation.  The OP is trying to drive a nail with 
a screwdriver.  He needs a program, not a query.



That any help?
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



--
Daryl Richter
(daryl (at)(brandywine dot com))


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[SQL] Strange join...maybe some improvements???

2005-10-12 Thread Andy



I have 3 tables:
CREATE TABLE orders(  id int4 
SERIAL,  id_ag int4,  id_modell int4 ->> this is linked 
to the modell.id
) 
 
CREATE TABLE 
modell(  id int4 SERIAL,  id_hersteller int4)

 
CREATE TABLE contactpartner(  id int4 
SERIAL,  id_ag int4, ->> this is linked to order.id_ag or 
modell.id_hersteller

id_user int4
 ).
 
I get a list of id_ag from the contactpartner which belongs to a 
user(AG_LIST). Then I have to selectselect/count all the data's from the order 
table that have the order.id_ag in the AG LIST or which have the 
modell.id_hersteller in the AG_LIST. 
 
I have this query:
 
SELECT count(o.id) FROM orders oINNER JOIN modell m ON 
m.id=o.id_modellWHERE o.id_ag IN (SELECT id_ag FROM contactpartner cp WHERE 
id_user=15) OR m.id_hersteller IN (SELECT id_ag FROM contactpartner cp WHERE 
id_user=15)
 
Aggregate  (cost=7828.60..7828.60 rows=1 width=4) (actual 
time=1145.150..1145.151 rows=1 loops=1)  ->  Hash Join  
(cost=1689.64..7706.32 rows=48913 width=4) (actual time=153.059..1136.457 
rows=9395 loops=1)    Hash Cond: 
("outer".id_modell = "inner".id)    
Join Filter: ((hashed subplan) OR (hashed 
subplan))    ->  Seq Scan on 
orders o  (cost=0.00..3129.17 rows=65217 width=12) (actual 
time=0.031..94.444 rows=65217 
loops=1)    ->  Hash  
(cost=1218.07..1218.07 rows=66607 width=8) (actual time=151.211..151.211 rows=0 
loops=1)  
->  Seq Scan on modell m  (cost=0.00..1218.07 rows=66607 width=8) 
(actual time=0.044..87.154 rows=66607 
loops=1)    
SubPlan  ->  
Index Scan using contactpartner_id_user_idx on contactpartner cp  
(cost=0.00..6.02 rows=2 width=4) (actual time=0.010..0.018 rows=4 
loops=1)    
Index Cond: (id_user = 
15)  ->  Index 
Scan using contactpartner_id_user_idx on contactpartner cp  
(cost=0.00..6.02 rows=2 width=4) (actual time=0.092..0.116 rows=4 
loops=1)    
Index Cond: (id_user = 15)Total runtime: 1145.689 ms
 
I tried also this one:
 
SELECT count(o.id) FROM orders oINNER JOIN modell m ON 
m.id=o.id_modellINNER JOIN contactpartner cp ON cp.id_user=15 AND 
(o.id_ag=cp.id_ag OR cp.id_ag=m.id_hersteller)
 
Aggregate  (cost=11658.63..11658.63 rows=1 width=4) (actual 
time=1691.570..1691.570 rows=1 loops=1)  ->  Nested Loop  
(cost=7752.40..11657.27 rows=542 width=4) (actual time=213.945..1683.515 
rows=9416 loops=1)    Join Filter: 
(("inner".id_ag = "outer".id_ag) OR ("outer".id_ag = 
"inner".id_hersteller))    
->  Index Scan using contactpartner_id_user_idx on contactpartner 
cp  (cost=0.00..6.02 rows=2 width=4) (actual time=0.108..0.188 rows=4 
loops=1)  
Index Cond: (id_user = 15)    
->  Materialize  (cost=7752.40..8723.57 rows=65217 width=12) 
(actual time=37.586..352.620 rows=65217 
loops=4)  
->  Hash Join  (cost=1677.59..7368.18 rows=65217 width=12) (actual 
time=150.220..1153.872 rows=65217 
loops=1)    
Hash Cond: ("outer".id_modell = 
"inner".id)    
->  Seq Scan on orders o  (cost=0.00..3129.17 rows=65217 width=12) 
(actual time=0.034..95.133 rows=65217 
loops=1)    
->  Hash  (cost=1218.07..1218.07 rows=66607 width=8) (actual 
time=149.961..149.961 rows=0 
loops=1)  
->  Seq Scan on modell m  (cost=0.00..1218.07 rows=66607 width=8) 
(actual time=0.032..86.378 rows=66607 loops=1)Total runtime: 1696.253 
ms
 
 but this brings me some double 
information(the same o.id) in the situation in which the o.id_ag and 
m.id_hersteller are different, but still both in the AG_LIST. 
 
 
Is there any way to speed up this query???
 
Regards, 
Andy.


[SQL] Text->Date conversion in a WHERE clause

2005-10-12 Thread Collin Peters
I have a table that has some columns which store 'custom' fields so the
content varies according to the user that the row belongs to.  For one
of the groups of users the field is a date (the type of the field is
'text' though).  I'm trying to perform a query where it only returns
values in a certain date range so in the WHERE clause I have

WHERE cust3 <> ''
AND cust3::text::timestamp > CURRENT_DATE - interval '1 month'

This results in the error 'ERROR:  date/time field value out of range:
"052-44-5863"'.  Now that is obviously not a valid date but there
is actually more to the where clause and the first part of it excludes
all rows where the user is not even the correct type, so the row which
includes the field '052-44-5863' should really not even be checked.

My main confusion lies in the assumption I made that the offending row
would not even be included as it should have already been discarded.
Is this not the case?How can I overcome this problem?  There
appears to be no isDate() function in postgresql like there is in sql
server.

Regards,
Collin Peters

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] regular expression

2005-10-12 Thread Michael Fuhr
[Please copy the mailing list on replies.]

On Wed, Oct 12, 2005 at 04:28:47PM -0400, [EMAIL PROTECTED] wrote:
> Thanks for the reply. It helped a lot. I was goint to ask similar question
> regarding regular expression, but this time I need not the numeric part. What 
> I
> mean is;
> say that this may be in these format 
>  "TSWUU"  ---need--- "TSWUU"
>  "4 - DSC"---need--- "DSC"
>  "6768 - THY" ---need--- "THY"
> basically string may or may not start with number, 
> I need substring of character if it starts with number discard it

This isn't a PostgreSQL issue, but rather one of understanding how
regular expressions work in general.  See the previously mentioned
"Pattern Matching" section of the PostgreSQL documentation for the
PostgreSQL-specific details, and use a search engine to find a
regular expression tutorial; they're frequently used in Perl and
other languages so a lot of learning material exists.  If you can
find a copy in a library or bookstore, the book _Mastering Regular
Expressions_ by Jeffrey E. F. Friedl, published by O'Reilly, is a
good resource.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Text->Date conversion in a WHERE clause

2005-10-12 Thread cadiolis
As an example:

CREATE TABLE userdata (
userdata_id serial NOT NULL,
user_id smallint,
data text
);

CREATE TABLE users (
user_id serial NOT NULL,
name text,
"type" smallint
);

INSERT INTO userdata (userdata_id, user_id, data) VALUES (1, 1,
'2005-01-01');
INSERT INTO userdata (userdata_id, user_id, data) VALUES (2, 2,
'2005-10-10');
INSERT INTO userdata (userdata_id, user_id, data) VALUES (3, 3,
'052-44-5863');
INSERT INTO userdata (userdata_id, user_id, data) VALUES (4, 4,
'052-44-5863');

INSERT INTO users (user_id, name, "type") VALUES (1, 'Jim', 1);
INSERT INTO users (user_id, name, "type") VALUES (2, 'John', 1);
INSERT INTO users (user_id, name, "type") VALUES (3, 'Bob', 2);
INSERT INTO users (user_id, name, "type") VALUES (4, 'Bill', 2);


Then run the query:
SELECT *
FROM
(
SELECT u.user_id, ud.data
FROM users u, userdata ud
WHERE u.user_id = ud.user_id
AND u.type = 1
) subusers
WHERE subusers.data::text::date <  now();

Returns the message: ERROR:  date/time field value out of range:
"052-44-5863"

So my question is how does this query ever even SEE the row containing
"052-44-5863"?  The sub-query doesn't return that row so I don't see
how it can get this error.

Regards,
Collin Peters


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Greg Stark
Stephan Szabo <[EMAIL PROTECTED]> writes:

> On Tue, 11 Oct 2005, Rick Schumeyer wrote:
> 
> > I'm not sure what I was thinking, but I tried the following query in pg:
> >
> > SELECT * FROM t GROUP BY state;
> >
> > pg returns an error.
> >
> > Mysql, OTOH, returns the first row for each state.  (The first row with
> > "AK", the first row with "PA", etc.)
> >
> > I'm no SQL expert, but it seems to me that the pg behavior is correct, and
> > the mysql result is just weird.  Am I correct?
> 
> In your case, it sounds like the mysql result is wrong. I believe SQL99
> would allow it if the other columns were functionally dependant upon state
> (as there'd by definition only be one value for the other columns per
> group).

I believe this is a documented feature.

MySQL treats "select a,b from t group by a" equivalently to Postgres's 
"select distinct on (a) a,b from t"

I suppose "equivalent" isn't quite true. It's more general since it allows
aggregate functions as well. The equivalently general Postgres syntax is to
have a first() aggregate function and do "select a,first(b) from t group by a".

I'm sure it's very convenient.

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] regular expression

2005-10-12 Thread Yasir Malik

This isn't a PostgreSQL issue, but rather one of understanding how
regular expressions work in general.  See the previously mentioned
"Pattern Matching" section of the PostgreSQL documentation for the
PostgreSQL-specific details, and use a search engine to find a
regular expression tutorial; they're frequently used in Perl and
other languages so a lot of learning material exists.  If you can
find a copy in a library or bookstore, the book _Mastering Regular
Expressions_ by Jeffrey E. F. Friedl, published by O'Reilly, is a
good resource.

If you have Perl installed, do a 'man perlretut'.  It is quite a wonderful 
tutorial.  That is what I used to learn regular expressions.


Yasir

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] Text->Date conversion in a WHERE clause

2005-10-12 Thread Greg Stark
Collin Peters <[EMAIL PROTECTED]> writes:

> I have a table that has some columns which store 'custom' fields so the
> content varies according to the user that the row belongs to.  For one
> of the groups of users the field is a date (the type of the field is
> 'text' though).  I'm trying to perform a query where it only returns
> values in a certain date range so in the WHERE clause I have
> 
> WHERE cust3 <> ''
> AND cust3::text::timestamp > CURRENT_DATE - interval '1 month'
> 
> This results in the error 'ERROR:  date/time field value out of range:


> This results in the error 'ERROR:  date/time field value out of range:
> "052-44-5863"'.  Now that is obviously not a valid date but there
> is actually more to the where clause and the first part of it excludes
> all rows where the user is not even the correct type, so the row which
> includes the field '052-44-5863' should really not even be checked.

I think you have to use a CASE expression like:

WHERE CASE WHEN user_type = 1
   THEN cust3::timestamp > CURRENT_DATE - interval '1 month' 
   ELSE false 
   END CASE
  AND ...



There's no advantage to doing this kind of thing though. Good database design
principles dictate having one column for each piece of data. Just leave the
columns for which the data is inappropriate NULL. NULLs take effectively no
space.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Scott Marlowe
On Wed, 2005-10-12 at 16:54, Greg Stark wrote:
> Stephan Szabo <[EMAIL PROTECTED]> writes:
> 
> > On Tue, 11 Oct 2005, Rick Schumeyer wrote:
> > 
> > > I'm not sure what I was thinking, but I tried the following query in pg:
> > >
> > > SELECT * FROM t GROUP BY state;
> > >
> > > pg returns an error.
> > >
> > > Mysql, OTOH, returns the first row for each state.  (The first row with
> > > "AK", the first row with "PA", etc.)
> > >
> > > I'm no SQL expert, but it seems to me that the pg behavior is correct, and
> > > the mysql result is just weird.  Am I correct?
> > 
> > In your case, it sounds like the mysql result is wrong. I believe SQL99
> > would allow it if the other columns were functionally dependant upon state
> > (as there'd by definition only be one value for the other columns per
> > group).
> 
> I believe this is a documented feature.

Hehe.  When I turn on my windshield wipers and my airbag deploys, is it
a documented "feature" if the dealership told me about this behaviour
ahead of time?  In much the same way, while this behaviour may be
documented by MySQL, I can't imagine it really being called a feature. 
But at least this misbehaviour is documented.  However, I think most
people in the MySQL universe just stumble onto it by accident when they
try it and it works.  I'd at least prefer it to throw a warning or
notice or something.

> MySQL treats "select a,b from t group by a" equivalently to Postgres's 
> "select distinct on (a) a,b from t"
> 
> I suppose "equivalent" isn't quite true. It's more general since it allows
> aggregate functions as well. The equivalently general Postgres syntax is to
> have a first() aggregate function and do "select a,first(b) from t group by 
> a".

A Subselect would let you do such a thing as well, and while it's more
complicated to write, it is likely to be easier to tell just what it's
doing.

> I'm sure it's very convenient.

Not always, but I'd rather get the right answer with difficulty than the
wrong one with ease. :)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] Text->Date conversion in a WHERE clause

2005-10-12 Thread Tom Lane
Collin Peters <[EMAIL PROTECTED]> writes:
> WHERE cust3 <> ''
> AND cust3::text::timestamp > CURRENT_DATE - interval '1 month'

> This results in the error 'ERROR:  date/time field value out of range:
> "052-44-5863"'.  Now that is obviously not a valid date but there
> is actually more to the where clause and the first part of it excludes
> all rows where the user is not even the correct type, so the row which
> includes the field '052-44-5863' should really not even be checked.

Your mistake is in supposing that WHERE clauses are guaranteed to be
evaluated in a particular order.  Such guarantees are made only for a
very few specific constructs such as CASE.  See
http://www.postgresql.org/docs/8.0/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Anthony Molinaro
> Not always, but I'd rather get the right answer with difficulty than
the
wrong one with ease. :)

agreed. 

I made it a point to mention this so called "feature" in my book.

This is a bug they never fixed and they decided to call it a feature.

It is, imo, *ridiculous*.

Regards,
  Anthony

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Scott Marlowe
Sent: Wednesday, October 12, 2005 6:25 PM
To: Greg Stark
Cc: Stephan Szabo; Rick Schumeyer; [email protected]
Subject: Re: [SQL] pg, mysql comparison with "group by" clause

On Wed, 2005-10-12 at 16:54, Greg Stark wrote:
> Stephan Szabo <[EMAIL PROTECTED]> writes:
> 
> > On Tue, 11 Oct 2005, Rick Schumeyer wrote:
> > 
> > > I'm not sure what I was thinking, but I tried the following query
in pg:
> > >
> > > SELECT * FROM t GROUP BY state;
> > >
> > > pg returns an error.
> > >
> > > Mysql, OTOH, returns the first row for each state.  (The first row
with
> > > "AK", the first row with "PA", etc.)
> > >
> > > I'm no SQL expert, but it seems to me that the pg behavior is
correct, and
> > > the mysql result is just weird.  Am I correct?
> > 
> > In your case, it sounds like the mysql result is wrong. I believe
SQL99
> > would allow it if the other columns were functionally dependant upon
state
> > (as there'd by definition only be one value for the other columns
per
> > group).
> 
> I believe this is a documented feature.

Hehe.  When I turn on my windshield wipers and my airbag deploys, is it
a documented "feature" if the dealership told me about this behaviour
ahead of time?  In much the same way, while this behaviour may be
documented by MySQL, I can't imagine it really being called a feature. 
But at least this misbehaviour is documented.  However, I think most
people in the MySQL universe just stumble onto it by accident when they
try it and it works.  I'd at least prefer it to throw a warning or
notice or something.

> MySQL treats "select a,b from t group by a" equivalently to Postgres's

> "select distinct on (a) a,b from t"
> 
> I suppose "equivalent" isn't quite true. It's more general since it
allows
> aggregate functions as well. The equivalently general Postgres syntax
is to
> have a first() aggregate function and do "select a,first(b) from t
group by a".

A Subselect would let you do such a thing as well, and while it's more
complicated to write, it is likely to be easier to tell just what it's
doing.

> I'm sure it's very convenient.

Not always, but I'd rather get the right answer with difficulty than the
wrong one with ease. :)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Text->Date conversion in a WHERE clause

2005-10-12 Thread Tom Lane
[EMAIL PROTECTED] writes:
> SELECT *
> FROM
> (
>   SELECT u.user_id, ud.data
>   FROM users u, userdata ud
>   WHERE u.user_id = ud.user_id
>   AND u.type = 1
> ) subusers
> WHERE subusers.data::text::date <  now();

> So my question is how does this query ever even SEE the row containing
> "052-44-5863"?  The sub-query doesn't return that row so I don't see
> how it can get this error.

BTW, the fallacy in this idea is that the planner pushes WHERE clauses
as far down the plan tree as it can.  EXPLAIN would show you the actual
plan tree, but it's probably along the lines of

Join using u.user_id = ud.user_id
Scan users u where u.type = 1
Scan userdata ud where ud.data::text::date <  now();

If we did not do this, it would pretty much cripple the performance
of queries involving views (since a view is nothing but a macro for a
sub-select).

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Greg Stark
Scott Marlowe <[EMAIL PROTECTED]> writes:

> Hehe.  When I turn on my windshield wipers and my airbag deploys, is it
> a documented "feature" if the dealership told me about this behaviour
> ahead of time?  

Well it's more like my car where the dashboard dims when I turn on my
headlights which annoys me to no end since I learned to always put my
headlights on even in the day.

> In much the same way, while this behaviour may be documented by MySQL, I
> can't imagine it really being called a feature. But at least this
> misbehaviour is documented. However, I think most people in the MySQL
> universe just stumble onto it by accident when they try it and it works. I'd
> at least prefer it to throw a warning or notice or something.

I don't see why you think people stumble on this by accident. I think it's
actually an extremely common need. So common that Postgres has the same
feature (though less general) and invented a whole syntax to handle it.

I think most MySQL users don't stumble on it, they learn it as the way to
handle the common use case when you join a master table against a detail table
and then want to aggregate all the detail records. In standard SQL you have to
write GROUP BY ... and list every single column you need from the master
table. Forcing the database to do a lot of redundant comparisons and sort on
uselessly long keys where in fact you only really need it to sort and group by
the primary key.

Remember, most MySQL users learn MySQL first, and only later learn what is
standard SQL and what isn't. 

> A Subselect would let you do such a thing as well, and while it's more
> complicated to write, it is likely to be easier to tell just what it's
> doing.

Subselects have their own problems here. Mainly Postgres's optimizer, as good
as it is, doesn't treat them with the same code paths as joins and can't find
all the same plans for them. But in any case you cannot always write a
subselect that's equivalent to an arbitrary join.


-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[SQL] Update timestamp on update

2005-10-12 Thread Jeff Williams
I have a table like:

CREATE TABLE products (
id int,
status int,
last_status_change timestamp DEFAULT now()
);

What I would like is that whenever the status is changed the
last_status_change timestamp is updated to the current time. I have had
a look at the rules and what I want would be similar to:

CREATE RULE last_status_change AS ON UPDATE
TO products WHERE NEW.status <> OLD.status
DO UPDATE products SET last_status_change = now() WHERE id = OLD.id;

Except of course that the above is recursive and doesn't work.

How can I do this?

Jeff

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] Update timestamp on update

2005-10-12 Thread Tom Lane
Jeff Williams <[EMAIL PROTECTED]> writes:
> last_status_change timestamp DEFAULT now()

> What I would like is that whenever the status is changed the
> last_status_change timestamp is updated to the current time.

For this you use an ON UPDATE trigger; rules are not a good way to solve
it.  See the documentation about triggers.  The first example on this
page does it along with a few other things:
http://developer.postgresql.org/docs/postgres/plpgsql-trigger.html

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] SEVEN cross joins?!?!?

2005-10-12 Thread Frank Bax

At 09:00 AM 10/12/05, Daryl Richter wrote:


Richard Huxton wrote:

Frank Bax wrote:

Are you saying that you WANT to generate a cross-join, score the 
millions of results and then pick the best 10? It's doing what you 
want, but you'd like it to be faster.


Or are you saying that you'd like to avoid the explosion in rows 
altogether?


In either case - I don't suppose you could provide a real example of 
the query, so we can see exactly what you're trying to do.




There is no "best 10".  I currently limit each subselect to 10 items so 
that query will actually run without crashing.  I would like to remove 
the "ORDER BY itemid LIMIT 10" mentioned above.  At the end of the query 
I have a "LIMIT 100" clause which will stay and produces a list of "best 
100" combos.


Either of your solutions would be acceptable; since avoiding the 
"explosion" would also make the query faster.  Current calculations 
indicate that running the query without "LIMIT 10" in subselect would 
take years to process.


OK - so at the heart of the problem is the fact that you want to search a 
space with 100 billion possible states. There are three main options
1. Brute force and patience - simple and is guaranteed to produce the 
"best" answers. You can use cursors/multiple queries to manage the 
workload. The disadvantage is that it's probably slower than you'd like.
2. Smarter algorithms - possibly something genetic to work towards local 
maxima. Or, if you'd like to keep it simple, just split your 7 locations 
into 2,2,3 and solve for each separately.
3. Statistical - examine a subset of possible states and accept you'll 
only be able to say "almost best" to 99% confidence or similar.
I'd be tempted by #2 - there are probably some combinations you can rule 
out, which combined with a split/recombine should reduce the number of 
states to query.


I'll second this recommendation.  The OP is trying to drive a nail with a 
screwdriver.  He needs a program, not a query.



Richard, you've summed it up nicely.

Splitting locations into subsets (like 2,2,3) doesn't work because it is 
possible that low values in one location can be offset by high values in 
another location, and still result in an excellent combo.


The good news is these suggestions got me thinking outside the box.  I 
think I can program a modified brute-force that bypasses large numbers of 
combos early.  It might still be too large/slow, so I'd be interested in 
finding more info about these "smarter algorithms" in option 2.  Where do I 
look?


Greg: my son's the gamer; I'm just trying to help him out.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes:
> I think most MySQL users don't stumble on it, they learn it as the way
> to handle the common use case when you join a master table against a
> detail table and then want to aggregate all the detail records. In
> standard SQL you have to write GROUP BY ... and list every single
> column you need from the master table. Forcing the database to do a
> lot of redundant comparisons and sort on uselessly long keys where in
> fact you only really need it to sort and group by the primary key.

Actually, if you're grouping by a table's primary key, the SQL99 spec
says you shouldn't have to explicitly list the other columns from that
table --- they are guaranteed to have unique values per group anyway.
This is a single case in the "functional dependency" stuff.  That
verbiage is incredibly dense and I don't think we want to tackle all of
it any time soon, but the primary-key case probably wouldn't be very
hard to implement.  We really ought to have this in TODO ... I'm sure
it's been discussed before.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [SQL] SEVEN cross joins?!?!?

2005-10-12 Thread Stewart Ben (RBAU/EQS4) *

> Splitting locations into subsets (like 2,2,3) doesn't work 
> because it is possible that low values in one location can be offset
by 
> high values in another location, and still result in an excellent
combo.
> 
> The good news is these suggestions got me thinking outside 
> the box.  I think I can program a modified brute-force that bypasses 
> large numbers of combos early.  It might still be too large/slow, so
I'd be 
> interested in finding more info about these "smarter algorithms" in
option 

Could you define an view with a calculated field, say, 2 * a1 + 6 * a2 +
3 * a3, and then use this aggregate to score the individual rows? I
haven't looked at the exact nature of the problem, but if you're
multiplying a table by itself, it may be worth redefining the problem in
terms of a simple ranking algorithm, define a column to calculate this,
then sort by that column.

Best regards,

Ben Stewart

--
Robert Bosch (Australia) Pty. Ltd.
Engineering Quality Services, Student Software Engineer (RBAU/EQS4)
Locked Bag 66 - Clayton South, VIC 3169 - AUSTRALIA
Tel: +61 3 9541-7002 Fax: +61 3 9541-7700
mailto:[EMAIL PROTECTED]
http://www.bosch.com.au/ 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Update timestamp on update

2005-10-12 Thread Jeff Williams
Tom Lane wrote:

>Jeff Williams <[EMAIL PROTECTED]> writes:
>  
>
>>last_status_change timestamp DEFAULT now()
>>
>>
>
>  
>
>>What I would like is that whenever the status is changed the
>>last_status_change timestamp is updated to the current time.
>>
>>
>
>For this you use an ON UPDATE trigger; rules are not a good way to solve
>it.  See the documentation about triggers.  The first example on this
>page does it along with a few other things:
>http://developer.postgresql.org/docs/postgres/plpgsql-trigger.html
>  
>
Thanks. Triggers was my first thought, but chapter 35 on Triggers didn't
really indicate a way I could do this easily and scared me with a lot of
c code. Maybe it is a good idea to present some of the more common
things you would want to do with triggers in the triggers chapter?

Jeff

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] Update timestamp on update

2005-10-12 Thread Tom Lane
Jeff Williams <[EMAIL PROTECTED]> writes:
> Thanks. Triggers was my first thought, but chapter 35 on Triggers didn't
> really indicate a way I could do this easily and scared me with a lot of
> c code.

Yeah.  This is a documentation issue that's bothered me for awhile.
The problem is that we treat the PL languages as add-ons and therefore
the documentation of the "core" system shouldn't rely on them ... but
that leaves us presenting C-code triggers as the only examples in
chapter 35.  There is a paragraph in there suggesting you go look at
the PL languages first, but obviously it's not getting the job done.

Anybody have a better idea?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] Update timestamp on update

2005-10-12 Thread Stewart Ben (RBAU/EQS4) *

> The problem is that we treat the PL languages as add-ons and therefore
> the documentation of the "core" system shouldn't rely on them ... but
> that leaves us presenting C-code triggers as the only examples in
> chapter 35.  There is a paragraph in there suggesting you go look at
> the PL languages first, but obviously it's not getting the job done.
> 
> Anybody have a better idea?

Include a sample trigger for each PL that is shipped with the standard
distribution, along with comments to the effect that "it is suggested
that the documentation for $PL is read before creating triggers"?

It may also be worth including a comment that explicitly states that
all(?) PLs are supported in triggers.

Best regards,

Ben Stewart

--
Robert Bosch (Australia) Pty. Ltd.
Engineering Quality Services, Student Software Engineer (RBAU/EQS4)
Locked Bag 66 - Clayton South, VIC 3169 - AUSTRALIA
mailto:[EMAIL PROTECTED]
http://www.bosch.com.au/ 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Anthony Molinaro
> I don't see why you think people stumble on this by accident. I think
it's
> actually an extremely common need.

I can't imagine how that's a common need at all.

It makes no sense.

When you add an additional column in the select, it must be included in
the group by as it changes the meaning of the query.

Consider:

select deptno, count(*) from emp group by deptno;

DEPTNO   COUNT(*)
-- --
10  3
20  5
30  6


the query above counts the number of employees in each department.


Now consider the following query:

select deptno,job,count(*) from emp group by deptno,job;

DEPTNO JOB COUNT(*)
-- - --
10 CLERK  1
10 MANAGER1
10 PRESIDENT  1
20 CLERK  2
20 ANALYST2
20 MANAGER1
30 CLERK  1
30 MANAGER1
30 SALESMAN   4


the query above counts the number of different job types in each
department.

In mysql, you would be allowed to do the following:

select deptno,job,count(*) from emp group by deptno;

but it makes no sense. What value would it return and what does it mean?
How can that possibly represent reliable data? What would the result set
above look like? It would be meaningless in a production system.
honestly.


It's a silly bug that mysql has touted has a feature
and I can't imagine why people think it's useful.


> I think most MySQL users don't stumble on it, they learn it as the way
to
> handle the common use case when you join a master table against a
detail 
> table and then want to aggregate all the detail records.

Huh? I don't follow that at all
Perhaps your confusing the concept with window functions that neither pg
nor mysql have implemented yet?
For example, using window functions allows you to return aggregate and
detailed info simultaneously:

select ename,
   deptno,
   job,
   count(*)over(partition by deptno) as emps_per_dept,
   count(*)over(partition by deptno,job) as job_per_dept,
   count(*)over() as total
   from emp

ENAME  DEPTNO JOB   EMPS_PER_DEPT JOB_PER_DEPT TOTAL
-- -- - -  -
MILLER 10 CLERK 3114
CLARK  10 MANAGER   3114
KING   10 PRESIDENT 3114
SCOTT  20 ANALYST   5214
FORD   20 ANALYST   5214
SMITH  20 CLERK 5214
ADAMS  20 CLERK 5214
JONES  20 MANAGER   5114
JAMES  30 CLERK 6114
BLAKE  30 MANAGER   6114
ALLEN  30 SALESMAN  6414
MARTIN 30 SALESMAN  6414
TURNER 30 SALESMAN  6414
WARD   30 SALESMAN  6414


But this is not a group by, this is aggregating and windowing, 
which is quite different from mysql adding that nasty little bug 
and calling it a feature.

 - a

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Wednesday, October 12, 2005 9:13 PM
To: Scott Marlowe
Cc: Greg Stark; Stephan Szabo; Rick Schumeyer; [email protected]
Subject: Re: [SQL] pg, mysql comparison with "group by" clause

Scott Marlowe <[EMAIL PROTECTED]> writes:

> Hehe.  When I turn on my windshield wipers and my airbag deploys, is
it
> a documented "feature" if the dealership told me about this behaviour
> ahead of time?  

Well it's more like my car where the dashboard dims when I turn on my
headlights which annoys me to no end since I learned to always put my
headlights on even in the day.

> In much the same way, while this behaviour may be documented by MySQL,
I
> can't imagine it really being called a feature. But at least this
> misbehaviour is documented. However, I think most people in the MySQL
> universe just stumble onto it by accident when they try it and it
works. I'd
> at least prefer it to throw a warning or notice or something.

I don't see why you think people stumble on this by accident. I think
it's
actually an extremely common need. So common that Postgres has the same
feature (though less general) and invented a whole syntax to handle it.

I think most MySQL users don't stumble on it, they learn it as the way
to
handle the common use case when you join a master table against a detail
table
and then want to aggregate all the detail records. In standard SQL you
have to
write GROUP BY ... and list every single column you need from the master
table. Forcing the database to do a lot of redundant comparisons and
sort on
uselessly long keys where in fact you only really need it to sort and
group by
the primary key.

Remember, m

Re: [SQL] Update timestamp on update

2005-10-12 Thread Michael Glaesemann


On Oct 13, 2005, at 11:52 , Tom Lane wrote:


Jeff Williams <[EMAIL PROTECTED]> writes:

Thanks. Triggers was my first thought, but chapter 35 on Triggers  
didn't
really indicate a way I could do this easily and scared me with a  
lot of

c code.



Yeah.  This is a documentation issue that's bothered me for awhile.
The problem is that we treat the PL languages as add-ons and therefore
the documentation of the "core" system shouldn't rely on them ... but
that leaves us presenting C-code triggers as the only examples in
chapter 35.  There is a paragraph in there suggesting you go look at
the PL languages first, but obviously it's not getting the job done.


I think examples in PL/pgSQL would perhaps be most appropriate, as  
it's a PostgreSQL-specific language, arguably easy to read, and a PL  
that doesn't rely on any external dependencies, so it should be  
relatively easy for people to install it if they want. (IIRC, there's  
even been discussion in the past of whether or not PL/pgSQL should be  
installed by default.)



Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Anthony Molinaro
Tom,
  Don't do it man...
  It's not the same. The standard has it wrong (that or I misunderstood
your
  explanation). But this statement:

> you shouldn't have to explicitly list the other columns from that
> table --- they are guaranteed to have unique values per group anyway

...is way off.
By changing the values in the select/group by you are changing 
Group! How can you arbitrarily add or exclude a column?
You can't do it.

Look, all it takes is one example to prove the standard wrong...



Consider this simple example (in oracle, but the same in pg):


create table foo (id number primary key, name varchar2(10));

insert into foo values (1,'joe');

insert into foo values (2,'joe');

insert into foo values (3,'jim');


select * from foo;

ID NAME
-- --
 1 joe
 2 joe
 3 jim


ok, lets get the count by ID (which is unique):

select id, count(*) from foo group by id;

ID   COUNT(*)
-- --
 1  1
 2  1
 3  1


makes sense...

the values for ID, 1,2,3 are our groups
and the count aggregate op shows that there's
one member per group.


now let's get the count by name:

select name, count(*) from foo group by name;

NAME COUNT(*)
-- --
jim 1
joe 2

so far so good. The queries and results are representative of the actual
data.  The values for name, "jim" and "joe" are our groups
and the count aggregate op shows that there's 1 member in the "jim"
group and two members in the "joe" group.


But, as soon as we add id to the group by...

select name, count(*) from foo group by name,id;

NAME COUNT(*)
-- --
jim 1
joe 1
joe 1

it changes the query! Even tho id is unique, it changes the query.
The group by it's definition (it's in the group by) is no longer
Name or id, it's both name and id.
How can you simply remove id? Which result set should it return
The first or second? Makes no sense...

If it's aint obvious why, simply plug id into the select:

select id, name, count(*) from foo group by name,id;

ID NAME COUNT(*)
--- -- --
  3 jim 1
  1 joe 1
  2 joe 1



so, how can it be that you don't have to list the other columns
(in this case "name")? it makes no sense because if you remove the 
either column from the group by (id or name) it changes the meaning of
the 
query.
The way you guys do it now is correct. Mysql has it wrong.
And if the standard states that you can leave out the pk from a group
by,
They are wrong too, as the simple examples above prove.

Fyi, Oracle just bought innodb, so, I'd not be too concerned with mysql
and they so called "features" anyway.


Regards,
  Anthony

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
Sent: Wednesday, October 12, 2005 9:51 PM
To: Greg Stark
Cc: Scott Marlowe; Stephan Szabo; Rick Schumeyer;
[email protected]
Subject: Re: [SQL] pg, mysql comparison with "group by" clause 

Greg Stark <[EMAIL PROTECTED]> writes:
> I think most MySQL users don't stumble on it, they learn it as the way
> to handle the common use case when you join a master table against a
> detail table and then want to aggregate all the detail records. In
> standard SQL you have to write GROUP BY ... and list every single
> column you need from the master table. Forcing the database to do a
> lot of redundant comparisons and sort on uselessly long keys where in
> fact you only really need it to sort and group by the primary key.

Actually, if you're grouping by a table's primary key, the SQL99 spec
says you shouldn't have to explicitly list the other columns from that
table --- they are guaranteed to have unique values per group anyway.
This is a single case in the "functional dependency" stuff.  That
verbiage is incredibly dense and I don't think we want to tackle all of
it any time soon, but the primary-key case probably wouldn't be very
hard to implement.  We really ought to have this in TODO ... I'm sure
it's been discussed before.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Greg Stark
"Anthony Molinaro" <[EMAIL PROTECTED]> writes:

> By changing the values in the select/group by you are changing 
> Group! How can you arbitrarily add or exclude a column?
> You can't do it.

Go back and reread the previous posts again. You missed the whole point.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Anthony Molinaro
Greg,
  You'll have to pardon me...
 
I saw this comment:

"I don't see why you think people stumble on this by accident. 
I think it's actually an extremely common need."

Which, if referring to the ability to have items in the select that do
not
need to be included in the group, (excluding constants and the like)
is just silly.

OTOH, if you're all poking fun at a mysql bug that they try to pass
off as a feature, then yes, I'm a clod and I missed that the first time
around :)

Regards,
  Anthony

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, October 12, 2005 11:43 PM
To: Anthony Molinaro
Cc: Tom Lane; Greg Stark; Scott Marlowe; Stephan Szabo; Rick Schumeyer;
[email protected]
Subject: Re: [SQL] pg, mysql comparison with "group by" clause

"Anthony Molinaro" <[EMAIL PROTECTED]> writes:

> By changing the values in the select/group by you are changing 
> Group! How can you arbitrarily add or exclude a column?
> You can't do it.

Go back and reread the previous posts again. You missed the whole point.

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Greg Stark
"Anthony Molinaro" <[EMAIL PROTECTED]> writes:

> Greg,
>   You'll have to pardon me...
>  
> I saw this comment:
> 
> "I don't see why you think people stumble on this by accident. 
> I think it's actually an extremely common need."
> 
> Which, if referring to the ability to have items in the select that do not
> need to be included in the group, (excluding constants and the like) is just
> silly.

Well the "constants and the like" are precisely the point. There are plenty of
cases where adding the column to the GROUP BY is unnecessary and since
Postgres makes no attempt to prune them out, inefficient. And constants aren't
the only such case. The most common case is columns that are coming from a
table where the primary key is already included in the GROUP BY list.

In the case of columns coming from a table where the primary key is already in
the GROUP BY list it's possible for the database to deduce that it's
unnecessary to group on that column. 

But it's also possible to have cases where the programmer has out of band
knowledge that it's unnecessary but the database doesn't have that knowledge.
The most obvious case that comes to mind is a denormalized data model that
includes a redundant column.

  select dept_id, dept_name, count(*) from employee_list

For example if dept_name is guaranteed to be the same for all records with the
same dept_id. Of course that's generally considered poor design but it doesn't
mean there aren't thousands of databases out there with data models like that.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[SQL] UPDATE Trigger on multiple tables

2005-10-12 Thread Ferindo Middleton Jr
Is it possible to have a single trigger on multiple tables 
simultaneously? Example:


CREATE TRIGGER emp_cust_stamp BEFORE INSERT OR UPDATE ON employees, customers
   FOR EACH ROW EXECUTE PROCEDURE last_updated_stamp();

I tried something like the above but get an error message at the comma. I tried using the keyword AND as well. 
I couldn't find anything on this in the docs. I have many different tables in my databases which all have a 
"last_updated" field and it seems less tedious to be able to enforce updating this trigger database-wide using 
just one trigger. Is it possible?


Ferindo



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] UPDATE Trigger on multiple tables

2005-10-12 Thread Tom Lane
Ferindo Middleton Jr <[EMAIL PROTECTED]> writes:
> Is it possible to have a single trigger on multiple tables 
> simultaneously? Example:

> CREATE TRIGGER emp_cust_stamp BEFORE INSERT OR UPDATE ON employees, customers
> FOR EACH ROW EXECUTE PROCEDURE last_updated_stamp();

No.  You can use the same function for multiple triggers, but you have
to CREATE TRIGGER for each table separately.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] pg, mysql comparison with "group by" clause

2005-10-12 Thread Anthony Molinaro
Greg,

Ok, I think I see where you're going (I don't agree, but I think
I get you now).

So, using your example of: 
"dept_name is guaranteed to be the same for all records with the
same dept_id."

Here:

select d.deptno,d.dname 
  from emp  e, dept d 
 where e.deptno=d.deptno

DEPTNO DNAME
-- --
10 ACCOUNTING
10 ACCOUNTING
10 ACCOUNTING
20 RESEARCH
20 RESEARCH
20 RESEARCH
20 RESEARCH
20 RESEARCH
30 SALES
30 SALES
30 SALES
30 SALES
30 SALES
30 SALES


 ok, so there's your scenario.

And you're suggesting that one should be able to
Do the following query?

select d.deptno,d.dname,count(*) 
  from emp  e, dept d 
 where e.deptno=d.deptno


if that's what you suggest, then we'll just have to agree to disagree.

That query needs a group by. What you're suggesting is, imo,
a wholly unnecessary shortcut (almost as bad as that ridiculous "natural
join" - whoever came up with that should be tarred and feathered).

I think I see your point now, I just disagree.
Your depending on syntax to work based on data integrity? 
Hmmm don't think I like that idea
 
What performance improvement do you get from leaving group by out?
Look at the query above, doesn't a count of distinct deptno,dname pairs
have
to take place anyway? What do you save by excluding group by?
Are you suggesting COUNT be computed for each row (windowed) or that
COUNT is computed for each group?

If you want repeating rows, then you want windowing. 
For example:

select d.deptno,d.dname,count(*)over(partition by d.deptno,d.dname) cnt 
  from emp  e, dept d 
 where e.deptno=d.deptno

DEPTNO DNAME  CNT
-- -- ---
10 ACCOUNTING   3
10 ACCOUNTING   3
10 ACCOUNTING   3
20 RESEARCH 5
20 RESEARCH 5
20 RESEARCH 5
20 RESEARCH 5
20 RESEARCH 5
30 SALES6
30 SALES6
30 SALES6
30 SALES6
30 SALES6
30 SALES6



if you want "groups", then use group by:

select d.deptno,d.dname,count(*) cnt 
  from emp  e, dept d
 where e.deptno=d.deptno 
 group by d.deptno,d.dname

DEPTNO DNAME  CNT
-- -- ---
10 ACCOUNTING   3
20 RESEARCH 5
30 SALES6


what your suggesting doesn't seem to fit in at all,
particularly when pg implements window functions.

If you're suggesting the pg optimizer isn't doing the right thing
with group by queries, then this is an optimizer issue and 
that should be hacked, not group by. If you're suggesting certain
rows be ditched or shortcuts be taken, then the optimizer should do
that, not the programmer writing sql.

Db2 and oracle have no problem doing these queries, I don't see
why pg should have a problem.

imo, the only items that should not be listed in the group by
are:

1. constants and deterministic functions
2. scalar subqueries
3. window functions

1 - because the value is same for each row
2&3 - because they are evaluated after the grouping takes place

regards,
  Anthony

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 13, 2005 12:25 AM
To: Anthony Molinaro
Cc: [EMAIL PROTECTED]; Tom Lane; Scott Marlowe; Stephan Szabo; Rick
Schumeyer; [email protected]
Subject: Re: [SQL] pg, mysql comparison with "group by" clause

"Anthony Molinaro" <[EMAIL PROTECTED]> writes:

> Greg,
>   You'll have to pardon me...
>  
> I saw this comment:
> 
> "I don't see why you think people stumble on this by accident. 
> I think it's actually an extremely common need."
> 
> Which, if referring to the ability to have items in the select that do
not
> need to be included in the group, (excluding constants and the like)
is just
> silly.

Well the "constants and the like" are precisely the point. There are
plenty of
cases where adding the column to the GROUP BY is unnecessary and since
Postgres makes no attempt to prune them out, inefficient. And constants
aren't
the only such case. The most common case is columns that are coming from
a
table where the primary key is already included in the GROUP BY list.

In the case of columns coming from a table where the primary key is
already in
the GROUP BY list it's possible for the database to deduce that it's
unnecessary to group on that column. 

But it's also possible to have cases where the programmer has out of
band
knowledge that it's unnecessary but the database doesn't have that
knowledge.
The most obvious case that comes to mind is a denormalized data model
that
includes a redundant column.

  select dept_id, dept_name, count(*) from employee_list

For example if dept_name is guaranteed to be the same for all records
with the
same dept_id. Of course that's generally considered poor design but it
doesn't
mean there aren't thousands of databases out there with data models like
that.

-- 
greg


---(end of broad