Re: [SQL] Aggregates with NaN values

2008-12-05 Thread Mark Roberts

On Thu, 2008-12-04 at 13:01 -0500, Sean Davis wrote:
> I am happy to see NaN and infinity handled in input.  I would now like
> to compute aggregates (avg, min, max, etc) on columns with NaN values
> in them.  The standard behavior (it appears) is to have the aggregate
> return NaN if the data contain one-or-more NaN values.  I am used to
> using coalesce with NULL values, but that doesn't work with NaN.  I
> can deal with these using CASE statuement to assign a value, but is
> there a standard way of dealing with the NaN (or Infinity, for that
> matter) cases to get a behvavior where they are "ignored" by an
> aggregate?
> 
> Thanks,
> Sean
> 

Have you considered using a where clause?

-Mark


-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Best way to "and" from a one-to-many joined table?

2008-12-05 Thread Bryce Nesbitt




Dear Experts,

I'm looking for a good technique to do "and" searches on one-to-many
joined tables.  For example, to find people with both 'dark hair' and
'president':

# select * from test_people join test_attributes using (people_id);
+---+-+---+
| people_id | person_name |   attribute   |
+---+-+---+
|    10 | Satan   | The Devil |
| 9 | Santa   | Imaginary |
| 8 | Obamba  | Dark Hair |
| 8 | Obamba  | Dark Hair |
| 8 | Obamba  | USA President |
|    10 | Satan   | Dark Hair |
+---+-+---+

# select person_name from test_people where people_id in
(select people_id from test_attributes where attribute='USA President' 

 INTERSECT
 select people_id from test_attributes where attribute='Dark Hair');

# select person_name from test_people
where people_id in
(select people_id from test_attributes where attribute='USA President')
and people_id in
(select people_id from test_attributes where attribute='Dark Hair');

# select people_id,count(*) as count from test_people
join test_attributes using (people_id)
where attribute='Dark Hair' or attribute='USA President'
group by people_id having count(*) >= 2;


A postgres specific solution is OK, but SQL92 is better.  I had the
"in" solution recommended to me, but it's performing dramatically
poorly on huge tables.

Thanks for any references to a solution!  -Bryce




Re: [SQL] Aggregates with NaN values

2008-12-05 Thread Sean Davis
On Fri, Dec 5, 2008 at 1:51 PM, Mark Roberts
<[EMAIL PROTECTED]> wrote:
>
> On Thu, 2008-12-04 at 13:01 -0500, Sean Davis wrote:
>> I am happy to see NaN and infinity handled in input.  I would now like
>> to compute aggregates (avg, min, max, etc) on columns with NaN values
>> in them.  The standard behavior (it appears) is to have the aggregate
>> return NaN if the data contain one-or-more NaN values.  I am used to
>> using coalesce with NULL values, but that doesn't work with NaN.  I
>> can deal with these using CASE statuement to assign a value, but is
>> there a standard way of dealing with the NaN (or Infinity, for that
>> matter) cases to get a behvavior where they are "ignored" by an
>> aggregate?
>>
>> Thanks,
>> Sean
>>
>
> Have you considered using a where clause?

Thanks, Mark.  Yes.  I have about 20 columns over which I want to
simultaneously compute aggregates.  Each has NaN's in different rows,
so a where clause won't do what I need.

The CASE statement approach works fine, though.

Sean

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Best way to "and" from a one-to-many joined table?

2008-12-05 Thread Oliveiros Cristina
Howdy, Bryce

Could you please try this out and tell me if it gave what you want.

Best,
Oliveiros

SELECT person_name
FROM test_people p
JOIN test_attributes a
ON ((a.people_id = p.people_id) AND (a."attribute" = @firstAttr))
JOIN test_attributes b
ON ((b."people_id" = p."people_id") AND (b."attribute" = @secondAttr));

  - Original Message - 
  From: Bryce Nesbitt 
  To: sql pgsql 
  Sent: Friday, December 05, 2008 6:55 PM
  Subject: [SQL] Best way to "and" from a one-to-many joined table?


  Dear Experts,

  I'm looking for a good technique to do "and" searches on one-to-many joined 
tables.  For example, to find people with both 'dark hair' and 'president':

  # select * from test_people join test_attributes using (people_id);
  +---+-+---+
  | people_id | person_name |   attribute   |
  +---+-+---+
  |10 | Satan   | The Devil |
  | 9 | Santa   | Imaginary |
  | 8 | Obamba  | Dark Hair |
  | 8 | Obamba  | Dark Hair |
  | 8 | Obamba  | USA President |
  |10 | Satan   | Dark Hair |
  +---+-+---+

  # select person_name from test_people where people_id in
  (select people_id from test_attributes where attribute='USA President'  
   INTERSECT
   select people_id from test_attributes where attribute='Dark Hair');

  # select person_name from test_people
  where people_id in
  (select people_id from test_attributes where attribute='USA President')
  and people_id in
  (select people_id from test_attributes where attribute='Dark Hair');

  # select people_id,count(*) as count from test_people
  join test_attributes using (people_id)
  where attribute='Dark Hair' or attribute='USA President'
  group by people_id having count(*) >= 2;


  A postgres specific solution is OK, but SQL92 is better.  I had the "in" 
solution recommended to me, but it's performing dramatically poorly on huge 
tables.

  Thanks for any references to a solution!  -Bryce


Re: [SQL] Best way to "and" from a one-to-many joined table?

2008-12-05 Thread Milan Oparnica

Hi,

This is how I do it, and it runs fast:

select p.*
from test_people p inner join test_attributes a on p.people_id = 
a.people_id

where a."attribute" = @firstAttr or a."attribute" = @secondAttr

If you have many attributes to search for you can replace the where part 
with


where a."attribute" in (@firstAttr,@secondAttr,...)

For best results, you can index the field "attribute" on test_attributes 
table. Be aware of case sensitivity of PG text search.


Best regards,

Milan Oparnica


Oliveiros Cristina wrote:

Howdy, Bryce
 
Could you please try this out and tell me if it gave what you want.
 
Best,

Oliveiros
 
SELECT person_name

FROM test_people p
JOIN test_attributes a
ON ((a.people_id = p.people_id) AND (a."attribute" = @firstAttr))
JOIN test_attributes b
ON ((b."people_id" = p."people_id") AND (b."attribute" = @secondAttr));

- Original Message -
*From:* Bryce Nesbitt 
*To:* sql pgsql 
*Sent:* Friday, December 05, 2008 6:55 PM
*Subject:* [SQL] Best way to "and" from a one-to-many joined table?

Dear Experts,

I'm looking for a good technique to do "and" searches on one-to-many
joined tables.  For example, to find people with both 'dark hair'
and 'president':

# select * from test_people join test_attributes using (people_id);
+---+-+---+
| people_id | person_name |   attribute   |
+---+-+---+
|10 | Satan   | The Devil |
| 9 | Santa   | Imaginary |
| 8 | Obamba  | Dark Hair |
| 8 | Obamba  | Dark Hair |
| 8 | Obamba  | USA President |
|10 | Satan   | Dark Hair |
+---+-+---+

# select person_name from test_people where people_id in
(select people_id from test_attributes where attribute='USA President' 
 INTERSECT

 select people_id from test_attributes where attribute='Dark Hair');

# select person_name from test_people
where people_id in
(select people_id from test_attributes where attribute='USA President')
and people_id in
(select people_id from test_attributes where attribute='Dark Hair');

# select people_id,count(*) as count from test_people
join test_attributes using (people_id)
where attribute='Dark Hair' or attribute='USA President'
group by people_id having count(*) >= 2;


A postgres specific solution is OK, but SQL92 is better.  I had the
"in" solution recommended to me, but it's performing dramatically
poorly on huge tables.

Thanks for any references to a solution!  -Bryce


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Best way to "and" from a one-to-many joined table?

2008-12-05 Thread Bryce Nesbitt




It works (with a DISTINCT clause added because of the duplicated row
for Obama).  It has a nice clean looking explain plan.  It has the
slowest execution time on this sample table (though that might not mean
anything).

SELECT
DISTINCT
person_name
FROM test_people p
JOIN test_attributes a
ON ((a.people_id = p.people_id) AND (a."attribute" = 'Dark Hair'))
JOIN test_attributes b
ON ((b."people_id" = p."people_id") AND (b."attribute" = 'USA
President'));

Here's the full test table

$ pg_dump --table=test_people --table=test_attributes -p 5433 -i
CREATE TABLE test_attributes (
    people_id integer,
    attribute text
);
COPY test_attributes (people_id, attribute) FROM stdin;
10    The Devil
9    Imaginary
8    Dark Hair
8    Dark Hair
8    USA President
10    Dark Hair
\.

CREATE TABLE test_people (
    people_id integer DEFAULT nextval('test_sequence'::regclass) NOT
NULL,
    person_name text
);
COPY test_people (people_id, person_name) FROM stdin;
8    Obamba
9    Santa
10    Satan
\.


Oliveiros Cristina wrote:

  
  
  
  Howdy, Bryce
  Could you please try this out and
tell me if it gave what you want.
  Best,
  Oliveiros
   
  SELECT person_name
FROM test_people p
JOIN test_attributes a
ON ((a.people_id = p.people_id) AND (a."attribute" = @firstAttr))
JOIN test_attributes b
ON ((b."people_id" = p."people_id") AND (b."attribute" = @secondAttr));







Re: [SQL] Best way to "and" from a one-to-many joined table?

2008-12-05 Thread Oliveiros Cristina
Hello, Bryce.
It wasn't supposed to output duplicates.

I have assumed that on the test_attributes u didn't have duplicate records,
i.e.,
you didn't have the same pair (people_id, attribute) more than once... But
it seems you do...
And Hence the duplicate row for Obama .
Why is that?
One person can have exactly the same attribute twice?? :-)

On the execution speed, I do declare that query optimization is an area
I know very little about (just to avoid  saying that i know nothing :p ) ,
maybe someone
with more knowledge than me can help you better, but from my
own experience, not just with postgres, but also with other sgbd ,
I can tell that subqueries of the kind WHERE x in (SELECT ... )
have the tendency to be slow, that's why I tried to provide you
a solution with the JOINs


Best,
Oliveiros


2008/12/5 Bryce Nesbitt <[EMAIL PROTECTED]>

>  It works (with a DISTINCT clause added because of the duplicated row for
> Obama).  It has a nice clean looking explain plan.  It has the slowest
> execution time on this sample table (though that might not mean anything).
>
> SELECT
> DISTINCT
> person_name
> FROM test_people p
> JOIN test_attributes a
> ON ((a.people_id = p.people_id) AND (a."attribute" = 'Dark Hair'))
> JOIN test_attributes b
> ON ((b."people_id" = p."people_id") AND (b."attribute" = 'USA President'));
>
> Here's the full test table
>
> $ pg_dump --table=test_people --table=test_attributes -p 5433 -i
> CREATE TABLE test_attributes (
> people_id integer,
> attribute text
> );
> COPY test_attributes (people_id, attribute) FROM stdin;
> 10The Devil
> 9Imaginary
> 8Dark Hair
> 8Dark Hair
> 8USA President
> 10Dark Hair
> \.
>
> CREATE TABLE test_people (
> people_id integer DEFAULT nextval('test_sequence'::regclass) NOT NULL,
> person_name text
> );
> COPY test_people (people_id, person_name) FROM stdin;
> 8Obamba
> 9Santa
> 10Satan
> \.
>
>
> Oliveiros Cristina wrote:
>
> Howdy, Bryce
> Could you please try this out and tell me if it gave what you want.
> Best,
> Oliveiros
>
> SELECT person_name
> FROM test_people p
> JOIN test_attributes a
> ON ((a.people_id = p.people_id) AND (a."attribute" = @firstAttr))
> JOIN test_attributes b
> ON ((b."people_id" = p."people_id") AND (b."attribute" = @secondAttr));
>
>
>
>