Re: [SQL] full join in view

2003-01-15 Thread Tambet Matiisen


> 
> If you need only not null-kdt_id and dor_id, then just change 
> your joins
> into inner joins.
> 

The whole idea is to show expected expenses (in "koostude_detailid") even if they were 
actually not spent (the same material is not listed in "dokumentide_read"). And also 
show actual expenses (in "dokumentide_read"), even if we did not expect them (the same 
material is not listed in "koostude_detailid"). Uh, my english is bit rough, but hope 
you get the idea. 

Anyway, that's why full join seemed exactly the right thing. As I understand now, the 
reason why my original query does not use indexes, is because of sub-query, not full 
join. And I think I understood the problem of exposing the right field in sub-query, 
but my query doesn't seem to have the same problem.

> If you are sure, that you will get only not-null results, you 
> don't need
> to include koostud and marerjalid.
> 

The problem is, that both kdt_kst_id and dor_kst_id can be null, but they never are at 
the same time. It's not correct to expose either of them as kst_id. That's why my 
original query used coalesce to get kst_id, which is always not null. But using 
coalesce field for filtering of course disabled indexes. Including koostud table in 
query was good idea, because now I have kst_id, which is always not null. 

I was not able to eliminate "materjalid" from my query, because that would have forced 
me to use full join between "koostude_detailid" and "dokumentide_read" again. Which is 
not automatically bad thing, but this forces me to write query from "dokumentide_read" 
as sub-query (whether row in "dokumentide_read" is active or not depends if 
corresponding row in "dokumendid" is approved or not (kinnitaja is not null)). And 
this sub-query does not use indexes. And cross join is bad.

I think I have to experiment bit more. Does anyone know a good tool (preferably free) 
to generate test data? I've got into habit disabling seqscan to see what indexes get 
used. More data would give more adequate execution plans.

  Tambet

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[SQL] Performance of request of type WHERE ... IN ( ... )

2003-01-15 Thread Th Templ
Hello,

I have performance problem of an request of type ... WHERE ... IN ( ... ).
How to improve performance of this type of request when a group of id in the 
'in' is important.
Thanks for your answers.
Templth


_
Add photos to your e-mail with MSN 8. Get 2 months FREE*. 
http://join.msn.com/?page=features/featuredemail


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [SQL] Performance of request of type WHERE ... IN ( ... )

2003-01-15 Thread Achilleus Mantzios
On Wed, 15 Jan 2003, Th Templ wrote:

> Hello,
>
> I have performance problem of an request of type ... WHERE ... IN ( ... ).
> How to improve performance of this type of request when a group of id in the
> 'in' is important.

Try WHERE ... EXISTS (...).


> Thanks for your answers.
> Templth
>
>
> _
> Add photos to your e-mail with MSN 8. Get 2 months FREE*.
> http://join.msn.com/?page=features/featuredemail
>
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
>

==
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:+30-10-8981112
fax:+30-10-8981877
email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Andreas Joseph Krogh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

The following does not work:

create index session_u_idx on session (to_char(created, ''));
ERROR:  parser: parse error at or near "''" at character 57

Can I make a function to do this and index using the result of that funtion? 
Do anyone have an example of such a function?

- -- 
Andreas Joseph Krogh <[EMAIL PROTECTED]>
There will always be someone who agrees with you
but is, inexplicably, a moron.

gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+JUfhUopImDh2gfQRAme7AJ4jDB+e97rvEicGrxBniD1ddQ1gZgCfbbGl
azbrt7/+xGJUuLSQC7fF+vQ=
=3pKN
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Andreas Joseph Krogh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 15 January 2003 11:37, you wrote:
> The following does not work:
>
> create index session_u_idx on session (to_char(created, ''));
> ERROR:  parser: parse error at or near "''" at character 57
>
> Can I make a function to do this and index using the result of that
> funtion? Do anyone have an example of such a function?

I tried the following function:
- -
create function drus (timestamp) returns varchar AS'
DECLARE
str_created VARCHAR;
created ALIAS FOR $1;
BEGIN
str_created:= to_char(created, '''');
RETURN str_created;
END;
' LANGUAGE 'plpgsql';

create index session_u_idx on session (drus(created));
- -
But it failes with: 
ERROR:  DefineIndex: index function must be marked isImmutable

Now the question is how do I mark an index function isImmutable?

- -- 
Andreas Joseph Krogh <[EMAIL PROTECTED]>
There will always be someone who agrees with you
but is, inexplicably, a moron.

gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+JUtlUopImDh2gfQRAl1XAKCkEDKxRDys/Di1gLLRDx6h0TGiPwCeI4FN
DNdajyaQTd27f8MeaWZ+xUE=
=T3we
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Achilleus Mantzios
On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On Wednesday 15 January 2003 11:37, you wrote:
> > The following does not work:
> >
> > create index session_u_idx on session (to_char(created, ''));
> > ERROR:  parser: parse error at or near "''" at character 57
> >
> > Can I make a function to do this and index using the result of that
> > funtion? Do anyone have an example of such a function?
>
> I tried the following function:
> - -
> create function drus (timestamp) returns varchar AS'
> DECLARE
> str_created VARCHAR;
> created ALIAS FOR $1;
> BEGIN
> str_created:= to_char(created, '''');
> RETURN str_created;
> END;
> ' LANGUAGE 'plpgsql';

add
WITH (iscachable)

>
> create index session_u_idx on session (drus(created));
> - -
> But it failes with:
> ERROR:  DefineIndex: index function must be marked isImmutable
>
> Now the question is how do I mark an index function isImmutable?
>
> - --
> Andreas Joseph Krogh <[EMAIL PROTECTED]>
>   There will always be someone who agrees with you
>   but is, inexplicably, a moron.
>
> gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.0.7 (GNU/Linux)
>
> iD8DBQE+JUtlUopImDh2gfQRAl1XAKCkEDKxRDys/Di1gLLRDx6h0TGiPwCeI4FN
> DNdajyaQTd27f8MeaWZ+xUE=
> =T3we
> -END PGP SIGNATURE-
>
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
>

==
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:+30-10-8981112
fax:+30-10-8981877
email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Andreas Joseph Krogh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 15 January 2003 16:12, you wrote:
> On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > On Wednesday 15 January 2003 11:37, you wrote:
> > > The following does not work:
> > >
> > > create index session_u_idx on session (to_char(created, ''));
> > > ERROR:  parser: parse error at or near "''" at character 57
> > >
> > > Can I make a function to do this and index using the result of that
> > > funtion? Do anyone have an example of such a function?
> >
> > I tried the following function:
> > - -
> > create function drus (timestamp) returns varchar AS'
> > DECLARE
> > str_created VARCHAR;
> > created ALIAS FOR $1;
> > BEGIN
> > str_created:= to_char(created, '''');
> > RETURN str_created;
> > END;
> > ' LANGUAGE 'plpgsql';
>
> add
> WITH (iscachable)

Thank you, not _that_ works:-)
But now this doesn't work:
create index session_u_idx on session (drus(created), username);

Can't I have multicolumn-indexes with functions? Any idea how to rewrite that 
so it works?
Here is my session table:
CREATE TABLE session (
session_id varchar(256) NOT NULL PRIMARY KEY,
created timestamp DEFAULT 'now' NOT NULL,
last_accessed timestamp NOT NULL,
destroyed timestamp NOT NULL,
username varchar -- Allow sessions from not logged in users
);

Here is my query I wish to optimize using indexes:
SELECT to_char(created, 'IW') as week, count(session_id) from session WHERE 
username IS NULL and to_char(created, '') = '2002' group by week ORDER BY 
week;

Any hints on optimizing this query, index-usage etc?

- -- 
Andreas Joseph Krogh <[EMAIL PROTECTED]>
There will always be someone who agrees with you
but is, inexplicably, a moron.

gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+JU8hUopImDh2gfQRAuA5AKCXyqCZk92d6oCgyJ/Auf8c4xkSaQCgr4Lq
/+r2WSydbYWXNomMvbmt2E8=
=N6NQ
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[SQL] sort by relevance

2003-01-15 Thread sector119
HI, PPL!

How am I able to sort query results by relevance?
I use contrib/tsearch to search using fill text index!

-- 
WBR, sector119



msg09000/pgp0.pgp
Description: PGP signature


[SQL] Oracle outer join porting question

2003-01-15 Thread Marko Asplund

i'm trying to port an existing application from Oracle8i to PostgreSQL but
i'm having problems understanding a certain outer join query type used in
the application. the query includes a normal outer join between two tables
but also uses outer join syntax to join a table with a constant. here's a
simplified version of the query:

SELECT doc.id,doc.title,sub.user_id,sub.operation
  FROM document doc, document_subscription sub
  WHERE 6 = sub.user_id(+) AND sub.document_id(+) = doc.id;

what does the '6 = sub.user_id(+)' condition exactly do in this query?  
how would this be translated SQL92 join syntax used by PostgreSQL?

i've tried converting it to:

SELECT doc.id,doc.title,sub.user_id,sub.operation
  FROM document doc LEFT OUTER JOIN document_subscription sub
  ON sub.document_id = doc.id
  WHERE (sub.user_id = 6 OR sub.user_id IS NULL);

but this query is missing the rows in the documents table which have a
corresponding document_subscription row with 'not user_id = 6'.

here're also simplified definitions of the two tables used in the query
and some test data:

CREATE TABLE document (
  id INTEGER,
  title VARCHAR(100),
  PRIMARY KEY(id)
);
CREATE TABLE document_subscription (
  document_id INTEGER NOT NULL,
  user_id INTEGER NOT NULL,
  operation VARCHAR(10)
);

INSERT INTO document VALUES (1, 'doc1');
INSERT INTO document VALUES (2, 'doc2');
INSERT INTO document VALUES (4, 'doc4');
INSERT INTO document_subscription VALUES (1, 5, 'op1');
INSERT INTO document_subscription VALUES (2, 5, 'op2');
INSERT INTO document_subscription VALUES (2, 6, 'op2');

best regards,
-- 
aspahttp://www.kronodoc.fi/


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] Oracle outer join porting question

2003-01-15 Thread Dan Langille
On 15 Jan 2003 at 16:31, Marko Asplund wrote:

> 
> i'm trying to port an existing application from Oracle8i to PostgreSQL but
> i'm having problems understanding a certain outer join query type used in
> the application. the query includes a normal outer join between two tables
> but also uses outer join syntax to join a table with a constant. here's a
> simplified version of the query:
> 
> SELECT doc.id,doc.title,sub.user_id,sub.operation
>   FROM document doc, document_subscription sub
>   WHERE 6 = sub.user_id(+) AND sub.document_id(+) = doc.id;
> 
> what does the '6 = sub.user_id(+)' condition exactly do in this query?  
> how would this be translated SQL92 join syntax used by PostgreSQL?
> 
> i've tried converting it to:
> 
> SELECT doc.id,doc.title,sub.user_id,sub.operation
>   FROM document doc LEFT OUTER JOIN document_subscription sub
>   ON sub.document_id = doc.id
>   WHERE (sub.user_id = 6 OR sub.user_id IS NULL);
> 
> but this query is missing the rows in the documents table which have a
> corresponding document_subscription row with 'not user_id = 6'.

What about this:
SELECT doc.id,doc.title,sub.user_id,sub.operation
  FROM document doc LEFT OUTER JOIN document_subscription sub
  ON sub.document_id = doc.id;

 id | title | user_id | operation
+---+-+---
  1 | doc1  |   5 | op1
  2 | doc2  |   5 | op2
  2 | doc2  |   6 | op2
  4 | doc4  | |
(4 rows)


> 
> here're also simplified definitions of the two tables used in the query
> and some test data:

Thanks for supplying the table and data.  That makes things much 
easier.

-- 
Dan Langille : http://www.langille.org/


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] Oracle outer join porting question

2003-01-15 Thread Tambet Matiisen


> -Original Message-
> From: Marko Asplund [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, January 15, 2003 4:31 PM
> To: [EMAIL PROTECTED]
> Subject: [SQL] Oracle outer join porting question
> 
> 
> 
> i'm trying to port an existing application from Oracle8i to 
> PostgreSQL but
> i'm having problems understanding a certain outer join query 
> type used in
> the application. the query includes a normal outer join 
> between two tables
> but also uses outer join syntax to join a table with a 
> constant. here's a
> simplified version of the query:
> 
> SELECT doc.id,doc.title,sub.user_id,sub.operation
>   FROM document doc, document_subscription sub
>   WHERE 6 = sub.user_id(+) AND sub.document_id(+) = doc.id;
> 
> what does the '6 = sub.user_id(+)' condition exactly do in 
> this query?  
> how would this be translated SQL92 join syntax used by PostgreSQL?
> 
> i've tried converting it to:
> 
> SELECT doc.id,doc.title,sub.user_id,sub.operation
>   FROM document doc LEFT OUTER JOIN document_subscription sub
>   ON sub.document_id = doc.id
>   WHERE (sub.user_id = 6 OR sub.user_id IS NULL);
> 
> but this query is missing the rows in the documents table which have a
> corresponding document_subscription row with 'not user_id = 6'.
> 

Try this:

SELECT doc.id,doc.title,sub.user_id,sub.operation
  FROM document doc LEFT OUTER JOIN document_subscription sub
  ON sub.document_id = doc.id AND sub.user_id = 6; 

  Tambet

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Achilleus Mantzios
On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On Wednesday 15 January 2003 16:12, you wrote:
> > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > >
> > > On Wednesday 15 January 2003 11:37, you wrote:
> > > > The following does not work:
> > > >
> > > > create index session_u_idx on session (to_char(created, ''));
> > > > ERROR:  parser: parse error at or near "''" at character 57
> > > >
> > > > Can I make a function to do this and index using the result of that
> > > > funtion? Do anyone have an example of such a function?
> > >
> > > I tried the following function:
> > > - -
> > > create function drus (timestamp) returns varchar AS'
> > > DECLARE
> > > str_created VARCHAR;
> > > created ALIAS FOR $1;
> > > BEGIN
> > > str_created:= to_char(created, '''');
> > > RETURN str_created;
> > > END;
> > > ' LANGUAGE 'plpgsql';
> >
> > add
> > WITH (iscachable)
>
> Thank you, not _that_ works:-)
> But now this doesn't work:
> create index session_u_idx on session (drus(created), username);

Functinal indexes are single column indexes.

Why dont you change your function to:

create function drus (timestamp,varchar) returns varchar A

and return the concatenation of to_char(created, '''')||$2

and then create the index as usual (passing the date and the username
as params to your function)

>
> Can't I have multicolumn-indexes with functions? Any idea how to rewrite that
> so it works?
> Here is my session table:
> CREATE TABLE session (
>   session_id varchar(256) NOT NULL PRIMARY KEY,
>   created timestamp DEFAULT 'now' NOT NULL,
>   last_accessed timestamp NOT NULL,
>   destroyed timestamp NOT NULL,
>   username varchar -- Allow sessions from not logged in users
> );
>
> Here is my query I wish to optimize using indexes:
> SELECT to_char(created, 'IW') as week, count(session_id) from session WHERE
> username IS NULL and to_char(created, '') = '2002' group by week ORDER BY
> week;
>
> Any hints on optimizing this query, index-usage etc?
>
> - --
> Andreas Joseph Krogh <[EMAIL PROTECTED]>
>   There will always be someone who agrees with you
>   but is, inexplicably, a moron.
>
> gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.0.7 (GNU/Linux)
>
> iD8DBQE+JU8hUopImDh2gfQRAuA5AKCXyqCZk92d6oCgyJ/Auf8c4xkSaQCgr4Lq
> /+r2WSydbYWXNomMvbmt2E8=
> =N6NQ
> -END PGP SIGNATURE-
>
>
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
>

==
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:+30-10-8981112
fax:+30-10-8981877
email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] Oracle outer join porting question

2003-01-15 Thread Dan Langille
On 15 Jan 2003 at 16:40, Tambet Matiisen wrote:

> Try this:
> 
> SELECT doc.id,doc.title,sub.user_id,sub.operation
>   FROM document doc LEFT OUTER JOIN document_subscription sub
>   ON sub.document_id = doc.id AND sub.user_id = 6; 

FWIW:

test=# SELECT doc.id,doc.title,sub.user_id,sub.operation
test-#   FROM document doc LEFT OUTER JOIN document_subscription sub
test-#   ON sub.document_id = doc.id AND sub.user_id = 6;
 id | title | user_id | operation
+---+-+---
  1 | doc1  | |
  2 | doc2  |   6 | op2
  4 | doc4  | |
-- 
Dan Langille : http://www.langille.org/


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Andreas Joseph Krogh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 15 January 2003 18:55, Achilleus Mantzios wrote:
> On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > On Wednesday 15 January 2003 16:12, you wrote:
> > > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > > > -BEGIN PGP SIGNED MESSAGE-
> > > > Hash: SHA1
> > > >
> > > > On Wednesday 15 January 2003 11:37, you wrote:
> > > > > The following does not work:
> > > > >
> > > > > create index session_u_idx on session (to_char(created, ''));
> > > > > ERROR:  parser: parse error at or near "''" at character 57
> > > > >
> > > > > Can I make a function to do this and index using the result of that
> > > > > funtion? Do anyone have an example of such a function?
> > > >
> > > > I tried the following function:
> > > > - -
> > > > create function drus (timestamp) returns varchar AS'
> > > > DECLARE
> > > > str_created VARCHAR;
> > > > created ALIAS FOR $1;
> > > > BEGIN
> > > > str_created:= to_char(created, '''');
> > > > RETURN str_created;
> > > > END;
> > > > ' LANGUAGE 'plpgsql';
> > >
> > > add
> > > WITH (iscachable)
> >
> > Thank you, not _that_ works:-)
> > But now this doesn't work:
> > create index session_u_idx on session (drus(created), username);
>
> Functinal indexes are single column indexes.
>
> Why dont you change your function to:
>
> create function drus (timestamp,varchar) returns varchar A
>
> and return the concatenation of to_char(created, '''')||$2
>
> and then create the index as usual (passing the date and the username
> as params to your function)

OK, thank you.
Let me explain what I want to do:
I have the following schema for gathering statistics from the web:

CREATE TABLE session (
session_id varchar(256) NOT NULL PRIMARY KEY,
created timestamp DEFAULT 'now' NOT NULL,
last_accessed timestamp NOT NULL,
destroyed timestamp NOT NULL,
username varchar -- Allow sessions from not logged in users
);

create or replace function drus (timestamp) returns varchar AS'
DECLARE
str_created VARCHAR;
created ALIAS FOR $1;
BEGIN
str_created:= to_char(created, '''');
RETURN str_created;
END;
' LANGUAGE 'plpgsql' WITH (iscachable);

create index session_u_idx on session (drus(created)) where username is null;

Now I want to get statistics for number of hits pr. week where users are not 
lnogged in(username IS NULL) for the year 2002:

select to_char(created, 'IW') as week, count(session_id) from session WHERE 
username IS NULL and drus(created) = '2002' group by week ORDER BY week;
 week | count
- --+---
 01   |  6321
 18   |74
 19   | 12153
 20   | 17125
 21   | 22157
 22   | 25316
 23   | 24265
 24   | 26234
 25   | 28583
 26   | 29156
 27   | 28335
 28   | 23587
 29   | 23203

This table is quite large(900 000 rows) and the query takes several minutes to 
run, which makes the browser timeout.
Do I have a design-issue here, should I rather batch-generate the stats in its 
own table so I don't have to process all the data(900 000 rows) each time.

Is there any way to optimize/rewrite this query? Is the use of to_char on the 
timestamp wrong, should I use another comparation method for getting the year 
2002?

- -- 
Andreas Joseph Krogh <[EMAIL PROTECTED]>
There will always be someone who agrees with you
but is, inexplicably, a moron.

gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+JXiPUopImDh2gfQRAlf/AJ9xlcUDqa7NcXghtse8PAqQxkf1lACdEGxH
vBXYxoFZnS6J35iQGw+14wE=
=xCVY
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Achilleus Mantzios
On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On Wednesday 15 January 2003 18:55, Achilleus Mantzios wrote:
> > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > >
> > > On Wednesday 15 January 2003 16:12, you wrote:
> > > > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > > > > -BEGIN PGP SIGNED MESSAGE-
> > > > > Hash: SHA1
> > > > >
> > > > > On Wednesday 15 January 2003 11:37, you wrote:
> > > > > > The following does not work:
> > > > > >
> > > > > > create index session_u_idx on session (to_char(created, ''));
> > > > > > ERROR:  parser: parse error at or near "''" at character 57
> > > > > >
> > > > > > Can I make a function to do this and index using the result of that
> > > > > > funtion? Do anyone have an example of such a function?
> > > > >
> > > > > I tried the following function:
> > > > > - -
> > > > > create function drus (timestamp) returns varchar AS'
> > > > > DECLARE
> > > > > str_created VARCHAR;
> > > > > created ALIAS FOR $1;
> > > > > BEGIN
> > > > > str_created:= to_char(created, '''');
> > > > > RETURN str_created;
> > > > > END;
> > > > > ' LANGUAGE 'plpgsql';
> > > >
> > > > add
> > > > WITH (iscachable)
> > >
> > > Thank you, not _that_ works:-)
> > > But now this doesn't work:
> > > create index session_u_idx on session (drus(created), username);
> >
> > Functinal indexes are single column indexes.
> >
> > Why dont you change your function to:
> >
> > create function drus (timestamp,varchar) returns varchar A
> >
> > and return the concatenation of to_char(created, '''')||$2
> >
> > and then create the index as usual (passing the date and the username
> > as params to your function)
>
> OK, thank you.
> Let me explain what I want to do:

You didnt try it!!

Change your to_char(created, '''')||$2 to
to_char(created, '''')||(coalesce($2,)
(provided there is no user named mister '' :)

then perform your query like:

select to_char(created, 'IW') as week, count(session_id) from session
WHERE drus(created,username) = '2002' group by week ORDER BY
week;

do a explain analyze to see index and performance issues.

> I have the following schema for gathering statistics from the web:
>
> CREATE TABLE session (
>   session_id varchar(256) NOT NULL PRIMARY KEY,
>   created timestamp DEFAULT 'now' NOT NULL,
>   last_accessed timestamp NOT NULL,
>   destroyed timestamp NOT NULL,
>   username varchar -- Allow sessions from not logged in users
> );
>
> create or replace function drus (timestamp) returns varchar AS'
> DECLARE
> str_created VARCHAR;
> created ALIAS FOR $1;
> BEGIN
> str_created:= to_char(created, '''');
> RETURN str_created;
> END;
> ' LANGUAGE 'plpgsql' WITH (iscachable);
>
> create index session_u_idx on session (drus(created)) where username is null;
>
> Now I want to get statistics for number of hits pr. week where users are not
> lnogged in(username IS NULL) for the year 2002:
>
> select to_char(created, 'IW') as week, count(session_id) from session WHERE
> username IS NULL and drus(created) = '2002' group by week ORDER BY week;
>  week | count
> - --+---
>  01   |  6321
>  18   |74
>  19   | 12153
>  20   | 17125
>  21   | 22157
>  22   | 25316
>  23   | 24265
>  24   | 26234
>  25   | 28583
>  26   | 29156
>  27   | 28335
>  28   | 23587
>  29   | 23203
>
> This table is quite large(900 000 rows) and the query takes several minutes to
> run, which makes the browser timeout.
> Do I have a design-issue here, should I rather batch-generate the stats in its
> own table so I don't have to process all the data(900 000 rows) each time.
>
> Is there any way to optimize/rewrite this query? Is the use of to_char on the
> timestamp wrong, should I use another comparation method for getting the year
> 2002?
>
> - --
> Andreas Joseph Krogh <[EMAIL PROTECTED]>
>   There will always be someone who agrees with you
>   but is, inexplicably, a moron.
>
> gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.0.7 (GNU/Linux)
>
> iD8DBQE+JXiPUopImDh2gfQRAlf/AJ9xlcUDqa7NcXghtse8PAqQxkf1lACdEGxH
> vBXYxoFZnS6J35iQGw+14wE=
> =xCVY
> -END PGP SIGNATURE-
>
>
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
>

==
Achilleus Mantzios
S/W Engineer
IT dept
Dynacom Tankers Mngmt
Nikis 4, Glyfada
Athens 16610
Greece
tel:+30-10-8981112
fax:+30-10-8981877
email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Tom Lane
Andreas Joseph Krogh <[EMAIL PROTECTED]> writes:
> create index session_u_idx on session (drus(created)) where username is null;

> select to_char(created, 'IW') as week, count(session_id) from session WHERE 
> username IS NULL and drus(created) = '2002' group by week ORDER BY week;

> This table is quite large(900 000 rows) and the query takes several
> minutes to run, which makes the browser timeout.

Is the query actually using the index?  It looks like it should, but
there's no substitute for making sure.  May we see the output of
EXPLAIN ANALYZE for this query?  How does it change if you force
indexscan or force seqscan (by setting enable_seqscan or
enable_indexscan to 0 respectively)?

> Is the use of to_char on the timestamp wrong

I'd be inclined to write cast(extract(year from created) as int) so that
the index key is int rather than varchar --- but this is probably just a
marginal efficiency hack.  The real problem is very likely that the
query selects such a large fraction of the table rows that the index
isn't buying you anything.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [SQL] Performance of request of type WHERE ... IN ( ... )

2003-01-15 Thread Josh Berkus
Th,

> > I have performance problem of an request of type ... WHERE ... IN (
> ... ).
> > How to improve performance of this type of request when a group of
> id in the
> > 'in' is important.
> 
> Try WHERE ... EXISTS (...).

The explanation of the difference btw. IN and EXISTS is in the
PostgreSQL FAQ on the web site.

-Josh

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Steve Crawford
I don't have "real-life" experience with partial 
indexes but...

You probably won't see an increase in speed unless the index use can get you 
down to a really small fraction of your total row count (I don't know just 
how small but remember being surprised at the number but realizing, on 
reflection, that it made sense. It was something like 10% or less).

One thing you could try is to use a partial index (one containing only the 
rows in which you are interested).

Here's a really dumb example:

create index foo on session (username) where username is null and 
to_char(created, '') = '2002';

Why dumb? Because the index will only contain nulls. You could probably 
choose a more intelligent index based on the other queries you do. Still, 
this index could increase your query speed considerably (as long as your 
where in creating the index matches the where in your query - if you change 
your query to 2003 you will be back to a sequential scan).

BTW, I tried to create an index on the to_char function and had no luck - 
seems like it should work but it doesn't on 7.2.3 or 7.3.1.

Cheers,
Steve

On Wednesday 15 January 2003 4:08 am, Andreas Joseph Krogh wrote:
> On Wednesday 15 January 2003 16:12, you wrote:
> > On Wed, 15 Jan 2003, Andreas Joseph Krogh wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > >
> > > On Wednesday 15 January 2003 11:37, you wrote:
> > > > The following does not work:
> > > >
> > > > create index session_u_idx on session (to_char(created, ''));
> > > > ERROR:  parser: parse error at or near "''" at character 57
> > > >
> > > > Can I make a function to do this and index using the result of that
> > > > funtion? Do anyone have an example of such a function?
> > >
> > > I tried the following function:
> > > - -
> > > create function drus (timestamp) returns varchar AS'
> > > DECLARE
> > > str_created VARCHAR;
> > > created ALIAS FOR $1;
> > > BEGIN
> > > str_created:= to_char(created, '''');
> > > RETURN str_created;
> > > END;
> > > ' LANGUAGE 'plpgsql';
> >
> > add
> > WITH (iscachable)
>
> Thank you, not _that_ works:-)
> But now this doesn't work:
> create index session_u_idx on session (drus(created), username);
>
> Can't I have multicolumn-indexes with functions? Any idea how to rewrite
> that so it works?
> Here is my session table:
> CREATE TABLE session (
>   session_id varchar(256) NOT NULL PRIMARY KEY,
>   created timestamp DEFAULT 'now' NOT NULL,
>   last_accessed timestamp NOT NULL,
>   destroyed timestamp NOT NULL,
>   username varchar -- Allow sessions from not logged in users
> );
>
> Here is my query I wish to optimize using indexes:
> SELECT to_char(created, 'IW') as week, count(session_id) from session WHERE
> username IS NULL and to_char(created, '') = '2002' group by week ORDER
> BY week;
>
> Any hints on optimizing this query, index-usage etc?

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[SQL] RFC: A brief guide to nulls

2003-01-15 Thread dev
There have been a few posts recently where people have had problems with
nulls. Anyone got comments on the below before I submit it to techdocs?

TIA

- Richard Huxton

A Brief Guide to NULLs
==

What is a null?
===
A null is *not* an empty string.
A null is *not* a value.
A null is *not* a "special" value.
A null is the absence of a value.


What do nulls mean?
===
Well, they *should* mean one of two things:
1. There is no applicable value
2. There is a value but it is unknown

Example 1: Imagine you have a customer table with name and sex fields.
If you get a new customer "ACME Widgets Ltd", the sex field is meaningless
since your customer is a company (case 1).
If you get a new customer "Jackie Smith" they might be male or female, but
you might not know (case 2).

Example 2: You have an address table with (street,city,county,postalcode)
fields.
You might insert an address ("10 Downing Street","London",Null,"WC1 1AA")
since you don't have a valid county.
You might also insert an address ("1 Any Street","Maidstone","Kent",Null)
where there *must be* a valid postalcode, but you don't know what it is.

It might be useful to be able to distinguish between these two cases - not
applicable and unknown, but there is only one option "Null" available to
us, so we can't.


How do nulls work?
==
There is one very important rule when dealing with nulls. The result of
any operation or comparison, when applied to a null is null. The only
exception is testing if a value is null.

Example: with the customer table above you could run the following queries:
  SELECT * FROM customer WHERE sex='M';
  SELECT * FROM customer WHERE sex<>'M';
Now you might think this returns all customers, but it will miss those
where sex is null. You've asked for all rows where the value of sex is 'M'
and all those with values not equal to 'M' but not rows with *no value at
all*

It might help to think of a database as a set of statements you *know* to
be true. A null indicates that you *cannot say anything at all* about that
field. You can't say what it is, you can't say what it isn't, you can only
say there is some information missing.

So, to see all the customers with unknown or inapplicable sex you would need:
  SELECT * FROM customer WHERE sex IS NULL;

There are actually three possible results for a test in SQL - True (the
test passed), False (the test failed) and Null (you tested against a
null). A result of null usually gets treated as False, so testing against
nulls always fails.

If you try to perform an operation on nulls, again the result is always
null. So the results of all of the following are null:
  SELECT 'abc' || null;
  SELECT 1 + null;
  SELECT sqrt(null::numeric);
The first case can be especially confusing. Concatenating a null string to
a string value will return null, not the original value.


Uniqueness and nulls

If you define a unique index on a column it prevents you inserting two
values that are the same. It does not prevent you inserting as many nulls
as you like. How could it, you don't have a value so it can't be the same
as any other.

Example: We create a table "ta" with a unique constraint on column "b"
  CREATE TABLE ta (
a int4,
b varchar(3),
PRIMARY KEY (a)
  );
  CREATE UNIQUE INDEX ta_b_idx ON ta (b);
  INSERT INTO ta VALUES (1,'aaa');  -- succeeds
  INSERT INTO ta VALUES (2,'bbb');  -- succeeds
  INSERT INTO ta VALUES (3,null);   -- succeeds
  INSERT INTO ta VALUES (4,'bbb');  -- fails
  INSERT INTO ta VALUES (5,null);   -- succeeds!

Given the definition of what a null is, you only have two choices: allow
multiple nulls or allow no nulls. If you want no nulls, define the column
as NOT NULL when creating the table.


Keys and nulls
==
No column that is part of a primary key can be null. When you define a
PRIMARY KEY, none of the columns mentioned can take a null value.
Postgresql makes sure of this by defining the columns as NOT NULL for you.

Example: With table "ta" we just created, \d ta will show column a as
being not null. Otherwise, we could insert rows with a set to null and
have no way to tell them apart.


Subqueries and nulls

Since tests always fail when testing against nulls you can have unexpected
results with sub-queries.

Example: Assume we have a companies table and a diary table. Diary entries
are usually related to a particular company but not always.
  SELECT co_name FROM companies WHERE co_id NOT IN (SELECT dy_company FROM
diary);
If any row in diary contains a null dy_company then you will get *no
results*.
We can expand the query like so:
  WHERE co_id NOT IN (SELECT dy_company FROM diary)
  WHERE co_id NOT IN (1, 2, null, 3...)
  WHERE NOT (co_id=1 OR co_id=2 OR co_id=null OR co_id=3...)
  WHERE NOT (... OR null OR ...)
  WHERE NOT (null)
  WHERE null

You either need to explicitly check for null values, or define the column
in question as NOT NU

Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Andreas Joseph Krogh
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 15 January 2003 20:02, you wrote:
> You didnt try it!!
>
> Change your to_char(created, '''')||$2 to
> to_char(created, '''')||(coalesce($2,)
> (provided there is no user named mister '' :)
>
> then perform your query like:
>
> select to_char(created, 'IW') as week, count(session_id) from session
> WHERE drus(created,username) = '2002' group by week ORDER BY
> week;
>
> do a explain analyze to see index and performance issues.

I didn't try it because I don't have a problem with the optimizer utilizing 
the index anymore. As you can se in the attachment the index is used.

Quoting Tom Lane:
"he real problem is very likely that the
query selects such a large fraction of the table rows that the index
isn't buying you anything."

nbeweb=> select count(*) from session;
 count
- 
 899691
(1 row)

nbeweb=> select count(*) from session where username IS NULL;
 count
- 
 898377
(1 row)

The output of EXPLAIN and EXPLAIN ANALYZE is in the attachment.

Can anyone explain to me how to reed the output from ANALYZE. It seems most of 
the time is spent sorting and grouping. Are there any ways to optimize this?

- -- 
Andreas Joseph Krogh <[EMAIL PROTECTED]>
There will always be someone who agrees with you
but is, inexplicably, a moron.

gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE+JZx+UopImDh2gfQRAjfJAKCv4uXE2PhtmWfCvm/6pRkumfM8KACgmeDF
AX9HeKVu9SErXxpaUh9ys4A=
=sPIN
-END PGP SIGNATURE-

nbeweb=> EXPLAIN select to_char(created, 'IW') as week, count(session_id) from session 
WHERE username IS NULL and drus(created) = '2002' group by week ORDER BY week;
NOTICE:  QUERY PLAN:

Aggregate  (cost=92.62..92.74 rows=2 width=154)
  ->  Group  (cost=92.62..92.68 rows=22 width=154)
->  Sort  (cost=92.62..92.62 rows=22 width=154)
  ->  Index Scan using session_u_idx on session  (cost=0.00..92.12 rows=22 
width=154)

EXPLAIN
nbeweb=> EXPLAIN ANALYZE select to_char(created, 'IW') as week, count(session_id) from 
session WHERE username IS NULL and drus(created) = '2002' group by week ORDER BY week;
NOTICE:  QUERY PLAN:

Aggregate  (cost=92.62..92.74 rows=2 width=154) (actual time=764457.99..837660.30 
rows=36 loops=1)
  ->  Group  (cost=92.62..92.68 rows=22 width=154) (actual time=763934.48..830755.34 
rows=846053 loops=1)
->  Sort  (cost=92.62..92.62 rows=22 width=154) (actual 
time=763934.45..798906.16 rows=846053 loops=1)
  ->  Index Scan using session_u_idx on session  (cost=0.00..92.12 rows=22 
width=154) (actual time=140.71..253768.07 rows=846053 loops=1)
Total runtime: 837759.64 msec

EXPLAIN
nbeweb=>


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] query speed joining tables

2003-01-15 Thread Josh Berkus
Vernon,
  
 > In regarding of recomposing multivalued field as a separated table,
I
 > have observed some advantages and 
> disadvantages of the approach. Good on search as you have pointed out
 > and bad on updating data, two operations 
> needed: deletion and insertion. A query may need to join a lot of
 > table together. In Christ's personal application, for 
> example, there are many mulitvalued fields such as relationship
 > status other then ethnicity. There will be some very long 
> and complex queries. 

 Hey, it's your database. In my 8-year experience as a professional
 DBA, few considerations ever outweigh normalization in a relational
 database. You are merely trading the immediate inconvenience of having
 to construct complex queries and data-saving functions for the
eventual
 huge inconvenience (or possibly disaster) of having your data
corrupted
 or at least having to modify it by hand, row-by-row.
 
(Pardon me if I'm a little strident, but I've spend a good portion of
 my career cleaning up other's, and sometimes my own, database design
 mistakes and I had to see a disaster-in-the-making repeated)
 
To put it another way: Your current strategy is saving a penny now in
 order to pay a dollar tommorrow. 

 For example, you currently store multiple ethnicities in a free-form
 text field. What happens when:
 1) Your organization decides they need to split "Asian" into "Chinese"
 and "Other Asian"?
 2) Someone types "aisan" by mistake?
 3) You stop tracking another ethnicity, and want to purge it from the
 database?
 4) Your administrator decides that Ethnicity needs to be ordered as
 "primary ethnicity" and "other ethnicities"?
 5) You need to do complex queries like (Asian and/or Caucasian but not
 Hispanic or African)? Your current strategy would require 4 seperate
 functional indexes to support that query, or do a table scan with 4
 row-by-row fuzzy text matches ... slow and memory-intensive either
way.
 
As I said, it's your database, and if it's a low-budget project
 destined to be thrown away in 3 months, then go for it. If, however,
 you expect this database to be around for a while, you owe it to
 yourself and your co-workers to design it right.
 
If you want an education on database normalization, pick up Fabian
 Pascal's "Practical Issues in Database Design".
 
-Josh Berkus

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] index on to_char(created, 'YYYY') doesn't work

2003-01-15 Thread Josh Berkus
Steve,

> BTW, I tried to create an index on the to_char function and had no
> luck - 
> seems like it should work but it doesn't on 7.2.3 or 7.3.1.

That's because functional indexes can't take any arguments other than
column names.   Therefore you'd need to:

CREATE FUNCTION to_year (timestamp) RETURNS varchar AS
'SELECT to_char($1, '''');'
LANGUAGE 'sql' WITH (ISCACHABLE, ISSTRICT)
(above is 7.2.3 syntax)

Then use the to_year function in place of to_char for creating your
index.

-Josh Berkus

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [SQL] sort by relevance

2003-01-15 Thread Oleg Bartunov
On Wed, 15 Jan 2003 [EMAIL PROTECTED] wrote:

> HI, PPL!
>
> How am I able to sort query results by relevance?
> I use contrib/tsearch to search using fill text index!
>

Use OpenFTS (openfts.sourceforge.net) for relevance ranking.
we might add relevance feature to tsearch though.


>

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[SQL] lost on self joins

2003-01-15 Thread Matthew Nuzum
Sometimes recursion makes my head spin...

Imagine that I have a database that holds the structure of my
filesystem.  There is a table called files that contains every piece of
info you would ever want to know about a file, including a unique ID
called fileid.
 | files

x| fileid
 | filename
 | ...

Then, you have a table called folders which looks like:
 | folders
==
x| folderid
 | parentid (relates to folders.folderid)
 | foldername

Finaly, a table to allow a many to many join called files_folders
 | files_folders

x| ffid
 | folderid (fk to folders.folderid)
 | fileid (fk to files.fileid)

Now, I'd like to create a view that shows everything in files, as well
as the complete path to the file.  However because I don't know how many
levels deep the file is nested, I'm not sure how to get that complete
path.  Here is conceptually what should come out:

 | files_view
==
x| fileid
 | filename
 | ...
 | full_path

Something that won't work is: 
SELECT files.*, folders.foldername, folders2.foldername 
FROM files, folders, folders folders2, files_folders ff
WHERE files.fileid = ff.fileid 
  AND ff.folderid = folders.folderid 
  AND folders.parentid;

The problem is that files that are not in a folder won't show up, and if
a folder is more than two levels deep it will only show the two highest
levels.

Can anyone suggest a way for me to get the information I need?  I'm very
content to use a simple pl/pgsql function, however I don't know how I'd
use recursion there.

Thanks,

--
Matthew Nuzum
www.bearfruit.org
[EMAIL PROTECTED]




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] query speed joining tables

2003-01-15 Thread Vernon Wu

Hi, Josh,

I appreciate you share your experience here. I definitely don't have that many years' 
DBA experience behind, and are 
learning to get DB design right at the first place. 

What I stated is my observation on my project with over twenty multivalued detail 
tables. I have a selection query 
contained 200 characters, involving 10 tables, and using subquery. The performance is 
not bad after properly indexing, 
least than 3 second (what the planner says). I will have longer queries later and hope 
they won't have any performance 
problem.

Thank you for recommending another DB book after the "Database Design For Mere 
Mortals". I will read the book.

Vernon   


15/01/2003 9:50:22 AM, "Josh Berkus" <[EMAIL PROTECTED]> wrote:

>Vernon,
>  
> > In regarding of recomposing multivalued field as a separated table,
>I
> > have observed some advantages and 
>> disadvantages of the approach. Good on search as you have pointed out
> > and bad on updating data, two operations 
>> needed: deletion and insertion. A query may need to join a lot of
> > table together. In Christ's personal application, for 
>> example, there are many mulitvalued fields such as relationship
> > status other then ethnicity. There will be some very long 
>> and complex queries. 
>
> Hey, it's your database. In my 8-year experience as a professional
> DBA, few considerations ever outweigh normalization in a relational
> database. You are merely trading the immediate inconvenience of having
> to construct complex queries and data-saving functions for the
>eventual
> huge inconvenience (or possibly disaster) of having your data
>corrupted
> or at least having to modify it by hand, row-by-row.
> 
>(Pardon me if I'm a little strident, but I've spend a good portion of
> my career cleaning up other's, and sometimes my own, database design
> mistakes and I had to see a disaster-in-the-making repeated)
> 
>To put it another way: Your current strategy is saving a penny now in
> order to pay a dollar tommorrow. 
>
> For example, you currently store multiple ethnicities in a free-form
> text field. What happens when:
> 1) Your organization decides they need to split "Asian" into "Chinese"
> and "Other Asian"?
> 2) Someone types "aisan" by mistake?
> 3) You stop tracking another ethnicity, and want to purge it from the
> database?
> 4) Your administrator decides that Ethnicity needs to be ordered as
> "primary ethnicity" and "other ethnicities"?
> 5) You need to do complex queries like (Asian and/or Caucasian but not
> Hispanic or African)? Your current strategy would require 4 seperate
> functional indexes to support that query, or do a table scan with 4
> row-by-row fuzzy text matches ... slow and memory-intensive either
>way.
> 
>As I said, it's your database, and if it's a low-budget project
> destined to be thrown away in 3 months, then go for it. If, however,
> you expect this database to be around for a while, you owe it to
> yourself and your co-workers to design it right.
> 
>If you want an education on database normalization, pick up Fabian
> Pascal's "Practical Issues in Database Design".
> 
>-Josh Berkus
>




---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] A brief guide to nulls

2003-01-15 Thread Otto Hirr
Another viewpoint, is to go a bit further into implementation...
i.e. for every field, there are two items...
*one item being the null/notnull info
*other item being the value info of the field per the type at def time
This faciliatates the concepts of
where  is null ... testing the null/notnull info
where  is not null ... testing the null/notnull info
where  = "somevalue" ... testing the value info
etc.

Trying to "hide" the concept that every field has TWO items,
the value and whether it is null/notnull, seems to confuse people.
By SIMPLY stating that every SQL field has two items, then causes
a scripter to think a bit more deeply about a where clause...

Hence making your example ...

>   SELECT * FROM customer WHERE sex='M';
>   SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value of sex is 'M'
> and all those with values not equal to 'M' but not rows with *no value at
all*

all that less mysterious...
i.e. the above two selects only tested one of the two items for the SQL
field,
the other item to test is the null/notnull item.


.. Otto


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of [EMAIL PROTECTED]
> Sent: Wednesday, January 15, 2003 9:23 AM
> To: [EMAIL PROTECTED]
> Subject: RFC: A brief guide to nulls
>
>
> There have been a few posts recently where people have had
> problems with
> nulls. Anyone got comments on the below before I submit it to
> techdocs?
>
> TIA
>
> - Richard Huxton
>
> A Brief Guide to NULLs
> ==
>
> What is a null?
> ===
> A null is *not* an empty string.
> A null is *not* a value.
> A null is *not* a "special" value.
> A null is the absence of a value.
>
>
> What do nulls mean?
> ===
> Well, they *should* mean one of two things:
> 1. There is no applicable value
> 2. There is a value but it is unknown
>
> Example 1: Imagine you have a customer table with name and sex fields.
> If you get a new customer "ACME Widgets Ltd", the sex field
> is meaningless
> since your customer is a company (case 1).
> If you get a new customer "Jackie Smith" they might be male
> or female, but
> you might not know (case 2).
>
> Example 2: You have an address table with
> (street,city,county,postalcode)
> fields.
> You might insert an address ("10 Downing
> Street","London",Null,"WC1 1AA")
> since you don't have a valid county.
> You might also insert an address ("1 Any
> Street","Maidstone","Kent",Null)
> where there *must be* a valid postalcode, but you don't know
> what it is.
>
> It might be useful to be able to distinguish between these
> two cases - not
> applicable and unknown, but there is only one option "Null"
> available to
> us, so we can't.
>
>
> How do nulls work?
> ==
> There is one very important rule when dealing with nulls. The
> result of
> any operation or comparison, when applied to a null is null. The only
> exception is testing if a value is null.
>
> Example: with the customer table above you could run the
> following queries:
>   SELECT * FROM customer WHERE sex='M';
>   SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value
> of sex is 'M'
> and all those with values not equal to 'M' but not rows with
> *no value at
> all*
>
> It might help to think of a database as a set of statements
> you *know* to
> be true. A null indicates that you *cannot say anything at
> all* about that
> field. You can't say what it is, you can't say what it isn't,
> you can only
> say there is some information missing.
>
> So, to see all the customers with unknown or inapplicable sex
> you would need:
>   SELECT * FROM customer WHERE sex IS NULL;
>
> There are actually three possible results for a test in SQL -
> True (the
> test passed), False (the test failed) and Null (you tested against a
> null). A result of null usually gets treated as False, so
> testing against
> nulls always fails.
>
> If you try to perform an operation on nulls, again the result
> is always
> null. So the results of all of the following are null:
>   SELECT 'abc' || null;
>   SELECT 1 + null;
>   SELECT sqrt(null::numeric);
> The first case can be especially confusing. Concatenating a
> null string to
> a string value will return null, not the original value.
>
>
> Uniqueness and nulls
> 
> If you define a unique index on a column it prevents you inserting two
> values that are the same. It does not prevent you inserting
> as many nulls
> as you like. How could it, you don't have a value so it can't
> be the same
> as any other.
>
> Example: We create a table "ta" with a unique constraint on column "b"
>   CREATE TABLE ta (
> a int4,
>   b varchar(3),
>   PRIMARY KEY (a)
>   );
>   CREATE UNIQUE INDEX ta_b_idx ON ta (b);
>   INSERT INTO

Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Andrew Perrin
*Disclaimer:* I do not have any formal training in database
theory. However, I have done quite a lot of work with databases in
practice, and have some understanding of "missing values" in statistics.

I would tend to think of the "no applicable value" case of a NULL as being
the result of poor or lazy planning, and therefore not really true NULL
values. (This is not to say I don't *do* it, but that it's not
theoretically appropriate to interpret a NULL as being "no applicable
value.")  To use your example 1, I would think a properly-planned and
structured database should account for the possibility of a sexless
customer by means of relationality:

customers:
custid
name

cust_sexes:
custid
sex

which would get rid of the NULL for the sex of ACME Widgets Ltd.  By
contrast, it wouldn't get rid of the NULL for Jackie Smith, who would
appropriately be represented by an entry in customers (, 'Jackie
Smith') and one in cust_sexes (, NULL).

(Otherwise the introduction is excellent.)

Any comments?

Andy

--
Andrew J Perrin - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
[EMAIL PROTECTED] * andrew_perrin (at) unc.edu


On Wed, 15 Jan 2003 [EMAIL PROTECTED] wrote:

> There have been a few posts recently where people have had problems with
> nulls. Anyone got comments on the below before I submit it to techdocs?
> 
> TIA
> 
> - Richard Huxton
> 
> A Brief Guide to NULLs
> ==
> 
> What is a null?
> ===
> A null is *not* an empty string.
> A null is *not* a value.
> A null is *not* a "special" value.
> A null is the absence of a value.
> 
> 
> What do nulls mean?
> ===
> Well, they *should* mean one of two things:
> 1. There is no applicable value
> 2. There is a value but it is unknown
> 
> Example 1: Imagine you have a customer table with name and sex fields.
> If you get a new customer "ACME Widgets Ltd", the sex field is meaningless
> since your customer is a company (case 1).
> If you get a new customer "Jackie Smith" they might be male or female, but
> you might not know (case 2).
> 
> Example 2: You have an address table with (street,city,county,postalcode)
> fields.
> You might insert an address ("10 Downing Street","London",Null,"WC1 1AA")
> since you don't have a valid county.
> You might also insert an address ("1 Any Street","Maidstone","Kent",Null)
> where there *must be* a valid postalcode, but you don't know what it is.
> 
> It might be useful to be able to distinguish between these two cases - not
> applicable and unknown, but there is only one option "Null" available to
> us, so we can't.
> 
> 
> How do nulls work?
> ==
> There is one very important rule when dealing with nulls. The result of
> any operation or comparison, when applied to a null is null. The only
> exception is testing if a value is null.
> 
> Example: with the customer table above you could run the following queries:
>   SELECT * FROM customer WHERE sex='M';
>   SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value of sex is 'M'
> and all those with values not equal to 'M' but not rows with *no value at
> all*
> 
> It might help to think of a database as a set of statements you *know* to
> be true. A null indicates that you *cannot say anything at all* about that
> field. You can't say what it is, you can't say what it isn't, you can only
> say there is some information missing.
> 
> So, to see all the customers with unknown or inapplicable sex you would need:
>   SELECT * FROM customer WHERE sex IS NULL;
> 
> There are actually three possible results for a test in SQL - True (the
> test passed), False (the test failed) and Null (you tested against a
> null). A result of null usually gets treated as False, so testing against
> nulls always fails.
> 
> If you try to perform an operation on nulls, again the result is always
> null. So the results of all of the following are null:
>   SELECT 'abc' || null;
>   SELECT 1 + null;
>   SELECT sqrt(null::numeric);
> The first case can be especially confusing. Concatenating a null string to
> a string value will return null, not the original value.
> 
> 
> Uniqueness and nulls
> 
> If you define a unique index on a column it prevents you inserting two
> values that are the same. It does not prevent you inserting as many nulls
> as you like. How could it, you don't have a value so it can't be the same
> as any other.
> 
> Example: We create a table "ta" with a unique constraint on column "b"
>   CREATE TABLE ta (
> a int4,
>   b varchar(3),
>   PRIMARY KEY (a)
>   );
>   CREATE UNIQUE INDEX ta_b_idx ON ta (b);
>   INSERT INTO ta VALUES (1,'aaa');  -- succeeds
>   INSERT INTO ta VALUES (2,'bbb');  -- succeeds
>   INSERT INTO ta

Re: [SQL] query speed joining tables

2003-01-15 Thread Josh Berkus

Vernon,

> What I stated is my observation on my project with over twenty multivalued 
detail tables. I have a selection query 
> contained 200 characters, involving 10 tables, and using subquery. The 
performance is not bad after properly indexing, 
> least than 3 second (what the planner says). I will have longer queries 
later and hope they won't have any performance 
> problem.

One trick for you is to create a custom aggregate for string contination for 
each detail table, and that will allow you to list the values in the detail 
table as if they were a continuous text string.   A concat aggregate is even 
fast on PostgreSQL.


CREATE FUNCTION "comma_cat" (text,text) RETURNS text AS '
SELECT CASE WHEN $2 IS NULL OR $2 =  THEN $1
WHEN $1 IS NULL OR $1 =  THEN $2
ELSE $1 || '', '' || $2
END
' LANGUAGE 'sql';

CREATE FUNCTION "br_cat" (text, text) RETURNS text AS '
SELECT CASE WHEN $2 IS NULL OR $2 =  THEN $1
WHEN $1 IS NULL OR $1 =  THEN $2
ELSE $1 ||  || $2
END
' LANGUAGE 'sql';

--create aggregate with html  between items
CREATE AGGREGATE br_list ( BASETYPE = text, SFUNC = br_cat, STYPE = text, 
INITCOND = '' );

--create aggregate with commas between items
CREATE AGGREGATE comma_list ( BASETYPE = text, SFUNC = comma_cat, STYPE = 
text,
INITCOND = '' );

-Josh Berkus

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Josh Berkus
Richard,

Thanks for writing this!  Would it be cool if we included it in the Linux 
Weekly News?   I'll ask Robert.

Two edits:

On Wednesday 15 January 2003 09:23, [EMAIL PROTECTED] wrote:
> There have been a few posts recently where people have had problems with
> nulls. Anyone got comments on the below before I submit it to techdocs?
> 
> TIA
> 
> - Richard Huxton
> 
> A Brief Guide to NULLs
> ==
> 
> What is a null?
> ===
> A null is *not* an empty string.
> A null is *not* a value.
> A null is *not* a "special" value.
> A null is the absence of a value.
> 
> 
> What do nulls mean?
> ===
> Well, they *should* mean one of two things:
> 1. There is no applicable value
> 2. There is a value but it is unknown
> 
> Example 1: Imagine you have a customer table with name and sex fields.
> If you get a new customer "ACME Widgets Ltd", the sex field is meaningless
> since your customer is a company (case 1).
> If you get a new customer "Jackie Smith" they might be male or female, but
> you might not know (case 2).
> 
> Example 2: You have an address table with (street,city,county,postalcode)
> fields.
> You might insert an address ("10 Downing Street","London",Null,"WC1 1AA")
> since you don't have a valid county.
> You might also insert an address ("1 Any Street","Maidstone","Kent",Null)
> where there *must be* a valid postalcode, but you don't know what it is.
> 
> It might be useful to be able to distinguish between these two cases - not
> applicable and unknown, but there is only one option "Null" available to
> us, so we can't.
> 
> 
> How do nulls work?
> ==
> There is one very important rule when dealing with nulls. The result of
> any operation or comparison, when applied to a null is null. The only
> exception is testing if a value is null.
> 
> Example: with the customer table above you could run the following queries:
>   SELECT * FROM customer WHERE sex='M';
>   SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value of sex is 'M'
> and all those with values not equal to 'M' but not rows with *no value at
> all*
> 
> It might help to think of a database as a set of statements you *know* to
> be true. A null indicates that you *cannot say anything at all* about that
> field. You can't say what it is, you can't say what it isn't, you can only
> say there is some information missing.
> 
> So, to see all the customers with unknown or inapplicable sex you would 
need:
>   SELECT * FROM customer WHERE sex IS NULL;

** MOVE the above paragraph 

> 
> There are actually three possible results for a test in SQL - True (the
> test passed), False (the test failed) and Null (you tested against a
> null). A result of null usually gets treated as False, so testing against
> nulls always fails.
> 
> If you try to perform an operation on nulls, again the result is always
> null. So the results of all of the following are null:
>   SELECT 'abc' || null;
>   SELECT 1 + null;
>   SELECT sqrt(null::numeric);
> The first case can be especially confusing. Concatenating a null string to
> a string value will return null, not the original value.

** ... down to here, and define the IS NULL and IS NOT NULL operators for the 
reader.

> 
> Uniqueness and nulls
> 
> If you define a unique index on a column it prevents you inserting two
> values that are the same. It does not prevent you inserting as many nulls
> as you like. How could it, you don't have a value so it can't be the same
> as any other.
> 
> Example: We create a table "ta" with a unique constraint on column "b"
>   CREATE TABLE ta (
> a int4,
>   b varchar(3),
>   PRIMARY KEY (a)
>   );
>   CREATE UNIQUE INDEX ta_b_idx ON ta (b);
>   INSERT INTO ta VALUES (1,'aaa');  -- succeeds
>   INSERT INTO ta VALUES (2,'bbb');  -- succeeds
>   INSERT INTO ta VALUES (3,null);   -- succeeds
>   INSERT INTO ta VALUES (4,'bbb');  -- fails
>   INSERT INTO ta VALUES (5,null);   -- succeeds!
> 
> Given the definition of what a null is, you only have two choices: allow
> multiple nulls or allow no nulls. If you want no nulls, define the column
> as NOT NULL when creating the table.
> 
> 
> Keys and nulls
> ==
> No column that is part of a primary key can be null. When you define a
> PRIMARY KEY, none of the columns mentioned can take a null value.
> Postgresql makes sure of this by defining the columns as NOT NULL for you.
> 
> Example: With table "ta" we just created, \d ta will show column a as
> being not null. Otherwise, we could insert rows with a set to null and
> have no way to tell them apart.
> 
> 
> Subqueries and nulls
> 
> Since tests always fail when testing against nulls you can have unexpected
> results with sub-queries.
> 
> Example: Assume we have a companies table and a diary table. Diary entries
> are usually relate

Re: [SQL] sub-select with aggregate

2003-01-15 Thread Tom Lane
[ followup on a gripe from October ]

Tomasz Myrta <[EMAIL PROTECTED]> writes:
> I want to perform query looking like this:

> select
>   user_id,
>   a/sum_a as percent_a,
>   b/sum_b as percent_b
> from
>   users join
>   (select
> group_id,
> sum(a) as sum_a,
> sum(b) as sum_b
>from users group by group_id) X using (group_id)
> where group_id=3;

> This query works, but very slow. Subquery with aggregate is performed 
> for all table rows instead of group_id=3.

Just FYI, this problem is fixed as of CVS tip.  Using an empty table
with an index on group_id, I get a plan like so:

 Nested Loop  (cost=17.10..34.21 rows=1 width=36)
   ->  Subquery Scan x  (cost=17.10..17.11 rows=1 width=12)
 ->  HashAggregate  (cost=17.10..17.11 rows=1 width=12)
   ->  Index Scan using users_group_id on users  (cost=0.00..17.07 rows=5 
width=12)
 Index Cond: (3 = group_id)
   ->  Index Scan using users_group_id on users  (cost=0.00..17.08 rows=1 width=16)
 Index Cond: (users.group_id = "outer".group_id)
 Filter: (group_id = 3)

Notice the condition on group_id has been propagated into both sides of
the join.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Antti Haapala

This should be added (From PostgreSQL integrated docs...):

SQL uses a three-valued Boolean logic where the null value represents
"unknown". Observe the following truth tables:

a | b | a AND b | a OR b
--+---+-+
TRUE  | TRUE  | TRUE| TRUE
TRUE  | FALSE | FALSE   | TRUE
TRUE  | NULL  | NULL| TRUE
FALSE | FALSE | FALSE   | FALSE
FALSE | NULL  | FALSE   | NULL
NULL  | NULL  | NULL| NULL



So, if there's any trues in the chain of ORs, the whole expression will be
true, not null. This conforms to NULL representing unknown value. If you
have "true or unknown", of course whole result is true regardless of the
"unknown". Let's check this example:

> Subqueries and nulls
> 
> Since tests always fail when testing against nulls you can have unexpected
> results with sub-queries.
>
> Example: Assume we have a companies table and a diary table. Diary entries
> are usually related to a particular company but not always.
>   SELECT co_name FROM companies WHERE co_id NOT IN (SELECT dy_company FROM
> diary);
> If any row in diary contains a null dy_company then you will get *no
> results*.
> We can expand the query like so:
>   WHERE co_id NOT IN (SELECT dy_company FROM diary)
>   WHERE co_id NOT IN (1, 2, null, 3...)
>   WHERE NOT (co_id=1 OR co_id=2 OR co_id=null OR co_id=3...)
>   WHERE NOT (... OR null OR ...)

>   WHERE NOT (null)
(erm... actually not)

template1=# select 5 in (1, 2, 9, null);
 ?column?
--

(1 row)

template1=# select 2 in (1, 2, 9, null);
 ?column?
--
 t
(1 row)

WHERE NOT (null/true) -> evaluates to
WHERE null/false

So the result was the same - but only for this example.  Suppose you take
the NOT away:

template1=# select * from a where a.i in (1, 2, 9, null);
 i
---
 1
 2
(2 rows)

I surely see two lines (not 0)... :)

And this could be added too, for clarification: "SELECT clause lists lines
for which the WHERE expression is certainly known to be true." ;)

-- 
Antti Haapala



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] lost on self joins

2003-01-15 Thread Tomasz Myrta
Matthew Nuzum wrote:


Sometimes recursion makes my head spin...

Imagine that I have a database that holds the structure of my
filesystem.  There is a table called files that contains every piece of
info you would ever want to know about a file, including a unique ID
called fileid.
| files

x| fileid
| filename
| ...

Then, you have a table called folders which looks like:
| folders
==
x| folderid
| parentid (relates to folders.folderid)
| foldername

Finaly, a table to allow a many to many join called files_folders
| files_folders

x| ffid
| folderid (fk to folders.folderid)
| fileid (fk to files.fileid)


Strange. Do you need this table? Can one file exist in several directories?
If not, you can just add "folderid" field into table files.


Now, I'd like to create a view that shows everything in files, as well
as the complete path to the file.  However because I don't know how many
levels deep the file is nested, I'm not sure how to get that complete
path.  Here is conceptually what should come out:

| files_view
==
x| fileid
| filename
| ...
| full_path

Something that won't work is: 
SELECT files.*, folders.foldername, folders2.foldername 
FROM files, folders, folders folders2, files_folders ff
WHERE files.fileid = ff.fileid 
 AND ff.folderid = folders.folderid 
 AND folders.parentid;

The problem is that files that are not in a folder won't show up, and if
a folder is more than two levels deep it will only show the two highest
levels.

Can anyone suggest a way for me to get the information I need?  I'm very
content to use a simple pl/pgsql function, however I don't know how I'd
use recursion there.

What would you say about this:

create or replace function parent_dir(varchar,integer) returns varchar as '
DECLARE
 curr_name	ALIAS for $1;
 curr_id	ALIAS for $2;
 par_name	varchar;
 par_id	integer;
begin
 select into par_name,par_id foldername,parentid from folders where folderid=curr_id;
 if not found or par_name is null then
   --finish
   return curr_name;
 else
   --find upper folder
   return parent_dir(par_name || ''/'' || curr_name,par_id);
 end if;
end;
' LANGUAGE 'plpgsql';

Using:
select parent_dir('',folderid) as fullpath...;
or 
select parent_dir(filename,folderid) as fullfilename...;

Your query would look like this:
SELECT files.*,
parent_dir('',folderid) as fullfoldername
FROM files f join files_folders ff using (fileid);

Regards,
Tomasz Myrta


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] sort by relevance

2003-01-15 Thread Ross J. Reedstrom
On Wed, Jan 15, 2003 at 09:19:47PM +0300, Oleg Bartunov wrote:
> On Wed, 15 Jan 2003 [EMAIL PROTECTED] wrote:
> 
> > HI, PPL!
> >
> > How am I able to sort query results by relevance?
> > I use contrib/tsearch to search using fill text index!
> >
> 
> Use OpenFTS (openfts.sourceforge.net) for relevance ranking.
> we might add relevance feature to tsearch though.

Here's another tsearch 'customer' who'd love that feature.

Ross

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] lost on self joins

2003-01-15 Thread Matthew Nuzum
> -Original Message-
> From: Tomasz Myrta [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, January 15, 2003 3:00 PM
> To: Matthew Nuzum
> Cc: [EMAIL PROTECTED]
> Subject: Re: [SQL] lost on self joins
> >Finaly, a table to allow a many to many join called files_folders
> > | files_folders
> >
> >x| ffid
> > | folderid (fk to folders.folderid)
> > | fileid (fk to files.fileid)
> 
> Strange. Do you need this table? Can one file exist in several
> directories?
> If not, you can just add "folderid" field into table files.

Good point.  No, it can't exist in multiple folders, so I guess it's
overkill to do a many to many here.  Thanks for the moment of clarity.

> What would you say about this:
> 
> create or replace function parent_dir(varchar,integer) returns varchar
as
> '
> DECLARE
>   curr_name   ALIAS for $1;
>   curr_id ALIAS for $2;
>   par_namevarchar;
>   par_id  integer;
> begin
>   select into par_name,par_id foldername,parentid from folders where
> folderid=curr_id;
>   if not found or par_name is null then
> --finish
> return curr_name;
>   else
> --find upper folder
> return parent_dir(par_name || ''/'' || curr_name,par_id);
>   end if;
> end;
> ' LANGUAGE 'plpgsql';
> 
> Using:
> select parent_dir('',folderid) as fullpath...;
> or
> select parent_dir(filename,folderid) as fullfilename...;
> 
> Your query would look like this:
> SELECT files.*,
> parent_dir('',folderid) as fullfoldername
> FROM files f join files_folders ff using (fileid);

Well, I guess I would say Thanks!  You make it look so easy.

As I was planning this in my mind, I didn't have the equivalent of your
first parameter to parent_dir, but now I see it's necessary to get the
full path when you recurse up the folder hierarchy.

Thanks a lot for your response.

Matthew Nuzum
www.bearfruit.org
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Bruno Wolff III
On Wed, Jan 15, 2003 at 17:23:09 -,
  [EMAIL PROTECTED] wrote:
> 
> How do nulls work?
> ==
> There is one very important rule when dealing with nulls. The result of
> any operation or comparison, when applied to a null is null. The only
> exception is testing if a value is null.

I suggest rephrasing this to make it clear you need to use is null to
do this and that = null won't do what you expect.

> If you try to perform an operation on nulls, again the result is always
> null. So the results of all of the following are null:

The above isn't true (i.e. null op something isn't ALWAYS null). For example:

bruno=> select 't'::boolean or null;
 ?column?
--
 t
(1 row)

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] lost on self joins

2003-01-15 Thread Bruno Wolff III
On Wed, Jan 15, 2003 at 13:29:47 -0500,
  Matthew Nuzum <[EMAIL PROTECTED]> wrote:
> 
> Can anyone suggest a way for me to get the information I need?  I'm very
> content to use a simple pl/pgsql function, however I don't know how I'd
> use recursion there.

Look at the tablefunc contrib package. I believe it provides the ability
to do what you are trying to do. You might need to have 7.3 to have it
available.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Antti Haapala

A few more comments...

> It might be useful to be able to distinguish between these two cases - not
> applicable and unknown, but there is only one option "Null" available to
> us, so we can't.

If we really need to distinguish between these two cases, I think null
shouldn't be used as a N/A value but some other like empty string or 0.
(IMHO it's preferable not to use null as N/A at all).

For example sex could be classified as
'n'  - not applicable
'f'  - female
'm'  - male
null - yet unknown

> Example: with the customer table above you could run the following queries:
>   SELECT * FROM customer WHERE sex='M';
>   SELECT * FROM customer WHERE sex<>'M';
> Now you might think this returns all customers, but it will miss those
> where sex is null. You've asked for all rows where the value of sex is 'M'
> and all those with values not equal to 'M' but not rows with *no value at
> all*

these could be explained as
select all customers who surely are men
select all customers who surely aren't men

if customers sex is unknown - null, we can't decide whether they're men or
not.

> The first case can be especially confusing. Concatenating a null string to
> a string value will return null, not the original value.

Isn't it null, not null string? ;)

> Keys and nulls
> ==
> No column that is part of a primary key can be null. When you define a
> PRIMARY KEY, none of the columns mentioned can take a null value.
> Postgresql makes sure of this by defining the columns as NOT NULL for you.

... because primary keys are to uniquelly identify rows in a table, and
how's an unknown values going to do that :)

-- 
Antti Haapala
+358 50 369 3535
ICQ: #177673735


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] lost on self joins

2003-01-15 Thread Ross J. Reedstrom
On Wed, Jan 15, 2003 at 03:19:38PM -0500, Matthew Nuzum wrote:
> > -Original Message-
> > From: Tomasz Myrta [mailto:[EMAIL PROTECTED]]
> > Sent: Wednesday, January 15, 2003 3:00 PM
> > To: Matthew Nuzum
> > Cc: [EMAIL PROTECTED]
> > Subject: Re: [SQL] lost on self joins
> > >Finaly, a table to allow a many to many join called files_folders
> > > | files_folders
> > >
> > >x| ffid
> > > | folderid (fk to folders.folderid)
> > > | fileid (fk to files.fileid)
> > 
> > Strange. Do you need this table? Can one file exist in several
> > directories?
> > If not, you can just add "folderid" field into table files.
> 
> Good point.  No, it can't exist in multiple folders, so I guess it's
> overkill to do a many to many here.  Thanks for the moment of clarity.

Unless you're attempting to accurately map Unix filesystem sematics, where
the exact same file _can_ be in more than one place in the filesystem
(hard links).  It's all about the inode. One of the wierder bits of unix
that you don't often see used in common occurances.

Ross

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[SQL] joining from multiple tables

2003-01-15 Thread Joseph Shraibman
I have a table I want to join on, but the conditions that restrict it span more than one 
table.  For example:

create table num_tab (thekey int primary key, val int, class char);
create table class_tab (class char primary key, tkey int);
create table txt_tab (thekey int primary key, class int, txt text);

insert into num_tab values (1, 1, 'o');
insert into num_tab values (2, 2, 'e');
insert into num_tab values (3, 3, 'o');
insert into num_tab values (4, 4, 'e');
insert into num_tab values (5, 5, 'o');
insert into num_tab values (6, 6, 'e');

insert into class_tab values('o', 1);
insert into class_tab values('e', 2);

insert into txt_tab values (2, 2,'two');
insert into txt_tab values (4, 2,'four');
insert into txt_tab values (6, 2,'six');

select distinct n.thekey, n.val, t.txt  FROM  num_tab n LEFT JOIN txt_tab t ON n.thekey = 
t.thekey
WHERE  n.thekey < 5 AND t.class = class_tab.tkey AND n.class = class_tab.class;
produces:

 thekey | val | txt
+-+--
  2 |   2 | two
  4 |   4 | four
... which is not what we want, because 1,3, and 5 aren't included, but:

select distinct n.thekey, n.val, t.txt  FROM  num_tab n LEFT JOIN txt_tab t ON n.thekey = 
t.thekey  AND t.class = class_tab.tkey AND n.class = class_tab.class
WHERE  n.thekey < 5;
 produces:
NOTICE:  Adding missing FROM-clause entry for table "class_tab"
ERROR:  JOIN/ON clause refers to "class_tab", which is not part of JOIN

So how do I do this?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[SQL] show data from two tables together

2003-01-15 Thread Matthew Nuzum
Well, this is somewhat of a follow up to my previous post regarding self
joins.  Now what I'm hoping to do is "virtually" combine the results
from two different record sets into one apparent record set.

Here is the skeleton of my application's data structure.  There is a
table called "folders" and a table called "files".

They look like:
 | files | folders
=   
x| fileid   x| folderid
 | filename  | foldername
 | folderid  | parentid
 | dsply_order   | dsply_order

files.folderid is fk to folders.folderid, folders.parentid is field for
self joining to folderid.

Now my difficulty is that I want to see results for the two tables
together.  As you know, folders can co-exist with files inside of a
folder.  Therefore if I know my current folder is 23, then I'd like to
SELECT * FROM files WHERE files.folderid = 23 AND SELECT * FROM folders
WHERE folders.parentid = 23 and then take the combined record sets and
ORDER BY dsply_order.

The only way that I know how to do this is to do two queries and put the
results into an array and sort it outside of postgres.  However my goal
is to make life simpler for the application developers and just give
them a view that has the data they need.

As a side note, I'd probably need to add a field that would indicate 1
if the file came from files otherwise count(folders.*) WHERE parentid =
folderid so that I can see if the folder is empty.

As another side note, this operation will be performed quite frequently
and should be fast.

As I think about it, it seems that the only logical way would be to do
this at the application level, not inside postgres.  Please correct me
if I'm wrong.

--
Matthew Nuzum
www.bearfruit.org
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Peter Eisentraut
[EMAIL PROTECTED] writes:

> A null is *not* a value.
> A null is *not* a "special" value.
> A null is the absence of a value.

A quotation directly from the SQL standard:

Every data type includes a special value, called the null value,

This seems to directly contradict those three statements.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] joining from multiple tables

2003-01-15 Thread Stephan Szabo

On Wed, 15 Jan 2003, Joseph Shraibman wrote:

> select distinct n.thekey, n.val, t.txt  FROM  num_tab n LEFT JOIN txt_tab t ON 
>n.thekey =
> t.thekey
> WHERE  n.thekey < 5 AND t.class = class_tab.tkey AND n.class = class_tab.class;
> produces:

Note that the above uses the non-standard postgres behavior of adding from
clauses, it's not technically valid SQL.

>   thekey | val | txt
> +-+--
>2 |   2 | two
>4 |   4 | four
> ... which is not what we want, because 1,3, and 5 aren't included, but:
>
> select distinct n.thekey, n.val, t.txt  FROM  num_tab n LEFT JOIN txt_tab t ON 
>n.thekey =
> t.thekey  AND t.class = class_tab.tkey AND n.class = class_tab.class
> WHERE  n.thekey < 5;
>   produces:
> NOTICE:  Adding missing FROM-clause entry for table "class_tab"
> ERROR:  JOIN/ON clause refers to "class_tab", which is not part of JOIN
>
> So how do I do this?

I think you want something like:

select distinct n.thekey, n.val, t.txt  FROM class_tab JOIN num_tab n
using (class) LEFT JOIN txt_tab t on (t.thekey=n.thekey and t.class =
class_tab.tkey);


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Ludwig Lim

--- [EMAIL PROTECTED] wrote:
> There have been a few posts recently where people
> have had problems with
> nulls. Anyone got comments on the below before I
> submit it to techdocs?
> 
> TIA
> 
> - Richard Huxton
> 
> A Brief Guide to NULLs
> ==
> 
> What is a null?
> ===
> A null is *not* an empty string.
> A null is *not* a value.
> A null is *not* a "special" value.
> A null is the absence of a value.
> 
> 
> What do nulls mean?
> ===
> Well, they *should* mean one of two things:
> 1. There is no applicable value
> 2. There is a value but it is unknown

  Good job!, it sure helps people who don't much
background on formal database theory.

  What about adding this to the section "What does
nulls mean"
 --> 3) No value has yet been assigned to that
particular attribute (field).

  
  I think it would also be nice if you can add a
section on functions that deals with NULL such as
nullif() and coalesce(). These functions help users
interpret NULL values.


best regards,

ludwig



__
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] show data from two tables together

2003-01-15 Thread Stephan Szabo

On Wed, 15 Jan 2003, Matthew Nuzum wrote:

> Well, this is somewhat of a follow up to my previous post regarding self
> joins.  Now what I'm hoping to do is "virtually" combine the results
> from two different record sets into one apparent record set.

Fortunately we have the set functions, specifically UNION ALL in this
case.

> Here is the skeleton of my application's data structure.  There is a
> table called "folders" and a table called "files".
>
> They look like:
>  | files | folders
> =   
> x| fileid   x| folderid
>  | filename  | foldername
>  | folderid  | parentid
>  | dsply_order   | dsply_order
>
> files.folderid is fk to folders.folderid, folders.parentid is field for
> self joining to folderid.
>
> As a side note, I'd probably need to add a field that would indicate 1
> if the file came from files otherwise count(folders.*) WHERE parentid =
> folderid so that I can see if the folder is empty.
>
> As another side note, this operation will be performed quite frequently
> and should be fast.
>
> As I think about it, it seems that the only logical way would be to do
> this at the application level, not inside postgres.  Please correct me
> if I'm wrong.

Maybe something like (minus the number of files/empty part):

CREATE VIEW viewname AS
 SELECT fileid, filename, folderid, dsply_order FROM files
  UNION ALL
 SELECT folderid, foldername, parentid, dsply_order FROM folders;

SELECT * from viewname where folderid=23 order by dsply_order;


Depending on whether you want a count of files or just an empty
or not, and whether you want info on whether a particular entry
is a file or folder, you'll probably need to add to the above.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] show data from two tables together

2003-01-15 Thread Matthew Nuzum
Hmm... I've never used this before.  I'll try it.

Thanks for your help and your quick reply!

--
Matthew Nuzum
www.bearfruit.org
[EMAIL PROTECTED]
 

> Fortunately we have the set functions, specifically UNION ALL in this
> case.
> 

> Maybe something like (minus the number of files/empty part):
> 
> CREATE VIEW viewname AS
>  SELECT fileid, filename, folderid, dsply_order FROM files
>   UNION ALL
>  SELECT folderid, foldername, parentid, dsply_order FROM folders;
> 
> SELECT * from viewname where folderid=23 order by dsply_order;



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] RFC: A brief guide to nulls

2003-01-15 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> [EMAIL PROTECTED] writes:
>> A null is *not* a value.
>> A null is *not* a "special" value.
>> A null is the absence of a value.

> A quotation directly from the SQL standard:
> Every data type includes a special value, called the null value,
> This seems to directly contradict those three statements.

I think you can look at it either way.  The traditional mathematical
approach to this sort of thing has been to consider that every data
type includes an "undefined" value (sometimes called "bottom", often
written as an upside-down T).  But the specific semantics assigned to
this concept in SQL definitely correspond to the idea that there's
a missing data entry.  And those who like to think about the bits prefer
to imagine a separate "its-null" flag bit, as someone else noted in this
thread.

The real bottom line is that the language provides you with a concept
"NULL" that has very specific (and less than intuitive) semantics.
To make use of this concept in your application, you have to interpret
it in a way that is useful for your application --- and doesn't conflict
with the SQL-defined semantics.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[SQL] pg_dump problem

2003-01-15 Thread Rudi Starcevic
Hi,

After doing a pg_dump on a database I'm unable to access the file.

My command is simply 'mv' ::

mv camper.dump20020116 camper_bak/

The error I get is ::

mv: camper.dump20020116: Value too large for defined data type

Strange. It seems to be saying the file I created is too large to handle.
Do you know where I've gone wrong ?
I just want to dump one of my databases, move it to a back up dir. then 
make changes to the running database.

Thank you kindly
Regards
Rudi Starcevic




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly


Re: [SQL] sort by relevance

2003-01-15 Thread sector119
On Wed, Jan 15, 2003 at 09:19:47PM +0300, Oleg Bartunov wrote:
> On Wed, 15 Jan 2003 [EMAIL PROTECTED] wrote:
> 
> > HI, PPL!
> >
> > How am I able to sort query results by relevance?
> > I use contrib/tsearch to search using fill text index!
> >
> 
> Use OpenFTS (openfts.sourceforge.net) for relevance ranking.

Am I able to use OpenFTS if I save data at db or only if I use txt files?? 

> we might add relevance feature to tsearch though.

It would be great to add this feature to tsearch! I need it so much!
Wheh we will be able to see this feature at PostgreSQL CVS? :))

-- 
WBR, sector119



msg09040/pgp0.pgp
Description: PGP signature


Re: [SQL] sort by relevance

2003-01-15 Thread Rajesh Kumar Mallah.

Yet another customer for relevence ranking ;-)

malz.

On Wednesday 15 January 2003 11:49 pm, Oleg Bartunov wrote:
> On Wed, 15 Jan 2003 [EMAIL PROTECTED] wrote:
> > HI, PPL!
> >
> > How am I able to sort query results by relevance?
> > I use contrib/tsearch to search using fill text index!
>
> Use OpenFTS (openfts.sourceforge.net) for relevance ranking.
> we might add relevance feature to tsearch though.
>
>
>
>
>   Regards,
>   Oleg
> _
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

-- 
Rajesh Kumar Mallah,
Project Manager (Development)
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])