Re: [GENERAL] Slow index performance

2015-07-03 Thread Christian Schröder
You are right ... How embarrassing ... Why did I not see this?
I will change the index and check again. I guess that the problem should be 
fixed then.

Thanks a lot!
Christian


Deriva GmbH Financial IT and Consulting
Christian Schröder
Geschäftsführer
Hans-Böckler-Straße 2 | D-37079 Göttingen
Tel: +49 (0)551 489 500-42
Fax: +49 (0)551 489 500-91
http://www.deriva.de

Amtsgericht Göttingen | HRB 3240
Geschäftsführer: Christian Schröder

-Ursprüngliche Nachricht-
Von: Marc Mamin [mailto:m.ma...@intershop.de] 
Gesendet: Freitag, 3. Juli 2015 08:58
An: Christian Schröder; pgsql-general@postgresql.org
Betreff: RE: [GENERAL] Slow index performance



 -Original Message-
 From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- 
 ow...@postgresql.org] On Behalf Of Christian Schröder
 Sent: Freitag, 3. Juli 2015 07:36
 To: pgsql-general@postgresql.org
 Subject: [GENERAL] Slow index performance
 
 Hi all,
 we have a strange performance issue in one of our databases (using 
 PostgreSQL 9.1.18). Maybe you can help me understand what’s going on.
 
 We have two identical tables (rec_isins_current, rec_isins_archive) 
 with the following structure:
 
 Table ts_frontend.rec_isins_current
Column   |  Type   | Modifiers
 +-+---
  attachment | integer | not null
  isin   | isin| not null
 Indexes:
 rec_isins_current_pkey PRIMARY KEY, btree (attachment, isin), 
 tablespace extra
 rec_isins_current_attachment btree (attachment), tablespace 
 extra


Hello, 

Are you sure that the column order of the PKs is the same in both tables?

(attachment, isin) or (isin, attachment).

When isin is at the second place, Postgres will read the whole index to find 
matching records.

regards,

Marc Mamin

 Foreign-key constraints:
 rec_isins_attachment_fkey FOREIGN KEY (attachment) REFERENCES
 ts_frontend.attachments(id) ON UPDATE RESTRICT ON DELETE CASCADE
 Inherits: ts_frontend.rec_isins
 
 The isin type is a domain type which has char(12) as its base type.
 Both tables inherit from ts_frontend.rec_isins, which is empty and is 
 only used to search both tables in a single query.
 
 When we search for an isin in both tables (using the parent table, but 
 the behavior is the same if we directly search in one of the tables), 
 the primary key index is used. However, while the archive table is 
 pretty fast, the current table is much slower:
 
 # explain analyze select * from ts_frontend.rec_isins where isin = 
 'foo';
 
 QUERY PLAN
 --
 -
 --
 -
 ---
  Result  (cost=0.00..565831.43 rows=501 width=17) (actual
 time=6080.778..6080.778 rows=0 loops=1)
-  Append  (cost=0.00..565831.43 rows=501 width=17) (actual
 time=6080.777..6080.777 rows=0 loops=1)
  -  Seq Scan on rec_isins  (cost=0.00..0.00 rows=1 width=36) 
 (actual time=0.001..0.001 rows=0 loops=1)
Filter: ((isin)::bpchar = 'foo'::bpchar)
  -  Index Scan using rec_isins_archive_pkey on 
 rec_isins_archive rec_isins  (cost=0.00..621.61 rows=405 width=17) 
 (actual time=10.335..10.335 rows=0 loops=1)
Index Cond: ((isin)::bpchar = 'foo'::bpchar)
  -  Index Scan using rec_isins_current_pkey on 
 rec_isins_current rec_isins  (cost=0.00..565209.82 rows=95 width=17) 
 (actual time=6070.440..6070.440 rows=0 loops=1)
Index Cond: ((isin)::bpchar = 'foo'::bpchar)  Total
 runtime: 6080.824 ms
 
 This is strange, because the archive table is four times larger than 
 the current table and the archive index is also four times larger than 
 the current index:
 
relname| relfilenode | reltablespace |
 pg_table_size
 --+-+---+-
 --+-+---+-
 -
 --+-+---+
  rec_isins|   514533886 | 0 |
 8192
  rec_isins_pkey   |   514533892 | 0 |
 8192
  rec_isins_attachment |   514533899 | 0 |
 8192
 --+-+---+-
 --+-+---+-
 -
 --+-+---+
  rec_isins_archive|   507194804 | 0 |
 10923393024
  rec_isins_archive_pkey   |   507197615 | 139300915 |
 9048784896
  rec_isins_archive_attachment |   507197692 | 139300915 |
 4706050048
 --+-+---+-
 --+-+---+-
 -
 --+-+---+
  rec_isins_current|   631621090 | 0

[GENERAL] Slow index performance

2015-07-02 Thread Christian Schröder
Hi all,
we have a strange performance issue in one of our databases (using PostgreSQL 
9.1.18). Maybe you can help me understand what’s going on.

We have two identical tables (rec_isins_current, rec_isins_archive) with the 
following structure:

Table ts_frontend.rec_isins_current
   Column   |  Type   | Modifiers
+-+---
 attachment | integer | not null
 isin   | isin| not null
Indexes:
rec_isins_current_pkey PRIMARY KEY, btree (attachment, isin), tablespace 
extra
rec_isins_current_attachment btree (attachment), tablespace extra
Foreign-key constraints:
rec_isins_attachment_fkey FOREIGN KEY (attachment) REFERENCES 
ts_frontend.attachments(id) ON UPDATE RESTRICT ON DELETE CASCADE
Inherits: ts_frontend.rec_isins

The isin type is a domain type which has char(12) as its base type.
Both tables inherit from ts_frontend.rec_isins, which is empty and is only used 
to search both tables in a single query.

When we search for an isin in both tables (using the parent table, but the 
behavior is the same if we directly search in one of the tables), the primary 
key index is used. However, while the archive table is pretty fast, the current 
table is much slower:

# explain analyze select * from ts_frontend.rec_isins where isin = 'foo';

 QUERY PLAN
-
 Result  (cost=0.00..565831.43 rows=501 width=17) (actual 
time=6080.778..6080.778 rows=0 loops=1)
   -  Append  (cost=0.00..565831.43 rows=501 width=17) (actual 
time=6080.777..6080.777 rows=0 loops=1)
 -  Seq Scan on rec_isins  (cost=0.00..0.00 rows=1 width=36) (actual 
time=0.001..0.001 rows=0 loops=1)
   Filter: ((isin)::bpchar = 'foo'::bpchar)
 -  Index Scan using rec_isins_archive_pkey on rec_isins_archive 
rec_isins  (cost=0.00..621.61 rows=405 width=17) (actual time=10.335..10.335 
rows=0 loops=1)
   Index Cond: ((isin)::bpchar = 'foo'::bpchar)
 -  Index Scan using rec_isins_current_pkey on rec_isins_current 
rec_isins  (cost=0.00..565209.82 rows=95 width=17) (actual 
time=6070.440..6070.440 rows=0 loops=1)
   Index Cond: ((isin)::bpchar = 'foo'::bpchar)
 Total runtime: 6080.824 ms

This is strange, because the archive table is four times larger than the 
current table and the archive index is also four times larger than the current 
index:

   relname| relfilenode | reltablespace | pg_table_size
--+-+---+---
 rec_isins|   514533886 | 0 |  8192
 rec_isins_pkey   |   514533892 | 0 |  8192
 rec_isins_attachment |   514533899 | 0 |  8192
--+-+---+---
 rec_isins_archive|   507194804 | 0 |   10923393024
 rec_isins_archive_pkey   |   507197615 | 139300915 |9048784896
 rec_isins_archive_attachment |   507197692 | 139300915 |4706050048
--+-+---+---
 rec_isins_current|   631621090 | 0 |2696216576
 rec_isins_current_pkey   |   631621096 | 139300915 |2098552832
 rec_isins_current_attachment |   631621107 | 139300915 |1160683520

Both tables are in the same tablespace (and thus on the same disk) and both 
indexes are also in the same tablespace (but in another than the tables).
The current table has been vacuumed full and reindexed.

Can anybody explain the difference? Why is the current table so slow? And what 
can we do to improve performance?

Thanks for your help,
Christian


Deriva GmbH Financial IT and Consulting
Christian Schröder
Geschäftsführer
Hans-Böckler-Straße 2 | D-37079 Göttingen
Tel: +49 (0)551 489 500-42
Fax: +49 (0)551 489 500-91
http://www.deriva.de

Amtsgericht Göttingen | HRB 3240
Geschäftsführer: Christian Schröder





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Perl function leading to out of memory error

2013-02-22 Thread Christian Schröder

On 21.02.2013 01:02, Jeff Janes wrote:

CREATE DOMAIN isin AS char(12) CHECK (isin_ok(value));

Could you give some example values that meet this criteria?  Trying to
reverse engineer your helper functions is no fun.
A valid example is DE000710. The same with another check digit is 
of course invalid (e.g. DE000711).

CREATE OR REPLACE FUNCTION foo(isin char(12)) RETURNS void AS $$
my ($isin) = @_;

my $stmt = spi_prepare('
 INSERT INTO foo (isin)
 VALUES ($1)', 'isin');
spi_exec_prepared($stmt, $isin);
spi_freeplan($stmt);
$$ LANGUAGE plperl VOLATILE STRICT;

That should be easy enough to re-write as plpgsql.  If you do, does it
still leak?
I have no idea how to create a prepared statement in plpgsql. I have 
tested the following function:


CREATE OR REPLACE FUNCTION foo(isin char(12)) RETURNS void AS $$
BEGIN
INSERT INTO foo (isin) VALUES (isin);
END;
$$ LANGUAGE plpgsql VOLATILE STRICT;

This works without any problem.

Regards,
Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Amtsgericht Göttingen | HRB 3240
Geschäftsführer: Dirk Baule, Christian Schröder
Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Perl function leading to out of memory error

2013-02-20 Thread Christian Schröder

On 19.02.2013 12:41, Tom Lane wrote:

Jan Strube j...@deriva.de writes:

we have a Java daemon that´s repeatedly calling a Perl function inside
our database (version 9.1.8). The function is called about 200 times per
second. While the Java program is running you can watch the memory usage
of the postmaster grow continuously until after a few hours we get an
out of memory error from Postgres. In the log you see a lot of
ExprContext... messages.

I tried to reproduce this, without much success.  Can you extract a
self-contained test case?
after some investigation it seems that the error has to do with a domain 
type that we have defined in our database. We have defined the following 
helper functions:


CREATE OR REPLACE FUNCTION isin_pz(text) RETURNS integer AS $$
DECLARE
c char;
s text := '';
l integer;
d integer;
w integer;
sum integer := 0;
BEGIN
IF char_length($1) != 11 THEN
RETURN null;
END IF;

IF substr($1, 1, 2)  'AA' OR substr($1, 1, 2)  'ZZ' THEN
RETURN null;
END IF;

FOR pos IN 1 .. 11 LOOP
c := substr($1, pos, 1);
IF c = '0' AND c = '9' THEN
s := s || c;
ELSE
IF c = 'A' AND c = 'Z' THEN
s := s || to_char(ascii(c) - 55, 'FM99');
ELSE
RETURN null;
END IF;
END IF;
END LOOP;

l := char_length(s);
FOR pos IN 1 .. l LOOP
d := to_number(substr(s, pos, 1), '0');
w := ((l-pos+1) % 2) + 1;
IF w * d = 10 THEN
sum := sum + (w * d) % 10 + 1;
ELSE
sum := sum + (w * d);
END IF;
END LOOP;

RETURN (10 - (sum % 10)) % 10;
END;
$$
LANGUAGE 'plpgsql' IMMUTABLE STRICT;

CREATE OR REPLACE FUNCTION isin_ok(text) RETURNS boolean AS $$
DECLARE
pz integer;
BEGIN
IF char_length($1) != 12 OR substr($1, 1, 2)  'AA' OR substr($1, 
1, 2)  'ZZ' THEN

RETURN false;
END IF;

pz := public.isin_pz(substr($1, 1, 11));
IF pz IS NULL THEN
RETURN false;
END IF;

RETURN to_char(pz, 'FM9') = substr($1, 12, 1);
END;
$$
LANGUAGE 'plpgsql' IMMUTABLE STRICT;

They are used to define the domain type isin as follows:

CREATE DOMAIN isin AS char(12) CHECK (isin_ok(value));

Now we can create our test case. Create the following table:

CREATE TABLE foo (isin char(12) NOT NULL);

And this function:

CREATE OR REPLACE FUNCTION foo(isin char(12)) RETURNS void AS $$
my ($isin) = @_;

my $stmt = spi_prepare('
INSERT INTO foo (isin)
VALUES ($1)', 'isin');
spi_exec_prepared($stmt, $isin);
spi_freeplan($stmt);
$$ LANGUAGE plperl VOLATILE STRICT;

If we now repeatedly call this function (e.g. using a Perl script) we 
can see the memory consumption rise continuously until the out of memory 
error occurs.


Interestingly, if we change the type specification in the call to 
spi_prepare from isin to char(12) the problem no longer occurs.


Can you explain this behavior?

Regards,
Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Amtsgericht Göttingen | HRB 3240
Geschäftsführer: Dirk Baule, Christian Schröder
Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Wrong estimation of rows for hash join

2009-10-16 Thread Christian Schröder

Hi list,
I have the following query:
   SELECT *
   FROM base
   INNER JOIN pt USING (x) WHERE pt.y IN ('1121', '11411', '11421', 
'1161', '1162');


explain analyze yields the following result:

QUERY PLAN

Hash Join  (cost=529.18..164930.70 rows=28374 width=0) (actual 
time=10.834..4877.326 rows=245298 loops=1)

  Hash Cond: (b.x = pt.x)
  -  Seq Scan on b  (cost=0.00..159579.93 rows=1210093 width=4) 
(actual time=0.018..2464.871 rows=1210250 loops=1)
  -  Hash  (cost=527.41..527.41 rows=142 width=4) (actual 
time=10.775..10.775 rows=138 loops=1)
-  Seq Scan on pt  (cost=0.00..527.41 rows=142 width=4) 
(actual time=0.057..10.556 rows=138 loops=1)

  Filter: (y = ANY ('{1121,11411,11421,1161,1162}'::bpchar[]))
Total runtime: 5170.837 ms

As you can see, the estimated number of rows in the join is much lower 
than the actual number. Normally, the join is part of a more complex 
query which gets really slow, probably (among other reasons) because the 
query planner uses the incorrect row estimate for its further plans.


Question is: What can I do to improve that estimation? There is a 
foreign key from base.x to pt.x and both fields are declared not null.


Side question: What can I do to improve the speed of the hash join 
itself? I understand that 2.5 secs are needed to perform the sequential 
scan of table b, but where do the remaining 2.5 secs come from?


Thanks a lot in advance,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance of full outer join in 8.3

2009-04-17 Thread Christian Schröder

Tom Lane wrote:

I've applied a patch for this.  It will be in 8.3.8, or if you're
in a hurry you can grab it from our CVS server or here:
  

Thanks a lot for your effort and the quick response!

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance of full outer join in 8.3

2009-04-16 Thread Christian Schröder

Grzegorz Jaśkiewicz wrote:

On Wed, Apr 15, 2009 at 1:25 PM, Simon Riggs si...@2ndquadrant.com wrote:
  

The cost of the query seems accurate, so the absence of
attachment_isins_attachment_idx on the 8.3 plan looks to be the reason.
There's no way it would choose to scan 8115133 rows on the pkey if the
other index was available and usable.



hance my question, if there's index on it in 8.3 version of db.
  

I added an index on this column, but it didn't change the query plan.
Stupid question: Do I have to analyze again or perform a reindex after 
adding the index?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Performance of full outer join in 8.3

2009-04-15 Thread Christian Schröder

Hi list,
we have just migrated one of our databases from 8.2.12 to 8.3.7. We now 
experience a strange problem: A query that was really fast on the 8.2 
server is now much slower on the 8.3 server (1 ms vs. 60 sec). I had a 
look at the query plan and it is completely different. Both servers run 
on the same machine. The configuration (planner constants etc.) is 
identical. The database has been vacuum analyzed after the migration. So 
why the difference?


This is the query:
select isin from ts_frontend.attachment_isins full OUTER JOIN 
ts_frontend.rec_isins using (attachment,isin)  WHERE attachment=2698120  
GROUP BY isin limit 1000;


Here is the explain analyze in 8.2:


QUERY 
PLAN 


-
Limit  (cost=826.44..826.61 rows=17 width=32) (actual time=0.163..0.172 
rows=2 loops=1)
  -  HashAggregate  (cost=826.44..826.61 rows=17 width=32) (actual 
time=0.159..0.162 rows=2 loops=1)
-  Merge Full Join  (cost=799.62..826.40 rows=17 width=32) 
(actual time=0.122..0.144 rows=2 loops=1)
  Merge Cond: ((outer.?column3? = inner.?column3?) 
AND (attachment_isins.attachment = rec_isins.attachment))
  Filter: (COALESCE(attachment_isins.attachment, 
rec_isins.attachment) = 2698120)
  -  Sort  (cost=13.39..13.74 rows=138 width=20) (actual 
time=0.065..0.067 rows=1 loops=1)
Sort Key: (attachment_isins.isin)::bpchar, 
attachment_isins.attachment
-  Index Scan using 
attachment_isins_attachment_idx on attachment_isins  (cost=0.00..8.49 
rows=138 width=20) (actual time=0.042..0.047 rows=1 loops=1)

  Index Cond: (attachment = 2698120)
  -  Sort  (cost=786.23..794.80 rows=3429 width=20) 
(actual time=0.045..0.049 rows=2 loops=1)
Sort Key: (rec_isins.isin)::bpchar, 
rec_isins.attachment
-  Index Scan using idx_rec_isins_attachment on 
rec_isins  (cost=0.00..584.89 rows=3429 width=20) (actual 
time=0.019..0.024 rows=2 loops=1)

  Index Cond: (attachment = 2698120)
Total runtime: 0.302 ms
(14 rows)

And this is the 8.3 plan:
  
QUERY PLAN

-
Limit  (cost=345890.35..345900.35 rows=1000 width=26) (actual 
time=53926.706..53927.071 rows=2 loops=1)
  -  HashAggregate  (cost=345890.35..346296.11 rows=40576 width=26) 
(actual time=53926.702..53927.061 rows=2 loops=1)
-  Merge Full Join  (cost=71575.91..345788.91 rows=40576 
width=26) (actual time=10694.727..53926.559 rows=2 loops=1)
  Merge Cond: (((rec_isins.isin)::bpchar = 
(attachment_isins.isin)::bpchar) AND (rec_isins.attachment = 
attachment_isins.attachment))
  Filter: (COALESCE(attachment_isins.attachment, 
rec_isins.attachment) = 2698120)
  -  Index Scan using rec_isin_pkey on rec_isins  
(cost=0.00..229562.97 rows=8115133 width=17) (actual 
time=0.141..18043.605 rows=8036226 loops=1)
  -  Materialize  (cost=71575.91..78318.19 rows=539383 
width=17) (actual time=10181.074..14471.215 rows=539101 loops=1)
-  Sort  (cost=71575.91..72924.36 rows=539383 
width=17) (actual time=10181.064..13019.906 rows=539101 loops=1)
  Sort Key: attachment_isins.isin, 
attachment_isins.attachment

  Sort Method:  external merge  Disk: 18936kB
  -  Seq Scan on attachment_isins  
(cost=0.00..13111.83 rows=539383 width=17) (actual time=0.036..912.963 
rows=539101 loops=1)

Total runtime: 53937.213 ms
(12 rows)

These are the table definitions:
  Table ts_frontend.attachment_isins
   Column|  Type  | Modifiers
--++---
attachment   | integer| not null
isin | isin   | not null
editor   | name   |
last_changed | timestamp(0) without time zone |
Indexes:
   attachment_isins_pkey PRIMARY KEY, btree (attachment, isin)
   attachment_isins_attachment_idx btree (attachment)
   attachment_isins_attachment_isin btree (attachment, isin)
   attachment_isins_isin_idx btree (isin)
Foreign-key constraints:
   attachment_isins_attachment_fkey FOREIGN KEY (attachment) 
REFERENCES ts_frontend.attachments(id) ON UPDATE CASCADE ON DELETE CASCADE


 Table ts_frontend.rec_isins
  Column   |  Type   | Modifiers
+-+---
attachment | integer | not 

Re: [GENERAL] Performance of full outer join in 8.3

2009-04-15 Thread Christian Schröder

Grzegorz Jaśkiewicz wrote:

set work_mem=24000; before running the query.

postgres is doing merge and sort on disc, that's always slow.
  
Ok, but why is the plan different in 8.2? As you can see the same query 
is really fast in 8.2, but slow in 8.3.

is there an index on column isin ?
  
There is a separate index on the isin column of the attachment_isins 
table (attachment_isins_isin_idx). The other table (rec_isins) has the 
combination of attachment and isin as primary key which creates an 
implicit index. Can this index be used for the single column isin? And 
again: Why doesn't this matter in 8.2??


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance of subselects

2009-03-09 Thread Christian Schröder

Scott Marlowe wrote:

you can run out of memory if too many connections try to use too much
of it at the same time, that's why it is advisable to set work_mem per
connection/query, should the connection/query require more.


Definitely.
  
I understand why this is advisable; however, something inside me hates 
the idea to put this kind of database specific stuff inside an 
application. How about portability? Why does the application developer 
have to know about database internals? He knows sql, that should be 
sufficient.
I have the (maybe naive) idea of a clear separation of database 
administration (including performance tuning) and application 
development. Is this idea completely wrong?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance of subselects

2009-03-08 Thread Christian Schröder

Tom Lane wrote:

No, they're not the same; NOT IN has different semantics for nulls.
  
But in this case the column in the subselect has a not-null constraint. 
Does the planner recognize this constraint?

You're probably at the threshold where it doesn't think the hashtable
would fit in work_mem.
  
I have read in the docs that the work_mem value should be increased 
carefully because the total memory used can be many times the value of 
work_mem. Is there any statistics available about how many concurrent 
sort or hash operations are running and how much memory they consume? 
This would help to find out if the value can be changed without running 
out of memory.


Regards,
   Christian Schröder

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Performance of subselects

2009-03-05 Thread Christian Schröder

Hi list,
if I want to find all records from a table that don't have a matching 
record in another table there are at least two ways to do it: Using a 
left outer join or using a subselect. I always thought that the planner 
would create identical plans for both approaches, but actually they are 
quite different which leads to a bad performance in one case.

I tried the following test case:

chschroe=# create table a (id integer primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index a_pkey 
for table a

CREATE TABLE
chschroe=# create table b (id serial not null, fk integer not null, 
primary key (id, fk));
NOTICE:  CREATE TABLE will create implicit sequence b_id_seq for 
serial column b.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index b_pkey 
for table b

CREATE TABLE
chschroe=# insert into a select generate_series(1, 50);
INSERT 0 50
chschroe=# insert into b(fk) select generate_series(1, 45);
INSERT 0 45
chschroe=# analyze a;
ANALYZE
chschroe=# analyze b;
ANALYZE
chschroe=# explain analyze select * from b where fk not in (select id 
from a);

   QUERY PLAN
---
Seq Scan on b  (cost=10645.00..1955718703.00 rows=225000 width=8) 
(actual time=65378590.167..65378590.167 rows=0 loops=1)

  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=10645.00..18087.00 rows=50 width=4) 
(actual time=0.008..72.326 rows=225000 loops=45)
  -  Seq Scan on a  (cost=0.00..7703.00 rows=50 width=4) 
(actual time=0.008..894.163 rows=45 loops=1)

Total runtime: 65378595.489 ms
(6 rows)
chschroe=# explain analyze select b.* from b left outer join a on b.fk = 
a.id where a.id is null;

 QUERY PLAN
--
Hash Left Join  (cost=16395.00..38041.00 rows=225000 width=8) (actual 
time=1040.840..1040.840 rows=0 loops=1)

  Hash Cond: (b.fk = a.id)
  Filter: (a.id IS NULL)
  -  Seq Scan on b  (cost=0.00..6933.00 rows=45 width=8) (actual 
time=0.010..149.508 rows=45 loops=1)
  -  Hash  (cost=7703.00..7703.00 rows=50 width=4) (actual 
time=408.126..408.126 rows=50 loops=1)
-  Seq Scan on a  (cost=0.00..7703.00 rows=50 width=4) 
(actual time=0.007..166.168 rows=50 loops=1)

Total runtime: 1041.945 ms
(7 rows)

Is there any difference between the two approaches that explain why the 
plans are so different? There would be a difference if the subselect 
could generate null values, but since the id field is a primary key 
field, it should be implicitly declared not null.


Another interesting thing: If table a contains only 400,000 rows 
(instead of 500,000) the query planner decides to use a hashed subplan 
and performance is fine again:


chschroe=# explain analyze select * from b where fk not in (select id 
from a);

   QUERY PLAN
--
Seq Scan on b  (cost=7163.00..15221.00 rows=225000 width=8) (actual 
time=472.969..497.096 rows=5 loops=1)

  Filter: (NOT (hashed subplan))
  SubPlan
-  Seq Scan on a  (cost=0.00..6163.00 rows=40 width=4) (actual 
time=0.010..124.503 rows=40 loops=1)

Total runtime: 509.632 ms
(5 rows)

Why this different plan?

All tests have been performed on a PostgreSQL 8.2.9 server:
chschroe=# select version();
version
--
PostgreSQL 8.2.9 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 
4.1.2 20061115 (prerelease) (SUSE Linux)

(1 row)

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Polymorphic setof record function?

2009-01-15 Thread Christian Schröder

Merlin Moncure wrote:

On 1/13/09, Christian Schröder c...@deriva.de wrote:
  

Hi list,
 I have written a function that returns a setof record. The function has a
table name as a parameter and the resulting records have the same structure
as this table. Is there any easy way to specify this when I call the
function? If the table has many columns then it's annoying to specify all of
them.
 I need something like:
   select * from myfunc('mytable') as x(like mytable)
 or
   select * from myfunc('mytable') as x(mytable%TYPE)

 Is there any solution for PostgreSQL 8.2?



Unfortunately to the best of my knowledge there is no way to do this.
 I think what you want is to have sql functions that specialize on
type in the way that templates do in C++.
  
That would certainly be the best solution, but I would also be happy 
with some syntactic sugar: The function may still be declared as 
returning a set of records, so that I would still have to declare their 
actual return type in the query. However, I would like to have an easy 
way to express: the record will have the same structure as table x.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Polymorphic setof record function?

2009-01-13 Thread Christian Schröder

Hi list,
I have written a function that returns a setof record. The function has 
a table name as a parameter and the resulting records have the same 
structure as this table. Is there any easy way to specify this when I 
call the function? If the table has many columns then it's annoying to 
specify all of them.

I need something like:
   select * from myfunc('mytable') as x(like mytable)
or
   select * from myfunc('mytable') as x(mytable%TYPE)

Is there any solution for PostgreSQL 8.2?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query planner and foreign key constraints

2009-01-12 Thread Christian Schröder

Christian Schröder wrote:
When I join both tables using key1 and key2 there will be exactly 
1630788 rows because for each row in table2 there *must* exist a row 
in table1. But the query planner doesn't think so:


# explain analyze select * from table1 inner join table2 using (key1, 
key2);
 QUERY 
PLAN
-- 

Merge Join  (cost=0.00..94916.58 rows=39560 width=44) (actual 
time=0.103..7105.960 rows=1630788 loops=1)
  Merge Cond: ((table1.key1 = table2.key1) AND (table1.key2 = 
table2.key2))
  -  Index Scan using table1_pkey on table1  (cost=0.00..22677.65 
rows=896576 width=20) (actual time=0.038..689.059 rows=896576 loops=1)
  -  Index Scan using table2_pkey on table2  (cost=0.00..59213.16 
rows=1630788 width=44) (actual time=0.049..1108.220 rows=1630788 loops=1)

Total runtime: 7525.492 ms
(5 rows)
A wrong assumption about the number of rows in a join seems to be one 
major cause for the long running queries that we experience. Here is 
another example (part of a larger query):


-  Hash Join  (cost=18.73..369864.68 rows=33583 width=24) (actual 
time=2.994..9636.586 rows=883068 loops=1)
 Hash Cond: ((b.ID_ISSUER_GROUP = ic.ID_ISSUER_GROUP) AND 
(substr((b.ISIN)::text, 1, 2) = (ic.ID_COUNTRY)::text))
 -  Seq Scan on Z_BASE b  (cost=0.00..265745.99 rows=883099 
width=20) (actual time=0.048..5380.554 rows=883099 loops=1)
 -  Hash  (cost=9.89..9.89 rows=589 width=14) (actual 
time=2.793..2.793 rows=589 loops=1)
   -  Seq Scan on ISSUER_CODES ic  (cost=0.00..9.89 rows=589 
width=14) (actual time=0.047..1.151 rows=589 loops=1)


This join has about 25 times more rows than the query planner thinks. In 
my naive thinking, all further planning is simply wild guessing ...


What can I do to address this issue? I tried to create a functional 
index on substr(b.ISIN, 1, 2), but execution time became even worse. 
(I cancelled the query after several minutes.) Is there any way to tell 
the query planner about the (non-)selectivity of the hash condition? 
Would it help to increase the statistics target of one of the columns?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SPI_ERROR_CONNECT in plperl function

2009-01-08 Thread Christian Schröder

Tom Lane wrote:

Hmph ... looks like plperl is shy a few SPI_push/SPI_pop calls.
  


I've applied a patch for this --- it'll be in the next set of update
releases.
  

Great. Thanks a lot! The whole PostgreSQL stuff is really amazing! :-)

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] What determines the cost of an index scan?

2009-01-04 Thread Christian Schröder

Hi list,
I have experienced the following situation: A join between two tables 
(one with ~900k rows, the other with ~1600k rows) takes about 20 sec on 
our productive database. I have created two tables in our test database 
with the same data, but with fewer fields. (I have omitted several 
fields that do not participate in the join.) If I try the join in our 
test database it takes about 8 sec. Both queries have the same query plan:


prod=# explain analyze select 1 from dtng.Z_UL inner join 
dtng.Z_BARRIER using (ISIN, ID_NOTATION);

 QUERY PLAN
--
Merge Join  (cost=0.00..131201.15 rows=39376 width=0) (actual 
time=0.198..16086.185 rows=1652076 loops=1)
 Merge Cond: (((Z_UL.ISIN)::bpchar = (Z_BARRIER.ISIN)::bpchar) 
AND (Z_UL.ID_NOTATION = Z_BARRIER.ID_NOTATION))
 -  Index Scan using Z_UL_pkey on Z_UL  (cost=0.00..34806.57 
rows=897841 width=20) (actual time=0.075..1743.396 rows=897841 loops=1)
 -  Index Scan using Z_BARRIER_ISIN_ID_NOTATION_key on Z_BARRIER  
(cost=0.00..83255.17 rows=1652076 width=20) (actual time=0.076..3389.676 
rows=1652076 loops=1)

Total runtime: 18123.042 ms

test=# explain analyze select 1 from table1 inner join table2 using 
(key1, key2);
 
QUERY PLAN

---
Merge Join  (cost=0.00..82443.05 rows=36158 width=0) (actual 
time=0.092..8036.490 rows=1652076 loops=1)

  Merge Cond: ((table1.key1 = table2.key1) AND (table1.key2 = table2.key2))
  -  Index Scan using table1_pkey on table1  (cost=0.00..22719.56 
rows=897841 width=20) (actual time=0.026..845.916 rows=897841 loops=1)
  -  Index Scan using table2_key1_key2_key on table2  
(cost=0.00..46638.20 rows=1652076 width=20) (actual time=0.049..1843.047 
rows=1652076 loops=1)

Total runtime: 8460.956 ms

No disk io occurs in either server, so I guess that the whole data is 
already in memory. Both servers are idle. Both use the same PostgreSQL 
version (8.2.9). Both servers are 64bit machines. However, the servers 
have different CPUs and memory: The production server has 4 Dual-Core 
AMD Opteron 8214 processors (2.2 GHz) and 16 GB memory, the test server 
has 2 Dual-Core Intel Xeon 5130 processors (2.0 GHz) and 8 GB memory. I 
have not yet done a CPU and memory benchmark, but this is my next step.


Where does this difference come from? Pure cpu performance? Do the 
additional fields in the productive database have an impact on the 
performance? Or do I miss something?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query planner and foreign key constraints

2009-01-04 Thread Christian Schröder

Christian Schröder wrote:

in our PostgreSQL 8.2.9 database I have these tables:

   create table table1 (
   key1 char(12),
   key2 integer,
   primary key (key1, key2)
   );

   create table table2 (
   key1 char(12),
   key2 integer,
   key3 varchar(20),
   primary key (key1, key2, key3),
   foreign key (key1, key2) references table1 (key1, key2)
   );

Table1 has 896576 rows. Table2 has 1630788 rows. The statistics target 
of the columns key1 and key2 in both tables has been set to 1000. Both 
tables have been analyzed.
When I join both tables using key1 and key2 there will be exactly 
1630788 rows because for each row in table2 there *must* exist a row 
in table1. But the query planner doesn't think so:


# explain analyze select * from table1 inner join table2 using (key1, 
key2);
 QUERY 
PLAN
-- 

Merge Join  (cost=0.00..94916.58 rows=39560 width=44) (actual 
time=0.103..7105.960 rows=1630788 loops=1)
  Merge Cond: ((table1.key1 = table2.key1) AND (table1.key2 = 
table2.key2))
  -  Index Scan using table1_pkey on table1  (cost=0.00..22677.65 
rows=896576 width=20) (actual time=0.038..689.059 rows=896576 loops=1)
  -  Index Scan using table2_pkey on table2  (cost=0.00..59213.16 
rows=1630788 width=44) (actual time=0.049..1108.220 rows=1630788 loops=1)

Total runtime: 7525.492 ms
(5 rows)

You can also find the query plan at 
http://explain-analyze.info/query_plans/2648-query-plan-1371.


What can I do to make the query planner realize that the join will 
have 1630788 rows? This join is part of a view which I then use in 
other joins and this wrong assumption leads to really bad performance.
I have not yet found any solution. My queries still take several minutes 
to complete. :-(

No ideas at all?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SPI_ERROR_CONNECT in plperl function

2009-01-04 Thread Christian Schröder

Hi list,
I have found the following problem: I have declared a domain datatype 
with a check constraint. The check constraint uses a plpgsql function:


CREATE FUNCTION domain_ok(value integer) RETURNS boolean AS $$
BEGIN
   RETURN value  0;
END;
$$ LANGUAGE plpgsql;

CREATE DOMAIN testdomain integer CHECK (domain_ok(value));

I then wrote a plperl function that returns a set of this domain type:

CREATE FUNCTION testfunc() RETURNS SETOF testdomain AS $$
return_next(42);
return undef;
$$ LANGUAGE plperl;

When I try to call the function I get the following error message:

test=# select * from testfunc();
ERROR:  error from Perl function: SPI_connect failed: SPI_ERROR_CONNECT 
at line 2.


The problem goes away if I declare the testfunc to return an integer. 
Even a domain with a normal check constraint (i.e. without one that 
calls another function) does not cause this error.


Is this a known bug? Or do I miss something?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query planner and foreign key constraints

2008-12-30 Thread Christian Schröder

Filip Rembiałkowski wrote:


  create table table1 (
  key1 char(12),
  key2 integer,
  primary key (key1, key2)
  );

  create table table2 (
  key1 char(12),
  key2 integer,
  key3 varchar(20),
  primary key (key1, key2, key3),
  foreign key (key1, key2) references table1 (key1, key2)
  );


just a guess, but - did you try to declare NOT NULL on FK columns?

your assumption that for each row in table2 there *must* exist a row 
in table1 will be enforced then.


maybe the planner will make use of this ...
All columns are implictly declared not null because they are part of 
the primary key of the tables:


# \d table1
  Table public.table1
Column | Type  | Modifiers
+---+---
key1   | character(12) | not null
key2   | integer   | not null
Indexes:
   table1_pkey PRIMARY KEY, btree (key1, key2)

# \d table2
  Table public.table2
Column | Type  | Modifiers
+---+---
key1   | character(12) | not null
key2   | integer   | not null
key3   | character varying(20) | not null
Indexes:
   table2_pkey PRIMARY KEY, btree (key1, key2, key3)
Foreign-key constraints:
   table2_key1_fkey FOREIGN KEY (key1, key2) REFERENCES table1(key1, 
key2)


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Query planner and foreign key constraints

2008-12-29 Thread Christian Schröder

Hi list,
in our PostgreSQL 8.2.9 database I have these tables:

   create table table1 (
   key1 char(12),
   key2 integer,
   primary key (key1, key2)
   );

   create table table2 (
   key1 char(12),
   key2 integer,
   key3 varchar(20),
   primary key (key1, key2, key3),
   foreign key (key1, key2) references table1 (key1, key2)
   );

Table1 has 896576 rows. Table2 has 1630788 rows. The statistics target 
of the columns key1 and key2 in both tables has been set to 1000. Both 
tables have been analyzed.
When I join both tables using key1 and key2 there will be exactly 
1630788 rows because for each row in table2 there *must* exist a row in 
table1. But the query planner doesn't think so:


# explain analyze select * from table1 inner join table2 using (key1, key2);
 QUERY PLAN
--
Merge Join  (cost=0.00..94916.58 rows=39560 width=44) (actual 
time=0.103..7105.960 rows=1630788 loops=1)

  Merge Cond: ((table1.key1 = table2.key1) AND (table1.key2 = table2.key2))
  -  Index Scan using table1_pkey on table1  (cost=0.00..22677.65 
rows=896576 width=20) (actual time=0.038..689.059 rows=896576 loops=1)
  -  Index Scan using table2_pkey on table2  (cost=0.00..59213.16 
rows=1630788 width=44) (actual time=0.049..1108.220 rows=1630788 loops=1)

Total runtime: 7525.492 ms
(5 rows)

You can also find the query plan at 
http://explain-analyze.info/query_plans/2648-query-plan-1371.


What can I do to make the query planner realize that the join will have 
1630788 rows? This join is part of a view which I then use in other 
joins and this wrong assumption leads to really bad performance.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] inherit table and its data

2008-11-21 Thread Christian Schröder

Dilyan Berkovski wrote:
I am using PostgreSQL 8.2, and I am interested in creating a table B that inherits table A, but with all it's data! 

create table B {a int} inherits A, just adds the structure of table A, not its data. 
  
PostgreSQL's inheritance works the other way around: If table B inherits 
table A and you select all records from A you will also get all records 
from B (but only the fields that are inherited from A). This is 
consistent with the definition of inheritance in object-oriented 
programming.
Consider the example from the manual (chapter 5.8): Assume you have a 
table with cities and a table with capitals which inherits from the 
cities table. If you select from the capitals table, why should you 
expect to get all cities, even those that are not capitals?



Is it possible to do this without the classic way - trigger on table A so each 
insert/update is done also in B, or it's not possible without triggers.
  

I don't think so.

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MS Access and PostgreSQL - a warning to people thinking about it

2008-11-19 Thread Christian Schröder

Craig Ringer wrote:

If I'm wrong about any of this (which is not unlikely, really) then if
anyone else is lucky enough to be using Access with PostgreSQL and
knows of a better solution or workaround, please feel free to correct me.
  
We have been working with the combination of a PostgreSQL backend and an 
Access frontend for several years. Since we still use Access 2003 I 
cannot tell if things have become worse with Access 2007, but my 
experiences with this combination are not too bad. I would agree that 
there is some tricky stuff about it, and one should always consider 
another solution (e.g. a web-based frontend), but I would not consider 
the Access solution a no-go.

The big issue is with Access's linked table support via ODBC (at least
as of Access 2007). Unlike tools like Hibernate, which are capable of
executing filters, queries across multiple tables, etc server-side,
Access will ALWAYS fetch the full contents of the linked table then do
its filters and joins client-side.
  
That's not exactly what I observed. Access actually knows about primary 
keys (at least sometimes *g*). When you link the table it tries to find 
the primary key fields and somehow stores this information. If no 
primary key exists you will be asked to select one or more fields that 
uniquely identify a record.
When you then open a linked table without any filter, Access fetches all 
records from the primary key column(s). Of course this can already be a 
bad idea, but it's at least better than fetching all the data. The next 
step is to fetch a couple of records (50 when I just tried it) 
identified by their primary keys (some nasty SELECT  lengthy field 
list FROM table WHERE pk = ... OR pk = ... OR pk = ... statements). 
When you skip to another part of the table, the next block of records is 
fetched.

As far as I can tell there is no way to get it to execute even simple
filters (think WHERE id = 99) server-side while still using Access's
built-in support for linked tables etc. If you want to do joins, filters,
  
I tried to apply an Access filter (not using a query) and the result was 
again fetched in two steps: First the matching primary keys were 
selected using a where clause (SELECT pk FROM table WHERE 
condition) and then the full data of the matching records was fetched 
(this time with one query for each record).
When I create a query (not pass-through) that joins two tables I 
actually get a join in the generated backend query. It's again the 
two-step approach, fetching the primary keys first (SELECT pk1, pk2 
FROM table1, table2 WHERE condition AND join condition) and 
then the data. Interestingly, the data is fetched separately from both 
tables after the matching records have been identified.


I would never praise Access the ideal frontend for PostgreSQL, but it 
may come out the best solution when you need a quick and simple frontend 
and bandwidth is not too restricted. Of course there are some problems 
that must be considered:


   * Auto-generated keys are a problem.
   * Nullable booleans are not supported. (But you can use a smallint
 instead with only 0 and -1 allowed which works fine.)
   * Timestamps must always be clipped to precision 0.
   * Dates before 01.01.0200 are not supported.
   * ...

Regards,
   Christian

P.S.: I used Access 2003, a PostgreSQL 8.2 backend and the latest 8.2 
version of the PostgreSQL ODBC driver.


--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] No serial type

2008-11-18 Thread Christian Schröder

Scott Marlowe wrote:

Serial is a pseudotype.  It represents creating an int or bigint and
a sequence then assigning a default value for the column and setting
dependency in the db so the sequence will be dropped when the table
gets dropped.  If you don't want to recreate the table, you can do
this:

create table a (i int primary key, info text);
create sequence a_i_seq;
alter table a alter column i set default nextval('a_i_seq');
  
You could even use create sequence a_i_seq owned by a.i. This would 
cause the sequence to be dropped when the table is dropped which I think 
is the default behaviour if you create a column with type serial.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database recovery

2008-11-13 Thread Christian Schröder

Christian Schröder wrote:
we have a PostgreSQL 8.2 database on an xfs filesystem. By mistake the 
first blocks of this filesystem were overwritten. An xfs_repair 
reconstructed the superblock and also found many orphaned files and 
directories. Actually, all we have on the filesystem now is in 
lost+found. ;-)
When I look in lost+found I have many files that *could* be database 
files, but since the original name of the files is unknown I cannot 
tell for sure. I have found a directory that looks like the original 
data directory, with stuff like postmaster.log, pg_hba.conf and 
even subdirectories base, global etc. in it. I have been able to 
start postgresql from this directory, but when I tried to access the 
most important database I got a message that the database directory 
could not be found. Indeed, this directory is missing in base, but 
there is a chance that some of the other files might be the original 
content of this directory.
Is there any way to find out which of the files is really a postgres 
data file? Or even for which database? Although the database file and 
page layout are described in the manual, I could not find an exact 
description of the file format, e.g. any magic numbers at the 
beginning of the file.

Hmmm, no idea? :-(

Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Database recovery

2008-11-10 Thread Christian Schröder

Hi list,
we have a PostgreSQL 8.2 database on an xfs filesystem. By mistake the 
first blocks of this filesystem were overwritten. An xfs_repair 
reconstructed the superblock and also found many orphaned files and 
directories. Actually, all we have on the filesystem now is in 
lost+found. ;-)
When I look in lost+found I have many files that *could* be database 
files, but since the original name of the files is unknown I cannot tell 
for sure. I have found a directory that looks like the original data 
directory, with stuff like postmaster.log, pg_hba.conf and even 
subdirectories base, global etc. in it. I have been able to start 
postgresql from this directory, but when I tried to access the most 
important database I got a message that the database directory could not 
be found. Indeed, this directory is missing in base, but there is a 
chance that some of the other files might be the original content of 
this directory.
Is there any way to find out which of the files is really a postgres 
data file? Or even for which database? Although the database file and 
page layout are described in the manual, I could not find an exact 
description of the file format, e.g. any magic numbers at the beginning 
of the file.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-11-04 Thread Christian Schröder

Tomasz Ostrowski wrote:

This is wrong. RAID5 is slower than RAID1.
You should go for RAID1+0 for fast and reliable storage. Or RAID0 for
even faster but unreliable.
  
I did not find a clear statement about this. I agree that RAID10 would 
be better than RAID5, but in some situations RAID5 at least seems to be 
faster than RAID1.


If I have 5 disks available, how should I use them to get best 
performance without the risk of severe data loss? If I use 4 of the 
disks to build a RAID10 then I will have only 1 remaining drive, e.g. to 
put the pgsql_tmp directories there. In this scenario I would not have 
the WAL on a separate disk.
Or should I use 3 disks to build a RAID5, 1 disk for tempspace and 1 
disk for WAL? How important is data integrity for the WAL? If the WAL 
disk fails, can this corrupt my data? Or would I just lose the data 
after the last checkpoint?
Or maybe I should use 2 disks as RAID1 for the database, 2 disks as 
RAID1 for the WAL and the remaining disk for the tempspace?


Regards,
  Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Storage location of temporary files

2008-10-31 Thread Christian Schröder

Hi list,
I want to optimize the performance of our PostgreSQL 8.2 server. Up to 
now the server has a raid1 where the whole database is located 
(including tha WAL files). We will now move the database to a raid5 
(which should be faster than the raid1) and will also move the WAL to a 
separate disk (which should, according to the docs, also increase the 
performance).
But I see the temporary files as another important performance 
bottleneck. From the docs (chapter 17.4.1):


   work_mem (integer)
   Specifies the amount of memory to be used by internal sort
   operations and hash tables before switching to temporary disk files.
   [...]

We have rather complex queries and as I far as I see from the disk usage 
patterns the system makes use of temporary disk files. So I would like 
to use a faster disk for these temporary files, too, but I could not 
find where the temporary files are located. Is there a separate 
directory? I have found a pgsql_tmp directory inside of the database 
directories (base/oid/pgsql_tmp). Is this what I'm looking for?

Thanks for your help!

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Storage location of temporary files

2008-10-31 Thread Christian Schröder

Christian Schröder wrote:
So I would like to use a faster disk for these temporary files, too, 
but I could not find where the temporary files are located. Is there a 
separate directory? I have found a pgsql_tmp directory inside of the 
database directories (base/oid/pgsql_tmp). Is this what I'm 
looking for?

Just one addition: I have found the following in the 8.3 docs:

   temp_tablespaces (string)
   This variable specifies tablespace(s) in which to create temporary
   objects (temp tables and indexes on temp tables) when a CREATE
   command does not explicitly specify a tablespace. Temporary files
   for purposes such as sorting large data sets are also created in
   these tablespace(s).

So my problem seems to have been addressed in the 8.3 release. Maybe we 
can upgrade our database, but until that happens I will need another 
solution, so my question remains ...


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Views and permissions

2008-01-21 Thread Christian Schröder

Hi list,
yesterday I moved our database from one server to another. I did a full 
dump of the database and imported the dump into the new server. Since 
then I have a strange problem which I cannot explain ...

I have a table public.EDITORS:

Table public.EDITORS
 Column  |  Type  |  Modifiers
--++-
code | character(2)   | not null
active   | smallint   | not null default -1
name | character varying(100) |
username | name   | not null
Indexes:
   EDITORS_pkey PRIMARY KEY, btree (code)
   EDITORS_username_key UNIQUE, btree (username)

And I have a view ts_frontend.v_editors:

 View ts_frontend.v_editors
 Column   |  Type  | Modifiers
---++---
code  | character(2)   |
name  | character varying(100) |
username  | name   |
usergroup | text   |
View definition:
SELECT EDITORS.code, EDITORS.name, EDITORS.username, ( SELECT
   CASE
   WHEN EDITORS.code = ANY (ARRAY['AF'::bpchar, 
'CS'::bpchar, 'FK'::bpchar, 'FW'::bpchar, 'JK'::bpchar, 'JS'::bpchar, 
'KJ'::bpchar, 'KR'::bpchar, 'MP'::bpchar, 'PB'::bpchar, 'RB'::bpchar, 
'RR'::bpchar, 'SJ'::bpchar]) THEN 'a'::text
   WHEN EDITORS.code = ANY (ARRAY['JA'::bpchar, 
'AG'::bpchar, 'BK'::bpchar]) THEN 'o'::text

   ELSE 'z'::text
   END AS case) AS usergroup
  FROM EDITORS
 WHERE EDITORS.active  0
 ORDER BY EDITORS.name;

A user www has read access on both the view and the table. When I log 
into the database as this user and execute the view's sql, everything 
works fine. But when I try to select from the view, I get an ERROR:  
permission denied for relation EDITORS.
How can this happen? As far as I understand, views are simply rewrite 
rules, so it should make no difference if I use the view or directly use 
the sql. Moreover, this error never happened before I moved to the new 
server. The new server completely replaced the old one (it has the same 
name, ip address etc.) so I cannot imagine how the migration can 
influence this behaviour.

If it is important: The postgresql version is 8.2.6.

Thanks a lot for any hints,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Views and permissions

2008-01-21 Thread Christian Schröder

Albe Laurenz wrote:

One possibility I see is that there is more than one table
called EDITORS and they get confused.

What do you get when you

SELECT t.oid, n.nspname, t.relname
FROM pg_catalog.pg_class t JOIN
 pg_catalog.pg_namespace n ON t.relnamespace = n.oid
WHERE t.relname='EDITORS';
  

As expected, there is only one table:

  oid   | nspname | relname
-+-+-
3045229 | public  | EDITORS
(1 row)


Can you show us the permissions for ts_frontend.v_editors as well
as for any EDITORS table you find (e.g. using \z in psql).
  
 Access privileges for database 
zertifikate
Schema |  Name   | Type  |  
Access privileges

+-+---+--
public | EDITORS | table | 
{chschroe=arwdxt/chschroe,zert_readers=r/chschroe,zert_writers=arwd/chschroe,ts_frontend=x/chschroe}

(1 row)

  Access privileges for database zertifikate
  Schema|   Name| Type |Access 
privileges

-+---+--+-
ts_frontend | v_editors | view | 
{ts_frontend=arwdxt/ts_frontend,www=r/ts_frontend,backup=r/ts_frontend}

(1 row)

The user www is a member of the zert_readers group:

zertifikate=# select pg_has_role('www', 'zert_readers', 'MEMBER');
pg_has_role
-
t
(1 row)

I have also tried to drop and recreate the view, but it didn't help.

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Views and permissions

2008-01-21 Thread Christian Schröder

Albe Laurenz wrote:

User ts_frontend, the owner of the view ts_frontend.v_editors, does not
have the SELECT privilege on the underlying table public.EDITORS.

Because of that neither he nor anybody else can select from the view,
although ts_frontend is able to create the view.
  
Indeed, you are right! Granting select permission to the ts_frontend 
user (more precisely: granting membership to the zert_readers role) 
solved the problem.

This is strange because ts_frontend can select from EDITORS because
of the membership to role zert_readers.
  
No, the user ts_frontend is (was) not a member of the group 
zert_readers, but the user www who uses the view is. Until now I 
always thought that the user that *uses* the view must have the 
appropriate privileges, but it seems to depend also on the privileges of 
the user that *defines* the view.

Since this database is from a pg_dump from another database where things
worked as expected:
- What is the version of that database?
- Do permissions look identical in that database?
  
Ok, I have found my mistake: During migration of the roles, I did not 
handle roles the way it should have been. I only migrated group 
memberships for users, but not for other groups. Maybe I should correct 
my migration script and remove the distinction between users and groups 
at all. Or is there a way to migrate the roles using the PostgreSQL 
tools? I normally dump the databases one by one (using pg_dump and not 
pg_dumpall), so the system catalogs (especially the roles) must be 
transferred separately.


That doesn't explain why views behave the way they do, but at least it 
describes why things suddenly stopped working.


Many thanks for your help!!

Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Views and permissions

2008-01-21 Thread Christian Schröder

Tom Lane wrote:

Table accesses done by a view are checked according to the privileges
of the owner of the view, not of whoever invoked the view.  It's a
bit inconsistent because function calls done in the view are not handled
that way (though I hope we change them to match, someday).
  
Phew, sometimes I'm surprised about my own stupidity! I used this more 
than once to create views that gave people access to tables they would 
otherwise not be allowed to read, but I simply did not recognize that 
it's simply the same in this case.
And by now I also found the section in the manual where this is 
described (35.4, if someone is interested).

You can use pg_dumpall -g to get a dump of just global objects (roles
and tablespaces).  If you do want to stick to hand-rolled scripts, then
  
Thanks, I didn't know this option. The next migration will be much 
easier with this!

yeah, you need to take another look at it.  Since 8.1 there is very
little difference between users and groups --- they are all roles, and
the only actual difference is the default settings of their LOGIN and
INHERITS flags.  See the CREATE ROLE reference page for details.
  
Yes, I know the new role concept, but I didn't realize that it had these 
impacts on my script. Anyway, I won't need it anymore, now that you told 
me the pg_dumpall -g solution.


Thanks again,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] and then / or else

2007-11-17 Thread Christian Schröder

Hi list,
the logical operators and and or are commutative, i.e. there is no 
short-circuiting. Especially when doing PL/pgSQL development it would 
sometimes be very handy to have this short circuiting. Unfortunately, 
the trick from the docs (chapter 4.2.12) using case ... then does not 
work inside an if statement (the then of the case is interpreted 
as belonging to the if and thus leads to a syntax error).
Of course, I can sometimes use nested if statements, but that doesn't 
always meet the logical requirements and leads to less readable code.
Some programming languages (e.g. Eiffel) know the operators and then 
and or else which explicitly are short-circuited. Would it be 
possible to add these operators to PostgreSQL in general or at least to 
the PL/pgSQL syntax?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] and then / or else

2007-11-17 Thread Christian Schröder

Michael Glaesemann wrote:

On Nov 17, 2007, at 3:53 , Christian Schröder wrote:

Unfortunately, the trick from the docs (chapter 4.2.12) using case 
... then does not work inside an if statement (the then of the 
case is interpreted as belonging to the if and thus leads to a 
syntax error).


I think if you use parentheses you can avoid the syntax error:
Ah, I didn't know that parentheses are allowed here. (And I must admit I 
didn't try.)
Nonetheless, I think it would improve readability to have an extra 
operator for this.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-13 Thread Christian Schröder

Tom Lane wrote:

Hah, I've got it.  (Should have searched Red Hat's bugzilla sooner.)
What you are hitting is a glibc bug, as explained here:
http://sources.redhat.com/ml/libc-hacker/2007-10/msg00010.html
If libpthread is loaded after first use of dcgettext, then subsequent
uses are at risk of hanging because they start to use a mutex lock
that was never initialized.  And libperl brings in libpthread.
  


Many thanks for your efforts which have been crowned with success at last!!

What remains inscrutable to me is why this problem did not arise 
earlier. I cannot remember any changes that I have made to the system 
recently. Or maybe it has been a defective update of the glibc? I did 
not see in the bug report any hint about when this bug has been 
introduced. Obviously it has something to do with the users having 
disconnects, but we have been working with dial-up connections for many 
years and did never see this bug.


By the way, does the --enable-thread-safety switch have anything to do 
with the problem?



So you need to pester SuSE for a glibc with that fix in it ...
  


Phew, that might be hard work, but I will give it a try. Until I have 
success I will probably restart the database server once in a week to 
remove those old connections.


Thanks a lot for your help!
   Christian

P.S.: Can I change the root password again?

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-11 Thread Christian Schröder

Tom Lane wrote:

=?ISO-8859-1?Q?Christian_Schr=F6der?= [EMAIL PROTECTED] writes:
  
Although I do not yet have any processes that are stuck inside a 
statement, there are some that are idle, but do not respond to SIGINT or 
even SIGTERM. Is this sufficient?



Dunno.  Have you looked at their call stacks with gdb?
  

(gdb) bt
#0  0x2b24aeee0a68 in __lll_mutex_lock_wait () from 
/lib64/libpthread.so.0
#1  0x2b24aeedde88 in pthread_rwlock_rdlock () from 
/lib64/libpthread.so.0

#2  0x2b24a5814e23 in _nl_find_msg () from /lib64/libc.so.6
#3  0x2b24a5815c83 in __dcigettext () from /lib64/libc.so.6
#4  0x2b24a585df0b in strerror_r () from /lib64/libc.so.6
#5  0x2b24a585dd33 in strerror () from /lib64/libc.so.6
#6  0x005f4daa in expand_fmt_string ()
#7  0x005f6d14 in errmsg ()
#8  0x005185f3 in pq_recvbuf ()
#9  0x00518987 in pq_getbyte ()
#10 0x0057eb69 in PostgresMain ()
#11 0x00558218 in ServerLoop ()
#12 0x00558db8 in PostmasterMain ()
#13 0x0051a213 in main ()

Seems to be the same as for the processes that were stuck inside of a 
statement.


I recompiled the server with debugging symbols enabled and then did the 
following experiment: I started a query which I knew would take some 
time. While the query executed I disconnected my dial-up line. After 
reconnecting the backend process was still there (still SELECTing). 
Meanwhile the query is finished and the process is idle, but it's still 
present. I tried to kill -SIGINT the process and it didn't respond. 
pg_cancel_backend also didn't work. However, a kill -SIGTERM did 
kill the process (in contrast to the processes I saw before) and the 
call stack is different:


(gdb) bt
#0  0x2ac5d1d0f0c5 in recv () from /lib64/libc.so.6
#1  0x00514292 in secure_read (port=0x996730, ptr=0x917760, 
len=8192) at be-secure.c:313

#2  0x00518574 in pq_recvbuf () at pqcomm.c:723
#3  0x00518987 in pq_getbyte () at pqcomm.c:764
#4  0x0057eb69 in PostgresMain (argc=4, argv=0x97f830, 
username=0x97f6d0 chschroe) at postgres.c:301

#5  0x00558218 in ServerLoop () at postmaster.c:2934
#6  0x00558db8 in PostmasterMain (argc=3, argv=0x97bfd0) at 
postmaster.c:966

#7  0x0051a213 in main (argc=3, argv=0x97bfd0) at main.c:188

I now did the same again and will wait some time. Maybe the process will 
come to the same state the other processes had been. If not, I'm pretty 
sure that my users will be able to create some really persistent 
processes again ... ;-)


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-11 Thread Christian Schröder

Tom Lane wrote:


I recompiled the server with debugging symbols enabled and then did the 
following experiment: I started a query which I knew would take some 
time. While the query executed I disconnected my dial-up line. After 
reconnecting the backend process was still there (still SELECTing). 
Meanwhile the query is finished and the process is idle, but it's still 
present.



That is probably not the same situation because (assuming the query
didn't produce a lot of output) the kernel does not yet think that the
network connection is lost irretrievably.  You'd have to wait for the
TCP timeout interval to elapse, whereupon the kernel would report the
connection lost (EPIPE or ECONNRESET error), whereupon we'd enter the
code path shown above.
  


That's what I also thought, but unfortunately it doesn't seem to be the 
same situation. After a while, the connection disappears and the server 
log says (quite correct in my opinion):


2007-11-11 18:45:43 CET - chschroe LOG:  could not receive data from 
client: Die Wartezeit für die Verbindung ist abgelaufen
2007-11-11 18:45:43 CET - chschroe LOG:  unexpected EOF on client 
connection



One thing I'm suddenly thinking might be related: didn't you mention
that you have some process that goes around and SIGINT's backends that
it thinks are running too long?  I'm wondering if a SIGINT event is a
necessary component of producing the problem ...
  


Maybe. On the other hand, I sent a SIGINT to my process and it 
nonetheless didn't show the strange behaviour. To test this I will 
change the script so that it will leave alone the processes of one of 
the users.


I think the users will create some more unkillable processes tomorrow. 
As soon as I have some, I will send you the login data for the server so 
that you can have a look for yourself.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-10 Thread Christian Schröder

Tom Lane wrote:

I don't think you ever mentioned exactly what platform you're running
on; it seems to be some 64-bit Linux variant but you didn't say which.
  


The machine has two dual-core Xeon 5130 cpus. The os is openSUSE 10.2 
(x86-64). The output of uname -a is:


Linux db2 2.6.18.8-0.7-default #1 SMP Tue Oct 2 17:21:08 UTC 2007 x86_64 
x86_64 x86_64 GNU/Linux


Did I mention the PostgreSQL version is 8.2.5?


On my machine the corresponding lines point to /usr/lib64:
  


You are right. Of course the directory /usr/lib64 exists and contains 
lots of shared libraries and other stuff, but indeed no directory perl5.



Maybe this is just a cross-distribution difference in file layouts,
but I'm suddenly wondering if there's a 32-vs-64-bit issue here.
Exactly which perl packages have you got installed?
  


rpm says: perl-5.8.8-32. The somewhat lengthy output of perl -V is:

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
 Platform:
   osname=linux, osvers=2.6.18, archname=x86_64-linux-thread-multi
   uname='linux eisler 2.6.18 #1 smp tue nov 21 12:59:21 utc 2006 
x86_64 x86_64 x86_64 gnulinux '
   config_args='-ds -e -Dprefix=/usr -Dvendorprefix=/usr 
-Dinstallusrbinperl -Dusethreads -Di_db -Di_dbm -Di_ndbm -Di_gdbm 
-Duseshrplib=true -Doptimize=-O2 -fmessage-length=0 -Wall 
-D_FORTIFY_SOURCE=2 -g -Wall -pipe'

   hint=recommended, useposix=true, d_sigaction=define
   usethreads=define use5005threads=undef useithreads=define 
usemultiplicity=define

   useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
   use64bitint=define use64bitall=define uselongdouble=undef
   usemymalloc=n, bincompat5005=undef
 Compiler:
   cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS 
-DDEBUGGING -fno-strict-aliasing -pipe -Wdeclaration-after-statement 
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
   optimize='-O2 -fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -Wall 
-pipe',
   cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING 
-fno-strict-aliasing -pipe -Wdeclaration-after-statement'
   ccversion='', gccversion='4.1.2 20061115 (prerelease) (SUSE Linux)', 
gccosandvers=''

   intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
   d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
   ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', 
lseeksize=8

   alignbytes=8, prototype=define
 Linker and Libraries:
   ld='cc', ldflags =' -L/usr/local/lib64'
   libpth=/lib64 /usr/lib64 /usr/local/lib64
   libs=-lm -ldl -lcrypt -lpthread
   perllibs=-lm -ldl -lcrypt -lpthread
   libc=/lib64/libc-2.5.so, so=so, useshrplib=true, libperl=libperl.so
   gnulibc_version='2.5'
 Dynamic Linking:
   dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E 
-Wl,-rpath,/usr/lib/perl5/5.8.8/x86_64-linux-thread-multi/CORE'

   cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib64'


Characteristics of this binary (from libperl):
 Compile-time options: DEBUGGING MULTIPLICITY PERL_IMPLICIT_CONTEXT
   PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL
   USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
   USE_PERLIO USE_REENTRANT_API
 Built under linux
 Compiled at Nov 25 2006 11:02:03
 @INC:
   /usr/lib/perl5/5.8.8/x86_64-linux-thread-multi
   /usr/lib/perl5/5.8.8
   /usr/lib/perl5/site_perl/5.8.8/x86_64-linux-thread-multi
   /usr/lib/perl5/site_perl/5.8.8
   /usr/lib/perl5/site_perl
   /usr/lib/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi
   /usr/lib/perl5/vendor_perl/5.8.8
   /usr/lib/perl5/vendor_perl
   .

One more information about the current situation: Besides the processes 
that are stuck in a statement (meanwhile I have five and the most recent 
belongs to yet another user) I also see many idle connections. I'm 
pretty sure that the users are out of office at the moment and their 
workstations are shut down, so the connections should not exist at all. 
Since I have now a total of 69 connections and already touched the 
connection limit yesterday I will now restart the server. Thus, I won't 
be able to provide any more detailed information until the problem 
arises again.


Of course, I can still give you general information about the system.

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-10 Thread Christian Schröder

Tom Lane wrote:

OK.  For the moment I confess bafflement.  You had offered access
to your system to probe more carefully --- once you've built up
two or three stuck processes again, I would like to take a look.
  


Although I do not yet have any processes that are stuck inside a 
statement, there are some that are idle, but do not respond to SIGINT or 
even SIGTERM. Is this sufficient? Then I could change the root password 
and send all the login data to you.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-09 Thread Christian Schröder

Tom Lane wrote:

=?ISO-8859-1?Q?Christian_Schr=F6der?= [EMAIL PROTECTED] writes:
  
I don't want to kill -9 the processes because the last time 
I did this the database was in recovery mode for a substantial amount of 
time.



A useful tip on that: if you perform a manual CHECKPOINT just before
issuing the kills, recovery time should be minimal.
  
Thanks for the tip. For the moment, I will leave the processes in their 
current state to be able to perform further tests in case you need more 
data. But when I will start feeling uncomfortable with the situation I 
will follow your advice.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-09 Thread Christian Schröder

Tom Lane wrote:

control has already returned from the kernel.  What I think is that the
perl stuff your session has done has included some action that changed
the condition of the backend process ... exactly what, I have no idea.
Can you show us the plperl functions that were used in these sessions?
  
Considering the skill level of the users I don't think that they 
explicitly used any perl function. So the only remaining perl stuff are 
the triggers that I mentioned earlier. There are two triggers in the 
database to which the users were connected (see 
http://archives.postgresql.org/pgsql-general/2007-11/msg00245.php). 
There are three more triggers in other databases which might have an impact:


=
   if ($_TD-{argc} != 1) {
   elog(ERROR, 'do_fast_export_log() needs the table\'s 
primary key as sole argument, multiple keys are not yet supported.');

   }

   my $table = $_TD-{table_name};
   my $id = $_TD-{old}{$_TD-{args}[0]};

   while (my ($col, $value) = each %{$_TD-{old}}) {
   my $oldVal = $_TD-{old}{$col};
   my $newVal = $_TD-{new}{$col};

   $oldVal =~ s/'/''/g;
   $oldVal =~ s/\t/t/g;
   $newVal =~ s/'/''/g;
   $newVal =~ s/\t/t/g;

   if ($oldVal ne $newVal) {
   my $data = $id\t$table\t$col\t$oldVal\t$newVal;
   my $query = spi_exec_query(insert into 
fast_export(data) values(E'$data'));

   }
   }

   spi_exec_query(notify fast_export);
   return;
=
   elog(ERROR, 'store_session_user needs the field name as argument.')
   if $_TD-{argc}  1;

   my ($field) = @{$_TD-{args}};

   elog(ERROR, Field '$field' does not exist in target table.\n)
   unless exists $_TD-{new}{$field};

   my $query = spi_exec_query('SELECT session_user', 1);
   $_TD-{new}{$field} = $query-{rows}[0]{session_user};
   return 'MODIFY';
=
   if ($_TD-{argc}  0) {
   foreach (@{$_TD-{args}}) {
   $_TD-{new}{$_} = $_TD-{old}{$_}
   if exists $_TD-{old}{$_};
   }
   return 'MODIFY';
   }
   else {
   return;
   }
=

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-08 Thread Christian Schröder

Hi all,
any news about this issue? Anything else that I can do to help you? 
Meanwhile there are 4 connections in the same state. (I did not do the 
whole investigation on all 4, but since they all do not respond on a 
SIGINT I assume that they all have the same problem.)
It may also be interesting that the 4 processes belong to the same two 
users that already caused this problem earlier. Maybe it really has 
something to do with their unstable internet connection? (I mentioned 
this in an earlier mail.)
I have also noticed that one of these two users has many open 
connections which are all idle. I guess that those connections are in 
fact dead, but the database did not close them for any reason. The 
pg_stat_activity entries for the corresponding backend processes are as 
follows:


procpid | usename | current_query | waiting |  
query_start  | backend_start

-+-+---+-+---+---
  26033 | dpyrek  | IDLE| f   | 2007-11-08 
10:21:01.555232+01 | 2007-11-08 09:55:01.59932+01
  18331 | dpyrek  | IDLE| f   | 2007-11-07 
11:34:24.968852+01 | 2007-11-07 11:08:29.043762+01
  18940 | dpyrek  | IDLE| f   | 2007-11-07 
14:29:52.987176+01 | 2007-11-07 13:14:48.609031+01
  25868 | dpyrek  | IDLE| f   | 2007-11-08 
09:47:46.938991+01 | 2007-11-08 09:13:34.101351+01
   6719 | dpyrek  | IDLE| f   | 2007-11-06 
12:06:14.875588+01 | 2007-11-06 11:10:00.566644+01
  17987 | dpyrek  | IDLE| f   | 2007-11-07 
10:31:50.517275+01 | 2007-11-07 10:11:07.310338+01
  31808 | dpyrek  | IDLE| f   | 2007-11-08 
22:55:03.931727+01 | 2007-11-08 22:55:03.766638+01
  25484 | dpyrek  | IDLE| f   | 2007-11-08 
08:32:57.265377+01 | 2007-11-08 07:44:30.845967+01
   5972 | dpyrek  | IDLE| f   | 2007-11-06 
08:51:54.57437+01  | 2007-11-06 08:14:03.560602+01
   6241 | dpyrek  | IDLE| f   | 2007-11-06 
09:59:02.018452+01 | 2007-11-06 09:20:49.092246+01
   6136 | dpyrek  | IDLE| f   | 2007-11-06 
09:14:40.729837+01 | 2007-11-06 08:57:29.55187+01
  12645 | dpyrek  | IDLE| f   | 2007-11-02 
10:08:24.856929+01 | 2007-11-02 09:35:37.640976+01
  25254 | dpyrek  | IDLE| f   | 2007-11-08 
07:29:04.547081+01 | 2007-11-08 06:33:47.707759+01
  20275 | dpyrek  | IDLE| f   | 2007-11-03 
09:14:12.73829+01  | 2007-11-03 08:57:05.555972+01
  20216 | dpyrek  | IDLE| f   | 2007-11-03 
08:46:40.555354+01 | 2007-11-03 08:40:31.756993+01
  12435 | dpyrek  | IDLE| f   | 2007-11-02 
09:28:53.361365+01 | 2007-11-02 08:48:11.589485+01
  19633 | dpyrek  | IDLE| f   | 2007-11-03 
08:34:16.263487+01 | 2007-11-03 05:46:16.811069+01
  12156 | dpyrek  | IDLE| f   | 2007-11-02 
08:10:11.558427+01 | 2007-11-02 07:49:03.442489+01
   4899 | dpyrek  | IDLE| f   | 2007-11-01 
12:42:30.880391+01 | 2007-11-01 10:56:18.513398+01
  11988 | dpyrek  | IDLE| f   | 2007-11-02 
07:38:10.315758+01 | 2007-11-02 07:02:52.438251+01
   4490 | dpyrek  | IDLE| f   | 2007-11-01 
09:51:42.216745+01 | 2007-11-01 09:34:18.63771+01


A ptrace of one of these connections yields the following result:

db2:/home/pgsql/data # strace -p 4899
Process 4899 attached - interrupt to quit
futex(0x994000, FUTEX_WAIT, 2, NULL

This looks identical (even with the same uaddr parameter) for the other 
processes.


In the log file I find many lines like this:

2007-11-08 22:56:19 CET - dpyrek LOG:  could not receive data from 
client: Die Wartezeit für die Verbindung ist abgelaufen

2007-11-08 22:56:19 CET - dpyrek LOG:  unexpected EOF on client connection

I'm not sure that these lines correspond to the dead connections, but at 
least it's the same user.


Does this additional information help you in any way? I'm a little bit 
afraid that eventually the maximum number of connections will be 
exceeded. I don't want to kill -9 the processes because the last time 
I did this the database was in recovery mode for a substantial amount of 
time.


Any help is still highly appreciated!

Regards,
   Christian

P.S.: If nothing else helps I could also grant one of you guys root 
access to the database machine.


--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-06 Thread Christian Schröder

Tom Lane wrote:

What we can be reasonably certain of is that that backend wasn't
reaching any CHECK_FOR_INTERRUPTS() macros.  Whether it was hung up
waiting for something, or caught in a tight loop somewhere, is
impossible to say without more data than we have.  AFAIR the OP didn't
even mention whether the backend appeared to be consuming CPU cycles
(which'd be a pretty fair tip about which of those to believe, but still
not enough to guess *where* the problem is). A gdb backtrace would tell
us more.
  


It happened again! I'm not sure if I should be happy because we can now 
maybe find the cause of the problem, or should be worried because it's 
our productive database ... At least the process doesn't seem to consume 
cpu (it doesn't show up in top), so I won't kill it this time, but 
instead try to get all information that you guys need.

What I already did was an strace with the following result:

db2:/home/pgsql/data # strace -p 7129
Process 7129 attached - interrupt to quit
futex(0x994000, FUTEX_WAIT, 2, NULL)= -1 EINTR (Interrupted system call)
--- SIGINT (Interrupt) @ 0 (0) ---
rt_sigreturn(0x2)   = -1 EINTR (Interrupted system call)
futex(0x994000, FUTEX_WAIT, 2, NULL

That interrupt will have been the script that tries to remove long-time 
queries. The same lines seem to repeat over and over again.


Then I attached a gdb to the process and printed out a backtrace:

db2:/home/pgsql/data # gdb --pid=7129
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.

Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as x86_64-suse-linux.
Attaching to process 7129
Reading symbols from /usr/local/pgsql_8.2.5/bin/postgres...done.
Using host libthread_db library /lib64/libthread_db.so.1.
Reading symbols from /lib64/libcrypt.so.1...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /usr/local/pgsql_8.2.5/lib/plpgsql.so...done.
Loaded symbols for /usr/local/pgsql_8.2.5/lib/plpgsql.so
Reading symbols from /usr/local/pgsql_8.2.5/lib/plperl.so...done.
Loaded symbols for /usr/local/pgsql_8.2.5/lib/plperl.so
Reading symbols from 
/usr/lib/perl5/5.8.8/x86_64-linux-thread-multi/CORE/libperl.so...done.
Loaded symbols for 
/usr/lib/perl5/5.8.8/x86_64-linux-thread-multi/CORE/libperl.so

Reading symbols from /lib64/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 47248855881456 (LWP 7129)]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from 
/usr/lib/perl5/5.8.8/x86_64-linux-thread-multi/auto/Opcode/Opcode.so...done.
Loaded symbols for 
/usr/lib/perl5/5.8.8/x86_64-linux-thread-multi/auto/Opcode/Opcode.so

0x2af904809a68 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x2af904809a68 in __lll_mutex_lock_wait () from 
/lib64/libpthread.so.0
#1  0x2af904806e88 in pthread_rwlock_rdlock () from 
/lib64/libpthread.so.0

#2  0x2af8fb13de23 in _nl_find_msg () from /lib64/libc.so.6
#3  0x2af8fb13ec83 in __dcigettext () from /lib64/libc.so.6
#4  0x2af8fb186f0b in strerror_r () from /lib64/libc.so.6
#5  0x2af8fb186d33 in strerror () from /lib64/libc.so.6
#6  0x005f4daa in expand_fmt_string ()
#7  0x005f6d14 in errmsg ()
#8  0x005182cc in internal_flush ()
#9  0x005183b6 in internal_putbytes ()
#10 0x0051841c in pq_putmessage ()
#11 0x005199c4 in pq_endmessage ()
#12 0x00440c6a in printtup ()
#13 0x004fc1b8 in ExecutorRun ()
#14 0x00580451 in PortalRunSelect ()
#15 0x00581446 in PortalRun ()
#16 0x0057d625 in exec_simple_query ()
#17 0x0057ea72 in PostgresMain ()
#18 0x00558218 in ServerLoop ()
#19 0x00558db8 in PostmasterMain ()
#20 0x0051a213 in main ()

Do you need anything else? Can you still tell what's happening?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] How does the query planner make its plan?

2007-11-06 Thread Christian Schröder

Hi list,
once again I do not understand how the query planner works and why it 
apparently does not find the best result.
I have a table with about 125 million rows. There is a char(5) column 
with a (non-unique) index. When I try to find the distinct values in 
this column using the following sql statement:


select distinct exchange from foo

the query planner chooses not to use the index, but performs a 
sequential scan. When I disfavour the use of sequential scans (set 
enable_seqscan = off) the performance is more than 6 times better. Why 
does the query planner's plan go wrong? The table has been vacuum 
analyzed just before I ran the queries.


Here is the plan when I let the query planner alone:

  QUERY PLAN

Unique  (cost=23057876.40..23683350.48 rows=4 width=9)
  -  Sort  (cost=23057876.40..23370613.44 rows=125094816 width=9)
Sort Key: exchange
-  Seq Scan on quotes  (cost=0.00..3301683.16 rows=125094816 
width=9)

(4 rows)

This is what really happens:

 QUERY PLAN
---
Unique  (cost=23057876.40..23683350.48 rows=4 width=9) (actual 
time=1577159.744..1968911.024 rows=4 loops=1)
  -  Sort  (cost=23057876.40..23370613.44 rows=125094816 width=9) 
(actual time=1577159.742..1927400.118 rows=125094818 loops=1)

Sort Key: exchange
-  Seq Scan on quotes  (cost=0.00..3301683.16 rows=125094816 
width=9) (actual time=0.022..169744.162 rows=125094818 loops=1)

Total runtime: 1969844.753 ms
(5 rows)

With enable_seqscan = off I get this plan:

 QUERY PLAN
---
Unique  (cost=0.00..89811549.81 rows=4 width=9)
  -  Index Scan using quotes_exchange_key on quotes  
(cost=0.00..89498812.77 rows=125094816 width=9)

(2 rows)

And again with execution times:

 
QUERY PLAN

--
Unique  (cost=0.00..89811549.81 rows=4 width=9) (actual 
time=0.079..313068.922 rows=4 loops=1)
  -  Index Scan using quotes_exchange_key on quotes  
(cost=0.00..89498812.77 rows=125094816 width=9) (actual 
time=0.078..273787.493 rows=125094818 loops=1)

Total runtime: 313068.967 ms
(3 rows)

I understand that from looking at the estimations (89811549.81 with 
index scan vs. 23683350.48 with sequential scan) the query planner had 
to choose the sequential scan. So maybe I have to tune the planner cost 
constants? Indeed I did some changes to these values, but in my opinion 
this should make index scans preferable:


#seq_page_cost = 1.0# measured on an arbitrary scale
#random_page_cost = 4.0 # same scale as above
random_page_cost = 1.0
#cpu_tuple_cost = 0.01  # same scale as above
#cpu_index_tuple_cost = 0.005   # same scale as above
cpu_index_tuple_cost = 0.001
#cpu_operator_cost = 0.0025 # same scale as above
#effective_cache_size = 128MB
effective_cache_size = 4GB

The machine is a dedicated database server with two dual-core xeon 
processors and 8 GB memory.


Thanks for your help,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-06 Thread Christian Schröder

Tom Lane wrote:

* The only place internal_flush would call errmsg is here:

ereport(COMMERROR,
(errcode_for_socket_access(),
 errmsg(could not send data to client: %m)));

So why is it unable to send data to the client?
  
The user accesses the database from his home office. He is connected to 
the internet using a dial-in connection which seems to be disconnected 
from time to time. The problem seems to be specific for this user, so it 
may have something to do with his type of connection.



* How the heck would strerror() be blocking on a thread mutex, when
the backend is single-threaded?

We recently discovered that it was possible for pltcl to cause the
backend to become multithreaded:
http://archives.postgresql.org/pgsql-patches/2007-09/msg00194.php
I see from your dump that this session has been using plperl, so I
wonder whether plperl has the same sort of issue.  Can you determine
exactly what's been done with plperl in this session?
  


Not exactly. There are several triggers whose functions are implemented 
in perl:


   elog(ERROR, 'set_serial needs at least 2 arguments.')
   if $_TD-{argc}  2;

   my ($colname, $seqname) = @{$_TD-{args}};
   my $result = spi_exec_query(SELECT nextval('$seqname'));
   $_TD-{new}{$colname} = $result-{rows}[0]{nextval};
   return 'MODIFY';

and

   elog(ERROR, 'log_changes needs at least 2 arguments.')
   if $_TD-{argc}  2;

   my @args = @{$_TD-{args}};

   shift(@args) =~ /([^.]*(?=\.))?\.?(.*)/;
   my $log_table = defined($1) ? qq($1.$2) : qq($2);

   if ($_TD-{event} eq 'UPDATE') {
   my $id = join(',', @{$_TD-[EMAIL PROTECTED]);
   $id =~ s/'/\\'/g;
   while (my ($key, $value) = each %{$_TD-{old}}) {
   if ($value ne $_TD-{new}{$key}) {
   my $newvalue = $_TD-{new}{$key};
   $value =~ s/'/\\'/g;
   $newvalue =~ s/'/\\'/g;
   my $query = sprintf(qq(INSERT INTO $log_table VALUES 
(current_timestamp, session_user, '%s', '%s', '%s', '%s', '%s')),

   $id,
   $_TD-{relname},
   $key,
   $value,
   $newvalue);
   spi_exec_query($query);
   }
   }
   }
   else {
   my $id = join(',', @{$_TD-{$_TD-{event} eq 'INSERT' ? 'new' : 
'old'[EMAIL PROTECTED]);

   $id =~ s/'/\\'/g;
   my $query = sprintf(qq(INSERT INTO $log_table VALUES 
(current_timestamp, session_user, '%s', '%s', '(%s)', null, null)),

   $id,
   $_TD-{relname},
   $_TD-{event});
   spi_exec_query($query);
   }
   return;


Also, can you confirm that there is actually more than one thread active
in this process?  On Linux ps -Lw would show threads.
  


Since ps -Lw only showed me my own processes, I did ps -AT -o 
pid,ppid,lwp,nlwp,command instead. This is the relevant line:


 PID  PPID   LWP NLWP COMMAND
...
7129  3934  71291 postgres: dpyrek uspi 88.70.241.205(10612) SELECT
...

I'm not sure if this data is correct because NLWP (the number of threads 
in the process) is 1 for all processes.


Is it necessary to recompile the server with debugging information 
enabled? As I already mentioned this is a productive database, so I 
would have to do it by night which would be somewhat uncomfortable for 
me ... Not to mention the bad feeling when changing a productive 
database server ...


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] (Never?) Kill Postmaster?

2007-11-06 Thread Christian Schröder

Alvaro Herrera wrote:

Please try thread apply all bt full on gdb.
  


The first lines where the symbols are loaded are of course identical. 
The output of the command is in my opinion not very helpful:


(gdb) thread apply all bt full

Thread 1 (Thread 47248855881456 (LWP 7129)):
#0  0x2af904809a68 in __lll_mutex_lock_wait () from 
/lib64/libpthread.so.0

No symbol table info available.
#1  0x2af904806e88 in pthread_rwlock_rdlock () from 
/lib64/libpthread.so.0

No symbol table info available.
#2  0x2af8fb13de23 in _nl_find_msg () from /lib64/libc.so.6
No symbol table info available.
#3  0x2af8fb13ec83 in __dcigettext () from /lib64/libc.so.6
No symbol table info available.
#4  0x2af8fb186f0b in strerror_r () from /lib64/libc.so.6
No symbol table info available.
#5  0x2af8fb186d33 in strerror () from /lib64/libc.so.6
No symbol table info available.
#6  0x005f4daa in expand_fmt_string ()
No symbol table info available.
#7  0x005f6d14 in errmsg ()
No symbol table info available.
#8  0x005182cc in internal_flush ()
No symbol table info available.
#9  0x005183b6 in internal_putbytes ()
No symbol table info available.
#10 0x0051841c in pq_putmessage ()
No symbol table info available.
#11 0x005199c4 in pq_endmessage ()
No symbol table info available.
#12 0x00440c6a in printtup ()
No symbol table info available.
#13 0x004fc1b8 in ExecutorRun ()
No symbol table info available.
#14 0x00580451 in PortalRunSelect ()
No symbol table info available.
#15 0x00581446 in PortalRun ()
No symbol table info available.
#16 0x0057d625 in exec_simple_query ()
No symbol table info available.
#17 0x0057ea72 in PostgresMain ()
No symbol table info available.
#18 0x00558218 in ServerLoop ()
No symbol table info available.
#19 0x00558db8 in PostmasterMain ()
No symbol table info available.
#20 0x0051a213 in main ()
No symbol table info available.
#0  0x2af904809a68 in __lll_mutex_lock_wait () from 
/lib64/libpthread.so.0


What does this tell you?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] (Never?) Kill Postmaster?

2007-10-31 Thread Christian Schröder

Tom Lane wrote:

Michael Harris [EMAIL PROTECTED] writes:
  

The tip is ''kill -9' the postmaster', which has two important
differences to the scenario I just described:
1) kill -9 means the OS kills the process without allowing it to clean
up after itself
2) The postmaster is the master postgresql backend process. If you want
to kill a single query you would not want to kill that.



Right: the tip is to not kill -9 the parent process; it's not saying
anything about child server processes.

If you've got a child process that's unresponsive to SIGINT then you
can send it a SIGKILL instead; the downside is that this will force a
restart of the other children too, that is you're interrupting all
database sessions not only the one.  But Postgres will recover
automatically and I don't think I've ever heard of anyone getting data
corruption as a result of such a thing.
  


I have been in exactly this situation today: One statement took several
hours to complete, so it should be cancelled. I tried a
pg_cancel_backend and a kill -2 (which means SIGINT on our linux
box), but nothing happened. Since I remembered this thread, I tried a
kill -9 on this child process. As you described, all other connections
were reset, too, and this was the message in the server log:

2007-10-31 22:48:28 CET - chschroe WARNING:  terminating connection
because of crash of another server process
2007-10-31 22:48:28 CET - chschroe DETAIL:  The postmaster has
commanded this server process to roll back the current transaction and
exit, because another server process exited abnormally and possibly
corrupted shared memory.

But then, when I tried to reconnect to the database, I received the
following message:

2007-10-31 22:50:01 CET - chschroe FATAL:  the database system is in
recovery mode

Ok, you wrote Postgres will recover automatically, but could this take 
several minutes? Is that what recovery mode means? When nothing seemed 
to happen for several minutes, I performed a (fortunately clean) restart 
of the whole server. The log messages for the server restart looked 
normal to me:


2007-10-31 22:53:15 CET -  LOG:  received smart shutdown request
2007-10-31 22:53:21 CET -  LOG:  all server processes terminated;
reinitializing
2007-10-31 22:53:58 CET -  LOG:  database system was interrupted at
2007-10-31 22:46:46 CET
2007-10-31 22:53:58 CET -  LOG:  checkpoint record is at 153/FE9FAF20
2007-10-31 22:53:58 CET -  LOG:  redo record is at 153/FE9FAF20; undo
record is at 0/0; shutdown FALSE
2007-10-31 22:53:58 CET -  LOG:  next transaction ID: 0/128715865;
next OID: 58311787
2007-10-31 22:53:58 CET -  LOG:  next MultiXactId: 4704; next
MultiXactOffset: 9414
2007-10-31 22:53:58 CET -  LOG:  database system was not properly shut
down; automatic recovery in progress
2007-10-31 22:53:58 CET -  LOG:  redo starts at 153/FE9FAF70
2007-10-31 22:53:58 CET -  LOG:  record with zero length at 153/FEA05E70
2007-10-31 22:53:58 CET -  LOG:  redo done at 153/FEA05E40
2007-10-31 22:53:58 CET -  LOG:  database system is ready

I hope that no data got corrupted. Is there any way to check this?

What is the conclusion of this experience? Is it contrary to the above
statements dangerous to kill (-9) a subprocess?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] (Never?) Kill Postmaster?

2007-10-31 Thread Christian Schröder

Tom Lane wrote:
Ok, you wrote Postgres will recover automatically, but could this take 
several minutes?



Yeah, potentially.  I don't suppose you have any idea how long it'd been
since your last checkpoint, but what do you have checkpoint_timeout and
checkpoint_segments set to?
  


I did not change these parameters from their default values, so 
checkpoint_timeout is 5 min and checkpoint_segments is 8.



What I'd like to know about is why the child process was unresponsive to
SIGINT in the first place.  There's little we can do about long-running
plpython functions, for instance, but if it was looping in Postgres
proper then we should do something about that.  Can you reproduce this
problem easily?
  


Unfortunately not. I have tried the same query and it took only about 1 
sec to complete. In fact, it's a simple seq scan with a single filter 
condition. No user defined functions are involved.
Maybe it has something to do with the users connecting from their 
Windows machines to the PostgreSQL server using psqlodbc. On the other 
hand, it has not been the first time that such a user connection had to 
be terminated and we did never experience this problem.
If I see the phenomenon again I will use strace or something similar to 
find out what the backend process is doing.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] current_user changes immediately after login

2007-10-31 Thread Christian Schröder

Hi list,
I have a strange problem: When I connect to one of my databases, the 
current_user immediatly changes without any interaction from my side. 
This is what I do:


[EMAIL PROTECTED]:~ psql -h db2 testdb
Welcome to psql 8.2.5, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
  \h for help with SQL commands
  \? for help with psql commands
  \g or terminate with semicolon to execute query
  \q to quit

testdb= select session_user;
session_user
--
chschroe
(1 row)

testdb= select current_user;
current_user
--
itstaff
(1 row)

How can this happen? The consequence of this is that I cannot access the 
tables because the role itstaff has no privileges. reset role does 
not have any effect. After set role none the current_user is equal to 
the session_user and the tables are accessible again.


This problem does only occur with this database. Do you have any 
explanation?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] current_user changes immediately after login

2007-10-31 Thread Christian Schröder

Tom Lane wrote:

=?ISO-8859-1?Q?Christian_Schr=F6der?= [EMAIL PROTECTED] writes:
  
I have a strange problem: When I connect to one of my databases, the 
current_user immediatly changes without any interaction from my side. 



That's bizarre.  Do you have anything in ~/.psqlrc?  I'm also wondering
(though too lazy to experiment) whether this effect could be produced
via ALTER DATABASE foo SET role = bar ... checking
pg_database.datconfig would be enough to eliminate that idea ...
  


I had found the problem just before I read your mail and you are 
perfectly right with your guess: These are the corresponding entries 
from the pg_database system catalog (without the datacl column):


 datname |datconfig
-+
 testdb  | {role=itstaff}

So it seems that one of the users (one with the appropriate permissions 
for this database) has issued this alter database command (probably 
without really understanding what he did *g*).

After alter database testdb reset role everything is ok now.

Thanks for your help!

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer







---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] (Never?) Kill Postmaster?

2007-10-24 Thread Christian Schröder

Ow Mun Heng wrote:

look for the query's procpid and then issue a select
pg_cancel_backend('the_id')
  


Does it do any harm if I kill (either with signal 9 or signal 15) the 
single backend process (and not the postmaster)?


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Performance Issues

2007-09-21 Thread Christian Schröder

Alvaro Herrera wrote:

Christian Schröder wrote:

  
I think it is my job as db admin to make the database work the way my users 
need it, and not the user's job to find a solution that fits the database's 
needs ...


Is there really nothing that I can do?



You can improve the selectivity estimator function.  One idea is that if
you are storing something that's not really a general character string,
develop a specific datatype, with a more precise selectivity estimator.
If you are you up to coding in C, that is.
  


Hm, that sounds interesting! I will definitely give it a try.
Will that also solve the problem of combining more than one of these 
conditions? As far as I can see, the main issue at the moment is that we 
often have ... where test like '11%' and test not like '113%' in our 
queries. Even if the selectivity estimation of the single condition will 
be improved, it will still be wrong to multiply the selectivities.


I think I will have a look at the src/backend/optimizer/util/plancat.c, 
src/backend/optimizer/path/clausesel.c and 
src/backend/utils/adt/selfuncs.c files after my holiday.


Kind regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Performance Issues

2007-09-20 Thread Christian Schröder

John D. Burger wrote:

Christian Schröder wrote:

Or would it be possible to tweak how the planner determines the 
selectivity? I have read in the docs (chapter 54.1) that in case of 
more than one condition in the where clause, independency is assumed. 
In my case (... where test like '11%' and test not like '113%') 
this is clearly not the case, so it might be an interesting point to 
address.


I think the planner does think about the interactions of inequalities, 
so if you can express your query with less-than and friends, or even 
with BETWEEN, you might get a better plan.  I don't know the details 
of your setup, but you can do things like this with any ordered type:


   where test between '11' and '113'
or test = '114'

I know this does not match the exact semantics of your query, but 
hopefully you get the idea.


There are two drawbacks of this solution:

  1. It is not always possible to rewrite the like or substring
 queries with standard relational operators.
  2. It is annoying for my users that they have to tewak the query
 until they find a solution that takes 5 seconds to finish instead
 of 4 hours.

I think it is my job as db admin to make the database work the way my 
users need it, and not the user's job to find a solution that fits the 
database's needs ...


Is there really nothing that I can do?

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


[GENERAL] Performance Issues (was: like vs substring again)

2007-09-18 Thread Christian Schröder

Hi list,
I am still fighting with the really slow database queries (see 
http://www.nabble.com/%22like%22-vs-%22substring%22-again-t4447906.html), 
and I still believe that the cause of the problem is that the query 
planner makes incorrect estimations about the selectivity of the where 
clauses.
I wondered if it is possible to make the query planner perform a 
sequential scan over a table *before* it starts planning? If I know that 
a table has only about 3000 rows, the overhead due to this sequential 
scan can be ignored. On the other hand, this would give the planner an 
exact data basis for his planning.
Or would it be possible to tweak how the planner determines the 
selectivity? I have read in the docs (chapter 54.1) that in case of more 
than one condition in the where clause, independency is assumed. In my 
case (... where test like '11%' and test not like '113%') this is 
clearly not the case, so it might be an interesting point to address.

Do you have any other tips for me?

Kind regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] like vs substring again

2007-09-15 Thread Christian Schröder

Hi list,

last week I asked a question about a query with several joins and a 
like operator which was really slow. When I replaced like with 
substring (which was possible because the comparison was simply bla 
like '123%') the query became extremely faster because the query 
optimizer came to another plan.


Gregory Stark gave me the hint (thanks, Gregory!) that the query 
optimizer made wrong assumptions about the selectivity of the like. 
When I used substring the assumptions became better and so it chose a 
better (faster) plan. I then increased the statistics target of the 
column and the query with like became as fast as when I used 
substring. So far, so good.


Now I have a similar problem: I have a query (which doesn't look too 
complicated to me) which takes about 4.5 hours on a 2 GHz Dual-Core Xeon 
machine. The query joins several tables and has two comparisons, one 
with not like, the other with substring(...) = . When I use like 
and not like together or substring(...) =  and substring(...)   
together, the query takes about 5 seconds. The plan is identical in both 
cases and different to the plan when I mix the comparisons. The most 
obvious difference is the number of rows the query optimizer expects to 
get from the table which is filtered: It expects 1 if I mix the 
comparison operators and 84 if I consistently use like or substring. 
The real number of selected rows is 1667 (from a total of 2884 rows), so 
both estimations are rather wrong. Note that this is exactly the same 
column for which I increased the statistics target to 500 after last 
week's discussion ...


I then set up a test table with the problematic column and filled it 
with the same data. The test table looks as follows:


  Table pg_temp_3.temp
Column | Type  | Modifiers
+---+---
test   | character(10) | not null

I set the statistics target to 1000 for this column and ran the 
following queries:


explain analyze select * from temp where test like '11%' and test not like 
'113%';
 QUERY PLAN
--
Seq Scan on temp  (cost=0.00..62.26 rows=39 width=14) (actual 
time=0.012..1.229 rows=1678 loops=1)
  Filter: ((test ~~ '11%'::text) AND (test !~~ '113%'::text))
Total runtime: 1.655 ms
(3 rows)

explain analyze select * from temp where substring(test from 1 for 2) = '11' and 
substring(test from 1 for 3)  '113';
  QUERY PLAN
-
Seq Scan on temp  (cost=0.00..91.10 rows=14 width=14) (actual 
time=0.020..3.282 rows=1678 loops=1)
  Filter: ((substring((test)::text, 1, 2) = '11'::text) AND 
(substring((test)::text, 1, 3)  '113'::text))
Total runtime: 3.719 ms
(3 rows)

explain analyze select * from temp where substring(test from 1 for 2) = '11' 
and test not like '113%';
QUERY PLAN
-
Seq Scan on temp  (cost=0.00..76.68 rows=1 width=14) (actual 
time=0.018..2.469 rows=1678 loops=1)
  Filter: ((substring((test)::text, 1, 2) = '11'::text) AND (test !~~ 
'113%'::text))
Total runtime: 2.914 ms
(3 rows)

As far as I understand, all queries are semantically identical and have 
the same result set. However, the query optimizer makes very different 
estimations about the number of rows the queries would return. All the 
estimations are far from reality, and at least the last one leads to 
fatal results when this where clause is part of a more complex query.


So I have the following questions:

  1. Why does the query optimizer not recognize that the expressions
 are equivalent?
  2. What can I do to improve the estimation of the query optimizer? I
 tried to create an index (with opclass bpchar_pattern_ops) which
 was actually used in the first query, but did not improve the
 estimation or the execution speed.

Thanks again for any help!

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Query with like is really slow

2007-09-09 Thread Christian Schröder




Gregory Stark wrote:


  Christian Schrder [EMAIL PROTECTED] writes:

...
  
  
-  Seq Scan on table2  (cost=0.00..186.64 rows=2 width=4) (actual time=0.052..2.259 rows=42 loops=1)
  Filter: (c ~~ '1131%'::text)

  
  ...
  
  
-  Seq Scan on table2  (cost=0.00..200.89 rows=14 width=4) (actual time=0.084..3.419 rows=42 loops=1)
  Filter: ("substring"((c)::text, 1, 4) = 1131'::text)

  
  ...

  
  
My question is: Why do I have to optimize my query (use "substring" instead
of "like") instead of having the database do this for me? Or is there a
difference between both queries which I cannot see?

  
  
The only difference is that the optimizer understands LIKE better than it does
substring and so it tries harder to come up with a good estimate of how many
rows will match. In this case it seems its estimate is actually better (by
pure luck) with the substring() call. But it's still not very good.

Have these tables been analyzed recently? If so try raising the statistics
target on the "c" column. If the number of rows estimated goes up from 2 to
the 14 it's estimating with substring() then you'll get the better plan.
Hopefully it would be even better than that though.
  


Yes, all tables are "vacuum analyzed" twice per day. (I did not have
time to configure the auto-vacuum feature.)

But after increasing the statistics target of the column to 20 and
re-analyzing the table the query planner chose the better plan and the
query got sped up dramatically. You seem to have found the problem!
I have now increased the default statistics target from 10 to 20 and
the statistics target of this column to 500. We have about 190 distinct
values in this column, so with a statistics target of 500 the
statistics should be as exact as possible. (At least if I have
understood well what this parameter means.) Since we have many queries
that rely on this column to me it seems to be a good idea to have best
statistics about it. I cannot see any disadvantage of this approach, at
least if I do it only for one single column. Or do I overlook anything?


  
  
And last question: I do not really understand the first query plan. The actual
time for the outer nested loop is 532673.631 ms. As far as I have understood
the docs this includes the child nodes. But I cannot find the time-consuming
child node. I only see two child nodes: The inner nested loop (which took
31.692 ms) and the index scan (which took 243.643 ms). Or do I have to multiply
the 243.643 ms with 1627 (number of loops)? But even then I get 396407.161 ms,
which is still far away from the 532673.631 ms in the parent node.

  
  
The nested loop still has to do some work. Actually it's quite possible that
that extra overhead in nested loop is largely gettimeofday() calls for the
explain analyze. Does the query take less time run without explain analyze
than it does run with it?
  

You seem to be right with your assumption that most of the extra time
is spent in the gettimeofday() calls: Without "explain analyze" the
query took about 6 minutes which is close to 380 seconds that I get
from multiplying the number of loops (1627) with the actual time per
loop (234.643 ms).

Many thanks for your very helpful explanations!

Regards,
 Christian
-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Bckler-Strae 2  http://www.deriva.de
D-37079 Gttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




[GENERAL] Query with like is really slow

2007-09-07 Thread Christian Schröder

Hi list,
if you please have a look at the following query:

SELECT DISTINCT a FROM table1
INNER JOIN table2 USING (b)
INNER JOIN view1 USING (a)
WHERE c like '1131%'
AND d IS NOT NULL
AND e IS NOT NULL;

Unfortunately, I have not been able to construct a suitable test case, 
so I had to take the query from our productive system and had to replace 
all table and field names because our table layout is considered 
confidential. The tables and views are explained at the end of this mail.


This query is really slow, so I had a look at the query plan:

 
QUERY PLAN

-
Unique  (cost=14960.17..14960.18 rows=2 width=16) (actual 
time=532691.185..532692.714 rows=1625 loops=1)
  -  Sort  (cost=14960.17..14960.17 rows=2 width=16) (actual 
time=532691.183..532691.632 rows=1625 loops=1)

Sort Key: table1.a
-  Nested Loop  (cost=0.00..14960.16 rows=2 width=16) (actual 
time=145.972..532673.631 rows=1625 loops=1)

  Join Filter: ((table1.a)::bpchar = (table3.a)::bpchar)
  -  Nested Loop  (cost=0.00..1673.61 rows=2 width=16) 
(actual time=0.094..31.692 rows=1627 loops=1)
-  Seq Scan on table2  (cost=0.00..186.64 rows=2 
width=4) (actual time=0.052..2.259 rows=42 loops=1)

  Filter: (c ~~ '1131%'::text)
-  Index Scan using table1_b_index on table1  
(cost=0.00..743.39 rows=8 width=20) (actual time=0.047..0.625 rows=39 
loops=42)

  Index Cond: (table1.b = table2.b)
  Filter: ((d IS NOT NULL) AND (e IS NOT NULL))
  -  Index Scan using table3_f on table3  
(cost=0.00..3737.05 rows=232498 width=16) (actual time=0.092..243.643 
rows=225893 loops=1627)

Index Cond: (f = 'foo'::bpchar)
Total runtime: 532693.200 ms
(14 rows)

If I replace where c like '1131%' with where substring(c from 1 for 
4) = '1131' (which to me seems to be exactly equivalent) I get a 
completely different query plan which is much faster:



QUERY PLAN

-
Unique  (cost=30626.05..30626.11 rows=13 width=16) (actual 
time=378.237..379.773 rows=1625 loops=1)
  -  Sort  (cost=30626.05..30626.08 rows=13 width=16) (actual 
time=378.236..378.685 rows=1625 loops=1)

Sort Key: table1.a
-  Hash Join  (cost=9507.43..30625.80 rows=13 width=16) 
(actual time=22.189..368.361 rows=1625 loops=1)

  Hash Cond: ((table3.a)::bpchar = (table1.a)::bpchar)
  -  Seq Scan on table3  (cost=0.00..20246.38 rows=232498 
width=16) (actual time=0.012..253.335 rows=225893 loops=1)

Filter: (f = 'foo'::bpchar)
  -  Hash  (cost=9507.22..9507.22 rows=17 width=16) 
(actual time=20.921..20.921 rows=1627 loops=1)
-  Nested Loop  (cost=0.00..9507.22 rows=17 
width=16) (actual time=0.121..19.837 rows=1627 loops=1)
  -  Seq Scan on table2  (cost=0.00..200.89 
rows=14 width=4) (actual time=0.084..3.419 rows=42 loops=1)
Filter: (substring((c)::text, 1, 4) = 
'1131'::text)
  -  Index Scan using table1_b_index on 
table1  (cost=0.00..664.64 rows=8 width=20) (actual time=0.024..0.364 
rows=39 loops=42)

Index Cond: (table1.b = table2.b)
Filter: ((d IS NOT NULL) AND (e IS NOT 
NULL))

Total runtime: 380.259 ms
(15 rows)

My question is: Why do I have to optimize my query (use substring 
instead of like) instead of having the database do this for me? Or is 
there a difference between both queries which I cannot see?
What can I do to get better results when using like? Do I have to add 
indices?
And last question: I do not really understand the first query plan. The 
actual time for the outer nested loop is 532673.631 ms. As far as I have 
understood the docs this includes the child nodes. But I cannot find the 
time-consuming child node. I only see two child nodes: The inner nested 
loop (which took 31.692 ms) and the index scan (which took 243.643 ms). 
Or do I have to multiply the 243.643 ms with 1627 (number of loops)? But 
even then I get 396407.161 ms, which is still far away from the 
532673.631 ms in the parent node.


Thanks for any help!

Regards,
   Christian

Appendix:

The tables and views look like this (I have left out the fields and 
constraints that don't participate in the queries):


  Table table1
Column |   Type   | Modifiers
+--+---
a  

Re: [GENERAL] out of memory error

2007-08-24 Thread Christian Schröder

Martijn van Oosterhout wrote:

You've got it completely wrong.


Hm, you seem to be right. :(
I have now decreased the shared_buffers setting to 128 MB. I have also 
found some tuning pages with warnings about not setting the value too 
high. I'm sure that I have read these pages before, but I seem to have 
been blind ...


Ok, many thanks for putting me right!

Regards,
  Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] out of memory error

2007-08-24 Thread Christian Schröder




Mikko Partio wrote:

  
  Isn't 128MB quite low considering the "current standard" of 25%
- 50% of total ram?
  
  


I had also read a statement about using this amount of memory as shared
buffers. Exactly that was the reason why I set it to such a high value,
but I am now convinced that this is wrong.

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#shbuf
and
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
are quite clear about this.

On the other hand,
http://edoceo.com/liber/db-postgresql-performance.php says:

  
Shared Memory
PostgreSQL uses lots of this, view ipcs to prove
it, the more shared memory the better as more data (tables) can be
loaded.
On a dedicated datbase server it's not uncommon to give half the memory
to the database.
  

and

  
shared_buffers = N
Set anywhere from 1/4 to 1/2 physical memory, must set kernel
shared memory max first.
Will see noticeable difference.
  

Since the first links are also mentioned on the official PostgreSQL
website (http://www.postgresql.org/docs/techdocs.2) I think they should
be trusted more.

Regards,
 Christian
-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Bckler-Strae 2  http://www.deriva.de
D-37079 Gttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




Re: [GENERAL] out of memory error

2007-08-24 Thread Christian Schröder
Side note: Why does Thunderbird send HTML mails albeit being configured 
for sending plain text mails? Sorry for that! And sorry for being off-topic.


Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] out of memory error

2007-08-23 Thread Christian Schröder

hubert depesz lubaczewski wrote:

On Wed, Aug 22, 2007 at 07:07:20PM +0200, Christian Schröder wrote:
  

These are the current settings from the server configuration:
   shared_buffers = 3GB



this is *way* to much. i would suggest lowering it to 1gig *at most*.
  

Ok, I can do this, but why can more memory be harmful?

   max memory size (kbytes, -m) 3441565



this looks like too close to shared_buffers. again - lower it.
  

What happens if I set shared_buffers higher than the ulimit?

The machine is a linux box with 4 GB memory running PostgreSQL 8.2.4.



is it by any chance i386 architecture?
  
Linux db2 2.6.18.8-0.3-default #1 SMP Tue Apr 17 08:42:35 UTC 2007 
x86_64 x86_64 x86_64 GNU/Linux


Intel(R) Xeon(R) CPU 5130  @ 2.00GHz with 4 logical processors (2 physical)

vm.overcommit_memory = 2 # No memory overcommit.

Regards,
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] out of memory error

2007-08-23 Thread Christian Schröder




Tom Lane wrote:

  
Ok, I can do this, but why can more memory be harmful?

  
  
Because you've left no room for anything else?  The kernel, the various
other daemons, the Postgres code itself, and the local memory for each
Postgres process all require more than zero space.
  


So does this mean that the stuff you mentioned needs more than 1 GB of
memory? I seem to have undererstimated the amount of memory that is
needed for these purposes. :(


  
Even more to the point, with such a large shared-buffer space, the
kernel probably will be tempted to swap out whatever parts of it seem
less used at the moment.  That is far more harmful to performance than
not having had the buffer in the first place --- it can easily triple
the amount of disk I/O involved.  (Thought experiment: dirty buffer is
written to disk, versus dirty buffer is swapped out to disk, then later
has to be swapped in so it can be written to wherever it should have
gone.)

Bottom line is that PG shared buffers are not so important as to deserve
3/4ths of your RAM.
  


Thanks for your tips! I have changed the "shared_buffers" setting back
to 2 GB. It was set to 2 GB before, but we also had "out of memory"
errors with this setting, so I raised it to 3 GB.
Could you please help me understand what's happening? The server is a
dedicated database server. Few other demons are running, most of them
are system services that do not consume a considerable amount of
memory. No web server or similar is running on this machine.
Moreover, the output of "free" confuses me:

 db2:~ # free -m
  total used free shared buffers
cached
 Mem: 3954 3724 229 0
0 3097
 -/+ buffers/cache: 627 3326
 Swap: 2055 628 1426

Doesn't that mean that plenty of memory is unused? I always thought
that the memory used for buffers and caches can be thought of as free
memory. Isn't this correct?
Regarding the memory needs of the PostgreSQL server itself: Is there
any estimation how much memory will be needed besides the shared
buffers? What exactly does "out of memory" mean? Who requested the
memory and why could this memory request not be fulfilled?
I can post the memory overview from the log file, but I don't know if
it's considered impolite to post so many lines to this mailing list.

Thanks a lot again for your help,
 Christian
-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Bckler-Strae 2  http://www.deriva.de
D-37079 Gttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




[GENERAL] out of memory error

2007-08-22 Thread Christian Schröder

Hi list,
I am struggling with some out of memory errors in our PostgreSQL 
database which I do not understand. Perhaps someone can give me a hint.
The application which causes the errors runs multi-threaded with 10 
threads. Each of the threads performs several select statements on the 
database. Some of the statements are rather complicated (joins over 
mulitple tables etc.) From time to time some of the statements lead to 
out of memory errors. The errors are not reproducable and if I run the 
statements alone everything works fine.


When I watch the output of free -m while the application runs, the 
used memory (without buffers) is always near 500 MB:
total   used   free sharedbuffers 
cached
   Mem:  3954   3410543  0  0   
2942

   -/+ buffers/cache:467   3486
   Swap: 2055556   1498

These are the current settings from the server configuration:
   shared_buffers = 3GB
   work_mem = 8MB
   maintenance_work_mem = 256MB
   max_stack_depth = 4MB

The output of ulimit -a is as follows:
   core file size  (blocks, -c) 0
   data seg size   (kbytes, -d) unlimited
   file size   (blocks, -f) unlimited
   pending signals (-i) 38912
   max locked memory   (kbytes, -l) 32
   max memory size (kbytes, -m) 3441565
   open files  (-n) 1024
   pipe size(512 bytes, -p) 8
   POSIX message queues (bytes, -q) 819200
   stack size  (kbytes, -s) 8192
   cpu time   (seconds, -t) unlimited
   max user processes  (-u) 38912
   virtual memory  (kbytes, -v) 4922720
   file locks  (-x) unlimited

The machine is a linux box with 4 GB memory running PostgreSQL 8.2.4.

I can post the details about the memory status from the log file if it 
is needed.


Are there any configuration parameters that influence the amount of 
available memory (besides shared_buffers which seems to be high enough)?


Thanks for any help!
   Christian

--
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] How to enforce uniqueness when NULL values are present?

2007-03-17 Thread Christian Schröder
Thank you for your tips. I think I will change the tables and use some
minimal date instead of a null value to represent a constraint that is
valid all the time. An additional advantage of this approach is that I
can then make sure that the time intervals (I not only have a start
date, but also an end date) don't overlap.
Nevertheless, I think that there are some examples where a null value
does not exactly mean unknown. But this is beyond the scope of this
discussion.

Regards,
Christian

-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] issue with SELECT settval(..);

2007-03-17 Thread Christian Schröder
Alain Roger wrote:
 insert into immense.statususer (statususer_id, statususer_type) values
 (SELECT nextval( 'statususer_statususer_id_seq' ),'customer');
The correct syntax would be:

insert into immense.statususer (statususer_id, statususer_type) values
((SELECT nextval( 'statususer_statususer_id_seq' )),'customer');

The sub-select must be put in parentheses. However, the much simpler
statement

insert into immense.statususer (statususer_id, statususer_type) values
(nextval( 'statususer_statususer_id_seq' ),'customer');

will do the same without a sub-select.

Regards
Christian

-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] How to enforce uniqueness when NULL values are present?

2007-03-11 Thread Christian Schröder
Berend Tober wrote:
 Christian Schröder wrote:
 Peter Eisentraut wrote:
  
 A first step in that direction would be to rethink the apparently
 troublesome use of null values.  
 Some of the limits are
 only valid after a given date, whereas other limits are valid all the
 time. How would you put this information into one or more tables? Of
 course, I could use a special date to indicate that a limit is valid all
 the time (e.g. 1970-01-01), but I don't think that this is better design
 than representing this with a NULL value. 

 I disagree. Using -infinity fits your defined needs unambiguously,
 except that you have to use timestamp data type rather than just date
I agree that this would be a correct model for the given application.
But wouldn't it be possible to think of a scenario where the same
problem arises? The core of my problem is that some of the records are
more exactly identified than some others. Some of them are identified
using one field, whereas some others need a second field to be uniquely
identified. Couldn't we construct examples for this?
Of course, if a NULL always means unknown, then this approach doesn't
make sense. Where can I find an authorative definition of what NULL
means? As I have quoted before, according to the Wikipedia (far from
being authorative!) a NULL can also mean not applicable.

Regards,
Christian

-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] How to enforce uniqueness when NULL values are present?

2007-03-10 Thread Christian Schröder
Hi list!
Consider the following table definition:

 Column |   Type   | Modifiers
+--+---
 id | integer  | not null
 date   | date |
 value  | double precision |

The id and date field together are some sort of primary key. As you see,
the date field is nullable. For the entries, the following should be
ensured:

   1. If a record with a given id and a null value in the date field
  exists, no other record with the same id is allowed.
   2. If multiple records with the same id exist, they must have
  different values in the date field and none of them must have a
  null value in this field.

How can I enforce these constraints?

Since primary keys must not contain nullable fields, I cannot define a
primary key. I tried to define two separate partial unique indices, one
for the records with a null value as date, one for those with a non-null
value:
create unique index idx1 on test (id) where date is null;
create unique index idx2 on test (id, date) where date is not null;

This ensures that at most one record with a given id and a null value as
date is possible, and that multiple records with the same id must have
different dates. However, it is still possible to insert one record
without a date and one or more records with dates, which violates my
above constraints.

My next idea was creating an own operator class which treats null values
as equal. For example, my special comparison operator =* would have the
following behaviour:
'2007-01-01'::date =* '2007-01-01'::date - true
'2007-01-01'::date =* '2007-01-02'::date - false
'2007-01-01'::date =* null - true (!)
null =* '2007-01-01'::date - true (!)
null =* null - true (!)

If these operators would be used when checking for uniqueness, the
records with a null date would always be equal to any record with a
non-null date; thus, it would not be allowed to insert more than one
record with the same id unless they had different non-null dates.

Unfortunately, this doesn't work. :-(  I assume that the date column is
never used at all so that my comparison operator is never asked. So what
can I do to make this work?

I hope someone has a solution for me. Many thanks in advance!

Christian

P.S.: I'm using PostgreSQL 8.2.3

-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] How to enforce uniqueness when NULL values are present?

2007-03-10 Thread Christian Schröder
Peter Eisentraut wrote:
 I submit that you should rethink your database schema and properly 
 normalize it.  You are attempting to retool the algebra that underlies 
   
I don't quite understand why this is a question of normalization. As far
as I can see, my table seems to be normalized as far as possible.
 A first step in that direction would be to rethink the apparently 
 troublesome use of null values.
   
Let me give you a more concrete example of the usage of my table. The
table was as follows:

 Column |   Type   | Modifiers
+--+---
 id | integer  | not null
 date   | date |
 value  | double precision |

Let's assume that the values in this table are some limits that are
given for different data (identified by the id). Some of the limits are
only valid after a given date, whereas other limits are valid all the
time. How would you put this information into one or more tables? Of
course, I could use a special date to indicate that a limit is valid all
the time (e.g. 1970-01-01), but I don't think that this is better design
than representing this with a NULL value. Or I could split the data into
two different tables, one with the date column and one without. But then
I had to work with two tables with more or less the same meaning.
Wouldn't it be quite strange to model the same entities (the limits)
with two tables?
I know that it's always dangerous to quote the Wikipedia. Let me do it
anyway: Attributes in tables in SQL database management systems can
optionally be designated as NULL. This indicates that the actual value
of the column is unknown _or not applicable_.
(http://en.wikipedia.org/wiki/Null_%28SQL%29) This is exactly what I
once learned about NULL values, and not applicable is exactly why I
use NULL values in my example.

Regards,
Christian

-- 
Deriva GmbH Tel.: +49 551 489500-42
Financial IT and Consulting Fax:  +49 551 489500-91
Hans-Böckler-Straße 2  http://www.deriva.de
D-37079 Göttingen

Deriva CA Certificate: http://www.deriva.de/deriva-ca.cer




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq