Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-04 Thread Bruno Wolff III
On Thu, May 04, 2006 at 01:32:45 -0400,
  Greg Stark [EMAIL PROTECTED] wrote:
 Bruno Wolff III [EMAIL PROTECTED] writes:
 
  On Thu, May 04, 2006 at 00:05:16 -0400,
Greg Stark [EMAIL PROTECTED] wrote:
   Bruno Wolff III [EMAIL PROTECTED] writes:
   
 Whereas it shouldn't be hard to prove that this is equivalent:
 
 stark= explain select col1 from test group by upper(col1),col1 order 
 by upper(col1);
  QUERY PLAN  
 -
  Group  (cost=88.50..98.23 rows=200 width=32)
-  Sort  (cost=88.50..91.58 rows=1230 width=32)
  Sort Key: upper(col1), col1
  -  Seq Scan on test  (cost=0.00..25.38 rows=1230 width=32)
 (4 rows)

I don't think you can assume that that will be true for any locale. If 
there
are two different characters that both have the same uppercase version, 
this
will break things.
   
   No it won't.
  
  Sure it will, because when you do the group by you will get a different
  number of groups. When grouping by the original characters you will get
  separate groups for characters that have the same uppercase character, where
  as when grouing by the uppercased characters you won't.
 
 But grouping on *both* will produce the same groups as grouping on the
 original characters alone.

OK, I misssed that. My brain only saw upper(col) and not the immediately
following ,col1.
I aggree that grouping by col1 and upper(col1), col1 will give you the same
groups. And hence the queries should be equivalent.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-04 Thread Bruno Wolff III
On Thu, May 04, 2006 at 01:13:20 -0400,
  Tom Lane [EMAIL PROTECTED] wrote:
 
 I think it's a fair point that we could allow SELECT DISTINCT x ORDER BY
 foo(x) if foo() is stable, but that does not imply that sorting by x is
 interchangeable with sorting by foo(x).  foo = abs is a trivial
 counterexample.

I misunderstood Greg's example. Sorting by (foo(x), x) is a suitable
replacement for sorting by foo(x). So that it would be OK to rewrite
SELECT DISTINCT x ORDER BY foo(x)
as
SELECT DISTINCT ON (foo(x), x) x ORDER BY foo(x)

Whether or not this is worthwhile to automate, I am not in a good position
to judge.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-04 Thread Tom Lane
Bruno Wolff III [EMAIL PROTECTED] writes:
 ... it would be OK to rewrite
 SELECT DISTINCT x ORDER BY foo(x)
 as
 SELECT DISTINCT ON (foo(x), x) x ORDER BY foo(x)

This assumes that x = y implies foo(x) = foo(y), which is something
that's not necessarily the case, mainly because a datatype's =
function need not have a lot to do with the behavior of arbitrary
functions foo(), especially if foo() yields a different datatype.
The citext datatype is an easy counterexample: it thinks foo = Foo,
but md5() of those values will not yield the same answers.

The bottom line here is that this sort of deduction requires more
understanding of the properties of datatypes and functions than
our existing catalogs allow the planner to obtain.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Revised R* tree using GiST

2006-05-04 Thread fernando esparza

Hello,

I'm finishing my career in computer engineering. To finish that I've
to develop a new implementation of R*-tree. My teacher Prof. Dr.
Seeger, http://dbs.mathematik.uni-marburg.de/Home/People/Professor has
already one done, now he wants me to add it to PostgreSQL using Gist.
Now that gist is the PostgreSQL core it's very dificult for me to know
from where to start with. Could you please give me some kind of guide?

Thank you. Sincerely yours.

Fernando Esparza

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Revised R* tree using GiST

2006-05-04 Thread Oleg Bartunov

On Thu, 4 May 2006, fernando esparza wrote:


Hello,

I'm finishing my career in computer engineering. To finish that I've
to develop a new implementation of R*-tree. My teacher Prof. Dr.
Seeger, http://dbs.mathematik.uni-marburg.de/Home/People/Professor has
already one done, now he wants me to add it to PostgreSQL using Gist.
Now that gist is the PostgreSQL core it's very dificult for me to know
from where to start with. Could you please give me some kind of guide?


Some information is available on 
http://www.sai.msu.su/~megera/oddmuse/index.cgi/GiST
We have also GiST programming tutorial (in Russian)
http://www.sai.msu.su/~megera/postgres/talks/gist_tutorial.html



Thank you. Sincerely yours.

Fernando Esparza

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Revised R* tree using GiST

2006-05-04 Thread Teodor Sigaev
For you it will be easy to understand GiST interface looking into 
src/backend/access/gist/gistproc.c. It implements RTree.


fernando esparza wrote:

Hello,

I'm finishing my career in computer engineering. To finish that I've
to develop a new implementation of R*-tree. My teacher Prof. Dr.
Seeger, http://dbs.mathematik.uni-marburg.de/Home/People/Professor has
already one done, now he wants me to add it to PostgreSQL using Gist.
Now that gist is the PostgreSQL core it's very dificult for me to know
from where to start with. Could you please give me some kind of guide?

Thank you. Sincerely yours.

Fernando Esparza

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Rethinking locking for database create/drop vs

2006-05-04 Thread Simon Riggs
On Wed, 2006-05-03 at 16:15 -0400, Tom Lane wrote:
 This is motivated by Jim Buttafuoco's recent gripe about not being
 able to connect while a DROP DATABASE is in progress:
 http://archives.postgresql.org/pgsql-hackers/2006-05/msg00074.php

...

  If dropdb() takes such a lock before it checks for active
 backends, then the connection sequence can look like this:
 
   1. read pg_database flat file to find out OID of target DB
   2. initialize far enough to be able to start a transaction,
  and do so
   3. take a shared lock on the target DB by OID
   4. re-read pg_database flat file and verify DB still exists

Many people never CREATE or DROP databases. They just do everything in
the default database (name is release dependent) - at least on their
main system(s). It would be valid to optimize for that case.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Typo in ginxlog.c

2006-05-04 Thread Simon Riggs
On Tue, 2006-05-02 at 15:01 -0400, Alvaro Herrera wrote:
 Just noticed a typo in newly added ginxlog.c.  I don't have line
 numbers, but in ginRedoSplit() it reads:
 
   PageSetLSN(rpage, lsn);
   PageSetTLI(lpage, ThisTimeLineID);
   MarkBufferDirty(rbuffer);
 
   PageSetLSN(lpage, lsn);
   PageSetTLI(lpage, ThisTimeLineID);
   MarkBufferDirty(lbuffer);
 
 
 Notice the first call to PageSetTLI should be
 
   PageSetTLI(rpage, ThisTimeLineID);
 

Well spotted. We'd have not corrected that until someone's db failed.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Semi-undocumented functions in libpq

2006-05-04 Thread Martijn van Oosterhout
Comparing the lists of functions exported by libpq and those declared
by libpq-fe.h turns up a fair number of descrepancies. Most of these
functions are declared by internal header files. For clarity I think we
should clarify the situation, either explicity declare them for
external users, or stop exporting them. Now that we've bumped the major
version of libpq, now is the perfect time to decide.

The functions in question are:

Used by psql:
all the PQExpBuffer functions (./internal/pqexpbuffer.h)
pg_encoding_to_char  (./server/mb/pg_wchar.h)
pqsignal (./server/libpq/pqsignal.h)

Used by initdb and createdb:
pg_char_to_encoding  (./server/mb/pg_wchar.h)
Used by initdb:
pg_valid_server_encoding (./server/mb/pg_wchar.h)

Not used by anyone:
pg_utf_mblen (./server/mb/pg_wchar.h)
pgresStatus  (a way to access info from PQresStatus but not 
declared publically anywhere)

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


[HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Larry Rosenman
Gentlepeople,
Now that the patch is out for keeping the last
autovacuum/vacuum/analyze/autoanalyze
timestamp in the stats system is pending, what's the consensus view on
what, if any,
logging changes are wanted for autovacuum?

I have the time and inclination to cut code quickly for it.

Thanks,
Larry Rosenman

-- 
Larry Rosenman  
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531 

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-04 Thread Bruno Wolff III
On Thu, May 04, 2006 at 02:39:33 -0400,
  Tom Lane [EMAIL PROTECTED] wrote:
 Bruno Wolff III [EMAIL PROTECTED] writes:
  ... it would be OK to rewrite
  SELECT DISTINCT x ORDER BY foo(x)
  as
  SELECT DISTINCT ON (foo(x), x) x ORDER BY foo(x)
 
 This assumes that x = y implies foo(x) = foo(y), which is something
 that's not necessarily the case, mainly because a datatype's =
 function need not have a lot to do with the behavior of arbitrary
 functions foo(), especially if foo() yields a different datatype.
 The citext datatype is an easy counterexample: it thinks foo = Foo,
 but md5() of those values will not yield the same answers.
 
 The bottom line here is that this sort of deduction requires more
 understanding of the properties of datatypes and functions than
 our existing catalogs allow the planner to obtain.

Thanks for pointing that out. I should have realized that this was the same
(or at least close to) issue I was thinking would be a problem initially, but
then I started thinking that '=' promised more than it did and assumed that
x = y implies foo(x) = foo(y), which as you point out isn't always true.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Rethinking locking for database create/drop vs connection startup

2006-05-04 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Wed, 2006-05-03 at 16:15 -0400, Tom Lane wrote:
 If dropdb() takes such a lock before it checks for active
 backends, then the connection sequence can look like this:

 Many people never CREATE or DROP databases. They just do everything in
 the default database (name is release dependent) - at least on their
 main system(s). It would be valid to optimize for that case.

I'm not particularly concerned about people with only a couple of
databases --- reading the flat file isn't going to take any meaningful
amount of time for them anyway.  It's the folks with hundreds of
databases who might have a beef.  But those are exactly the people
who need create/drop database to be bulletproof.

As I've been working on this patch I've found that it will clean up a
whole lot of related issues, so I'm getting more and more convinced
it's the Right Thing.  Some points:

* Connecting will actually take RowExclusiveLock (ordinary writer's
lock), while CREATE DATABASE takes ShareLock on the template DB, and
of course DROP/RENAME DATABASE take AccessExclusiveLock.  This provides
for the first time an absolute guarantee that CREATE DATABASE gets a
consistent copy of the template: before we could never ensure that
someone didn't connect to the template and change it while the copy was
in progress.  At the same time, two CREATE DATABASEs can safely use the
same template, and of course two concurrent connections don't block
each other.

* Since we're trying not to take any table-level exclusive locks on
pg_database anymore, we need a different solution in flatfiles.c to
ensure only one transaction writes the flat file at a time.  To do this
I'm going to have a dedicated lock, used only in the flatfile code, that
is taken just before trying to write the file and held till commit
(which is immediately after).  This eliminates the former risk of
deadlock associated with manual updates to pg_database, and as a bonus
holds the exclusive lock for a much shorter period of time.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] pseudo-type record arguments for PL-functions

2006-05-04 Thread Markus Schiltknecht
Hi,

I'm trying to write a PL/Python function which is to be called from a
rule. I'd need pass the OLD and NEW tuple records to the function.
Unfortunately that does not work: 'pl/python functions cannot take type
record'.

What I have figured out by reading the source code: OLD and NEW are
pseudo types (otherwise, pl/python would not have thrown that error)
(plpython.c:1088). During parsing of SQL function definitions the
arguments for the function are checked. In a comment I've read:
Disallow pseudotypes in arguments (pg_proc:546). I checked the other
PLs and found out, that no one can handle pseudo-arguments.

What exactly are pseudo types? Why are the OLD and NEW records of a rule
pseudo-types? Why can PLs not handle pseudo-types?

Or is it possible to write a C-function which takes the OLD and NEW
records of a rule as arguments? Is there an example of such a thing?

Regards

Markus



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Chris Browne
[EMAIL PROTECTED] (Larry Rosenman) writes:
 Gentlepeople,
 Now that the patch is out for keeping the last
 autovacuum/vacuum/analyze/autoanalyze
 timestamp in the stats system is pending, what's the consensus view on
 what, if any,
 logging changes are wanted for autovacuum?

 I have the time and inclination to cut code quickly for it.

It would be Really Nice if it could draw in the verbose stats as to
what the VACUUM did...

e.g. - to collect some portion (INFO?  DETAIL?  I'm easy :-)) of the
information that PostgreSQL generates at either INFO: or DETAIL:
levels.

/* [EMAIL PROTECTED]/dba2 vacdb=*/ vacuum verbose analyze vacuum_requests;
INFO:  vacuuming public.vacuum_requests
INFO:  index vacuum_requests_pkey now contains 2449 row versions in 64 pages
DETAIL:  3 index pages have been deleted, 3 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index vr_priority now contains 0 row versions in 19 pages
DETAIL:  16 index pages have been deleted, 16 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuum_requests: found 0 removable, 2449 nonremovable row versions in 
65 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2809 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming pg_toast.pg_toast_95167460
INFO:  index pg_toast_95167460_index now contains 0 row versions in 1 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_toast_95167460: found 0 removable, 0 nonremovable row versions in 0 
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing public.vacuum_requests
INFO:  vacuum_requests: 65 pages, 2449 rows sampled, 2449 estimated total rows
VACUUM

-- 
cbbrowne,@,acm.org
http://cbbrowne.com/info/x.html
If you  stand in the middle  of a library and  shout Argh at
the top of your voice, everyone just stares at you. If you do the same
thing on an aeroplane, why does everyone join in?

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Rod Taylor
I don't know about anyone else, but the only time I look at that mess is
to find poor tuple/table or tuple/index ratios and other indications
that vacuum isn't working as well as it should be.

How about this instead:

Log when the actual autovacuum_vacuum_scale_factor (dead space cleaned
up) was more than 2 times the autovacuum_vacuum_scale_factor listed in
postgresql.conf. This means autovacuum isn't keeping up to what you want
it to.

Another interesting case would be a large amount of empty space in the
index or table (say 3x autovacuum_vacuum_scale_factor). This may
indicate unnecessary bloat and something to fix.

Aside from that, the raw numbers don't really interest me.

On Thu, 2006-05-04 at 14:46 +, Chris Browne wrote:
 [EMAIL PROTECTED] (Larry Rosenman) writes:
  Gentlepeople,
  Now that the patch is out for keeping the last
  autovacuum/vacuum/analyze/autoanalyze
  timestamp in the stats system is pending, what's the consensus view on
  what, if any,
  logging changes are wanted for autovacuum?
 
  I have the time and inclination to cut code quickly for it.
 
 It would be Really Nice if it could draw in the verbose stats as to
 what the VACUUM did...
 
 e.g. - to collect some portion (INFO?  DETAIL?  I'm easy :-)) of the
 information that PostgreSQL generates at either INFO: or DETAIL:
 levels.
 
 /* [EMAIL PROTECTED]/dba2 vacdb=*/ vacuum verbose analyze vacuum_requests;
 INFO:  vacuuming public.vacuum_requests
 INFO:  index vacuum_requests_pkey now contains 2449 row versions in 64 pages
 DETAIL:  3 index pages have been deleted, 3 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  index vr_priority now contains 0 row versions in 19 pages
 DETAIL:  16 index pages have been deleted, 16 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  vacuum_requests: found 0 removable, 2449 nonremovable row versions 
 in 65 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 2809 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  vacuuming pg_toast.pg_toast_95167460
 INFO:  index pg_toast_95167460_index now contains 0 row versions in 1 pages
 DETAIL:  0 index pages have been deleted, 0 are currently reusable.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  pg_toast_95167460: found 0 removable, 0 nonremovable row versions in 
 0 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 There were 0 unused item pointers.
 0 pages are entirely empty.
 CPU 0.00s/0.00u sec elapsed 0.00 sec.
 INFO:  analyzing public.vacuum_requests
 INFO:  vacuum_requests: 65 pages, 2449 rows sampled, 2449 estimated total 
 rows
 VACUUM
 
-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Larry Rosenman
Rod Taylor wrote:
 I don't know about anyone else, but the only time I look at that mess
 is to find poor tuple/table or tuple/index ratios and other
 indications that vacuum isn't working as well as it should be.
 
 How about this instead:
 
 Log when the actual autovacuum_vacuum_scale_factor (dead space cleaned
 up) was more than 2 times the autovacuum_vacuum_scale_factor listed in
 postgresql.conf. This means autovacuum isn't keeping up to what you
 want it to.
 
 Another interesting case would be a large amount of empty space in the
 index or table (say 3x autovacuum_vacuum_scale_factor). This may
 indicate unnecessary bloat and something to fix.
 
 Aside from that, the raw numbers don't really interest me.


Does anyone think we should have a stats view for the last vacuum stats
for each table?

I.E. capture all the verbose info somewhere?

Or,  do people just want to increase the logging?

I still don't see a consensus on what needs to come out. 

Do we still need the autovacuum_verbosity type change?

LER

 
-- 
Larry Rosenman  
Database Support Engineer

PERVASIVE SOFTWARE. INC.
12365B RIATA TRACE PKWY
3015
AUSTIN TX  78727-6531 

Tel: 512.231.6173
Fax: 512.231.6597
Email: [EMAIL PROTECTED]
Web: www.pervasive.com 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Rod Taylor
On Thu, 2006-05-04 at 11:25 -0500, Larry Rosenman wrote:
 Rod Taylor wrote:
  I don't know about anyone else, but the only time I look at that mess
  is to find poor tuple/table or tuple/index ratios and other
  indications that vacuum isn't working as well as it should be.
  
  How about this instead:
  
  Log when the actual autovacuum_vacuum_scale_factor (dead space cleaned
  up) was more than 2 times the autovacuum_vacuum_scale_factor listed in
  postgresql.conf. This means autovacuum isn't keeping up to what you
  want it to.
  
  Another interesting case would be a large amount of empty space in the
  index or table (say 3x autovacuum_vacuum_scale_factor). This may
  indicate unnecessary bloat and something to fix.
  
  Aside from that, the raw numbers don't really interest me.
 
 
 Does anyone think we should have a stats view for the last vacuum stats
 for each table?

This would actually suit me better as it would be trivial to plug into a
monitoring system with home-brew per table thresholds at that point.

-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-04 Thread Greg Stark
Bruno Wolff III [EMAIL PROTECTED] writes:

 Thanks for pointing that out. I should have realized that this was the same
 (or at least close to) issue I was thinking would be a problem initially, but
 then I started thinking that '=' promised more than it did and assumed that
 x = y implies foo(x) = foo(y), which as you point out isn't always true.

Hm. This goes back to the earlier conversation about whether = should ever be
true for two objects that aren't, well, equal. I thought there was some
consensus at the time that sorting should impose a superficial ordering on
items that compare equal but aren't in fact the same.

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Semi-undocumented functions in libpq

2006-05-04 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 Comparing the lists of functions exported by libpq and those declared
 by libpq-fe.h turns up a fair number of descrepancies. Most of these
 functions are declared by internal header files. For clarity I think we
 should clarify the situation, either explicity declare them for
 external users, or stop exporting them. Now that we've bumped the major
 version of libpq, now is the perfect time to decide.

Already done no?  (at least on the platforms where we know how to
restrict it)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-04 Thread Tom Lane
Markus Schiltknecht [EMAIL PROTECTED] writes:
 What exactly are pseudo types?

See
http://developer.postgresql.org/docs/postgres/extend-type-system.html

 Why can PLs not handle pseudo-types?

No one's done the work to figure out which ones are sensible to support
and then add the logic needed to support them.

In your particular case, the problem is that plpython isn't prepared to
handle rowtypes determined at runtime.  I'm not sure if the recently
submitted plpython patch fixes that or not.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-04 Thread Jim C. Nasby
On Wed, May 03, 2006 at 02:25:54PM -0400, Tom Lane wrote:
  On Tue, May 02, 2006 at 05:49:33PM -0500, Jim C. Nasby wrote:
  Back in the discussion of this someone had mentioned capturing all the
  info that you'd get from a vacuum verbose; dead tuples, etc. What do
  people think about that? In particular I think it'd be handy to know how
  many pages vacuum wanted in the FSM vs. how many it got; this would make
  it much easier for people to ensure that the FSM is large enough.
 
 Isn't this already dealt with by contrib/pg_freespacemap?

AFAIK that does nothing to tell you how much space is desired by
relations. It would tell you if the FSM is nearly full, but I'm not sure
that's very reliable, especially given how every relation that wants
space in the FSM is giving a sizeable minimum number of pages (16,
iirc). Even putting that aside, it'd be nice to have an exact amount of
space that was needed.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Typo in ginxlog.c

2006-05-04 Thread Jim C. Nasby
On Thu, May 04, 2006 at 08:46:54AM +0100, Simon Riggs wrote:
 On Tue, 2006-05-02 at 15:01 -0400, Alvaro Herrera wrote:
  Just noticed a typo in newly added ginxlog.c.  I don't have line
  numbers, but in ginRedoSplit() it reads:
  
  PageSetLSN(rpage, lsn);
  PageSetTLI(lpage, ThisTimeLineID);
  MarkBufferDirty(rbuffer);
  
  PageSetLSN(lpage, lsn);
  PageSetTLI(lpage, ThisTimeLineID);
  MarkBufferDirty(lbuffer);
  
  
  Notice the first call to PageSetTLI should be
  
  PageSetTLI(rpage, ThisTimeLineID);
  
 
 Well spotted. We'd have not corrected that until someone's db failed.

It also begs the question of if there should be a function/macro that
handles those 3 steps...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Warts with SELECT DISTINCT

2006-05-04 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Hm. This goes back to the earlier conversation about whether = should ever be
 true for two objects that aren't, well, equal. I thought there was some
 consensus at the time that sorting should impose a superficial ordering on
 items that compare equal but aren't in fact the same.

We forced that recently for text strings (overriding locale-specific
cases in strcoll()), but there certainly was not any intent to decree
that every datatype must do likewise.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Wed, May 03, 2006 at 02:25:54PM -0400, Tom Lane wrote:
 Isn't this already dealt with by contrib/pg_freespacemap?

 AFAIK that does nothing to tell you how much space is desired by
 relations.

I thought the latest patch arranged to expose per-relation lastPageCount
values.  If it doesn't then it certainly still needs work.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] autovacuum logging, part deux.

2006-05-04 Thread Jim C. Nasby
On Thu, May 04, 2006 at 12:37:48PM -0400, Rod Taylor wrote:
 On Thu, 2006-05-04 at 11:25 -0500, Larry Rosenman wrote:
  Rod Taylor wrote:
   I don't know about anyone else, but the only time I look at that mess
   is to find poor tuple/table or tuple/index ratios and other
   indications that vacuum isn't working as well as it should be.
   
   How about this instead:
   
   Log when the actual autovacuum_vacuum_scale_factor (dead space cleaned
   up) was more than 2 times the autovacuum_vacuum_scale_factor listed in
   postgresql.conf. This means autovacuum isn't keeping up to what you
   want it to.
   
   Another interesting case would be a large amount of empty space in the
   index or table (say 3x autovacuum_vacuum_scale_factor). This may
   indicate unnecessary bloat and something to fix.
   
   Aside from that, the raw numbers don't really interest me.
  
  
  Does anyone think we should have a stats view for the last vacuum stats
  for each table?
 
 This would actually suit me better as it would be trivial to plug into a
 monitoring system with home-brew per table thresholds at that point.

+1. But I also think it would be handy to have some means to better
control autovacuum logging, probably via something like
autovacuum_verbosity.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-04 Thread Jim C. Nasby
On Thu, May 04, 2006 at 01:20:32PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  On Wed, May 03, 2006 at 02:25:54PM -0400, Tom Lane wrote:
  Isn't this already dealt with by contrib/pg_freespacemap?
 
  AFAIK that does nothing to tell you how much space is desired by
  relations.
 
 I thought the latest patch arranged to expose per-relation lastPageCount
 values.  If it doesn't then it certainly still needs work.

After CVS-upping... yes, both lastpagecount and nextpage are now
included. But unfortunately the README says next to nothing about what
they mean...

   lastpagecount  |  | Count of pages examined for useful
  |  | free space.
   nextpage   |  | page index (from 0) to start next 
  |  | search at.

Perhaps if there was some discussion about how the backend used these
two values it would make more sense, but right now I'm not seeing how
lastpagecount relates to how many pages weren't included in the FSM that
should have been...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Semi-undocumented functions in libpq

2006-05-04 Thread Martijn van Oosterhout
On Thu, May 04, 2006 at 12:47:39PM -0400, Tom Lane wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  Comparing the lists of functions exported by libpq and those declared
  by libpq-fe.h turns up a fair number of descrepancies. Most of these
  functions are declared by internal header files. For clarity I think we
  should clarify the situation, either explicity declare them for
  external users, or stop exporting them. Now that we've bumped the major
  version of libpq, now is the perfect time to decide.
 
 Already done no?  (at least on the platforms where we know how to
 restrict it)

These functions are all in the exports.txt. I was just wondering if we
wanted to cut that list down any more...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-04 Thread Thomas Hallgren

Tom Lane wrote:

Why can PLs not handle pseudo-types?


No one's done the work to figure out which ones are sensible to support
and then add the logic needed to support them.

PL/Java will handle the RECORD type correctly. I'm just finalizing a new, more flexible, 
type mapping implementation for PL/Java and it would be easy to add support for more pseudo 
types too. But what others would make sense?


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Semi-undocumented functions in libpq

2006-05-04 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Thu, May 04, 2006 at 12:47:39PM -0400, Tom Lane wrote:
 Already done no?  (at least on the platforms where we know how to
 restrict it)

 These functions are all in the exports.txt. I was just wondering if we
 wanted to cut that list down any more...

AFAIK, everything that is in exports.txt was put there for a reason.
I'm happy with the situation as it stands (other than wanting to enforce
the exports.txt restriction on more platforms ...)

Did we come to a decision about whether to implement symbol versioning
for libpq?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-04 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 PL/Java will handle the RECORD type correctly. I'm just finalizing a new, 
 more flexible, 
 type mapping implementation for PL/Java and it would be easy to add support 
 for more pseudo 
 types too. But what others would make sense?

If you've got record/anyelement/anyarray support, you've probably pretty
much covered the bases.

Looking at the list (table 8-20), it strikes me that there's nothing
very pseudo about cstring anymore --- it could certainly be treated as
an ordinary datatype.  Not sure if there's any point in changing though.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] patch review, please: Autovacuum/Vacuum times via stats.

2006-05-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 After CVS-upping... yes, both lastpagecount and nextpage are now
 included. But unfortunately the README says next to nothing about what
 they mean...

Yeah, this needs a bit of work ... will have at it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Adding ON UPDATE CASCADE to an existing foreign key

2006-05-04 Thread Jim C. Nasby
Moving to -hackers...

On Thu, May 04, 2006 at 09:17:31AM -0700, Stephan Szabo wrote:
 On Thu, 4 May 2006, Rich Doughty wrote:
 
  I have a foreign key constraint that I'd like to alter. I'd rather not
  drop and re-create it due to the size of the table involved. All I need
  to do is add an ON UPDATE CASCADE.
 
  Is it ok to set confupdtype to 'c' in pg_constraint (and will this be
  all that's needed) or is it safer to drop and recreate the constraint?
 
 I don't think that's going to work, you'd probably need to change the
 function associated with the trigger involved too.  It's probably safer to
 do the drop and create.

It would be nice if there was a way to do this that didn't involve
re-validating all the data. Can this be added as a TODO?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] [EMAIL PROTECTED]: Re: [GENERAL] 8.1.4 anytime soon?]

2006-05-04 Thread Jim C. Nasby
Natives getting restless... :)

Any plans for a release?

- Forwarded message from Bruno Wolff III [EMAIL PROTECTED] -
On Tue, May 02, 2006 at 14:20:03 -0400,
  Matthew T. O'Connor matthew@zeut.net wrote:
 Hey all, I was just wondering if there were any plans to get 8.1.4 
 release in the near future.  I'm seeing semi-frequent out of memory 
 errors that are related to a bugfix that Tom put in post 8.1.3. (Yes I 
 know I can compile from source, but I'd rather not do that on my 
 production server.)

There was some talk a while ago (maybe a month?) about getting a 8.1.4
release out relatively soon. (This was after the partial block write issue
was discovered.), but I haven't seen anything recently.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


- End forwarded message -

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Semi-undocumented functions in libpq

2006-05-04 Thread Martijn van Oosterhout
On Thu, May 04, 2006 at 03:21:56PM -0400, Tom Lane wrote:
 AFAIK, everything that is in exports.txt was put there for a reason.
 I'm happy with the situation as it stands (other than wanting to enforce
 the exports.txt restriction on more platforms ...)

In that case, shouldn't we add to libpq-fe.h all the functions that
libpq is exporting?

 Did we come to a decision about whether to implement symbol versioning
 for libpq?

Not really. For distributors it's nice because it means that they don't
need to recompile the entire software stack just because a new postgres
release enters the archive. But for run-of-the-mill users who compile
postgres themselves, they won't notice one way or the other.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-04 Thread David Fetter
On Thu, May 04, 2006 at 09:02:02PM +0200, Thomas Hallgren wrote:
 Tom Lane wrote:
   Why can PLs not handle pseudo-types?
 
 No one's done the work to figure out which ones are sensible to
 support and then add the logic needed to support them.
 
 PL/Java will handle the RECORD type correctly. I'm just finalizing a
 new, more flexible, type mapping implementation for PL/Java and it
 would be easy to add support for more pseudo types too. But what
 others would make sense?

Ideally, some way to get all kinds of user-defined types.  DOMAINs,
too. :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Semi-undocumented functions in libpq

2006-05-04 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Thu, May 04, 2006 at 03:21:56PM -0400, Tom Lane wrote:
 AFAIK, everything that is in exports.txt was put there for a reason.
 I'm happy with the situation as it stands (other than wanting to enforce
 the exports.txt restriction on more platforms ...)

 In that case, shouldn't we add to libpq-fe.h all the functions that
 libpq is exporting?

No, because we're only intending that stuff like psql and pg_dump use 'em.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-05-04 Thread Bruce Momjian

I have read this thread and I agree with Magnus that we have both new
and experienced users, and we need something simple like SERIAL for new
users.

I agree that having SERIAL be a macro is probably less that useless ---
you can create SERIAL easily, but to remove a table you have to
understand the underlying system, so we actually add complexity by
having SERIAL as a macro --- if we did that, we might as well just
remove SERIAL if all it does is expand to DEFAULT nextval().

My idea is to create a new SECURITY DEFINER function called
serial_nextval(), and use that for SERIAL defaults.  That will fix the
sequence permission issue Magnus mentioned, and better document for new
users what the DEFAULT does (it is related to SERIAL).  It might also
help us flag cases where we should be modifying things during ALTER.

---

Magnus Hagander wrote:
  We started with #2 and have been moving slowly towards #1, 
  but I think there's a limit to how far we want to go in that 
  direction.  A black box approach isn't especially 
  user-friendly in my opinion; it's not solving any problems, 
  it's just refusing to deal with the implications of ALTER 
  TABLE and friends.  
 
 I think it's a matter of user-friendliness for *who*. A black box would
 definitly be a lot more user-friendly for a beginner, or someone who
 really doesn't care for more than just an auto-numbering column (which
 I'm sure is a lot of cases).
 
 For example, I've lost count of the number of times I've had to explain
 to people yes, I know you just created a table with a column, but when
 you need to GRANT permissions you need to do it twice - once for the
 column and once for the sequence you didn't know you created. I don't
 recall any of these cases ending with hey, what a handy feature that I
 can tweak the sequence independently.
 
 For an expert user it's certainly handy, though.
 
 
  What's more, the further we go in that 
  direction the more problems we'll have in pg_dump.  We've 
  already got issues there; for example, if someone renames a 
  serial's sequence or tweaks its sequence parameters, this 
  will not be preserved by dump/restore.
 
 If it was a proper black box, that wouldn't happen, since there would
 be no way to make those changes, right? So that argument would really be
 helped in either direction, with the problem mainly showing in the
 middle ground where we are now.
 
 
  The other concern the hidden dependency addresses is the idea 
  that the sequence ought to be silently dropped if the table 
  (or just the column) is dropped.  I wonder though if that 
  behavior is really worth the increasing contortions we're 
  going through to try to make things work 
  conveniently/transparently in other respects.  We're buying 
  simplicity for one case at the cost of tremendous 
  complication for other cases.
 
 I bet loads of databases would be filled with no-longer-used sequences
 in this case. But that may not really be a problem, since they don't
 exactly occupy loads of space when they just sit there...
 
 
  In short, I think there's a reasonably good case to be made 
  for losing the hidden dependency and re-adopting the 
  viewpoint that saying SERIAL is
  *exactly* the same as making a sequence and then making a 
  default expression that uses the sequence.  Nothing behind 
  the curtain.
 
 That certainly does have the merit of being very predictable behaviour -
 which is good. 
 
 Another note is that it's definitly going to make it harder for people
 coming in from other databases, that have IDENTITY or AUTO_NUMBER or
 whatever the feature is called there. They're likely to go even more
 what?! than now...
 
 If it's not obvious yet :-P, I'd be in favour of having SERIAL as
 black-box as possible, and then just use manual CREATE SEQUENCE and
 DEFAULT nextval() for when you need a more advanced case. But that's as
 seen from a user perspective, without regard for backend complexity.
 
 //Magnus
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-05-04 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 My idea is to create a new SECURITY DEFINER function called
 serial_nextval(), and use that for SERIAL defaults.

You haven't thought about this at all.  Who will own that function?
Surely we don't want to create a new one for every SERIAL column.
And even if we did, what magic will cause its ownership to change
when the table's owner is changed?

I'm leaning towards the idea that we need special syntax, along the
lines of
DEFAULT nextval('some_seq') AS OWNER
which would result in generating a special expression node type at
the time the DEFAULT expression is inserted into a query plan (and
no earlier).  At runtime this node would temporarily switch
current_user, just as we do for SECURITY_DEFINER functions --- but by
postponing the determination of which user to switch to until the plan
is built, we avoid trouble with ALTER TABLE OWNER.

Per Bruno's earlier comments, we probably need the same feature for
table CHECK constraints.  Might be interesting to think about it for
domain check constraints too, though that's getting a bit far afield
unless someone has a convincing use-case.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [pgsql-hackers-win32] [HACKERS] Build with Visual Studio MSVC

2006-05-04 Thread Gurjeet Singh

Hi William(uniware), Chuck and Hackers,

   I have been interested in doing complete PGSQL development in MSVC
for a long time now. With reference to one of Chuck's mails to
-hackers-win32 with the same subject, you said that you were able to
successfully compile PG 8.1 with some minor tweaks.

   Also, William has 'vcproject' hosted on pgfoundry, I downloaded
it, and tried compiling vcproject\msvc\postgres\postgres.dsw on
VC++6.0. It failed miserably with over 1000 errors. I am sure there's
some tweaks needed here too!!!

   First of all, I would like to build entire server using just VC++
6.0, with NO mingw toolchain (or try to minimize it's dependency as
much as possible). If successful, I'd be glad to maintain it too for
future releases, and add support for other components/tools too.

   My main grudge is that if we are supporting almost all flovours of
nixens and compilers (close to 34 according to official website), then
why are we leaving Windows platform alone? This will bring in quite a
lot more developers.

   I am sure it's not going to be easy, but I am sure with this great
community suppport, we sure can achieve it.

   Will and Chuck, please send the detailed steps of how to compile
pgsql the way you did it. All, please send in quirks, watch-outs when
trying to build using VC.

Thanks in advance,
Gurjeet.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [pgsql-hackers-win32] [HACKERS] Build with Visual Studio MSVC

2006-05-04 Thread Jonah H. Harris

On 5/4/06, Gurjeet Singh [EMAIL PROTECTED] wrote:

My main grudge is that if we are supporting almost all flovours of
nixens and compilers (close to 34 according to official website), then
why are we leaving Windows platform alone? This will bring in quite a
lot more developers.


Sorry, but this sounds pretty rude.  If you have a grudge, do
something about it and stop whining about our support for tons of *nix
platforms instead of Windows.  For the most part, the reason *nix
platform support is more popular is because that's what most of us use
on a daily basis.

Keep in mind this is an open source project and a lot of us have more
important things to work on than simple porting.  It's easy to
criticize, it's much harder to do the actual work.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

---(end of broadcast)---
TIP 6: explain analyze is your friend