Re: [HACKERS] message for constraint

2006-01-14 Thread Peter Eisentraut
Jaime Casanova wrote:
 From time to time people ask me if there is a way to customize
 messages for constraints so they could be more informative to the
 user...

What about this?

= create table foo (fld int4 constraint fld must contain positive numbers 
check (fld  0));
CREATE TABLE
= insert into foo values (-5);
ERROR:  new row for relation foo violates check constraint fld must contain 
positive numbers

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Leandro Guimarães Faria Corcete DUTRA
Rod Taylor pg at rbt.ca writes:

 The basic idea is that most of us break out schemas by creating fake
 primary keys for the purpose of obtaining performance because using the
 proper primary key (single or multiple columns) is often very slow.

This is one thing I simply can't understand.

If you still declare the natural key(s) as UNIQUEs, you have just made 
performance worse.  Now there are two keys to be checked on UPDATEs and 
INSERTs, two indexes to be updated, and probably a SEQUENCE too.

If you don't, you have just thrown away centralised, optimised integrity 
checking, and will probably have to remember to do a slower SELECT before 
updating.

Certainly decoupling presentation from storage would be nice, but even before 
that generalised use of surrogate keys seems to me a knee-jerk reaction.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Michael Glaesemann


On Jan 13, 2006, at 21:42 , Leandro Guimarães Faria Corcete DUTRA wrote:


If you still declare the natural key(s) as UNIQUEs, you have just made
performance worse.  Now there are two keys to be checked on UPDATEs  
and

INSERTs, two indexes to be updated, and probably a SEQUENCE too.


For UPDATEs and INSERTs, the proper primary key also needs to be  
checked, but keys are used for more than just checking uniqueness:  
they're also often used in JOINs. Joining against a single integer  
I'd think it quite a different proposition (I'd think faster in terms  
of performance) than joining against, say, a text column or a  
composite key.


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Lukas Smith

Michael Glaesemann wrote:


On Jan 13, 2006, at 21:42 , Leandro Guimarães Faria Corcete DUTRA wrote:


If you still declare the natural key(s) as UNIQUEs, you have just made
performance worse.  Now there are two keys to be checked on UPDATEs and
INSERTs, two indexes to be updated, and probably a SEQUENCE too.


For UPDATEs and INSERTs, the proper primary key also needs to be 
checked, but keys are used for more than just checking uniqueness: 
they're also often used in JOINs. Joining against a single integer I'd 
think it quite a different proposition (I'd think faster in terms of 
performance) than joining against, say, a text column or a composite key.


Well this is a balancing decision. You certainly slow down inserts. You 
might also increase the stress on the table because you have to 
translate between the different keys. It also depends on the join type 
you end up doing. It also obviously depends on how large your original 
primary key is. However whatever your situation is: make sure you do not 
end up doing premature optimization.


regards,
Lukas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Rod Taylor
On Fri, 2006-01-13 at 12:42 +, Leandro Guimarães Faria Corcete DUTRA
wrote:
 Rod Taylor pg at rbt.ca writes:
 
  The basic idea is that most of us break out schemas by creating fake
  primary keys for the purpose of obtaining performance because using the
  proper primary key (single or multiple columns) is often very slow.
 
 This is one thing I simply can't understand.
 
 If you still declare the natural key(s) as UNIQUEs, you have just made 
 performance worse.  Now there are two keys to be checked on UPDATEs and 
 INSERTs, two indexes to be updated, and probably a SEQUENCE too.

Indeed. Using a surrogate key is not free and that is why it would be
something the DBA would specify during table creation.

The main goal would be to give the option of using a surrogate key
without being forced to expose it to the applications using the
database. It is a feature akin to table spaces in that it can help
performance but without the application or standard users knowing why.

-- 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Lukas Smith

Rod Taylor wrote:

If you still declare the natural key(s) as UNIQUEs, you have just made 
performance worse.  Now there are two keys to be checked on UPDATEs and 
INSERTs, two indexes to be updated, and probably a SEQUENCE too.


Indeed. Using a surrogate key is not free and that is why it would be
something the DBA would specify during table creation.

The main goal would be to give the option of using a surrogate key
without being forced to expose it to the applications using the
database. It is a feature akin to table spaces in that it can help
performance but without the application or standard users knowing why.


Just this morning my father came to me (he is getting into SQL now that 
he is retired) with an issue where a surrogate key probably makes sense. 
He is storing a tree of plant families that can get fairly deep. The 
primary key is the scientific name. In order to improve performance and 
get rid of the recursive lookups he currently does he now wants to use 
materialized paths (parent name/sub name/sub sub name). He decided 
not to go with nested paths since that makes it very hard to hand fix 
things in the tree structure. Obviously using the scientific name in the 
materialized paths can quickly give you a really wide column if you have 
a fairly deep tree. In that case it could be beneficial to introduce a 
surrogate key.


The only annoying bit is that he frequently needs to sync with an 
external database where they use no surrogate key so the import slows 
down because he needs to check if a surrogate key has been introduced 
for every given scientific name before writing to the database.


regards,
Lukas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread mark
On Fri, Jan 13, 2006 at 12:42:55PM +, Leandro Guimarães Faria Corcete DUTRA 
wrote:
 Rod Taylor pg at rbt.ca writes:
  The basic idea is that most of us break out schemas by creating fake
  primary keys for the purpose of obtaining performance because using the
  proper primary key (single or multiple columns) is often very slow.
 This is one thing I simply can't understand.

 If you still declare the natural key(s) as UNIQUEs, you have just made 
 performance worse.  Now there are two keys to be checked on UPDATEs and 
 INSERTs, two indexes to be updated, and probably a SEQUENCE too.

Not to completely defend the practice - but in some applications,
INSERT is much less frequent than UPDATE, and that UPDATE requires a
unique check on the primary key and the surrogate key, as well as an
update, should be considered (and I believe is considered) a
PostgreSQL performance bug. It's undesirable and unnecessary behaviour
for the majority of uses (where they key does not change as a part of
the update).

 Certainly decoupling presentation from storage would be nice, but even before
 that generalised use of surrogate keys seems to me a knee-jerk reaction.

Yes, I agree. As per a previous thread, I'm one of those using it to
generalize my query / update implementation into common base code. I
have other reasons - but I confess to this being the real reason.

In my case, the cost of maintaining the code that queries / updates is
more expensive than the cost of having an extra unique index, and the
storage and performance impacts this has on my data. :-)

Is my primary reason good on its own, without the other more legitimate
justifications? It's good enough for me. I expect others to strongly
disagree.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] message for constraint

2006-01-14 Thread Jaime Casanova
On 1/14/06, Peter Eisentraut [EMAIL PROTECTED] wrote:
 Jaime Casanova wrote:
  From time to time people ask me if there is a way to customize
  messages for constraints so they could be more informative to the
  user...

 What about this?

 = create table foo (fld int4 constraint fld must contain positive numbers 
 check (fld  0));
 CREATE TABLE
 = insert into foo values (-5);
 ERROR:  new row for relation foo violates check constraint fld must 
 contain positive numbers

 --
 Peter Eisentraut
 http://developer.postgresql.org/~petere/


ok, i didn't know you can use such names...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PG process architecture

2006-01-14 Thread Denis Lussier

EnterpriseDB is a clone AND a fork. :-) We work hard to free customers from Oracle vendor lock-in AND stay up with the latest releases and patches of Postgres AND don't break any Postgres native and/or ANSI functionality. 


--Denis Lussier
 Founder  CTO
 www.EnterpriseDB.com



On 1/11/06, Joshua D. Drake [EMAIL PROTECTED]
 wrote: 
EnterpriseDB is not a clone, it is a fork. That said, it is very coolthat theyhave created an Oracle look-a-like for much less but if you want PostgreSQL 
that isn't EnterpriseDB.


Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-01-14 Thread Magnus Hagander
Naturally, just a minute after sending the patch, I realised how it can
be done better ;)

If the job object code is moved to the postmaster, it'll work when not
running as a service as well. And there's no way to break out of the job
object -  it's just the other part.

Yes, I'll set up a new patch :-) Meanwhlie, if there are any other
comments on this one, I'm still interested in those.

//Magnus


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Magnus Hagander
 Sent: Saturday, January 14, 2006 6:09 PM
 To: pgsql-patches@postgresql.org
 Subject: [PATCHES] Fix for running from admin account on win32
 
 Attached is a two part patch for the admin functionality on win32.
 
 The first part is a simple bugfix to 
 backend/port/win32/security.c. The function that checks if 
 the user has admin privileges didn't check it the SID in the 
 token was enabled or not. All actual access checks done by 
 the OS does check this, so we should too :-) This is required 
 for the second part of the patch to work, but also in some 
 scenarios with third-party tools that modify the token.
 
 
 The second part enables pg_ctl to give up admin privleges 
 when starting the server. This works for both standalone and 
 service, but only when running on Windows 2000 or newer. The 
 APIs simply didn't exist in NT4.
 pg_ctl still works in NT4, but is unable to give up 
 privileges. Since we still do the privilege check in the 
 postmaster, this is not a problem.
 This has to be implemented in pg_ctl, because if it's done 
 in-process it's possi ble to get the admin privs back. 
 
 It also implements a job object wrapper around all processes created.
 This only works when running as a service, because the job 
 object is destroyed when pg_ctl exits (it's automatically 
 destroyed when the last handle is closed). However, when 
 running as a service it increases security further by 
 preventing new processes from being started with a different 
 user, access to clipboard, windows restarting and desktop 
 access. It also limits further any chance of accessing admin 
 privileges, more than we have today.
 
 Finally, the job object provides an excellent point for 
 monitoring the server. It will contain aggregate statistics 
 of how many processes are running (or have been running), how 
 much CPU is being used (has been used), memory activity etc. 
 As a whole for postmaster+all children, not one a piece. This 
 functionality is all provided by default by the windows 
 performance monitor when you use job objects.
 
 
 It turned out the mingw headers *and* libraries were 
 incomplete wrt these functions, so I had to do it with 
 runtime loading of DLLs. Since I had to do this anyway, it 
 was trivial to do this for all the NT4 functions, and just 
 have it work there. So the discussion I started yesterday 
 about NT4 compatibility doesn't really apply to this case - 
 but it's still a good discussionto have I think.
 
 
 //Magnus
 
 
 D:\msys\1.0\local\pgsql\binpostmaster -D ..\data Execution 
 of PostgreSQL by a user with administrative permissions is 
 not permitted.
 The server must be started under an unprivileged user ID to 
 prevent possible system security compromises.  See the 
 documentation for more information on how to properly start 
 the server.
 
 D:\msys\1.0\local\pgsql\binpg_ctl -D ..\data start 
 postmaster starting
 
 D:\msys\1.0\local\pgsql\binLOG:  database system was shut down at
 2006-01-14 17
 :42:14 W. Europe Standard Time
 LOG:  checkpoint record is at 0/39FD88
 LOG:  redo record is at 0/39FD88; undo record is at 0/0; shutdown TRUE
 LOG:  next transaction ID: 582; next OID: 16389
 LOG:  next MultiXactId: 1; next MultiXactOffset: 0
 LOG:  database system is ready
 LOG:  transaction ID wrap limit is 2147484146, limited by 
 database postgres
 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-01-14 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 If the job object code is moved to the postmaster, it'll work when not
 running as a service as well.

I'd just as soon keep all that Windows-specific cruft in pg_ctl.
So I think the way you've got it set up is fine.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Fix for running from admin account on win32

2006-01-14 Thread Magnus Hagander
  If the job object code is moved to the postmaster, it'll 
 work when not 
  running as a service as well.
 
 I'd just as soon keep all that Windows-specific cruft in pg_ctl.
 So I think the way you've got it set up is fine.

Well, it'd all be localised to the backend/port/win32 directory of
course, except for a single call - where there is already windows
cruft.

But. I see your point. And the  vast majority of production
installations run as service anyway. So I won't spend any time making
those changes, I'll leave what's on -patches now.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Martijn van Oosterhout
On Sat, Jan 14, 2006 at 11:06:07AM -0500, [EMAIL PROTECTED] wrote:
 Not to completely defend the practice - but in some applications,
 INSERT is much less frequent than UPDATE, and that UPDATE requires a
 unique check on the primary key and the surrogate key, as well as an
 update, should be considered (and I believe is considered) a
 PostgreSQL performance bug. It's undesirable and unnecessary behaviour
 for the majority of uses (where they key does not change as a part of
 the update).

Unique check? An index is an index and when you do an UPDATE the new
tuple has to be added to the index. At this point it doesn't matter if
the index is unique or not, all indexes cost something.

Since after the UPDATE the tuple with that primary key appears two (or
more) times in the table, a check needs to be made that they don't
overlap timewise. Are you claiming you could avoid this check and still
guarentee correctness in the face of concurrent transactions?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread mark
On Sat, Jan 14, 2006 at 09:33:39PM +0100, Martijn van Oosterhout wrote:
 On Sat, Jan 14, 2006 at 11:06:07AM -0500, [EMAIL PROTECTED] wrote:
  Not to completely defend the practice - but in some applications,
  INSERT is much less frequent than UPDATE, and that UPDATE requires a
  unique check on the primary key and the surrogate key, as well as an
  update, should be considered (and I believe is considered) a
  PostgreSQL performance bug. It's undesirable and unnecessary behaviour
  for the majority of uses (where they key does not change as a part of
  the update).
 Unique check? An index is an index and when you do an UPDATE the new
 tuple has to be added to the index. At this point it doesn't matter if
 the index is unique or not, all indexes cost something.

 Since after the UPDATE the tuple with that primary key appears two (or
 more) times in the table, a check needs to be made that they don't
 overlap timewise. Are you claiming you could avoid this check and still
 guarentee correctness in the face of concurrent transactions?

I'm claiming that I agree with this TODO item:

- Prevent index uniqueness checks when UPDATE does not modify the column

  Uniqueness (index) checks are done when updating a column
  even if the column is not modified by the UPDATE.

Definately, the check is unnecessary. If it was unique before we made
the change, we know it will be unique after we've made the change.
The check shouldn't be performed for the primary key, or for the
surrogate key, if neither of these keys are modified in any way.

Perhaps you are challenging my addition of the phrase as well as an
update, with a hint on my part, that I feel the update is unnecessary
as well. I may have been wrong to add these 5 words. The MVCC
implementation has numerous costs, and perhaps this is one of them
that cannot be avoided. :-(

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Warm-up cache may have its virtue

2006-01-14 Thread Qingqing Zhou

Qingqing Zhou [EMAIL PROTECTED] wrote

 I wonder if we should really implement file-system-cache-warmup strategy
 which we have discussed before. There are two natural good places to do
 this:

 (1) sequentail scan
 (2) bitmap index scan


For the sake of memory, there is a third place a warm-up cache or pre-read 
is beneficial (OS won't help us):
(3) xlog recovery

Regards,
Qingqing



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Surrogate keys (Was: enums)

2006-01-14 Thread Greg Stark

Leandro Guimarães Faria Corcete DUTRA [EMAIL PROTECTED] writes:

 Certainly decoupling presentation from storage would be nice, but even before 
 that generalised use of surrogate keys seems to me a knee-jerk reaction.

I hate knee-jerk reactions too, but just think of all the pain of people
dealing with databases where they used Social Security numbers for primary
keys. I would never use an attribute that represents some real-world datum as
a primary key any more.

In my experience there are very few occasions where I want a real non-sequence
generated primary key. I've never regretted having a sequence generated
primary key, and I've certainly had occasions to regret not having one.

-- 
greg


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org