Re: [HACKERS] File leak?

2004-06-14 Thread Heikki Linnakangas
On Sun, 13 Jun 2004, Tom Lane wrote:

 Bruce Momjian [EMAIL PROTECTED] writes:
  (viz, log at the instant of file creation, and the replayer would have
  to keep track of whether it sees the creating transaction commit and
  delete the file if not).

  I don't see how we could WAL log it because we don't fsync the WAL until
  our transaction completes, right, or are you thinking we would do a
  special fsync when we add the record?

 Right, we would have to XLogFlush the file-creation WAL record before we
 could actually create the file.  This is in line with the standard WAL
 rule: the WAL record must hit disk before the data file change it
 describes does.  Assuming that the filesystem fsync's the created inode
 immediately, that means we have to flush first.

I'm afraid that's not enough. Checkpoints spoil it, think:

1. CREATE TABLE foobar ...
2. INSERT 
3. checkpoint
4. crash

The replay would not see the file-creation WAL record.

We need some additional stash for the pending file-creations to make them
survive checkpoints.

 I'm not sure what the performance implications of this would be; it's
 likely that pushing the cost somewhere else would be better.

I don't think that file creation is that common for it to matter..

- Heikki


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Releasing 7.4.3 ...

2004-06-14 Thread Peter Eisentraut
Marc G. Fournier wrote:
 On Sun, 13 Jun 2004, Peter Eisentraut wrote:
  Marc G. Fournier wrote:
  Anyone else, please test the tar ball for any bug/nits ...
  specifically, Peter, can you check that I've built/included the
  right documentation?
 
  Try reading the list of supported platforms at the bottom of the
  INSTALL file...

 k, and that is supposed to tell me what?

There is supposed to be a table there; I just see unaligned gargage.  
The INSTALL file looks quite bad.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] I just got it: PostgreSQL Application Server -- a

2004-06-14 Thread Mark Kirkwood

Joshua D. Drake wrote:

The PostgreSQL Enhanced Server (How's that name? Maybe we call it Zerver
and use PEZ?) idea is how to take the excellent core of PostgreSQL and
productize it in much the same way distributions take the Linux kernel and
may a GNU/Linux system.
 

It would seem to me that this is more correct in the commercial space. 
Of course I am biased but
what you are talking about sounds a whole lot like RedHat Enterprise 
versus Fedora etc

And Postgresql Inc, Command Prompt, Slony etc...
regards
Mark
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] [PATCHES] Compiling libpq with VisualC

2004-06-14 Thread Magnus Hagander
 What is the recommended way to create mutex objects 
 (CreateMutex) from
 Win32 libraries?  There must be a clean way like there is 
 in pthreads.
 
 
 
 A mutex is inherently a global object. CreateMutex(NULL, 
 FALSE, NULL) 
 will return a handle to an unowned mutex.
 
   
 
 That's not the problem. Under pthread, it's possible to 
 initialize a mutex from compile time:
 
 static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER;
 
 This means that the mutex is immediately valid, no races with 
 the initialization. I couldn't find an equivalent Win32 feature.

AFAIK, there is no such thing on Win32. The clean way is probably to
rqeuire the library to export a function InitialyzeFooLibrary() that
does it (like Winsock does with requiring WSAStartup()).

To do something like it though, you can use a named mutex. Then doing,
in pseudocode:

if (CreateMutex(...,my_unique_mutex_name) == ERROR_ALREADY_EXISTS)
   OpenMutex(...,my_unique_mutex_name)

Assuming nobody closes the mutex between your attempt to create and open
(which shouldn't happen if you just ignore closing it until process
exit), this should be safe.

Store the HANDLE to the Mutex in TLS, and have each thread do the
create/open when it needs the mutex (e.g. wrap the wait on the mutex in
a function/macro that will create/open the mutex if it's
INVALID_HANDLE_VALUE, which you assign it to by default).


You need a unique name for the mutex, since it's not per-process but
per-sessino. But that can easily be constructed from the pid.

//Magnus


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-hackers-win32] [HACKERS] [PATCHES] Compiling libpq with

2004-06-14 Thread Andreas Pflug
Magnus Hagander wrote:
What is the recommended way to create mutex objects 
   

(CreateMutex) from
   

Win32 libraries?  There must be a clean way like there is 
   

in pthreads.
   

  

   

A mutex is inherently a global object. CreateMutex(NULL, 
 

FALSE, NULL) 
   

will return a handle to an unowned mutex.

 

That's not the problem. Under pthread, it's possible to 
initialize a mutex from compile time:

   static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER;
This means that the mutex is immediately valid, no races with 
the initialization. I couldn't find an equivalent Win32 feature.
   

AFAIK, there is no such thing on Win32. The clean way is probably to
rqeuire the library to export a function InitialyzeFooLibrary() that
does it (like Winsock does with requiring WSAStartup()).
To do something like it though, you can use a named mutex. Then doing,
in pseudocode:
if (CreateMutex(...,my_unique_mutex_name) == ERROR_ALREADY_EXISTS)
  OpenMutex(...,my_unique_mutex_name)
Assuming nobody closes the mutex between your attempt to create and open
(which shouldn't happen if you just ignore closing it until process
exit), this should be safe.
Store the HANDLE to the Mutex in TLS, and have each thread do the
create/open when it needs the mutex (e.g. wrap the wait on the mutex in
a function/macro that will create/open the mutex if it's
INVALID_HANDLE_VALUE, which you assign it to by default).
You need a unique name for the mutex, since it's not per-process but
per-sessino. But that can easily be constructed from the pid.
 

A libpq patch avoiding the InitializeFooLibrary() creating the mutex 
on-demand is in pgsql-patches already.

+#ifndef WIN32
static pthread_mutex_t singlethread_lock = PTHREAD_MUTEX_INITIALIZER;
+#else
+static pthread_mutex_t singlethread_lock;
+static long mutex_initialized = 0;
+if (!InterlockedExchange(mutex_initialized, 1L))
+pthread_mutex_init(singlethread_lock, NULL); // wraps 
CreateMutex(NULL,FALSE,NULL)
+#endif

Regards,
Andreas
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] I just got it: PostgreSQL Application Server -- a

2004-06-14 Thread pgsql

The PostgreSQL Enhanced Server (How's that name? Maybe we call it
 Zerver
and use PEZ?) idea is how to take the excellent core of PostgreSQL and
productize it in much the same way distributions take the Linux kernel
 and
may a GNU/Linux system.



 It would seem to me that this is more correct in the commercial space.
 Of course I am biased but
 what you are talking about sounds a whole lot like RedHat Enterprise
 versus Fedora etc

No, I don't think I agree. It does not need to be Commercial as it is
similar to Apache Jacarta too.

If you are going to do a complex project with PostgreSQL, you sort of have
a lot of construction ahead of you. Yea, it is a great SQL engine, but to
build a high speed web site, or virtually any complex project, you will
need a lot of add-ons.  Rather than have everyone duplate the effort of
finding the extensions, why not have a project with all this stuff
installed. AFAIK, and correct me if I'm wrong, having functions installed
doesn't affect performance.



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Releasing 7.4.3 ...

2004-06-14 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 The INSTALL file looks quite bad.

It's clearly been freshly generated.  The formatting does seem worse
than in previous versions, but it's not so bad I'd want to hold up
the release to fix it.

My guess is that the wrong version of lynx is being used to build it.
I remember being dissatisfied with the output of developer.pg.org's
version of lynx back when we were making this file manually.
(The version I have here is 2.8.5rel.1 (04 Feb 2004) and it seems to
do fine.)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] File leak?

2004-06-14 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I'm afraid that's not enough. Checkpoints spoil it, think:

 1. CREATE TABLE foobar ...
 2. INSERT 
 3. checkpoint
 4. crash

 The replay would not see the file-creation WAL record.

Good point.  That makes it messy enough that we probably don't want to
do it that way.  Scan-for-unreferenced-files is looking a lot more
robust (although it has its own interesting race-condition issues if
you try to do it in a live system).

 I'm not sure what the performance implications of this would be; it's
 likely that pushing the cost somewhere else would be better.

 I don't think that file creation is that common for it to matter..

Maybe not for regular tables, but for temp tables I'm less convinced.
If we could do the unreferenced-file scan only at completion of a crash
recovery then it'd be zero cost in all normal paths ...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Releasing 7.4.3 ...

2004-06-14 Thread Tom Lane
I wrote:
 My guess is that the wrong version of lynx is being used to build it.
 I remember being dissatisfied with the output of developer.pg.org's
 version of lynx back when we were making this file manually.
 (The version I have here is 2.8.5rel.1 (04 Feb 2004) and it seems to
 do fine.)

Actually, having just tried it, 2.8.5rel.1 generates output that is
nearly indistinguishable from what's in this morning's snapshot.
In particular I do not get the ASCII-art decoration for the table of
supported platforms that I see in the 7.4.2 version of INSTALL.
It comes out exactly like what's in the snapshot.

I'm now wondering about changes in the SGML stylesheets used to generate
the .html file we then send to lynx.  But I would definitely put this
in the category of something to investigate and fix later.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Weird 'bit' type behaviour

2004-06-14 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Is there any reason for this behaviour:
 test=# select 1::bit;
   bit
 -
   0
 (1 row)

This is actually 1::int4::bit(1), and what you are getting is
the sign bit.  See previous discussions about int-to-bit conversion
and which part of the int we ought to take.

 What about these?

Again, it's a field-width issue.  These all default to bit(1) and
you're shifting out of the defined width.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Delaying the planning of unnamed statements until Bind

2004-06-14 Thread Tom Lane
Oliver Jowett [EMAIL PROTECTED] writes:
 I guess that ExprState does not live long enough to be useful.

Actually the opposite: it lasts too long, namely the entire execution of
a query.  I don't think there's any convenient way to reset it on the
timescale appropriate for STABLE values (ie, once per scan, as opposed
to once per query).

 How about introducing a function modifier that provides stronger 
 guarantees than STABLE, along the lines of immutable during execution 
 of a single SQL statement?

Why?

I suspect that if we did have two flavors of STABLE, we'd just have a
lot of people getting it wrong :-(.  A big advantage of the current
definition is exactly that it is pretty weak...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Passing typmod to cast functions (for int-to-bit casting)

2004-06-14 Thread Tom Lane
Chris K-L's recent complaint reminded me that we had been talking about
redefining int-to-bit(n) casting to take the rightmost n bits of the
integer value, instead of the leftmost n bits.  The implementation
reason why it works the way it does is that the existing cast function
effectively converts to bit(32), and then after that we cast to bit(n),
and the bitstring width-conversion transformation takes the leftmost
bits of the bitstring, which is per SQL spec.

The only convenient way I can see to handle this is to extend the cast
stuff so that the cast function can be passed an additional parameter
which is the target typmod.  Armed with that info, inttobit() could
align its output bits properly for the upcoming bitstring truncation.

Now that cast functions are selected through pg_cast, this should be a
fairly straightforward change.  Does anyone have a problem with it?
I'm not sure the functionality is actually useful for anything except
this one issue, but arguably it's a general-purpose mechanism...

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] File leak?

2004-06-14 Thread Gaetano Mendola
Bruce Momjian wrote:
Tom Lane wrote:
Now that I think about it, I believe Bruce recently removed this on my
advice; I was thinking that the problem shouldn't occur anymore now that

True.

we WAL-log file creation and deletion.  But actually the present form of
the WAL entries doesn't ensure that a file created by a transaction that
crashes before committing will go away, because file deletion actions
are only logged (and replayed) at transaction commit/abort.  So it
probably should go back in.  Or else we could add more WAL logging

Wording updated to:
* Remove unreferenced table files created by a transactions that were
  in-progress when the server crashed
I don't think is a good idea put the words: when the server crashed in a TODO
list, may be is better write: when the server is killed abruptly.
My 2 cents.
Regards
Gaetano Mendola


---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Improving postgresql.conf

2004-06-14 Thread Gaetano Mendola
Scott Marlowe wrote:
On Fri, 2004-06-11 at 11:02, Bruce Momjian wrote:
Gaetano Mendola wrote:
[ PGP not available, raw data follows ]
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Bruce Momjian wrote:
| Gaetano Mendola wrote:
|
|Bruce Momjian wrote:
|
|  I understand your points below.  However, the group has weighed in the
|  direction of clearly showing non-default values and not duplicating
|  documentation.  We can change that, but you will need more folks
|  agreeing with your direction.
|
|I don't remember the behaviour but tell me what happen if
|I comment out a value changing the value. Kill UP the postmater.
|Recommenting that value and now re killing the postmaster.
|
|I believe that postmaster will not run with the default value.
|Who will look the configuration file will not understand the right
|reality.
|
|
| If you comment a variable in postgresql.conf, it will use the
| default value.
That's not true at least with the version 7.4.2.
Try yourself, I did the experiment changing the cpu_tuple_cost and
commenting out the cpu_tuple_cost, after sending the SIGHUP to
postmaster the value remain: 0.005 that is not the default value at
all.
Oh, sorry, you are right.  Not sure if this is a bug or not.

This point has come up before, and I think it's intended behavior. 
Stopping and restarting the database will, of course, make it load the
defaults.
Yes and this doesn't help to understand how the postmaster is running
unless you use SHOW ALL.
Regards
Gaetano Mendola


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Releasing 7.4.3 ...

2004-06-14 Thread Marc G. Fournier
On Mon, 14 Jun 2004, Tom Lane wrote:
Peter Eisentraut [EMAIL PROTECTED] writes:
The INSTALL file looks quite bad.
It's clearly been freshly generated.  The formatting does seem worse
than in previous versions, but it's not so bad I'd want to hold up
the release to fix it.
My guess is that the wrong version of lynx is being used to build it.
I remember being dissatisfied with the output of developer.pg.org's
version of lynx back when we were making this file manually.
(The version I have here is 2.8.5rel.1 (04 Feb 2004) and it seems to
do fine.)
svr1# lynx --version
Lynx Version 2.8.5rel.1 (04 Feb 2004)
libwww-FM 2.14, SSL-MM 1.4.1, OpenSSL 0.9.7d
Built on freebsd4.9 Feb 28 2004 22:01:12

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Delaying the planning of unnamed statements until Bind

2004-06-14 Thread Oliver Jowett
Tom Lane wrote:
Oliver Jowett [EMAIL PROTECTED] writes:
I guess that ExprState does not live long enough to be useful.

Actually the opposite: it lasts too long, namely the entire execution of
a query.  I don't think there's any convenient way to reset it on the
timescale appropriate for STABLE values (ie, once per scan, as opposed
to once per query).
I think you misunderstand what I was suggesting. Given your earlier 
clarification of what STABLE means, it isn't correct to mark expressions 
involving a STABLE function as constant-at-execution-time, so those 
results would not be cached. But there are still other expression trees 
that would benefit, e.g. those involving an IMMUTABLE function with 
parameterized arguments.

How about introducing a function modifier that provides stronger 
guarantees than STABLE, along the lines of immutable during execution 
of a single SQL statement?
Why?
It's not directly useful currently, as there's no expression caching 
going on. If there was expression caching, the stronger guarantees would 
allow you to cache a wider range of expressions.

I suspect that if we did have two flavors of STABLE, we'd just have a
lot of people getting it wrong :-(.  A big advantage of the current
definition is exactly that it is pretty weak...
It seems quite hard to build a STABLE function that doesn't also satisfy 
the stronger requirements. I can't think of how you'd do it as a SQL 
function at all, off the top of my head. What sort of function were you 
thinking of that is STABLE-safe but doesn't satisfy the stronger 
requirements?

-O
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Delaying the planning of unnamed statements until Bind

2004-06-14 Thread Tom Lane
Oliver Jowett [EMAIL PROTECTED] writes:
 But there are still other expression trees 
 that would benefit, e.g. those involving an IMMUTABLE function with 
 parameterized arguments.

Oh, you are thinking of some very-long-lived cache.  This has been
proposed and rejected before; it's just not apparent that the costs of
maintaining and searching such a cache are justified by the possible
benefits.  Most of the functions that actually appear in SQL commands
are cheap enough to evaluate that it'd not be worthwhile to do this at
all for them, ever --- the costs of executing datatype-specific
comparison functions to verify a hashtable hit would equal or exceed
the cost of evaluating the function.

There certainly are expensive user functions out there, and if we knew
which ones those were, it might be worth caching their values.  But we
don't presently have any way to identify such functions.  More, I'm not
convinced that very many of the ones that are that expensive can
reasonably be marked IMMUTABLE; an expensive function is likely one that
does database accesses.

 It seems quite hard to build a STABLE function that doesn't also satisfy 
 the stronger requirements. I can't think of how you'd do it as a SQL 
 function at all, off the top of my head. What sort of function were you 
 thinking of that is STABLE-safe but doesn't satisfy the stronger 
 requirements?

Anything at all that inspects database contents is probably STABLE and
not anything stronger, since it could potentially be affected by
intra-transaction updates.  (The definition of STABLE is partly
motivated by MVCC semantics, particularly the fact that updates executed
by a command only become visible at CommandCounterIncrement boundaries.)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Frequently updated tables

2004-06-14 Thread pgsql
 [EMAIL PROTECTED] wrote:


I have been talking about two types of problems which are both based on
PostgreSQL's behavior with frequently updated tables.

Summary table: In the single row table system, you have to vacuum very
requently, and this affects performance.

Frequently updated tables: think about the session table for a website.
Each new user gets a new session row. Everytime they refresh or act in
 the
site, the row is updated. When they leave or their session times out, the
row is deleted. I wrote a RAM only session manager for PHP because
PostgreSQL couldn't handle the volume. (2000 hits a second)



 It would be interesting to see if the vacuum delay patch, fsm tuning +
 vacuum scheduling could have changed this situation. Clearly there is an
 issue here (hence a patch...), but ISTM that just as significant is the
 fact that it is difficult to know how to configure the various bits and
 pieces, and also difficult to know if it has been done optimally.

If you have an active site, with hundreds or thousands of hits a second,
vacuuming the table constantly is not practical.

I don't think anyone who has seriously looked at these issues has
concluded that PostgreSQL works fine in these cases. The question is
 what,
if anything, can be done? The frequent update issue really affects
PostgreSQL's acceptance in web applications, and one which MySQL seems to
do a better job.




 As an aside, I have had similar issues with DB2 and high update tables -
 lock escalations (locklist tuning needed). It is not just
 non-overwriting storage managers that need the magic tuning wand :-)

Funny, I've used DB2 for a few projects, but never for a web session
system. This is an interesting data point thanks.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Frequently updated tables

2004-06-14 Thread Mark Kirkwood
[EMAIL PROTECTED] wrote:
I have been talking about two types of problems which are both based on
PostgreSQL's behavior with frequently updated tables.
Summary table: In the single row table system, you have to vacuum very
requently, and this affects performance.
Frequently updated tables: think about the session table for a website.
Each new user gets a new session row. Everytime they refresh or act in the
site, the row is updated. When they leave or their session times out, the
row is deleted. I wrote a RAM only session manager for PHP because
PostgreSQL couldn't handle the volume. (2000 hits a second)
 

It would be interesting to see if the vacuum delay patch, fsm tuning + 
vacuum scheduling could have changed this situation. Clearly there is an 
issue here (hence a patch...), but ISTM that just as significant is the 
fact that it is difficult to know how to configure the various bits and 
pieces, and also difficult to know if it has been done optimally.

If you have an active site, with hundreds or thousands of hits a second,
vacuuming the table constantly is not practical.
I don't think anyone who has seriously looked at these issues has
concluded that PostgreSQL works fine in these cases. The question is what,
if anything, can be done? The frequent update issue really affects
PostgreSQL's acceptance in web applications, and one which MySQL seems to
do a better job.
 

As an aside, I have had similar issues with DB2 and high update tables - 
lock escalations (locklist tuning needed). It is not just 
non-overwriting storage managers that need the magic tuning wand :-)

regards
Mark
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Coding question

2004-06-14 Thread Ramanujam H S Iyengar
Hi,
I think replacing the RELOID with TYPEOID should get you the required Datum 
...
later you may have to use

Form_pg_type pform = (Form_pg_type) GETSTRUCT(tuple);
to get a reference to the type tuple.
I have worked this way but iam not sure whether this is correct or not !!
If iam wrong some body can correct me !!
-Ramu
If typTup is of type Form_pg_type, is this use of ObjectIdGetDatum legal?
tuple = SearchSysCache(RELOID, ObjectIdGetDatum(typTup-typrelid),
  0, 0, 0);
If not, how do I turn -typrelid into an Oid type for safe passage
through ObjectIdGetDatum?
Chris
_
Contact brides  grooms FREE! http://www.shaadi.com/index.php?ptnr=hmltag 
Only on www.shaadi.com. Register now!

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[HACKERS] Spinlock assembly cleanup

2004-06-14 Thread Tom Lane
Pursuant to the gripes raised by Martin Pitt ---

I've consulted some gcc experts within Red Hat and come to the following
conclusions:

* We should consistently refer to the spinlock contents via a
read/write operand declared like +m(*lock).  This is consistent
with longstanding practice in the Linux kernel and therefore is unlikely
to get broken in future gcc releases.  The existing ports that use
matched input and output parameters will break, or at least draw nasty
warnings, in upcoming gcc releases.

* Not all of the ports currently declare *lock as an input operand,
but this seems rather dangerous to me; I think all should use +m.

* Some but not all the ports list memory as a clobbered operand.
The gcc manual saith

: If your assembler instruction modifies memory in an unpredictable
: fashion, add `memory' to the list of clobbered registers.  This will
: cause GNU CC to not keep memory values cached in registers across the
: assembler instruction.

Now as far as I can see, none of the spinlock sequences directly clobber
any memory other than the spinlock itself, and so (as long as the lock
is stated to be an output operand) one might think the memory clobber
marking to be excessive.  However, I am thinking it is actually a good
idea and we ought to add the marking to all ports, not remove it.  The
thought is that what we are actually using the spinlock for is to guard
access to values in shared memory, and therefore the act of waiting for
a spinlock can be seen as waiting for other memory variables to assume
values they didn't necessarily have last time we looked.  If gcc caches
shared variables in registers across a spinlock acquisition, the code is
broken.

The alternative to doing this would be to always use volatile pointers
to access shared memory, but I don't want to do that --- in the first
place it's notationally cumbersome, and in the second place it would
hurt performance unnecessarily.  Within straight-line code that holds a
spinlock there is no reason to treat shared memory as volatile.  It's
only when crossing a spinlock boundary that you must reload from memory,
and that seems to be exactly what the memory modifier declares for us.

(I am assuming here that marking the asm fragment volatile does not
necessarily do what the memory modifier does; I can't see anything in
the gcc docs that claims volatile includes the effects of memory.)

So I'd like to make all the gcc-asm fragments for spinlocks follow these
rules.  Comments?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Accelerating aggregates

2004-06-14 Thread Greg Stark
Steve Atkins [EMAIL PROTECTED] writes:

 On Fri, Jun 11, 2004 at 12:17:57PM -0400, Greg Stark wrote:

  no, read committed would see any other updates that have been committed since
  the start of your transaction. 
 
 Uhm... only updates within the current transaction. 

No, read committed refers to being able to read any updates that are
committed, even if they were committed after the start of your transaction:


For example:



db= begin;
BEGIN

db= begin;
BEGIN

db= insert into test values (1);
INSERT 6725927 1
db= select * from test;
 a 
---
 1
(1 row)

db= select * from test;
 a 
---
(0 rows)

db= commit;
COMMIT

db= select * from test;
 a 
---
 1
(1 row)


-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[HACKERS] building rpms from source rpm's

2004-06-14 Thread Dave Cramer
I am getting the following error:
error: parse error in expression
error: /usr/src/redhat/SPECS/postgresql-7.4.2-1PGDG.spec:98:
parseExpressionBoolean returns -1
error: Package has no %description: postgresql

When I execute 

rpmbuild --rebuild  --define 'build9x 1' --define 'tcldevel 0' --define
'perl 0' --define 'tcl 0' --define 'tkpkg 0' --define 'test 0' --define
'newintarray 1' --define 'kerberos 0'
  -vv  SRPMS/postgresql-7.4.2-1PGDG.src.rpm

Dave
-- 
Dave Cramer
519 939 0336
ICQ # 14675561


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] 7.4.3 running a bit late ...

2004-06-14 Thread Marc G. Fournier
got everything built this evening, just giving it a bit of time to 
propogate out to the mirrors before announcing ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Passing typmod to cast functions (for int-to-bit casting)

2004-06-14 Thread Christopher Kings-Lynne
Now that cast functions are selected through pg_cast, this should be a
fairly straightforward change.  Does anyone have a problem with it?
I'm not sure the functionality is actually useful for anything except
this one issue, but arguably it's a general-purpose mechanism...
Does that help with making CREATE TABLE AS SELECT pick up varchar 
lengths, etc.?

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Delaying the planning of unnamed statements until Bind

2004-06-14 Thread Oliver Jowett
Tom Lane wrote:
Oliver Jowett [EMAIL PROTECTED] writes:
But there are still other expression trees 
that would benefit, e.g. those involving an IMMUTABLE function with 
parameterized arguments.
Oh, you are thinking of some very-long-lived cache.  This has been
proposed and rejected before; it's just not apparent that the costs of
maintaining and searching such a cache are justified by the possible
benefits.  Most of the functions that actually appear in SQL commands
are cheap enough to evaluate that it'd not be worthwhile to do this at
all for them, ever --- the costs of executing datatype-specific
comparison functions to verify a hashtable hit would equal or exceed
the cost of evaluating the function.
I was actually thinking of only caching when the structure of the 
expression tree means that it is known to be constant across some period 
-- e.g. (non-subquery) Params remain constant across a single query 
execution. So there's no hashtable or datatype-specific comparisons 
involved. The cache only lives as long as you can guarantee the 
expression tree remains constant. I'm just trying to work out the best 
lifetime for the cache (or, equivalently, the types of expression tree 
that can be marks as cacheable).

There certainly are expensive user functions out there, and if we knew
which ones those were, it might be worth caching their values.  But we
don't presently have any way to identify such functions.  More, I'm not
convinced that very many of the ones that are that expensive can
reasonably be marked IMMUTABLE; an expensive function is likely one that
does database accesses.
Fair enough. My concern is that if every query is parameterized by an 
interface layer (as I'm planning to do with the JDBC driver), that one 
expensive IMMUTABLE function is going to bite the application. So I'd 
still like to see some sort of caching so that those few queries don't 
run significantly slower, assuming that the cost of caching in the 
common case is minor.

It seems quite hard to build a STABLE function that doesn't also satisfy 
the stronger requirements. I can't think of how you'd do it as a SQL 
function at all, off the top of my head. What sort of function were you 
thinking of that is STABLE-safe but doesn't satisfy the stronger 
requirements?

Anything at all that inspects database contents is probably STABLE and
not anything stronger, since it could potentially be affected by
intra-transaction updates.  (The definition of STABLE is partly
motivated by MVCC semantics, particularly the fact that updates executed
by a command only become visible at CommandCounterIncrement boundaries.)
Does that mean that these functions satisfy IMMUTABLE until the next 
CommandCounterIncrement? That sounds more cacheable than the 
tablescan-based definition.

-O
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Spinlock assembly cleanup

2004-06-14 Thread Bruce Momjian

Sounds good to me.  Consistencyis important because it lets us fix
problems across all cpu types.  I am not 100% excited about the memory
part because it invalidates all register memory values, not just the
shared memory location.  We are specifically accessing a memory address
as part of the ASM, so I don't see how it could pull that memory value
from a register behind our back.

---

Tom Lane wrote:
 Pursuant to the gripes raised by Martin Pitt ---
 
 I've consulted some gcc experts within Red Hat and come to the following
 conclusions:
 
 * We should consistently refer to the spinlock contents via a
 read/write operand declared like +m(*lock).  This is consistent
 with longstanding practice in the Linux kernel and therefore is unlikely
 to get broken in future gcc releases.  The existing ports that use
 matched input and output parameters will break, or at least draw nasty
 warnings, in upcoming gcc releases.
 
 * Not all of the ports currently declare *lock as an input operand,
 but this seems rather dangerous to me; I think all should use +m.
 
 * Some but not all the ports list memory as a clobbered operand.
 The gcc manual saith
 
 : If your assembler instruction modifies memory in an unpredictable
 : fashion, add `memory' to the list of clobbered registers.  This will
 : cause GNU CC to not keep memory values cached in registers across the
 : assembler instruction.
 
 Now as far as I can see, none of the spinlock sequences directly clobber
 any memory other than the spinlock itself, and so (as long as the lock
 is stated to be an output operand) one might think the memory clobber
 marking to be excessive.  However, I am thinking it is actually a good
 idea and we ought to add the marking to all ports, not remove it.  The
 thought is that what we are actually using the spinlock for is to guard
 access to values in shared memory, and therefore the act of waiting for
 a spinlock can be seen as waiting for other memory variables to assume
 values they didn't necessarily have last time we looked.  If gcc caches
 shared variables in registers across a spinlock acquisition, the code is
 broken.
 
 The alternative to doing this would be to always use volatile pointers
 to access shared memory, but I don't want to do that --- in the first
 place it's notationally cumbersome, and in the second place it would
 hurt performance unnecessarily.  Within straight-line code that holds a
 spinlock there is no reason to treat shared memory as volatile.  It's
 only when crossing a spinlock boundary that you must reload from memory,
 and that seems to be exactly what the memory modifier declares for us.
 
 (I am assuming here that marking the asm fragment volatile does not
 necessarily do what the memory modifier does; I can't see anything in
 the gcc docs that claims volatile includes the effects of memory.)
 
 So I'd like to make all the gcc-asm fragments for spinlocks follow these
 rules.  Comments?
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faqs/FAQ.html
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Passing typmod to cast functions (for int-to-bit casting)

2004-06-14 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Does that help with making CREATE TABLE AS SELECT pick up varchar 
 lengths, etc.?

Not any better than it does now, no ... but AFAIK simple cases work okay
on that.  What's your gripe exactly?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Passing typmod to cast functions (for int-to-bit casting)

2004-06-14 Thread Christopher Kings-Lynne
Not any better than it does now, no ... but AFAIK simple cases work okay
on that.  What's your gripe exactly?
No gripe - was just pointing out a situation that might be improved 
slightly be carrying around typmod info.

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[HACKERS] OWNER TO on all objects

2004-06-14 Thread Christopher Kings-Lynne
Hi,
This is a preview patch - DON'T COMMIT IT TO HEAD!
What I've done in this patch is add the following:
ALTER AGGREGATE / OWNER TO
ALTER CONVERSION / OWNER TO
ALTER FUNCTION / OWNER TO
ALTER OPERATOR / OWNER TO
ALTER OPERATOR CLASS / OWNER TO
ALTER SCHEMA / OWNER TO
ALTER TYPE / OWNER TO
That means we can change the owner of all objects.
Next, I modified pg_dump to remove all SET SESSION AUTHORIZATION 
commands for object creation.  (I left them in on the COPY commands).

Then I made it so that pg_dump will output an OWNER TO statement after 
every object creation.

This means that pg_dump can dump a restorable dump in cases where, say, 
a super user created a language, and then had their superuser privs 
dropped, or when a user has created a table, but has then had their 
create privileges removed.

At the moment, i'm happy with how it dumps and reloads the regression 
database, and i'm working on adding tests for all OWNER TO in the 
regression suite.  Full doc updates are already included.

Please review and give me feedback!  The patch is large, but not at all 
complex :)

Some questions:
* Do we need the set session auth for COPY commands still?
* Are there any subtle implications of changing owners that I haven't 
realised?  I know that it will affect SECURITY DEFINER for functions, 
but I put that in the docs.

* Is doing this ok: ObjectIdGetDatum(typTup-typrelid)
* Is there any reason there is no RENAME TO command for operators?
Chris


owner.tar.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] building rpms from source rpm's

2004-06-14 Thread Greg Stark

Dave Cramer [EMAIL PROTECTED] writes:

 I am getting the following error:
 error: parse error in expression

What does this have to do with accelerating aggregates?

Please don't start new threads by responding to existing threads.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] OWNER TO on all objects

2004-06-14 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Then I made it so that pg_dump will output an OWNER TO statement after 
 every object creation.

Perhaps better to put these out towards the end of the dump, not right
after the creation of the object?  Or is that what you're doing?

I would envision the safest procedure as creating all objects, loading
all data, etc, then all ALTER OWNERs, then all GRANT/REVOKEs.

 * Do we need the set session auth for COPY commands still?

Not if you still own the table while loading into it (see above point).

However, this all assumes a complete dump/restore.  Consider data-only
restores.  Consider partial restores using pg_restore's options for
that.  What happens then?  It'd likely be appropriate to issue set
session auth during scenarios involving pre-existing objects.

 * Is there any reason there is no RENAME TO command for operators?

Lack of round tuits, no doubt.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Spinlock assembly cleanup

2004-06-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I am not 100% excited about the memory
 part because it invalidates all register memory values, not just the
 shared memory location.

That's exactly the point.

 We are specifically accessing a memory address
 as part of the ASM, so I don't see how it could pull that memory value
 from a register behind our back.

We need to prevent gcc from caching values of *other* memory locations
behind our backs.  Consider code on the order of

spinlockacquire(lock);
while (sharedvar == 0)
{
spinlockrelease(lock);
// we expect someone else to acquire lock and
// set sharedvar here...
spinlockacquire(lock);
}

If this is all inline code, and we didn't declare sharedvar as volatile,
then the compiler would be within its rights to assume that sharedvar
doesn't change, hence load it into a register once and not reload it
from shared memory after reacquiring the spinlock.  This will of course
fail to do what we want it to.

We haven't seen failures of this kind because our direct use of
spinlocks is pretty constricted, and (for example) LWLockAcquire is
careful to use a volatile pointer for all accesses to the LWLock
fields.  However this is inefficient: while it owns the spinlock,
LWLockAcquire doesn't really need to treat all the other fields
as volatile, so there are probably a few wasted loads in there.
And the requirement for using volatile pointers is something that is
likely to bite us if we start using spinlocks directly in more parts
of the code.  Not to mention that if anyone cranks up the optimization
level to the point where LWLockAcquire can get inlined into other
functions, those functions will break immediately, because they are
not saying volatile for every shared memory access.

So I think it's best to fix it as part of the TAS asm definition.

As things stand at the moment, there's not going to be any efficiency
loss, because LWLockAcquire brute-forces the same result with a volatile
pointer, and its callers aren't going to expect to be able to cache
global variables across a function call anyway.  In the long run when
you consider global inlining of functions, the possibility is there for
the efficiency to be better not worse if we do things this way.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] 7.4.3 running a bit late ...

2004-06-14 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
 got everything built this evening, just giving it a bit of time to 
 propogate out to the mirrors before announcing ...

Final tarball looks good from here ... but don't forget to update
the symlinks at the top level of the ftp site.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] OWNER TO on all objects

2004-06-14 Thread Christopher Kings-Lynne
Perhaps better to put these out towards the end of the dump, not right
after the creation of the object?  Or is that what you're doing?
I just inserted the ALTER OWNER statement between the CREATE and the 
GRANTs.  Why do you want them at the end of the dump?

I would envision the safest procedure as creating all objects, loading
all data, etc, then all ALTER OWNERs, then all GRANT/REVOKEs.
I don't yet understand your reasoning for wanting this all at the end...
Not if you still own the table while loading into it (see above point).
Can we not load as superuser?
However, this all assumes a complete dump/restore.  Consider data-only
restores.  Consider partial restores using pg_restore's options for
that.  What happens then?  It'd likely be appropriate to issue set
session auth during scenarios involving pre-existing objects.
OK, i will test all those situations...  What scenarios did you have in 
mind?

Chris
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] OWNER TO on all objects

2004-06-14 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 I just inserted the ALTER OWNER statement between the CREATE and the 
 GRANTs.  Why do you want them at the end of the dump?

So that the initial owner is still owner when he does COPY, ALTER TABLE
ADD PRIMARY KEY, etc etc.  Else you're gonna have problems.

The regression database is next to useless as a testbed for this,
btw, since all the objects in it are owned by the superuser anyway.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] OWNER TO on all objects

2004-06-14 Thread Christopher Kings-Lynne
So that the initial owner is still owner when he does COPY, ALTER TABLE
ADD PRIMARY KEY, etc etc.  Else you're gonna have problems.
I was thinking of doing all COPY and ALTER as superuser as well...
Or are you trying to make it work when run as non-super?  Which is won't 
since ALTER OWNER will require superuser.

ie. the entire script always runs as a single user, most usefully a 
superuser.

Chris
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly