Re: [HACKERS] File leak?
On Sun, 13 Jun 2004, Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: (viz, log at the instant of file creation, and the replayer would have to keep track of whether it sees the creating transaction commit and delete the file if not). I don't see how we could WAL log it because we don't fsync the WAL until our transaction completes, right, or are you thinking we would do a special fsync when we add the record? Right, we would have to XLogFlush the file-creation WAL record before we could actually create the file. This is in line with the standard WAL rule: the WAL record must hit disk before the data file change it describes does. Assuming that the filesystem fsync's the created inode immediately, that means we have to flush first. I'm afraid that's not enough. Checkpoints spoil it, think: 1. CREATE TABLE foobar ... 2. INSERT 3. checkpoint 4. crash The replay would not see the file-creation WAL record. We need some additional stash for the pending file-creations to make them survive checkpoints. I'm not sure what the performance implications of this would be; it's likely that pushing the cost somewhere else would be better. I don't think that file creation is that common for it to matter.. - Heikki ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Releasing 7.4.3 ...
Marc G. Fournier wrote: On Sun, 13 Jun 2004, Peter Eisentraut wrote: Marc G. Fournier wrote: Anyone else, please test the tar ball for any bug/nits ... specifically, Peter, can you check that I've built/included the right documentation? Try reading the list of supported platforms at the bottom of the INSTALL file... k, and that is supposed to tell me what? There is supposed to be a table there; I just see unaligned gargage. The INSTALL file looks quite bad. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] I just got it: PostgreSQL Application Server -- a
Joshua D. Drake wrote: The PostgreSQL Enhanced Server (How's that name? Maybe we call it Zerver and use PEZ?) idea is how to take the excellent core of PostgreSQL and productize it in much the same way distributions take the Linux kernel and may a GNU/Linux system. It would seem to me that this is more correct in the commercial space. Of course I am biased but what you are talking about sounds a whole lot like RedHat Enterprise versus Fedora etc And Postgresql Inc, Command Prompt, Slony etc... regards Mark ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [PATCHES] Compiling libpq with VisualC
What is the recommended way to create mutex objects (CreateMutex) from Win32 libraries? There must be a clean way like there is in pthreads. A mutex is inherently a global object. CreateMutex(NULL, FALSE, NULL) will return a handle to an unowned mutex. That's not the problem. Under pthread, it's possible to initialize a mutex from compile time: static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER; This means that the mutex is immediately valid, no races with the initialization. I couldn't find an equivalent Win32 feature. AFAIK, there is no such thing on Win32. The clean way is probably to rqeuire the library to export a function InitialyzeFooLibrary() that does it (like Winsock does with requiring WSAStartup()). To do something like it though, you can use a named mutex. Then doing, in pseudocode: if (CreateMutex(...,my_unique_mutex_name) == ERROR_ALREADY_EXISTS) OpenMutex(...,my_unique_mutex_name) Assuming nobody closes the mutex between your attempt to create and open (which shouldn't happen if you just ignore closing it until process exit), this should be safe. Store the HANDLE to the Mutex in TLS, and have each thread do the create/open when it needs the mutex (e.g. wrap the wait on the mutex in a function/macro that will create/open the mutex if it's INVALID_HANDLE_VALUE, which you assign it to by default). You need a unique name for the mutex, since it's not per-process but per-sessino. But that can easily be constructed from the pid. //Magnus ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [pgsql-hackers-win32] [HACKERS] [PATCHES] Compiling libpq with
Magnus Hagander wrote: What is the recommended way to create mutex objects (CreateMutex) from Win32 libraries? There must be a clean way like there is in pthreads. A mutex is inherently a global object. CreateMutex(NULL, FALSE, NULL) will return a handle to an unowned mutex. That's not the problem. Under pthread, it's possible to initialize a mutex from compile time: static pthread_mutex_t init_mutex = PTHREAD_MUTEX_INITIALIZER; This means that the mutex is immediately valid, no races with the initialization. I couldn't find an equivalent Win32 feature. AFAIK, there is no such thing on Win32. The clean way is probably to rqeuire the library to export a function InitialyzeFooLibrary() that does it (like Winsock does with requiring WSAStartup()). To do something like it though, you can use a named mutex. Then doing, in pseudocode: if (CreateMutex(...,my_unique_mutex_name) == ERROR_ALREADY_EXISTS) OpenMutex(...,my_unique_mutex_name) Assuming nobody closes the mutex between your attempt to create and open (which shouldn't happen if you just ignore closing it until process exit), this should be safe. Store the HANDLE to the Mutex in TLS, and have each thread do the create/open when it needs the mutex (e.g. wrap the wait on the mutex in a function/macro that will create/open the mutex if it's INVALID_HANDLE_VALUE, which you assign it to by default). You need a unique name for the mutex, since it's not per-process but per-sessino. But that can easily be constructed from the pid. A libpq patch avoiding the InitializeFooLibrary() creating the mutex on-demand is in pgsql-patches already. +#ifndef WIN32 static pthread_mutex_t singlethread_lock = PTHREAD_MUTEX_INITIALIZER; +#else +static pthread_mutex_t singlethread_lock; +static long mutex_initialized = 0; +if (!InterlockedExchange(mutex_initialized, 1L)) +pthread_mutex_init(singlethread_lock, NULL); // wraps CreateMutex(NULL,FALSE,NULL) +#endif Regards, Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] I just got it: PostgreSQL Application Server -- a
The PostgreSQL Enhanced Server (How's that name? Maybe we call it Zerver and use PEZ?) idea is how to take the excellent core of PostgreSQL and productize it in much the same way distributions take the Linux kernel and may a GNU/Linux system. It would seem to me that this is more correct in the commercial space. Of course I am biased but what you are talking about sounds a whole lot like RedHat Enterprise versus Fedora etc No, I don't think I agree. It does not need to be Commercial as it is similar to Apache Jacarta too. If you are going to do a complex project with PostgreSQL, you sort of have a lot of construction ahead of you. Yea, it is a great SQL engine, but to build a high speed web site, or virtually any complex project, you will need a lot of add-ons. Rather than have everyone duplate the effort of finding the extensions, why not have a project with all this stuff installed. AFAIK, and correct me if I'm wrong, having functions installed doesn't affect performance. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Releasing 7.4.3 ...
Peter Eisentraut [EMAIL PROTECTED] writes: The INSTALL file looks quite bad. It's clearly been freshly generated. The formatting does seem worse than in previous versions, but it's not so bad I'd want to hold up the release to fix it. My guess is that the wrong version of lynx is being used to build it. I remember being dissatisfied with the output of developer.pg.org's version of lynx back when we were making this file manually. (The version I have here is 2.8.5rel.1 (04 Feb 2004) and it seems to do fine.) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] File leak?
Heikki Linnakangas [EMAIL PROTECTED] writes: I'm afraid that's not enough. Checkpoints spoil it, think: 1. CREATE TABLE foobar ... 2. INSERT 3. checkpoint 4. crash The replay would not see the file-creation WAL record. Good point. That makes it messy enough that we probably don't want to do it that way. Scan-for-unreferenced-files is looking a lot more robust (although it has its own interesting race-condition issues if you try to do it in a live system). I'm not sure what the performance implications of this would be; it's likely that pushing the cost somewhere else would be better. I don't think that file creation is that common for it to matter.. Maybe not for regular tables, but for temp tables I'm less convinced. If we could do the unreferenced-file scan only at completion of a crash recovery then it'd be zero cost in all normal paths ... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Releasing 7.4.3 ...
I wrote: My guess is that the wrong version of lynx is being used to build it. I remember being dissatisfied with the output of developer.pg.org's version of lynx back when we were making this file manually. (The version I have here is 2.8.5rel.1 (04 Feb 2004) and it seems to do fine.) Actually, having just tried it, 2.8.5rel.1 generates output that is nearly indistinguishable from what's in this morning's snapshot. In particular I do not get the ASCII-art decoration for the table of supported platforms that I see in the 7.4.2 version of INSTALL. It comes out exactly like what's in the snapshot. I'm now wondering about changes in the SGML stylesheets used to generate the .html file we then send to lynx. But I would definitely put this in the category of something to investigate and fix later. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Weird 'bit' type behaviour
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Is there any reason for this behaviour: test=# select 1::bit; bit - 0 (1 row) This is actually 1::int4::bit(1), and what you are getting is the sign bit. See previous discussions about int-to-bit conversion and which part of the int we ought to take. What about these? Again, it's a field-width issue. These all default to bit(1) and you're shifting out of the defined width. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Delaying the planning of unnamed statements until Bind
Oliver Jowett [EMAIL PROTECTED] writes: I guess that ExprState does not live long enough to be useful. Actually the opposite: it lasts too long, namely the entire execution of a query. I don't think there's any convenient way to reset it on the timescale appropriate for STABLE values (ie, once per scan, as opposed to once per query). How about introducing a function modifier that provides stronger guarantees than STABLE, along the lines of immutable during execution of a single SQL statement? Why? I suspect that if we did have two flavors of STABLE, we'd just have a lot of people getting it wrong :-(. A big advantage of the current definition is exactly that it is pretty weak... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Passing typmod to cast functions (for int-to-bit casting)
Chris K-L's recent complaint reminded me that we had been talking about redefining int-to-bit(n) casting to take the rightmost n bits of the integer value, instead of the leftmost n bits. The implementation reason why it works the way it does is that the existing cast function effectively converts to bit(32), and then after that we cast to bit(n), and the bitstring width-conversion transformation takes the leftmost bits of the bitstring, which is per SQL spec. The only convenient way I can see to handle this is to extend the cast stuff so that the cast function can be passed an additional parameter which is the target typmod. Armed with that info, inttobit() could align its output bits properly for the upcoming bitstring truncation. Now that cast functions are selected through pg_cast, this should be a fairly straightforward change. Does anyone have a problem with it? I'm not sure the functionality is actually useful for anything except this one issue, but arguably it's a general-purpose mechanism... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] File leak?
Bruce Momjian wrote: Tom Lane wrote: Now that I think about it, I believe Bruce recently removed this on my advice; I was thinking that the problem shouldn't occur anymore now that True. we WAL-log file creation and deletion. But actually the present form of the WAL entries doesn't ensure that a file created by a transaction that crashes before committing will go away, because file deletion actions are only logged (and replayed) at transaction commit/abort. So it probably should go back in. Or else we could add more WAL logging Wording updated to: * Remove unreferenced table files created by a transactions that were in-progress when the server crashed I don't think is a good idea put the words: when the server crashed in a TODO list, may be is better write: when the server is killed abruptly. My 2 cents. Regards Gaetano Mendola ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Improving postgresql.conf
Scott Marlowe wrote: On Fri, 2004-06-11 at 11:02, Bruce Momjian wrote: Gaetano Mendola wrote: [ PGP not available, raw data follows ] -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bruce Momjian wrote: | Gaetano Mendola wrote: | |Bruce Momjian wrote: | | I understand your points below. However, the group has weighed in the | direction of clearly showing non-default values and not duplicating | documentation. We can change that, but you will need more folks | agreeing with your direction. | |I don't remember the behaviour but tell me what happen if |I comment out a value changing the value. Kill UP the postmater. |Recommenting that value and now re killing the postmaster. | |I believe that postmaster will not run with the default value. |Who will look the configuration file will not understand the right |reality. | | | If you comment a variable in postgresql.conf, it will use the | default value. That's not true at least with the version 7.4.2. Try yourself, I did the experiment changing the cpu_tuple_cost and commenting out the cpu_tuple_cost, after sending the SIGHUP to postmaster the value remain: 0.005 that is not the default value at all. Oh, sorry, you are right. Not sure if this is a bug or not. This point has come up before, and I think it's intended behavior. Stopping and restarting the database will, of course, make it load the defaults. Yes and this doesn't help to understand how the postmaster is running unless you use SHOW ALL. Regards Gaetano Mendola ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Releasing 7.4.3 ...
On Mon, 14 Jun 2004, Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: The INSTALL file looks quite bad. It's clearly been freshly generated. The formatting does seem worse than in previous versions, but it's not so bad I'd want to hold up the release to fix it. My guess is that the wrong version of lynx is being used to build it. I remember being dissatisfied with the output of developer.pg.org's version of lynx back when we were making this file manually. (The version I have here is 2.8.5rel.1 (04 Feb 2004) and it seems to do fine.) svr1# lynx --version Lynx Version 2.8.5rel.1 (04 Feb 2004) libwww-FM 2.14, SSL-MM 1.4.1, OpenSSL 0.9.7d Built on freebsd4.9 Feb 28 2004 22:01:12 Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Delaying the planning of unnamed statements until Bind
Tom Lane wrote: Oliver Jowett [EMAIL PROTECTED] writes: I guess that ExprState does not live long enough to be useful. Actually the opposite: it lasts too long, namely the entire execution of a query. I don't think there's any convenient way to reset it on the timescale appropriate for STABLE values (ie, once per scan, as opposed to once per query). I think you misunderstand what I was suggesting. Given your earlier clarification of what STABLE means, it isn't correct to mark expressions involving a STABLE function as constant-at-execution-time, so those results would not be cached. But there are still other expression trees that would benefit, e.g. those involving an IMMUTABLE function with parameterized arguments. How about introducing a function modifier that provides stronger guarantees than STABLE, along the lines of immutable during execution of a single SQL statement? Why? It's not directly useful currently, as there's no expression caching going on. If there was expression caching, the stronger guarantees would allow you to cache a wider range of expressions. I suspect that if we did have two flavors of STABLE, we'd just have a lot of people getting it wrong :-(. A big advantage of the current definition is exactly that it is pretty weak... It seems quite hard to build a STABLE function that doesn't also satisfy the stronger requirements. I can't think of how you'd do it as a SQL function at all, off the top of my head. What sort of function were you thinking of that is STABLE-safe but doesn't satisfy the stronger requirements? -O ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Delaying the planning of unnamed statements until Bind
Oliver Jowett [EMAIL PROTECTED] writes: But there are still other expression trees that would benefit, e.g. those involving an IMMUTABLE function with parameterized arguments. Oh, you are thinking of some very-long-lived cache. This has been proposed and rejected before; it's just not apparent that the costs of maintaining and searching such a cache are justified by the possible benefits. Most of the functions that actually appear in SQL commands are cheap enough to evaluate that it'd not be worthwhile to do this at all for them, ever --- the costs of executing datatype-specific comparison functions to verify a hashtable hit would equal or exceed the cost of evaluating the function. There certainly are expensive user functions out there, and if we knew which ones those were, it might be worth caching their values. But we don't presently have any way to identify such functions. More, I'm not convinced that very many of the ones that are that expensive can reasonably be marked IMMUTABLE; an expensive function is likely one that does database accesses. It seems quite hard to build a STABLE function that doesn't also satisfy the stronger requirements. I can't think of how you'd do it as a SQL function at all, off the top of my head. What sort of function were you thinking of that is STABLE-safe but doesn't satisfy the stronger requirements? Anything at all that inspects database contents is probably STABLE and not anything stronger, since it could potentially be affected by intra-transaction updates. (The definition of STABLE is partly motivated by MVCC semantics, particularly the fact that updates executed by a command only become visible at CommandCounterIncrement boundaries.) regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Frequently updated tables
[EMAIL PROTECTED] wrote: I have been talking about two types of problems which are both based on PostgreSQL's behavior with frequently updated tables. Summary table: In the single row table system, you have to vacuum very requently, and this affects performance. Frequently updated tables: think about the session table for a website. Each new user gets a new session row. Everytime they refresh or act in the site, the row is updated. When they leave or their session times out, the row is deleted. I wrote a RAM only session manager for PHP because PostgreSQL couldn't handle the volume. (2000 hits a second) It would be interesting to see if the vacuum delay patch, fsm tuning + vacuum scheduling could have changed this situation. Clearly there is an issue here (hence a patch...), but ISTM that just as significant is the fact that it is difficult to know how to configure the various bits and pieces, and also difficult to know if it has been done optimally. If you have an active site, with hundreds or thousands of hits a second, vacuuming the table constantly is not practical. I don't think anyone who has seriously looked at these issues has concluded that PostgreSQL works fine in these cases. The question is what, if anything, can be done? The frequent update issue really affects PostgreSQL's acceptance in web applications, and one which MySQL seems to do a better job. As an aside, I have had similar issues with DB2 and high update tables - lock escalations (locklist tuning needed). It is not just non-overwriting storage managers that need the magic tuning wand :-) Funny, I've used DB2 for a few projects, but never for a web session system. This is an interesting data point thanks. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Frequently updated tables
[EMAIL PROTECTED] wrote: I have been talking about two types of problems which are both based on PostgreSQL's behavior with frequently updated tables. Summary table: In the single row table system, you have to vacuum very requently, and this affects performance. Frequently updated tables: think about the session table for a website. Each new user gets a new session row. Everytime they refresh or act in the site, the row is updated. When they leave or their session times out, the row is deleted. I wrote a RAM only session manager for PHP because PostgreSQL couldn't handle the volume. (2000 hits a second) It would be interesting to see if the vacuum delay patch, fsm tuning + vacuum scheduling could have changed this situation. Clearly there is an issue here (hence a patch...), but ISTM that just as significant is the fact that it is difficult to know how to configure the various bits and pieces, and also difficult to know if it has been done optimally. If you have an active site, with hundreds or thousands of hits a second, vacuuming the table constantly is not practical. I don't think anyone who has seriously looked at these issues has concluded that PostgreSQL works fine in these cases. The question is what, if anything, can be done? The frequent update issue really affects PostgreSQL's acceptance in web applications, and one which MySQL seems to do a better job. As an aside, I have had similar issues with DB2 and high update tables - lock escalations (locklist tuning needed). It is not just non-overwriting storage managers that need the magic tuning wand :-) regards Mark ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Coding question
Hi, I think replacing the RELOID with TYPEOID should get you the required Datum ... later you may have to use Form_pg_type pform = (Form_pg_type) GETSTRUCT(tuple); to get a reference to the type tuple. I have worked this way but iam not sure whether this is correct or not !! If iam wrong some body can correct me !! -Ramu If typTup is of type Form_pg_type, is this use of ObjectIdGetDatum legal? tuple = SearchSysCache(RELOID, ObjectIdGetDatum(typTup-typrelid), 0, 0, 0); If not, how do I turn -typrelid into an Oid type for safe passage through ObjectIdGetDatum? Chris _ Contact brides grooms FREE! http://www.shaadi.com/index.php?ptnr=hmltag Only on www.shaadi.com. Register now! ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Spinlock assembly cleanup
Pursuant to the gripes raised by Martin Pitt --- I've consulted some gcc experts within Red Hat and come to the following conclusions: * We should consistently refer to the spinlock contents via a read/write operand declared like +m(*lock). This is consistent with longstanding practice in the Linux kernel and therefore is unlikely to get broken in future gcc releases. The existing ports that use matched input and output parameters will break, or at least draw nasty warnings, in upcoming gcc releases. * Not all of the ports currently declare *lock as an input operand, but this seems rather dangerous to me; I think all should use +m. * Some but not all the ports list memory as a clobbered operand. The gcc manual saith : If your assembler instruction modifies memory in an unpredictable : fashion, add `memory' to the list of clobbered registers. This will : cause GNU CC to not keep memory values cached in registers across the : assembler instruction. Now as far as I can see, none of the spinlock sequences directly clobber any memory other than the spinlock itself, and so (as long as the lock is stated to be an output operand) one might think the memory clobber marking to be excessive. However, I am thinking it is actually a good idea and we ought to add the marking to all ports, not remove it. The thought is that what we are actually using the spinlock for is to guard access to values in shared memory, and therefore the act of waiting for a spinlock can be seen as waiting for other memory variables to assume values they didn't necessarily have last time we looked. If gcc caches shared variables in registers across a spinlock acquisition, the code is broken. The alternative to doing this would be to always use volatile pointers to access shared memory, but I don't want to do that --- in the first place it's notationally cumbersome, and in the second place it would hurt performance unnecessarily. Within straight-line code that holds a spinlock there is no reason to treat shared memory as volatile. It's only when crossing a spinlock boundary that you must reload from memory, and that seems to be exactly what the memory modifier declares for us. (I am assuming here that marking the asm fragment volatile does not necessarily do what the memory modifier does; I can't see anything in the gcc docs that claims volatile includes the effects of memory.) So I'd like to make all the gcc-asm fragments for spinlocks follow these rules. Comments? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Accelerating aggregates
Steve Atkins [EMAIL PROTECTED] writes: On Fri, Jun 11, 2004 at 12:17:57PM -0400, Greg Stark wrote: no, read committed would see any other updates that have been committed since the start of your transaction. Uhm... only updates within the current transaction. No, read committed refers to being able to read any updates that are committed, even if they were committed after the start of your transaction: For example: db= begin; BEGIN db= begin; BEGIN db= insert into test values (1); INSERT 6725927 1 db= select * from test; a --- 1 (1 row) db= select * from test; a --- (0 rows) db= commit; COMMIT db= select * from test; a --- 1 (1 row) -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] building rpms from source rpm's
I am getting the following error: error: parse error in expression error: /usr/src/redhat/SPECS/postgresql-7.4.2-1PGDG.spec:98: parseExpressionBoolean returns -1 error: Package has no %description: postgresql When I execute rpmbuild --rebuild --define 'build9x 1' --define 'tcldevel 0' --define 'perl 0' --define 'tcl 0' --define 'tkpkg 0' --define 'test 0' --define 'newintarray 1' --define 'kerberos 0' -vv SRPMS/postgresql-7.4.2-1PGDG.src.rpm Dave -- Dave Cramer 519 939 0336 ICQ # 14675561 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] 7.4.3 running a bit late ...
got everything built this evening, just giving it a bit of time to propogate out to the mirrors before announcing ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Passing typmod to cast functions (for int-to-bit casting)
Now that cast functions are selected through pg_cast, this should be a fairly straightforward change. Does anyone have a problem with it? I'm not sure the functionality is actually useful for anything except this one issue, but arguably it's a general-purpose mechanism... Does that help with making CREATE TABLE AS SELECT pick up varchar lengths, etc.? Chris ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Delaying the planning of unnamed statements until Bind
Tom Lane wrote: Oliver Jowett [EMAIL PROTECTED] writes: But there are still other expression trees that would benefit, e.g. those involving an IMMUTABLE function with parameterized arguments. Oh, you are thinking of some very-long-lived cache. This has been proposed and rejected before; it's just not apparent that the costs of maintaining and searching such a cache are justified by the possible benefits. Most of the functions that actually appear in SQL commands are cheap enough to evaluate that it'd not be worthwhile to do this at all for them, ever --- the costs of executing datatype-specific comparison functions to verify a hashtable hit would equal or exceed the cost of evaluating the function. I was actually thinking of only caching when the structure of the expression tree means that it is known to be constant across some period -- e.g. (non-subquery) Params remain constant across a single query execution. So there's no hashtable or datatype-specific comparisons involved. The cache only lives as long as you can guarantee the expression tree remains constant. I'm just trying to work out the best lifetime for the cache (or, equivalently, the types of expression tree that can be marks as cacheable). There certainly are expensive user functions out there, and if we knew which ones those were, it might be worth caching their values. But we don't presently have any way to identify such functions. More, I'm not convinced that very many of the ones that are that expensive can reasonably be marked IMMUTABLE; an expensive function is likely one that does database accesses. Fair enough. My concern is that if every query is parameterized by an interface layer (as I'm planning to do with the JDBC driver), that one expensive IMMUTABLE function is going to bite the application. So I'd still like to see some sort of caching so that those few queries don't run significantly slower, assuming that the cost of caching in the common case is minor. It seems quite hard to build a STABLE function that doesn't also satisfy the stronger requirements. I can't think of how you'd do it as a SQL function at all, off the top of my head. What sort of function were you thinking of that is STABLE-safe but doesn't satisfy the stronger requirements? Anything at all that inspects database contents is probably STABLE and not anything stronger, since it could potentially be affected by intra-transaction updates. (The definition of STABLE is partly motivated by MVCC semantics, particularly the fact that updates executed by a command only become visible at CommandCounterIncrement boundaries.) Does that mean that these functions satisfy IMMUTABLE until the next CommandCounterIncrement? That sounds more cacheable than the tablescan-based definition. -O ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Spinlock assembly cleanup
Sounds good to me. Consistencyis important because it lets us fix problems across all cpu types. I am not 100% excited about the memory part because it invalidates all register memory values, not just the shared memory location. We are specifically accessing a memory address as part of the ASM, so I don't see how it could pull that memory value from a register behind our back. --- Tom Lane wrote: Pursuant to the gripes raised by Martin Pitt --- I've consulted some gcc experts within Red Hat and come to the following conclusions: * We should consistently refer to the spinlock contents via a read/write operand declared like +m(*lock). This is consistent with longstanding practice in the Linux kernel and therefore is unlikely to get broken in future gcc releases. The existing ports that use matched input and output parameters will break, or at least draw nasty warnings, in upcoming gcc releases. * Not all of the ports currently declare *lock as an input operand, but this seems rather dangerous to me; I think all should use +m. * Some but not all the ports list memory as a clobbered operand. The gcc manual saith : If your assembler instruction modifies memory in an unpredictable : fashion, add `memory' to the list of clobbered registers. This will : cause GNU CC to not keep memory values cached in registers across the : assembler instruction. Now as far as I can see, none of the spinlock sequences directly clobber any memory other than the spinlock itself, and so (as long as the lock is stated to be an output operand) one might think the memory clobber marking to be excessive. However, I am thinking it is actually a good idea and we ought to add the marking to all ports, not remove it. The thought is that what we are actually using the spinlock for is to guard access to values in shared memory, and therefore the act of waiting for a spinlock can be seen as waiting for other memory variables to assume values they didn't necessarily have last time we looked. If gcc caches shared variables in registers across a spinlock acquisition, the code is broken. The alternative to doing this would be to always use volatile pointers to access shared memory, but I don't want to do that --- in the first place it's notationally cumbersome, and in the second place it would hurt performance unnecessarily. Within straight-line code that holds a spinlock there is no reason to treat shared memory as volatile. It's only when crossing a spinlock boundary that you must reload from memory, and that seems to be exactly what the memory modifier declares for us. (I am assuming here that marking the asm fragment volatile does not necessarily do what the memory modifier does; I can't see anything in the gcc docs that claims volatile includes the effects of memory.) So I'd like to make all the gcc-asm fragments for spinlocks follow these rules. Comments? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Passing typmod to cast functions (for int-to-bit casting)
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Does that help with making CREATE TABLE AS SELECT pick up varchar lengths, etc.? Not any better than it does now, no ... but AFAIK simple cases work okay on that. What's your gripe exactly? regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Passing typmod to cast functions (for int-to-bit casting)
Not any better than it does now, no ... but AFAIK simple cases work okay on that. What's your gripe exactly? No gripe - was just pointing out a situation that might be improved slightly be carrying around typmod info. Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] OWNER TO on all objects
Hi, This is a preview patch - DON'T COMMIT IT TO HEAD! What I've done in this patch is add the following: ALTER AGGREGATE / OWNER TO ALTER CONVERSION / OWNER TO ALTER FUNCTION / OWNER TO ALTER OPERATOR / OWNER TO ALTER OPERATOR CLASS / OWNER TO ALTER SCHEMA / OWNER TO ALTER TYPE / OWNER TO That means we can change the owner of all objects. Next, I modified pg_dump to remove all SET SESSION AUTHORIZATION commands for object creation. (I left them in on the COPY commands). Then I made it so that pg_dump will output an OWNER TO statement after every object creation. This means that pg_dump can dump a restorable dump in cases where, say, a super user created a language, and then had their superuser privs dropped, or when a user has created a table, but has then had their create privileges removed. At the moment, i'm happy with how it dumps and reloads the regression database, and i'm working on adding tests for all OWNER TO in the regression suite. Full doc updates are already included. Please review and give me feedback! The patch is large, but not at all complex :) Some questions: * Do we need the set session auth for COPY commands still? * Are there any subtle implications of changing owners that I haven't realised? I know that it will affect SECURITY DEFINER for functions, but I put that in the docs. * Is doing this ok: ObjectIdGetDatum(typTup-typrelid) * Is there any reason there is no RENAME TO command for operators? Chris owner.tar.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] building rpms from source rpm's
Dave Cramer [EMAIL PROTECTED] writes: I am getting the following error: error: parse error in expression What does this have to do with accelerating aggregates? Please don't start new threads by responding to existing threads. -- greg ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] OWNER TO on all objects
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Then I made it so that pg_dump will output an OWNER TO statement after every object creation. Perhaps better to put these out towards the end of the dump, not right after the creation of the object? Or is that what you're doing? I would envision the safest procedure as creating all objects, loading all data, etc, then all ALTER OWNERs, then all GRANT/REVOKEs. * Do we need the set session auth for COPY commands still? Not if you still own the table while loading into it (see above point). However, this all assumes a complete dump/restore. Consider data-only restores. Consider partial restores using pg_restore's options for that. What happens then? It'd likely be appropriate to issue set session auth during scenarios involving pre-existing objects. * Is there any reason there is no RENAME TO command for operators? Lack of round tuits, no doubt. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Spinlock assembly cleanup
Bruce Momjian [EMAIL PROTECTED] writes: I am not 100% excited about the memory part because it invalidates all register memory values, not just the shared memory location. That's exactly the point. We are specifically accessing a memory address as part of the ASM, so I don't see how it could pull that memory value from a register behind our back. We need to prevent gcc from caching values of *other* memory locations behind our backs. Consider code on the order of spinlockacquire(lock); while (sharedvar == 0) { spinlockrelease(lock); // we expect someone else to acquire lock and // set sharedvar here... spinlockacquire(lock); } If this is all inline code, and we didn't declare sharedvar as volatile, then the compiler would be within its rights to assume that sharedvar doesn't change, hence load it into a register once and not reload it from shared memory after reacquiring the spinlock. This will of course fail to do what we want it to. We haven't seen failures of this kind because our direct use of spinlocks is pretty constricted, and (for example) LWLockAcquire is careful to use a volatile pointer for all accesses to the LWLock fields. However this is inefficient: while it owns the spinlock, LWLockAcquire doesn't really need to treat all the other fields as volatile, so there are probably a few wasted loads in there. And the requirement for using volatile pointers is something that is likely to bite us if we start using spinlocks directly in more parts of the code. Not to mention that if anyone cranks up the optimization level to the point where LWLockAcquire can get inlined into other functions, those functions will break immediately, because they are not saying volatile for every shared memory access. So I think it's best to fix it as part of the TAS asm definition. As things stand at the moment, there's not going to be any efficiency loss, because LWLockAcquire brute-forces the same result with a volatile pointer, and its callers aren't going to expect to be able to cache global variables across a function call anyway. In the long run when you consider global inlining of functions, the possibility is there for the efficiency to be better not worse if we do things this way. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.4.3 running a bit late ...
Marc G. Fournier [EMAIL PROTECTED] writes: got everything built this evening, just giving it a bit of time to propogate out to the mirrors before announcing ... Final tarball looks good from here ... but don't forget to update the symlinks at the top level of the ftp site. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] OWNER TO on all objects
Perhaps better to put these out towards the end of the dump, not right after the creation of the object? Or is that what you're doing? I just inserted the ALTER OWNER statement between the CREATE and the GRANTs. Why do you want them at the end of the dump? I would envision the safest procedure as creating all objects, loading all data, etc, then all ALTER OWNERs, then all GRANT/REVOKEs. I don't yet understand your reasoning for wanting this all at the end... Not if you still own the table while loading into it (see above point). Can we not load as superuser? However, this all assumes a complete dump/restore. Consider data-only restores. Consider partial restores using pg_restore's options for that. What happens then? It'd likely be appropriate to issue set session auth during scenarios involving pre-existing objects. OK, i will test all those situations... What scenarios did you have in mind? Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] OWNER TO on all objects
Christopher Kings-Lynne [EMAIL PROTECTED] writes: I just inserted the ALTER OWNER statement between the CREATE and the GRANTs. Why do you want them at the end of the dump? So that the initial owner is still owner when he does COPY, ALTER TABLE ADD PRIMARY KEY, etc etc. Else you're gonna have problems. The regression database is next to useless as a testbed for this, btw, since all the objects in it are owned by the superuser anyway. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] OWNER TO on all objects
So that the initial owner is still owner when he does COPY, ALTER TABLE ADD PRIMARY KEY, etc etc. Else you're gonna have problems. I was thinking of doing all COPY and ALTER as superuser as well... Or are you trying to make it work when run as non-super? Which is won't since ALTER OWNER will require superuser. ie. the entire script always runs as a single user, most usefully a superuser. Chris ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly