Re: [HACKERS] Posix Shared Mem patch
On Jun 27, 2012, at 7:34 AM, Robert Haas wrote: On Wed, Jun 27, 2012 at 12:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: So, here's a patch. Instead of using POSIX shmem, I just took the expedient of using mmap() to map a block of MAP_SHARED|MAP_ANONYMOUS memory. The sysv shm is still allocated, but it's just a copy of PGShmemHeader; the real shared memory is the anonymous block. This won't work if EXEC_BACKEND is defined so it just falls back on straight sysv shm in that case. Um. I hadn't thought about the EXEC_BACKEND interaction, but that seems like a bit of a showstopper. I would not like to give up the ability to debug EXEC_BACKEND mode on Unixen. Would Posix shmem help with that at all? Why did you choose not to use the Posix API, anyway? It seemed more complicated. If we use the POSIX API, we've got to have code to find a non-colliding name for the shm, and we've got to arrange to clean it up at process exit. Anonymous shm doesn't require a name and goes away automatically when it's no longer in use. With respect to EXEC_BACKEND, I wasn't proposing to kill it, just to make it continue to use a full-sized sysv shm. I solved this by unlinking the posix shared memory segment immediately after creation. The file descriptor to the shared memory is inherited, so, by definition, only the postmaster children can access the memory. This ensures that shared memory cleanup is immediate after the postmaster and all children close, as well. The fcntl locking is not required to protect the posix shared memory- it can protect itself. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
On Jun 26, 2012, at 5:44 PM, Josh Berkus wrote: On that, I used to be of the opinion that this is a good compromise (a small amount of interlock space, plus mostly posix shmem), but I've heard since then (I think via AgentM indirectly, but I'm not sure) that there are cases where even the small SysV segment can cause problems -- notably when other software tweaks shared memory settings on behalf of a user, but only leaves just-enough for the software being installed. This is most likely on platforms that don't have a high SysV shmem limit by default, so installers all feel the prerogative to increase the limit, but there's no great answer for how to compose a series of such installations. It only takes one installer that says whatever, I'm just catenating stuff to sysctl.conf that works for me to sabotage Postgres' ability to start. Personally, I see this as rather an extreme case, and aside from AgentM himself, have never run into it before. Certainly it would be useful to not need SysV RAM at all, but it's more important to get a working patch for 9.3. This can be trivially reproduced if one runs an old (SysV shared memory-based) postgresql alongside a potentially newer postgresql with a smaller SysV segment. This can occur with applications that bundle postgresql as part of the app. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
On Jun 26, 2012, at 6:12 PM, Daniel Farina wrote: (Emphasis mine). I don't think that -hackers at the time gave the zero-shmem rationale much weight (I also was not that happy about the safety mechanism of that patch), but upon more reflection (and taking into account *other* software that may mangle shmem settings) I think it's something at least worth thinking about again one more time. What killed the patch was an attachment to the deemed-less-safe stategy for avoiding bogus shmem attachments already in it, but I don't seem to recall anyone putting a whole lot of thought at the time into the zero-shmem case from what I could read on the list, because a small interlock with nattach seemed good-enough. I'm simply suggesting that for additional benefits it may be worth thinking about getting around nattach and thus SysV shmem, especially with regard to safety, in an open-ended way. Maybe there's a solution (like Robert's FIFO suggestion?) that is not too onerous and can satisfy everyone. I solved this via fcntl locking. I also set up gdb to break in critical regions to test the interlock and I found no flaw in the design. More eyes would be welcome, of course. https://github.com/agentm/postgres/tree/posix_shmem Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
On 06/26/2012 07:30 PM, Tom Lane wrote: A.M. age...@themactionfaction.com writes: On Jun 26, 2012, at 6:12 PM, Daniel Farina wrote: I'm simply suggesting that for additional benefits it may be worth thinking about getting around nattach and thus SysV shmem, especially with regard to safety, in an open-ended way. I solved this via fcntl locking. No, you didn't, because fcntl locks aren't inherited by child processes. Too bad, because they'd be a great solution otherwise. You claimed this last time and I replied: http://archives.postgresql.org/pgsql-hackers/2011-04/msg00656.php I address this race condition by ensuring that a lock-holding violator is the postmaster or a postmaster child. If such as condition is detected, the child exits immediately without touching the shared memory. POSIX shmem is inherited via file descriptors. This is possible because the locking API allows one to request which PID violates the lock. The child expects the lock to be held and checks that the PID is the parent. If the lock is not held, that means that the postmaster is dead, so the child exits immediately. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Posix Shared Mem patch
On 06/26/2012 07:15 PM, Alvaro Herrera wrote: Excerpts from Tom Lane's message of mar jun 26 18:58:45 -0400 2012: Even if you actively try to configure the shmem settings to exactly fill shmmax (which I concede some installation scripts might do), it's going to be hard to do because of the 8K granularity of the main knob, shared_buffers. Actually it's very easy -- just try to start postmaster on a system with not enough shmmax and it will tell you how much shmem it wants. Then copy that number verbatim in the config file. This might fail on picky systems such as MacOSX that require some exact multiple or power of some other parameter, but it works fine on Linux. Except that we have to account for other installers. A user can install an application in the future which clobbers the value and then the original application will fail to run. The options to get the first app working is: a) to re-install the first app (potentially preventing the second app from running) b) to have the first app detect the failure and readjust the value (guessing what it should be) and potentially forcing a reboot c) to have the the user manually adjust the value and potentially force a reboot The failure usually gets blamed on the first application. That's why we had to nuke SysV shmem. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Mar 7, 2012, at 11:39 PM, Bruce Momjian wrote: On Thu, Mar 01, 2012 at 09:06:10PM -0500, Bruce Momjian wrote: OK, combining your and Robert's ideas, how about I have pg_upgrade write the server log to a file, and the pg_dump output to a file (with its stderr), and if pg_upgrade fails, I report the failure and mention those files. If pg_upgrade succeeds, I remove the files? pg_upgrade already creates temporary files that it removes on completion. OK, I have completed a rework of pg_upgrade logging. pg_upgrade had 4 logging options, -g, -G, -l, and -v, and still it wasn't possible to get useful logging. :-( What I have done with this patch is to remove -g, -G, and -l, and unconditionally write to 4 log files in the current directory (in addition to the 3 SQL files I already create). If pg_upgrade succeeds, the files are removed, but if it fails (or if the new -r/retain option is used), the files remain. Here is a sample failure when I create a plpgsql function in the old server, but truncate plpgsql.so in the new server: It looks like the patch will overwrite the logs in the current working directory, for example, if pg_upgrade is run twice in the same place. Is that intentional? I had imagined that the logs would have been dumped in the /tmp directory so that one can compare results if the first pg_upgrade run had been errant. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Mar 8, 2012, at 10:00 AM, Bruce Momjian wrote: Yes. I was afraid that continually appending to a log file on every run would be too confusing. I could do only appends, or number the log files, that those seemed confusing. the /tmp directory so that one can compare results if the first pg_upgrade run had been errant. You would have to copy the file to a new name before re-running pg_upgrade. The only reason I truncate them on start is that I am appending to them in many places in the code, and it was easier to just truncate them on start rather than to remember where I first write to them. mktemps? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Mar 8, 2012, at 4:37 PM, Peter Eisentraut wrote: On tor, 2012-03-08 at 10:06 -0500, A.M. wrote: The only reason I truncate them on start is that I am appending to them in many places in the code, and it was easier to just truncate them on start rather than to remember where I first write to them. mktemps? I don't want to see some tool unconditionally writing files (log or otherwise) with unpredictable names. That would make it impossible to clean up in a wrapper script. The point of writing temp files to the /tmp/ directory is that they don't need to be cleaned up. You really prefer having log files written to your current working directory? I don't know of any utility that pollutes the cwd like that- it seems like an easy way to forget where one left the log files. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Feb 29, 2012, at 6:02 PM, Bruce Momjian wrote: On Wed, Feb 29, 2012 at 04:34:24PM -0500, Robert Haas wrote: On Tue, Feb 28, 2012 at 9:45 PM, Bruce Momjian br...@momjian.us wrote: OK, I have implemented both Roberts and Àlvaro's ideas in my patch. I only add the .old suffix to pg_controldata when link mode is used, and I now do it after the schema has been created (the most common failure case for pg_upgrade), and just before we actually link files --- both very good ideas. Thanks for working on this. I think this will be a significant usability improvement. Glad I got such good feedback and ideas. Any ideas about improving the error reporting more generally, so that when reloading the dump fails, the user can easily see what went belly-up, even if they didn't use -l? The only idea I have is to write the psql log to a temporary file and report the last X lines from the file in case of failure. Does that help? Perhaps pg_upgrade can print the path to the temp file containing the log and instruct the user to look there for more detail. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] leakproof
On Feb 26, 2012, at 10:39 AM, Peter Eisentraut wrote: On ons, 2012-02-22 at 10:56 -0500, Andrew Dunstan wrote: The trouble with leakproof is that it doesn't point to what it is that's not leaking, which is information rather than memory, as many might imagine (and I did) without further hints. I'm not sure any single English word would be as descriptive as I'd like. Well, we have RETURNS NULL ON NULL INPUT, so maybe DOES NOT LEAK INFORMATION. ;-) If you are willing to go full length, then the computer science term is referential transparency, no? http://en.wikipedia.org/wiki/Referential_transparency_(computer_science) So a function could be described as REFERENTIALLY TRANSPARENT. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Jan 23, 2012, at 2:49 PM, Tom Lane wrote: Marko Kreen mark...@gmail.com writes: [ bytea_output doesn't need to be GUC_REPORT because format is autodetectable ] Fair enough. Anyway we're really about two years too late to revisit that. Btw, it does not seems that per-request metainfo change requires major version. It just client can send extra metainfo packet before bind+execute, if it knows server version is good enough. That is nonsense. You're changing the protocol, and then saying that clients should consult the server version instead of the protocol version to know what to do. 2. Can we postpone minor data format changes on the wire until there is proper way for clients to request on-the-wire formats? I think that people are coming around to that position, ie, we need a well-engineered solution to the versioning problem *first*, and should not accept incompatible minor improvements until we have that. One simple way clients could detect the binary encoding at startup would be to pass known test parameters and match against the returned values. If the client cannot match the response, then it should choose the text representation. Alternatively, the 16-bit int in the Bind and RowDescription messages could be incremented to indicate a new format and then clients can specify the highest version of the binary format which they support. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GUC_REPORT for protocol tunables was: Re: [HACKERS] Optimize binary serialization format of arrays with fixed size elements
On Jan 23, 2012, at 4:45 PM, Merlin Moncure wrote: On Mon, Jan 23, 2012 at 2:00 PM, A.M. age...@themactionfaction.com wrote: One simple way clients could detect the binary encoding at startup would be to pass known test parameters and match against the returned values. If the client cannot match the response, then it should choose the text representation. Alternatively, the 16-bit int in the Bind and RowDescription messages could be incremented to indicate a new format and then clients can specify the highest version of the binary format which they support. Prefer the version. But why send this over and over with each bind? Wouldn't you negotiate that when connecting? Most likely, optionally, doing as much as you can from the server version? Personally I'm not really enthusiastic about a solution that adds a non-avoidable penalty to all queries. Also, a small nit: this problem is not specific to binary formats. Text formats can and do change, albeit rarely, with predictable headaches for the client. I see no reason to deal with text/binary differently. The only difference between text/binary wire formats in my eyes are that the text formats are documented. merlin In terms of backwards compatibility (to support the widest range of clients), wouldn't it make sense to freeze each format option? That way, an updated text version could also assume a new int16 format identifier. The client would simply pass its preferred format. This could also allow for multiple in-flight formats; for example, if a client anticipates a large in-bound bytea column, it could specify format X which indicates the server should use gzip the result before sending. That same format may not be preferable on a different request. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory patch status
On Jun 16, 2011, at 11:51 AM, Heikki Linnakangas wrote: What's the current state of the POSIX shared memory patch? I grabbed the patch from http://archives.postgresql.org/message-id/d9edacf7-53f1-4355-84f8-2e74cd19d...@themactionfaction.com and it doesn't seem to apply cleanly any more. Are you planning to continue working on it? If I understood the conclusion of the discussions correctly, the current plan is to continue using a small SysV shared memory segment for the interlock, and POSIX shared memory for the rest. That lets us stay below SHMMAX even if it's small, which is convenient for admins. Was there a conclusion on whether we should use fnctl() to provide some extra safety in the current interlock mechanism? I'm feeling that that should probably be split off to a separate patch, it would be easier to review separately. Hello, Please try a merge from my github branch: https://github.com/agentm/postgres/tree/posix_shmem I don't believe any conclusions were reached because the debate concerned whether or not fcntl locking was sufficient. I thought so while others pointed out that the proposed interlock would not work with mutli-client NFSv3 despite the fact that the current interlock doesn't either. I originally made the patch because SysV memory sometimes requires reboots which is especially annoying when expanding an existing production db server. Even if the next version of postgresql incorporates a hybrid SysV/POSIX shmem setup, reboots may still be required if one runs any other processes requiring SysV shmem (such as older versions of postgresql). In any case, I lost interest in maintaining the patch and would not object to having the patch removed from the CommitFest. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Process wakeups when idle and power consumption
On May 5, 2011, at 4:08 PM, Alvaro Herrera wrote: Excerpts from Peter Geoghegan's message of jue may 05 16:49:25 -0300 2011: I'll need to take a look at statistics, autovacuum and Logger processes too, to see if they present more subtle opportunities for reduced idle power consumption. More subtle? Autovacuum wakes up once per second and it could sleep a lot longer if it weren't for the loop that checks for signals. I think that could be improved a lot. Could kqueue be of use here? Non-kqueue-supporting platforms could always fall back to the existing select(). Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 13, 2011, at 9:30 PM, Robert Haas wrote: On Wed, Apr 13, 2011 at 6:11 PM, A.M. age...@themactionfaction.com wrote: I don't see why we need to get rid of SysV shared memory; needing less of it seems just as good. 1. As long one keeps SysV shared memory around, the postgresql project has to maintain the annoying platform-specific document on how to configure the poorly named kernel parameters. If the SysV region is very small, that means I can run more postgresql instances within the same kernel limits, but one can still hit the limits. My patch allows the postgresql project to delete that page and the hassles with it. 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? (Pun intended.) With all due respect, I think this is an unproductive conversation. Your patch proves that SysV is wholly unnecessary only if we also agree that fcntl() locking is just as reliable as the nattch interlock, and Tom and I are trying to explain why we don't believe that's the case. Saying that we're just wrong without responding to our points substantively doesn't move the conversation forward. Sorry- it wasn't meant to be an attack- just a dumb pun. I am trying to argue that, even if the fcntl is unreliable, the startup procedure is just as reliable as it is now. The reasons being: 1) the SysV nattch method's primary purpose is to protect the shmem region. This is no longer necessary in my patch because the shared memory in unlinked immediately after creation, so only the initial postmaster and its children have access. 2) the standard postgresql lock file remains the same Furthermore, there is indeed a case where the SysV nattch cannot work while the fcntl locking can indeed catch: if two separate machines have a postgresql data directory mounted over NFS, postgresql will currently allow both machines to start a postmaster in that directory because the SysV nattch check fails and then the pid in the lock file is the pid on the first machine, so postgresql will say starting anyway. With fcntl locking, this can be fixed. SysV only has presence on one kernel. In case it's not clear, here again is what we're concerned about: A System V shm *cannot* be removed until nobody is attached to it. A lock file can be removed, or the lock can be accidentally released by the apparently innocuous operation of closing a file descriptor. Both you and Tom have somehow assumed that the patch alters current postgresql behavior. In fact, the opposite is true. I haven't changed any of the existing behavior. The robust behavior remains. I merely added fcntl interlocking on top of the lock file to replace the SysV shmem check. This seems contradictory. If you replaced the SysV shmem check, then it's not there, which means you altered the behavior. From what I understood, the primary purpose of the SysV check was to protect the shared memory from multiple stompers. The interlock was a neat side-effect. The lock file contents are currently important to get the pid of a potential, conflicting postmaster. With the fcntl API, we can return a live conflicting PID (whether a postmaster or a stuck child), so that's an improvement. This could be used, for example, for STONITH, to reliably kill a dying replication clone- just loop on the pids returned from the lock. Even if the fcntl check passes, the pid in the lock file is checked, so the lock file behavior remains the same. If you were to implement a daemon with a shared data directory but no shared memory, how would implement the interlock? Would you still insist on SysV shmem? Unix daemons generally rely on lock files alone. Perhaps there is a different API on which we can agree. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 14, 2011, at 8:22 AM, Florian Weimer wrote: * Tom Lane: Well, the fundamental point is that ignoring NFS is not the real world. We can't tell people not to put data directories on NFS, and even if we did tell them not to, they'd still do it. And NFS locking is not trustworthy, because the remote lock daemon can crash and restart (forgetting everything it ever knew) while your own machine and the postmaster remain blissfully awake. Is this still the case with NFSv4? Does the local daemon still keep the lock state? The lock handling has been fixed in NFSv4. http://nfs.sourceforge.net/ NFS Version 4 introduces support for byte-range locking and share reservation. Locking in NFS Version 4 is lease-based, so an NFS Version 4 client must maintain contact with an NFS Version 4 server to continue extending its open and lock leases. http://linux.die.net/man/2/flock flock(2) does not lock files over NFS. Use fcntl(2) instead: that does work over NFS, given a sufficiently recent version of Linux and a server which supports locking. I would need some more time to dig up what recent version of Linux specifies, but NFSv4 is likely required. None of this is to say that an fcntl lock might not be a useful addition to what we do already. It is to say that fcntl can't just replace what we do already, because there are real-world failure cases that the current solution handles and fcntl alone wouldn't. If it requires NFS misbehavior (possibly in an older version), and you have to start postmasters on separate nodes (which you normally wouldn't do), doesn't this make it increasingly unlikely that it's going to be triggered in the wild? With the patch I offer, it would be possible to use shared storage and failover postgresql nodes on different machines over NFS. (The second postmaster blocks and waits for the lock to be released.) Obviously, such as a setup isn't as strong as using replication, but given a sufficiently fail-safe shared storage setup, it could be made reliable. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 13, 2011, at 11:37 PM, Tom Lane wrote: A.M. age...@themactionfaction.com writes: 1. As long one keeps SysV shared memory around, the postgresql project has to maintain the annoying platform-specific document on how to configure the poorly named kernel parameters. No, if it's just a small area, I don't see that that's an issue. You're going to max out on other things (like I/O bandwidth) long before you run into the limit on how many postmasters you can have from this. The reason that those parameters are problematic now is that people tend to want *large* shmem segments and the typical defaults aren't friendly to that. That's assuming that no other processes on the system are using up the available segments (such as older postgresql instances). 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? (Pun intended.) You were losing this argument already, but ad hominem attacks are pretty much guaranteed to get people to tune you out. I apologized to Robert Haas in another post- no offense was intended. There are real, substantive, unfixable reasons to not trust fcntl locking completely. ...on NFS which the postgresql community doesn't recommend anyway. But even in that case, the existing lock file (even without the fcntl lock), can catch that case via the PID in the file contents. That is what I meant when I claimed that the behavior remains the same. I would encourage you to take a look at the patch. Just to be perfectly clear: I have not read your patch, and am not likely to before the next commitfest starts, because I have approximately forty times too many things to do already. I'm just going off your own abbreviated description of the patch. But from what I know about fcntl locking, it's not a sufficient substitute for the SysV shmem interlock, because it has failure modes that the SysV interlock doesn't, and those failure modes occur in real-world cases. Yeah, it'd be nice to also be able to lock against other postmasters on other NFS clients, but we hardly ever hear of somebody getting burnt by the lack of that (and fcntl wouldn't be a bulletproof defense anyway). On the other hand, accidentally trying to start a duplicate postmaster on the same machine is an everyday occurrence. I really do appreciate the time you have put into feedback. I posed this question also to Robert Haas: is there a different API which you would find acceptable? I chose fcntl because it seemed well-suited for this task, but the feedback has been regarding NFS v4 concerns. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 13, 2011, at 8:36 PM, Robert Haas wrote: I don't see why we need to get rid of SysV shared memory; needing less of it seems just as good. 1. As long one keeps SysV shared memory around, the postgresql project has to maintain the annoying platform-specific document on how to configure the poorly named kernel parameters. If the SysV region is very small, that means I can run more postgresql instances within the same kernel limits, but one can still hit the limits. My patch allows the postgresql project to delete that page and the hassles with it. 2. My patch proves that SysV is wholly unnecessary. Are you attached to it? (Pun intended.) In answer to your off-list question, one of the principle ways I've seen fcntl() locking fall over and die is when someone removes the lock file. You might think that this could be avoided by picking something important like pg_control as the log file, but it turns out that doesn't really work: http://0pointer.de/blog/projects/locking.html Tom's point is valid too. Many storage appliances present themselves as an NFS server, so it's very plausible for the data directory to be on an NFS server, and there's no guarantee that flock() won't be broken there. If our current interlock were known to be unreliable also maybe we wouldn't care very much, but AFAICT it's been extremely robust. Both you and Tom have somehow assumed that the patch alters current postgresql behavior. In fact, the opposite is true. I haven't changed any of the existing behavior. The robust behavior remains. I merely added fcntl interlocking on top of the lock file to replace the SysV shmem check. If someone deletes the postgresql lock file over NFS, the data directory is equally screwed, but with my patch there is chance that two machines sharing a properly-configured NFS mount can properly interlock- postgresql cannot offer that today, so this is a feature upgrade with no loss. The worst case scenario is today's behavior. My original goal remains to implement POSIX shared memory, but Tom Lane was right to point out that the current interlocking check relies on SysV, so, even though the startup locking is really orthogonal to shared memory, I implemented what could be considered a separate patch for that and rolled it into one. I would encourage you to take a look at the patch. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 11, 2011, at 6:06 PM, Robert Haas wrote: On Sun, Apr 10, 2011 at 5:03 PM, A.M. age...@themactionfaction.com wrote: To ensure that no two postmasters can startup in the same data directory, I use fcntl range locking on the data directory lock file, which also works properly on (properly configured) NFS volumes. Whenever a postmaster or postmaster child starts, it acquires a read (non-exclusive) lock on the data directory's lock file. When a new postmaster starts, it queries if anything would block a write (exclusive) lock on the lock file which returns a lock-holding PID in the case when other postgresql processes are running. This seems a lot leakier than what we do now (imagine, for example, shared storage) and I'm not sure what the advantage is. I was imagining keeping some portion of the data in sysv shm, and moving the big stuff to a POSIX shm that would operate alongside it. What do you mean by leakier? The goal here is to extinguish SysV shared memory for portability and convenience benefits. The mini-SysV proposal was implemented and shot down by Tom Lane. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 11, 2011, at 7:25 PM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Apr 10, 2011 at 5:03 PM, A.M. age...@themactionfaction.com wrote: To ensure that no two postmasters can startup in the same data directory, I use fcntl range locking on the data directory lock file, which also works properly on (properly configured) NFS volumes. Whenever a postmaster or postmaster child starts, it acquires a read (non-exclusive) lock on the data directory's lock file. When a new postmaster starts, it queries if anything would block a write (exclusive) lock on the lock file which returns a lock-holding PID in the case when other postgresql processes are running. This seems a lot leakier than what we do now (imagine, for example, shared storage) and I'm not sure what the advantage is. BTW, the above-described solution flat out doesn't work anyway, because it has a race condition. Postmaster children have to reacquire the lock after forking, because fcntl locks aren't inherited during fork(). And that means you can't tell whether there's a just-started backend that hasn't yet acquired the lock. It's really critical for our purposes that SysV shmem segments are inherited at fork() and so there's no window where a just-forked backend isn't visible to somebody checking the state of the shmem segment. Then you haven't looked at my patch because I address this race condition by ensuring that a lock-holding violator is the postmaster or a postmaster child. If such as condition is detected, the child exits immediately without touching the shared memory. POSIX shmem is inherited via file descriptors. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] POSIX shared memory redux
On Apr 11, 2011, at 7:13 PM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Apr 11, 2011 at 3:11 PM, A.M. age...@themactionfaction.com wrote: What do you mean by leakier? The goal here is to extinguish SysV shared memory for portability and convenience benefits. The mini-SysV proposal was implemented and shot down by Tom Lane. I mean I'm not convinced that fcntl() locking will be as reliable. I'm not either. Particularly not on NFS. (Although on NFS you have other issues to worry about too, like postmasters on different machines being able to reach the same data directory. I wonder if we should do both SysV and fcntl locking ...) Is there an example of a recent system where fcntl is broken (ignoring NFS)? I believe my patch addresses all potential race conditions and uses the APIs properly to guarantee single-postmaster data directory usage and I tested on Darwin and a two-year-old Linux kernel. In the end, fcntl locking relies on the same kernel which provides the SysV user count, so I'm not sure what makes it less reliable, but I have heard that twice now, so I am open to hearing about your experiences. I know Tom shot that down before, but I still think it's probably the best way forward. Did I? I think I pointed out that there's zero gain in portability as long as we still depend on SysV shmem to work. However, if you're doing it for other reasons than portability, it might make sense anyway. The question is whether there are adequate other reasons. I provided an example of postmaster-failover relying on F_SETLKW in the email with the patch. Also, as you point out above, fcntl locking at least has a chance of working over NFS. The advantage I see is that we would be able to more easily allocate larger chunks of shared memory with changing kernel parameters, Yes, getting out from under the SHMMAX bugaboo would be awfully nice. Yes, please! That is my primary motivation for this patch. and perhaps even to dynamically resize shared memory chunks. This I don't really believe will ever work reliably, especially not in 32-bit machines. Whatever your kernel API is, you still have the problem of finding address space contiguous to what you were already using. Even if expanding shmem involves copying large regions of memory, it could be at least useful to adjust buffer sizes live without a restart. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lowering privs in SECURITY DEFINER function
On Apr 8, 2011, at 7:20 PM, Alvaro Herrera wrote: Excerpts from A.M.'s message of mié abr 06 19:08:35 -0300 2011: That's really strange considering that the new role may not normally have permission to switch to the original role. How would you handle the case where the security definer role is not the super user? As I said to Jeff, it's up to the creator of the wrapper function to ensure that things are safe. Perhaps this new operation should only be superuser-callable, for example. How would you prevent general SQL attacks when manually popping the authentication stack is allowed? The popping and pushing operations would be restricted. You can only pop a single frame, and pushing it back before returning is mandatory. It might be worth thinking about extending this functionality to cover the case for connection pooling. If some SQL can re-tool an existing connection to have the properties of a new connection by a different role, then that would reduce the use-case of connection pooling. If that authorization chain can be pushed and popped by a password or some security token, for example, then that would cover the use cases I mention in this thread: http://archives.postgresql.org/pgsql-general/2006-04/msg00917.php Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lowering privs in SECURITY DEFINER function
On Apr 6, 2011, at 5:33 PM, Alvaro Herrera wrote: Hi, A customer of ours has for a long time the desire to be able to return to the previous privilege level (i.e. the caller privs) inside a SECURITY DEFINER function. I find that this notion is not at all covered in the SQL standard, yet the use case is certainly valid from a security-concious point of view. (Consider, for example, that you may want to enable a user to run some operation to which he is authorized, but you want to carry out some privileged operation before/after doing so: for example, disable triggers, run an update, re-enable triggers.) An easy way to somewhat solve this problem is to provide another security definer function that calls the intermediate operation, owned by a role with lower privileges. But this doesn't really solve the problem, because you are then providing a way to return to an arbitrary role, not to the specific role that's calling the function. I think part of the solution here would be to be able to tell what's the previous role, i.e. the one just below the topmost stack item in the authorization stack. Then, at least you know what to call SET SESSION AUTHORIZATION to. Thoughts? This area seems fraught with security problems, yet it is a necessary piece on the security puzzle. That's really strange considering that the new role may not normally have permission to switch to the original role. How would you handle the case where the security definer role is not the super user? How would you prevent general SQL attacks when manually popping the authentication stack is allowed? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI patch version 14
On Feb 9, 2011, at 12:25 PM, Robert Haas wrote: On Wed, Feb 9, 2011 at 10:38 AM, Markus Wanner mar...@bluegap.ch wrote: On 02/09/2011 04:16 PM, David Fetter wrote: On Tue, Feb 08, 2011 at 09:09:48PM -0500, Robert Haas wrote: Frankly, I think this is an example of how our current shared memory model is a piece of garbage. What other model(s) might work better? Thread based, dynamically allocatable and resizeable shared memory, as most other projects and developers use, for example. Or less invasively, a small sysv shm to prevent the double-postmaster problem, and allocate the rest using POSIX shm. Such a patch was proposed and rejected: http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/94791 Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test_fsync label adjustments
On Jan 18, 2011, at 3:55 PM, Bruce Momjian wrote: I have modified test_fsync to use test labels that match wal_sync_method values, and and added more tests for open_sync with different sizes. This should make the program easier for novices to understand. Here is a test run for Ubuntu 11.04: $ ./test_fsync 2000 operations per test Compare file sync methods using one 8k write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync (non-direct I/O)*85.127 ops/sec open_datasync (direct I/O) 87.119 ops/sec fdatasync 81.006 ops/sec fsync 82.621 ops/sec fsync_writethroughn/a open_sync (non-direct I/O)*84.412 ops/sec open_sync (direct I/O) 91.006 ops/sec * This non-direct I/O mode is not used by Postgres. I am curious how this is targeted at novices. A naive user might enable the fastest option which could be exactly wrong. For this to be useful to novices, I suspect the tool will need to generate platform-specific suggestions, no? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test_fsync label adjustments
On Jan 18, 2011, at 5:16 PM, Bruce Momjian wrote: A.M. wrote: On Jan 18, 2011, at 3:55 PM, Bruce Momjian wrote: I have modified test_fsync to use test labels that match wal_sync_method values, and and added more tests for open_sync with different sizes. This should make the program easier for novices to understand. Here is a test run for Ubuntu 11.04: $ ./test_fsync 2000 operations per test Compare file sync methods using one 8k write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync (non-direct I/O)*85.127 ops/sec open_datasync (direct I/O) 87.119 ops/sec fdatasync 81.006 ops/sec fsync 82.621 ops/sec fsync_writethroughn/a open_sync (non-direct I/O)*84.412 ops/sec open_sync (direct I/O) 91.006 ops/sec * This non-direct I/O mode is not used by Postgres. I am curious how this is targeted at novices. A naive user might enable the fastest option which could be exactly wrong. For this to be useful to novices, I suspect the tool will need to generate platform-specific suggestions, no? Uh, why isn't the fastest option right for them? It is hardware/kernel specific when you run it --- how could it be better? Because the fastest option may not be syncing to disk. For example, the only option that makes sense on OS X is fsync_writethrough- it would be helpful if the tool pointed that out (on OS X only, obviously). Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test_fsync label adjustments
On Jan 18, 2011, at 5:21 PM, Bruce Momjian wrote: A.M. wrote: On Jan 18, 2011, at 5:16 PM, Bruce Momjian wrote: A.M. wrote: On Jan 18, 2011, at 3:55 PM, Bruce Momjian wrote: I have modified test_fsync to use test labels that match wal_sync_method values, and and added more tests for open_sync with different sizes. This should make the program easier for novices to understand. Here is a test run for Ubuntu 11.04: $ ./test_fsync 2000 operations per test Compare file sync methods using one 8k write: (in wal_sync_method preference order, except fdatasync is Linux's default) open_datasync (non-direct I/O)*85.127 ops/sec open_datasync (direct I/O) 87.119 ops/sec fdatasync 81.006 ops/sec fsync 82.621 ops/sec fsync_writethroughn/a open_sync (non-direct I/O)*84.412 ops/sec open_sync (direct I/O) 91.006 ops/sec * This non-direct I/O mode is not used by Postgres. I am curious how this is targeted at novices. A naive user might enable the fastest option which could be exactly wrong. For this to be useful to novices, I suspect the tool will need to generate platform-specific suggestions, no? Uh, why isn't the fastest option right for them? It is hardware/kernel specific when you run it --- how could it be better? Because the fastest option may not be syncing to disk. For example, the only option that makes sense on OS X is fsync_writethrough- it would be helpful if the tool pointed that out (on OS X only, obviously). Yes, that would be a serious problem. :-( I am not sure how we would address this --- your point is a good one. One general idea I had would be to offer some heuristics such as this sync rate is comparable to that of one SATA drive or comparable to RAID 10 with X drives or this rate is likely too fast to be actually be syncing. But then you are stuck with making sure that the heuristics are kept up-to-date, which would be annoying. Otherwise, the only option I see is to detect the kernel and compare against a list of known problematic methods. Perhaps it would be easier to compare against a whitelist. Also, the tool would likely need to parse mount output to account for problems with specific filesystems. I am just throwing around some ideas... Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test_fsync label adjustments
On Jan 18, 2011, at 5:41 PM, Bruce Momjian wrote: A.M. wrote: Because the fastest option may not be syncing to disk. For example, the only option that makes sense on OS X is fsync_writethrough- it would be helpful if the tool pointed that out (on OS X only, obviously). Yes, that would be a serious problem. :-( I am not sure how we would address this --- your point is a good one. One general idea I had would be to offer some heuristics such as this sync rate is comparable to that of one SATA drive or comparable to RAID 10 with X drives or this rate is likely too fast to be actually be syncing. But then you are stuck with making sure that the heuristics are kept up-to-date, which would be annoying. That fails for RAID BBUs. Well, it's nothing more than a heuristic- it is still nice to know whether or not the fancy hardware RAID I just setup is similar to Josh Berkus' RAID setup or a single SATA drive (which would hint at a misconfiguration). As you said, perhaps a wiki is better for this. But a wiki won't integrate with this tool, which I why I would hesitate to point novices to this tool... should the tool point to the wiki? Otherwise, the only option I see is to detect the kernel and compare against a list of known problematic methods. Perhaps it would be easier to compare against a whitelist. Also, the tool would likely need to parse mount output to account for problems with specific filesystems. I am just throwing around some ideas... That sounds pretty complicated. One idea would be the creation of a wiki where people could post their results, or ideally a tool that could read the output and load it into a database for analysis with other results. The OS X example is pretty cut-and-dry- it would be nice if there were some kind of hints in the tool pointing in the right direction, or at least a few words of warning: the fastest option may not be the safest- read the docs. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] making an unlogged table logged
On Jan 5, 2011, at 2:37 PM, Robert Haas wrote: On Wed, Jan 5, 2011 at 2:36 PM, Simon Riggs si...@2ndquadrant.com wrote: The lock strength selected on the master doesn't need to be the same as the lock strength on the standby. You could quite easily generate AEL lock records to send to standby, without actually taking that lock level on the master. True. Question: what does an unlogged table look like on the standby? Are they visible at all, does it throw an error, or do they just look empty. We probably need some docs in the HS section to explain that. I hope the answer isn't look empty since that is effectively data loss for people spreading queries across multiple nodes. Error. Hm- if the unlogged tables are being used as HTTP transient state storage, it would be handy to have that (admittedly non-essential) data on the standby when it becomes master, even if there are no guarantees surrounding the data beyond it looked like this at some point. Since the tables are not writing WAL, would it be possible to allow for writing to unlogged tables on the standby to allow for out-of-band syncing? Otherwise, it seems the only alternative is to push changes to a separate database on the standby machine and then suck the data in when it becomes master. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlogged tables
On Nov 17, 2010, at 4:00 PM, Kevin Grittner wrote: Robert Haas robertmh...@gmail.com wrote: OK, so we're proposing a hierarchy like this. 1. PERMANENT (already exists). 2. UNLOGGED (what this patch currently implements). 3. UNSYNCED (future work). 4. GLOBAL TEMPORARY (future work). 5. LOCAL TEMPORARY (our current temp tables). All of the above would have real uses in our shop. It's possible to imagine a few more stops on this hierarchy. Some of these might be slightly preferred over the above in certain circumstances, but that's getting down to fine tuning. I think the five listed above are more important than the speculative ones mentioned. I don't particularly care for the name UNSYNCED EVANESCENT? I'm starting not to like UNLOGGED much either EPHEMERAL? Actually, the UNSYNCED and UNLOGGED seem fairly clear Unless one thinks that the types could be combined- perhaps a table declaration could use both UNLOGGED and UNSYNCED? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] create c function with void argument bug?
Hello, It seems that this: CREATE OR REPLACE FUNCTION test_fsync_speed() RETURNS float AS '$libdir/test_fsync_speed','\ test_fsync_speed' LANGUAGE C IMMUTABLE STRICT; is not equivalent to this (note void argument): CREATE OR REPLACE FUNCTION test_fsync_speed(void) RETURNS float AS '$libdir/test_fsync_speed','\ test_fsync_speed' LANGUAGE C IMMUTABLE STRICT; Two functions are created. test=# \df List of functions Schema | Name | Result data type | Argument data types | Type +--+--+-+ public | test_fsync_speed | double precision | | normal public | test_fsync_speed | double precision | void| normal Furthermore, I can't figure out how to call the void argument variant. Why is void accepted as an argument? create function testvoid(void) returns void as '' LANGUAGE PLPGSQL; ERROR: PL/pgSQL functions cannot accept type void I punched the void argument in just by chance- am I just stupid? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exposing an installation's default value of unix_socket_directory
On Oct 21, 2010, at 4:19 PM, Robert Haas wrote: 2010/10/21 Tom Lane t...@sss.pgh.pa.us: Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Cédric Villemain's message of jue oct 21 16:01:30 -0300 2010: I agree this is interesting information to get, but wonder how pg_config can know that and it looks to me that this information as nothing to do in pg_config pg_config is all about installation, socket_dir is a postgresql.conf setting. Yeah -- how is pg_config to know? All it can tell you is what was the compiled-in default. That's what I wanted, actually. If you've set a non-default value in postgresql.conf, SHOW will tell you about that, but it fails to expose the default value. Maybe you should go the SHOW route. The user could connect via TCP and find out the socket directory that way. Yeah, the SHOW case is not useless by any means. I think adding this to pg_config is sensible. Sure, the user could have moved the socket directory. But it's a place to start looking. So why not? Because pg_config is supposed to return the current state of a cluster? Because it might indicate a connection to the wrong server? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] O_DSYNC broken on MacOS X?
On Oct 19, 2010, at 11:22 AM, Bruce Momjian wrote: Greg Smith wrote: A.M. wrote: Perhaps a simpler tool could run a basic fsyncs-per-second test and prompt the DBA to check that the numbers are within the realm of possibility. This is what the test_fsync utility that already ships with the database should be useful for. The way Bruce changed it to report numbers in commits/second for 9.0 makes it a lot easier to use for this purpose than it used to be. I think there's still some additional improvements that could be made there, but it's a tricky test to run accurately. The test_fsync was designed to test various things like whether several open-sync writes are better than two write and an fsync, and whether you can fsync data written on a different file descriptor. It is really a catch-all test right now, not one specific for choosing sync methods. I am working on simplifying the test_fsync tool and making it a contrib function which can be run by the superuser based on the configured fsync method. That way, the list can ask a user to run it to report fsyncs-per-second for suspiciousness. The goal is to make it more accessible. I was also thinking about adding some notes along the lines of Your drive fsync speed rates between a 5400 RPM SATA drive and a 7200 RPM SATA drive. or Your drive fsync speed rates as high as RAM- your fsync method may be wrong. Currently, the test tool is not even compiled by default. Thoughts? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] O_DSYNC broken on MacOS X?
On Oct 7, 2010, at 12:26 PM, Robert Haas wrote: On Thu, Oct 7, 2010 at 11:45 AM, Greg Smith g...@2ndquadrant.com wrote: Robert Haas wrote: Proposed doc patch attached. Looks accurate to me. I like the additional linking to the Reliability page you put in there too. Heavily referencing that important page from related areas is a good thing, particularly now that it's got a lot more details than it used to. Cool, thanks for the fast review. I suspect there are more details that could stand to be added to the WAL reliability page as well, but I don't know what they are so I can't add them. I still have the feeling that we have not put quite a large enough red, blinking light around this issue, but I don't have a concrete suggestion. I think the general problem is that there is no simple way to verify that a PostgreSQL commit is pushing the bits to persistent storage. It would be helpful if there were a platform-specific, volume-specific tool to deduce this. Currently, there is no warning light that goes on when commits are not persistent. On Linux, a tool could check filesystem parameters, hdparm (if relevant), and hard drive and controller specs (possibly against a blacklist of known liars). Perhaps a simpler tool could run a basic fsyncs-per-second test and prompt the DBA to check that the numbers are within the realm of possibility. How else can a DBA today ensure that a commit is a commit? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] O_DSYNC broken on MacOS X?
On Sep 30, 2010, at 5:02 PM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: Oh, I missed that. Actually, I wasn't really so concerned with whether his benchmark is correct. I *am* concerned about being broken out of the box on MacOS X. Actually, the problem with OSX is that OSX is broken out of the box, at least by that standard. The system's normal configuration is that fsync() does nothing, That is not correct. fsync and friends on Darwin synchronizes I/O and flushes dirty kernel caches to the disk which meets the specification and is distinctly different from doing nothing. The fsync() function can be used by an application to indicate that all data for the open file description named by fildes is to be transferred to the storage device associated with the file described by fildes in an implementation-dependent manner. http://opengroup.org/onlinepubs/007908799/xsh/fsync.html On MacOS X, fsync() always has and always will flush all file data from host memory to the drive on which the file resides. http://lists.apple.com/archives/Darwin-dev/2005/Feb/msg00072.html I'm not sure whether we should select fsync_writethrough as the default on OSX. We don't make an equivalent attempt to prevent OS or storage malfeasance on other Unixoid platforms --- in fact, I'd say OSX is a bit ahead of the game in that you *can* force writethrough without resorting to arcane hacks with hdparm or some such. We could definitely stand to be a bit more verbose about documenting the platform-specific issues in this area. Not only is this issue platform-specific, it is also bus-, controller- and disk-specific. Luckily, hardware that ships from Apple responds properly to F_FULLFSYNC. It's too bad there is no cross-platform way to ask what level of hardware-syncing is available. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Multi-branch committing in git, revisited
On Sep 21, 2010, at 11:19 PM, Tom Lane wrote: Offhand I think I like Andrew's recommendation of a shortlived branch better. In essence your idea is using the tip of master itself as a shortlived branch, which is maybe a bit too cute. If you get distracted and need to do something else for awhile, the tip of master is not where you want your not-yet-pushable work to be. For uncommitted work, see also git stash. http://www.kernel.org/pub/software/scm/git/docs/git-stash.html Cheers, M
Re: [HACKERS] string function - format function proposal
On Aug 31, 2010, at 5:07 PM, Pavel Stehule wrote: Hello attached WIP patch. I implement only basic format's tags related to SQL: string, value, literal, sql identifier. These tags are basic, but there are not any break to implement any other formats or enhance a syntax. The mix with to_char function is more complex then I expected - so I don't thinking about it for now (there are more then one to_char function). snip It would be pretty handy if plpgsql EXECUTE could operate like this with USING to support identifiers. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] new notify payload as string
With 9.0b4, I am testing the new NOTIFY payload feature. One thing I noticed is that it seems impossible to differentiate at the receiving end from: NOTIFY test; and NOTIFY test,''; So, it is impossible to differentiate between a notification with an empty string payload and a notification without a payload due to the backend protocol defining the payload as a string. Perhaps a boolean could be added to the backend protocol to account for this and then extra could be set to NULL in libpq. This could be handy when converting codebases that use LISTEN/NOTIFY when one wishes to be warned of old NOTIFYs being used without a payload. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Committers info for the git migration - URGENT!
On Aug 25, 2010, at 6:49 PM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: The current mapping used is the same one as on git.postgresql.org (see attached file). BTW, I noticed that this list omits several old committers: Julian Assange pr...@suburbia.net That is _the_ Julian Assange who is in the news now. Very cool! http://en.wikipedia.org/wiki/Julian_Assange#Career_as_computer_programmer -M
Re: [HACKERS] Progress indication prototype
On Aug 18, 2010, at 9:02 AM, Robert Haas wrote: On Wed, Aug 18, 2010 at 8:45 AM, Greg Stark st...@mit.edu wrote: On Tue, Aug 17, 2010 at 11:29 PM, Dave Page dp...@pgadmin.org wrote: Which is ideal for monitoring your own connection - having the info in the pg_stat_activity is also valuable for monitoring and system administration. Both would be ideal :-) Hm, I think I've come around to the idea that having the info in pg_stat_activity would be very nice. I can just picture sitting in pgadmin while a bunch of reports are running and seeing progress bars for all of them... But progress bars alone aren't really the big prize. I would really love to see the explain plans for running queries. This would improve the DBAs view of what's going on in the system immensely. Currently you have to grab the query and try to set up a similar environment for it to run explain on it. If analyze has run since or if the tables have grown or shrank or if the query was run with some constants as parameters it can be awkward. If some of the tables in the query were temporary tables it can be impossible. You can never really be sure you're looking at precisely the same plan than the other user's session is running. But stuffing the whole json or xml explain plan into pg_stat_activity seems like it doesn't really fit the same model that the existing infrastructure is designed around. It could be quite large and if we want to support progress feedback it could change quite frequently. We do stuff the whole query there (up to a limited size) so maybe I'm all wet and stuffing the explain plan in there would be fine? It seems to me that progress reporting could add quite a bit of overhead. For example, in the whole-database vacuum case, the most logical way to report progress would be to compute pages visited divided by pages to be visited. But the total number of pages to be visited is something that doesn't need to be computed in advance unless someone cares about progress. I don't think we want to incur that overhead in all cases just on the off chance someone might ask. We need to think about ways to structure this so that it only costs when someone's using it. I wish that I could get explain analyze output step-by-step while running a long query instead of seeing it jump out at the end of execution. Some queries never end and it would be nice to see which step is spinning (explain can be a red herring). To me the progress bar is nice, but I don't see how it would be reliable enough to draw any inferences (such as execution time). If I could get the explain analyze results *and* the actual query results, that would be a huge win, too. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Testing with concurrent sessions
On Jan 7, 2010, at 12:39 PM, Greg Sabino Mullane wrote: I'm still *very* interested in making a libpq-less pure perl driver, if anyone feels like funding it, let me know! :) You mean this one: http://search.cpan.org/~arc/DBD-PgPP-0.07/lib/DBD/PgPP.pm ? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Has anyone used CLANG yet?
On Dec 9, 2009, at 4:23 PM, Chris Browne wrote: This is a C front end for the LLVM compiler... I noticed that it entered Debian/Unstable today: http://packages.debian.org/sid/main/clang I thought it would be interesting to see if PostgreSQL compiles with this, as an alternative compiler that should presumably become more and more available on Linux et al. (And I suppose that the randomly selected .sig is supremely apropos!) configure blows up here at the following: conftest.c:75:28: error: invalid token after top level declarator extern unsigned int PASCAL accept (unsigned int, void *, void *); I suspect there's something about PASCAL that's a problem, as clang is nominally supposed to be a C compiler ;-). I haven't looked deeper, so haven't the remotest idea how deep the issue lies. At any rate, I should poke at this further soon, but if it seems interesting to others, well, CLANG is now an easy install on some number of systems! Clang works for me on MacOS 10.6.2: /Developer/usr/bin/clang --version clang version 1.0.1 (http://llvm.org/svn/llvm-project/cfe/tags/Apple/clang-24 exported) Target: x86_64-apple-darwin10 CC=/Developer/usr/bin/clang ./configure --prefix=/Users/agentm/pgsql841/ make -j 8 /Users/agentm/pgsql841/initdb -E UTF8 ../data ./pg_ctl -D ../data/ start server starting RD07:bin agentm$ LOG: database system was shut down at 2009-12-09 17:01:51 EST LOG: database system is ready to accept connections LOG: autovacuum launcher started /Users/agentm/pgsql841/psql postgres psql (8.4.1) Type help for help. postgres=# select 1; ?column? -- 1 (1 row) I do see lots of warnings regarding unsupported compiler flags: clang: warning: argument unused during compilation: '-no-cpp-precomp' clang: warning: argument unused during compilation: '-O2' clang: warning: argument unused during compilation: '-Wall' clang: warning: argument unused during compilation: '-Wmissing-prototypes' clang: warning: argument unused during compilation: '-Wpointer-arith' clang: warning: argument unused during compilation: '-Wdeclaration-after-statement' clang: warning: argument unused during compilation: '-Wendif-labels' clang: warning: argument unused during compilation: '-fno-strict-aliasing' clang: warning: argument unused during compilation: '-fwrapv' and some code-based warnings: print.c:1105:24: warning: field width should have type 'int', but argument has type 'unsigned int' [-Wformat] fprintf(fout, %-s%*s, hlineptr[line_count].ptr, ^ pl_exec.c:3529:6: warning: expression result unused [-Wunused-value] ItemPointerSetInvalid((tmptup.t_self)); ^~~ ../../../../src/include/storage/itemptr.h:134:2: note: instantiated from: BlockIdSet(((pointer)-ip_blkid), InvalidBlockNumber), \ ^ ../../../../src/include/storage/block.h:86:2: note: instantiated from: AssertMacro(PointerIsValid(blockId)), \ ^ ../../../../src/include/postgres.h:675:39: note: instantiated from: #define AssertMacro(condition) ((void)true) ^ ../../../../src/include/c.h:185:15: note: instantiated from: #define true((bool) 1) You are probably running configure with gcc, no? FYI: with clang: time make (not -j 8) real1m46.511s user1m26.295s sys 0m14.639s with gcc: time make real2m41.934s user2m20.778s sys 0m17.441s du -h pgsql841gcc/bin/* 52Kpgsql841gcc/bin/clusterdb 52Kpgsql841gcc/bin/createdb 60Kpgsql841gcc/bin/createlang 52Kpgsql841gcc/bin/createuser 52Kpgsql841gcc/bin/dropdb 60Kpgsql841gcc/bin/droplang 52Kpgsql841gcc/bin/dropuser 616Kpgsql841gcc/bin/ecpg 72Kpgsql841gcc/bin/initdb 32Kpgsql841gcc/bin/pg_config 28Kpgsql841gcc/bin/pg_controldata 36Kpgsql841gcc/bin/pg_ctl 280Kpgsql841gcc/bin/pg_dump 68Kpgsql841gcc/bin/pg_dumpall 36Kpgsql841gcc/bin/pg_resetxlog 128Kpgsql841gcc/bin/pg_restore 4.6Mpgsql841gcc/bin/postgres 4.0Kpgsql841gcc/bin/postmaster 340Kpgsql841gcc/bin/psql 52Kpgsql841gcc/bin/reindexdb 32Kpgsql841gcc/bin/vacuumdb du -h pgsql841/bin/* (clang build) 52Kpgsql841/bin/clusterdb 52Kpgsql841/bin/createdb 60Kpgsql841/bin/createlang 52Kpgsql841/bin/createuser 48Kpgsql841/bin/dropdb 60Kpgsql841/bin/droplang 48Kpgsql841/bin/dropuser 612Kpgsql841/bin/ecpg 72Kpgsql841/bin/initdb 28Kpgsql841/bin/pg_config 28Kpgsql841/bin/pg_controldata 36Kpgsql841/bin/pg_ctl 272Kpgsql841/bin/pg_dump 68Kpgsql841/bin/pg_dumpall 36Kpgsql841/bin/pg_resetxlog 124Kpgsql841/bin/pg_restore 4.5Mpgsql841/bin/postgres 4.0Kpgsql841/bin/postmaster 344Kpgsql841/bin/psql 52Kpgsql841/bin/reindexdb 32Kpgsql841/bin/vacuumdb Cheers, M
Re: [HACKERS] Listen / Notify rewrite
On Nov 11, 2009, at 4:25 PM, Joachim Wieland wrote: Hi, Attached is a patch for a new listen/notify implementation. In a few words, the patch reimplements listen/notify as an slru- based queue which works similar to the sinval structure. Essentially it is a ring buffer on disk with pages mapped into shared memory for read/write access. Additionally the patch does the following (see below for details): 1. It removes the pg_listener relation and 2. adds the possibility to specify a payload parameter, i.e. executing in SQL NOTIFY foo 'payload'; and 'payload' will be delivered to any listening backend. 3. Every distinct notification is delivered. 4. Order is preserved, i.e. if txn 1 first does NOTIFY foo, then NOTIFY bar, a backend (listening to both foo and bar) will always first receive the notification foo and then the notification bar. 5. It's now listen to a channel, not listen to a relation anymore... Hi Joachim, Thank you for implementing this- LISTEN/NOTIFY without a payload has been a major problem to work around for me. I understand that coalescing multiple notifications of the same name happens now, but I never understood why. I had hoped that someone revisiting this feature would remove it. My use case is autonomous transactions. From a developer's standpoint, NOTIFY specifies a form of remote trigger for further action- outside the transaction- to occur. From that point of view, I don't see why NOTIFY foo; NOTIFY foo; is equivalent to NOTIFY foo;. I understand the use case where a per-row trigger could generate lots of spurious notifications, but it seems that if anything is generating spurious notifications, the notification is in the wrong place. The documentation makes the strong implication that notification names are usually table names, but with the new payload, this becomes even less the case. I changed this table, take a look at it to see what's new. But no such association is enforced by the NOTIFY and LISTEN commands. http://www.postgresql.org/docs/8.4/static/sql-notify.html With the coalescing, I feel like I am being punished for using NOTIFY with something other than a table name use case. At least with this new payload, I can set the payload to the transaction ID and be certain that all the notifications I sent are processed (and in order even!) but could you explain why the coalescing is still necessary? While coalescing could be achieved on the receiving-end NOTIFY callback ( if(payload ID was already processed) continue; ), non- coalescing behavior cannot be achieved when the backend does the coalescing. For backwards compatibility, payload-less NOTIFY could coalesce while NOTIFYs with a payload would not coalesce, but, on the other hand, that might be confusing. In any case, thank you for improving this long-neglected subsystem! Best regards, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Listen / Notify rewrite
On Nov 11, 2009, at 10:43 PM, Tom Lane wrote: Andrew Chernow a...@esilo.com writes: I thought of a compromise: add the number of times a notification was generated (coalesced count+1) to the callback data. That would satisfy any backwards compatibility concerns and my use case too! If you are suggesting that the server poke data into the notifier's opaque payload, I vote no. Maybe the NOTIFY command can include a switch to enable this behavior. No syntax suggestions at this point. I agree, we should not have the system modifying the payload string for this. And don't bother suggesting a third column in the result --- we'd have to change the FE/BE protocol for that, and it's not going to happen. The existing precedent is that the system collapses identical notifications without payloads. So we could possibly get away with saying that identical payload-less notifies are collapsed but those with a payload are not. That doesn't really seem to satisfy the POLA though. I think Joachim's definition is fine, and anyone who needs delivery of distinct notifications can easily make his payload strings unique to ensure it. The notification count could be a secondary payload which does not affect the first, but I guess I'm the only one complaining about the coalescing... -M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SE-PostgreSQL Specifications
On Jul 25, 2009, at 11:06 AM, Tom Lane wrote: Sam Mason s...@samason.me.uk writes: Yes, that seems reasonable. The fact that you're still talking about confined users is slightly worrying and would seem to imply that there is still a superuser/normal user divide--it's probably just a terminology thing though. There had better still be superusers. Or do you want the correctness of your backups to depend on whether your SELinux policy is correct? The first time somebody loses critical data because SELinux suppressed it from their pg_dump output, they're going to be on the warpath. This behavior is no different than when taking/using an SE-enabled filesystem backup. And woe to the admin who doesn't test his backups- caveat emptor. Still, it would be nice if pg_dump warned or stopped if the backup it created was completely useless (missing data dependencies), no? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] obtaining the function call stack
On Jul 13, 2009, at 4:51 PM, decibel wrote: On Jul 13, 2009, at 2:33 PM, Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: The performance and error recovery implications are unfavorable. Just how badly do you need this, and for what? Mainly for debugging. The situation is such that there is a lot of functions and very high load. The functions have embedded debug elogs and the intention is to call them only if the function was called in a particular context. I can't really see that as sufficiently widely useful to justify inserting such a mechanism. I suspect also that you are defining the problem the wrong way --- this user doesn't want a generic fmgr call stack, he wants a plpgsql stack. Which is something the plpgsql debugger could be taught to do, if it doesn't already, thus avoiding the overhead the 99.9% of the time that you don't need it. Actually, this could conceivably be called from other languages, such as plPerl. But it sounds like this can be done via an add-on, so no need to add it directly to the backend. How would I go about generating a meaningful backtrace for a plpgsql function that calls a plperl function? One would also expect a C function which calls a plpgsql function to appear, too, no? Shouldn't there be a unified backtrace subsystem? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea: global temp tables
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Apr 27, 2009, at 4:44 PM, Pavel Stehule wrote: Hello I am thinking about global temp tables. One possible solution is creating global temporary table like normal table and in planner stage check using this table. When some global temporary table is detected, then real temporary table is created and used in execution plan. It's like: CREATE GLOBAL TEMP TABLE foo(a varchar); -- create global empty table foo SELECT * FROM foo; a) is relevant temp table for foo, use it a) when not, then CREATE TEMP TABLE pg_temp_1.foo(LIKE foo INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES); b) transform origin query to SELECT * FROM pg_temp_1.foo; Ideas? Notes? Objections? When will postgresql offer global temporary tables with data which are shared among sessions? Such tables are great for transient data such as web session data where writing to the WAL is a waste. (On DB startup, the tables would simply be empty.) We're currently stuck with the memcached plugin which makes it impossible to use database constructs such as foreign keys against the temporary data. Cheers, M - -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (Darwin) iEYEARECAAYFAkn2JHQACgkQqVAj6JpR7t4YRgCdGj8JPJY61PPaK79jnPFXu8c7 vjIAn2F1lA0Nr/2EHVPcYQohWqGjWElK =3zYu - -END PGP SIGNATURE- -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (Darwin) iEYEARECAAYFAkn2JIIACgkQqVAj6JpR7t6IOgCdE0le+MAlcwCYNqEt+w9jt/Y3 Z/sAni8Jm3ndYZSI1pIQLBVtKnBnJ8Ee =VXWF -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea: global temp tables
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote: Hi, Le 27 avr. 09 à 23:32, A.M. a écrit : When will postgresql offer global temporary tables with data which are shared among sessions? Such tables are great for transient data such as web session data where writing to the WAL is a waste. (On DB startup, the tables would simply be empty.) We're currently stuck with the memcached plugin which makes it impossible to use database constructs such as foreign keys against the temporary data. If using 8.3 you can SET LOCAL synchronous_commit TO off; for web session management transactions, it'll skip the WAL fsync'ing, which is already a good start. That's pretty close, but it's not table specific and wouldn't let us to reliably mix transient data changes with real data changes. Cheers, M -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (Darwin) iEYEARECAAYFAkn2KNcACgkQqVAj6JpR7t4OrQCgpU9K3FzG2LWWyM245vUaop1G ZMIAn379RDewxKUmCsZsWLo8KdWAYGIs =kHl5 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idea: global temp tables
On Apr 27, 2009, at 6:01 PM, Kevin Grittner wrote: A.M. age...@themactionfaction.com wrote: On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote: Le 27 avr. 09 à 23:32, A.M. a écrit : When will postgresql offer global temporary tables with data which are shared among sessions? Such tables are great for transient data such as web session data where writing to the WAL is a waste. (On DB startup, the tables would simply be empty.) We're currently stuck with the memcached plugin which makes it impossible to use database constructs such as foreign keys against the temporary data. If using 8.3 you can SET LOCAL synchronous_commit TO off; for web session management transactions, it'll skip the WAL fsync'ing, which is already a good start. That's pretty close, but it's not table specific and wouldn't let us to reliably mix transient data changes with real data changes. Yeah, we have a dozen or so tables we use with the pattern you describe; so the feature you describe would also have some value for us. To avoid confusion, we don't refer to these as temporary tables, but rather as permanent work tables. Again, I can't comment on practical issues regarding implementation; but it would be a nice feature to add some day. The tricky bit would be to figure out how to ensure that it got cleaned up properly, especially if the PostgreSQL went down or client processes wend down before tidying up. Actually, for our usage, that's the easiest part- truncate all the permanent work tables whenever the db starts. That's really the only sane thing to do anyway. That's what I mean by transient data- if it's there, that's great, if not, I can re-generate it (cache) or I don't care because, if the database goes down, then the data is useless on restart anyway. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gcc: why optimize for size flag is not the default
On Mar 11, 2009, at 3:18 PM, Grzegorz Jaskiewicz wrote: On 11 Mar 2009, at 13:51, Marko Kreen wrote: Linux kernel is moving to use -Os everywhere. AFAIK their argument is that kernel code should not be doing anything CPU-intensive, thus minimal cache usage is more important than unrolled loops. This also seems to hint that -Os is not really appropriate to Postgres. Although it would be good fit for eg. PgBouncer. while it might be right in case of linux kernel (which I won't agree totally with personally), I don't see any reason to compare it with postgresql. Kernel is extensively use by everything in system, hence their reasoning. Postgresql is an application. MacOS X defaults to and recommends -Os with the rationales that smaller code causes less paging and less CPU instruction cache thrashing. http://developer.apple.com/ReleaseNotes/DeveloperTools/RN-GCC3/index.html For deployment builds, the recommended setting is -Os, which produces the smallest possible binary size. Generally, a binary that's smaller is also faster. That's because a large application spends much of its time paging its binary code in and out of memory. The smaller the binary, the less the application needs to page. For example, say a binary uses aggressive function inlining. That binary saves time with fewer function calls, but it could easily spend far more time paging the binary code containing those inlined functions in and out of memory. -Os Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size. -Os is still optimizing but using a slightly different heuristic as to what optimization means. That said, if postgresql is paging out, the DBA probably has postgresql or the server misconfigured. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Prepping to break every past release...
On Mar 4, 2009, at 6:07 PM, Josh Berkus wrote: Andrew, Back on that track, I'd like to see a facility whereby we could provide an alias (or synonym, to use a nearby subject) columns and other objects. That would help to overcome naming glitches without breaking things quite so much. Believe it or not, a large PostgreSQL user in LA just buttonholed me about that particular feature idea at SCALE. So it might be generally useful as well -- not just for the system catalogs, bug to allow businesses with long-use databases to manage change over time. Schema change is a strong motivator for applications to access the database through views and functions only. A column with multiple names would likely make it *more* painful to migrate schemata. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)
On Jan 5, 2009, at 1:16 PM, Stephen R. van den Berg wrote: Upon reading the DFSG, it seems you have a point... However... QuickLZ is dual licensed: a. Royalty-free-perpetuous-use as part of the PostgreSQL backend or any derived works of PostgreSQL which link in *at least* 50% of the original PostgreSQL codebase. How does one even define 50% of the original PostgreSQL codebase? What nonsense. -M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL future ideas
On Sep 25, 2008, at 5:50 PM, Chris Browne wrote: [EMAIL PROTECTED] (Gevik Babakhani) writes: Advantage of C++ is that it reduce lot of OO code written in C in PostgreSQL, but it is so big effort to do that without small gain. It will increase number of bugs. Do not forget also that C++ compiler is not so common (so good) on different platforms. If somebody interesting in that yes but like a fork ( PostgreSQL++ :-). Reducing OO code that is written in C is one of my major interests. After some investigating myself it appears that having the codebase fully (rewritten in C++ will have an impact on the performance. So I guess such an effort will result the code being more C++ish and fully OO, being a mixture in C with some OO taste. I'm not convinced that it would a good idea at all to make the system fully OO, nor that C++ would be a meaningful tool to use to that end. After all, C++ can certainly be used in decidedly non-OO ways. For instance, STL is NOT an OO framework, and the author of STL, obviously something of a fan of C++, characterizes OO as almost as much of a hoax as Artificial Intelligence. http://en.wikipedia.org/wiki/Object-oriented_programming#Criticism I tend to agree with that characterization. Further, C++ suffers from the same not OO at its base problem of Java, which contributes complexity as well as hurting the OO-ness of it. Better idea is to start to use C99 in PostgreSQL ;-). I have not investigated this yet. But I am very interested to know what the advantages would be to upgrade the code to C99 standards. It would give us heartburn on any platforms where the preferred compiler doesn't grok C99, for sure. As much as I'm ok with using GCC, it would seem unfortunate to force people into using GCC everywhere, and preclude using other compilers. (And actually, I'm more ambivalent about GCC than that; I'm not totally happy with how GCC has gone, but that's another tale for another day...) Speaking of language choice, no one said that _all_ the source code would need to be rewritten. It would be nice, for example, if PostgreSQL rewrote the current GUC system with a glue language like Lua (which is also very C-like). Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)
On Sep 24, 2008, at 2:38 PM, Josh Berkus wrote: Peter, Yeah, but do we even have the slightest bit of information about what exactly would be required to achieve the required levels? And whether this patch does it? And whether there would be alternative, more desirable ways to achieve a similar level? Actually, I have some direct communication that SEPostgres will lead to PostgreSQL being widely adopted in at least one US government agency. Can't say more, of course. ;-) And the consideration is predicated on the PostgreSQL community accepting the patch? This sounds like an opportunity for one of the numerous enterprise spin-offs. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal of SE-PostgreSQL patches (for CommitFest:Sep)
On Sep 19, 2008, at 1:42 PM, Robert Haas wrote: It's too early to vote. :-) The second and third option have prerequisite. The purpose of them is to match granularity of access controls provided by SE-PostgreSQL and native PostgreSQL. However, I have not seen a clear reason why these different security mechanisms have to have same granuality in access controls. Have you seen a clear reason why they should NOT have the same granularity? I realize that SELinux has become quite popular and that a lot of people use it - but certainly not everyone. There might be some parts of the functionality that are not really severable, and if that is the case, fine. But I think there should be some consideration of which parts can be usefully exposed via SQL and which can't. If the parts that can be are independently useful, then I think they should be available, but ultimately that's a judgment call and people may come to different conclusions. If the SE-PostgreSQL effort simply implemented SQL interfaces to increase security granularity, it wouldn't be SE-PostgreSQL at all. SE- PostgreSQL integrates with a set of optional system-wide access controls and is only marginally related to SQL-level database features. Since it relies on an optional component, it doesn't really make sense that anyone is surprised that the patch doesn't improve security granularity of vanilla PostgreSQL. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protocol 3, Execute, maxrows to return, impact?
On Jul 28, 2008, at 1:54 PM, Dave Cramer wrote: On 28-Jul-08, at 12:45 PM, Stephen R. van den Berg wrote: Dave Cramer wrote: On 27-Jul-08, at 3:00 PM, Stephen R. van den Berg wrote: Stephen R. van den Berg wrote: The driver beats libpq in speed by about 62%. Anyone interested in taking a peek at the (GPL copyright) driver, I temporarily put up a small package which contains the working driver in Pike at: http://admin.cuci.nl/psgsql.pike.tar.gz This is very exciting news, I'd love to look at it, is there any way it could be re-licensed so that it can be incorporated into say the jdbc driver ? Since I wrote it, I can relicense it any which way I want. What kind of license would you like to have? As Joshua mentioned BSD is the preferred postgresql license. As I understand it I can't even look at your code and subsequently use anything in the JDBC driver The GPL does not cover implementation ideas, not to mention that the author just described the implementation. Furthermore, one could not take anything from the Pike driver for the JDBC driver because it is a completely different language. It seems like you confused the GPL with an NDA. Anyway, what does The driver beats libpq in speed by about 62% mean? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] digest
On Jul 10, 2008, at 1:20 PM, Fabrízio de Royes Mello wrote: set pgsql-hackers digest Postgresql hackers have been successfully digested. *burp* -M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MERGE Specification
On Apr 22, 2008, at 1:47 PM, Simon Riggs wrote: On Mon, 2008-04-21 at 22:27 -0400, Gregory Stark wrote: Simon Riggs [EMAIL PROTECTED] writes: Unrelated to rule processing, you did read the bit about MERGE and race conditions? ISTM that MERGE as it stands isn't very useful for anything other than large data loads since its going to cause problems if used concurrently. If there are race conditions what advantage does it offer over writing plpgsql or client code to do it? That's an excellent question. I'm not trying to sell you anything here. MERGE is a SQL Standard command, supported by Oracle, DB2, SQLServer etc, so there is reason enough to implement it. There may be also reasons to implement other syntaxes, other behaviours, which would be non-standard. If people want the latter first/second/ not at all then please speak, its not an either-or situation. I would expect MERGE to be slightly faster than a well coded PL/pgSQL function, but there won't be too much in it. It will allow the statement to be more easily parallelised in the form it currently takes, I would note. I thought the whole advantage of having a built-in command is that it could do the kind of locking our unique constraints do to avoid race conditions. As I've said elsewhere, we could have it lock each row, its just more overhead if we do and not necessary at all for bulk data merging. I was hoping to use MERGE alongside the other standard DML. Its purpose is to set the truth regardless of past states. Why should it be relegated to the bulk-loading basement alongside COPY? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MERGE Specification
On Apr 21, 2008, at 4:08 PM, Simon Riggs wrote: The following two files specify the behaviour of the MERGE statement and how it will work in the world of PostgreSQL. In places, this supercedes my recent commentary on MERGE, particularly with regard to triggers. Neither of these files is intended for CVS. The HTML file was generated from SGML source, though the latter is not included here for clarity. The test file shows how I expect a successful test run to look when a regression test is executed with a working version of final MERGE patch applied. It has behavioural comments in it also, to make it slightly more readable. If anybody has any questions, ask them now please, before I begin detailed implementation. MERGE will not invoke Rules. Does this imply that MERGE cannot be used on views or that the resulting INSERTs or UPDATEs do not work on views? Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] An idea for parallelizing COPY within one backend
On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wrong, so I'm all ears! As far as I understand, pgloader starts N threads or processes that open up N individual connections to the server. In that case, moving then text-binary conversion from the backend into the loader won't give any additional performace I'd say. The reason that I'd love some within-one-backend solution is that I'd allow you to utilize more than one CPU for a restore within a *single* transaction. This is something that a client-side solution won't be able to deliver, unless major changes to the architecture of postgres happen first... It seems like multiple backends should be able to take advantage of 2PC for transaction safety. Cheers, M ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] FW: bitemporal functionality for PostgreSQL
On Feb 1, 2008, at 10:42 AM, Luke Porter wrote: All Is there an interest in developing bitemporal functionality in PostgreSQL Regards Luke I can only speak for myself, but- definitely! Based on the googling I did on bitemporal database, I kind of do this already with PostgreSQL. Some of my tables are insert-only and each row includes a committed time timestamp. That way, I don't need a separate audit log table, and fixing someone's mistake is as simple as copying old rows. The downside to this is that I need a view to represent the current truth and calculating the truth is more expensive than a simple table would be. Can you explain in more detail or provide references to how PostgreSQL could potentially handle temporal data better? One idea I had would be to blow the transaction ID up to 128 bits (no more wrapping!) and have it represent the nanoseconds since the epoch. Cheers, M ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Naming of the prefab snowball stemmer dictionaries
On Aug 22, 2007, at 11:10 , Tom Lane wrote: I notice that the existing tsearch documentation that we've imported fairly consistently refers to Snowball dictionaries with names like en_stem, ru_stem, etc. However, CVS HEAD is set up to create them with names english, russian, etc. As I've been absorbing more of the docs I'm starting to wonder whether this is a good idea. ISTM that these names encourage a novice to think that the one dictionary is all you could need for a given language; and there are enough examples of more-complex setups in the docs to make it clear that in fact Snowball is not the be-all and end-all of dictionaries. I'm thinking that going back to the old naming convention (or something like it --- maybe english_stem, russian_stem, etc) would be better. It'd help to give the right impression, namely that these dictionaries are a component of a solution but not necessarily all you need. Please use ISO 639 codes plus any qualifiers to reduce confusion. http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes -M ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] no cascade triggers?
On Jun 26, 2007, at 10:04 , Chris Mair wrote: Hello, On the italian list we're discussing a case were a user reportedly worked around this (i.e. got rid of unwanted cascading calls) by writing an on insert trigger procedure something on the lines of: ALTER TABLE tab DISABLE TRIGGER USER; -- do more inserts into the same table ALTER TABLE tab ENABLE TRIGGER USER; While this reporetedly worked well in 8.2.1 it does not in 8.2.4 resulting in an error: ERROR: relation distinta_base1 is being used by active queries in this session Stato SQL: 55006 Now -- while we agree that disabling a trigger from inside itself is a somewhat strange thing to do, we cannot see a good and easy solution to the problem (of avoiding cascading trigger calls). General question: would a no cascade clause for triggers be a todo item? Instead of putting a trigger on the table, put a rule+trigger on a wrapper view. Cheers, M ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
On Mar 15, 2007, at 11:31 , Ron Mayer wrote: Josh Berkus wrote: And then what? dynamically construct all your SQL queries? Sure, sounds like a simple solution to me... Not to mention DB security issues. How do you secure your database when your web client has DDL access? So, Edward, the really *interesting* idea would be to come up with a secure, normalized way to do UDFs *without* EAV tables. People would be very impressed. I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. My current workaround is to store them as a YAML document and use tsearch to index it (with application logic to further refine the results) - but a EAV datatype that could be put in tables and effectively indexed would be of quite a bit of interest here. And yes, a better say to do UDFs would be even cooler. Out of all the databases that I have used, postgresql offers the most flexible DDL- mostly for one reason: they can operate within transactions. To handle arbitrary strings as column identifiers, the column names could actually be stripped down to lower-case letters and the real title could be stored in a separate table or as column comments. Mr. Berkus' concern regarding the security implications is already handled by privilege separation or security-definer functions. The OP's concern about the difficulty about querying a schema structure is alleviated via any number of APIs in Perl, JDBC, etc. It seems to me that postgresql is especially well-suited to run DDL at runtime, so what's the issue? -M ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] COMMIT NOWAIT Performance Option
On Mar 3, 2007, at 23:19 , Robert Treat wrote: A similar idea we've been kicking around would be having a set storage parameter = nologging option for alter table which would, as it's name implies, cause the system to ignore writing wal logs for the table, much like it does for temp tables now. One cavaet would be you would probably need to forbid such a table from being the parent side of a FK relationship, but otherwise this should be fairly safe even for replay since alter table needs an exclusive lock, so you have finite points where data for the table would be written or ignored. I can't think of any reason why a global temp table couldn't be referenced by another global temp table, but maybe I'm missing something. Whenever postgres starts, it would simply truncate the tables to ensure temporary compliance (the no-foot-gun approach). I could then place such tables in a ramdisk tablespace and make postgresql a transaction-safe memcached replacement. This would also be great for materialized views. Cheers, M ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] COMMIT NOWAIT Performance Option
On Feb 26, 2007, at 18:58 , Simon Riggs wrote: On Mon, 2007-02-26 at 23:25 +, Richard Huxton wrote: Simon Riggs wrote: Proposal: Implement a new option for COMMIT, for enhancing performance, providing a MySQL-like trade-off between performance and robustness for *only* those that want it. COMMIT NOWAIT Isn't the domain of transient data relegated to certain tables instead of specific transactions? Wouldn't it be easier to create un- wal-logged global temp tables? -M ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [Monotone-devel] Re: SCMS question
On Feb 23, 2007, at 11:24 , Andreas Pflug wrote: It probably _can_ never happen, because that would have to be a one-for-all solution, embracing both centric and distributed repositories, combining contradictionary goals. So the first question to answer is: Will PostgreSQL continue with a single repository (the project was managed very successfully this way for a long time), or try a distributed approach. IMHO facts would quote for a central repository, which would drastically reduce SCM candidates. Any distributed SCM can be inherently be used as a central repo. The project leaders would merely designate one place from whence builds are generated and developers would simply sync with the central repo. In fact, distributed SCMs fit the open source development model better because any failure of the central repo (crash, buy-out, sabotage, needing a project fork) cannot cause chaos: a new central repo is designated- essentially an instant failover. Cheers, M ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Three weeks left until feature freeze
On Thu, July 13, 2006 11:03 am, Jonah H. Harris wrote: This is my point exactly. As with many things, we keep skirting the real issue by going with an improve the smaller component approach such as promote pgfoundry more. I have never seen this approach work, but maybe someone has an example of another OSS project that has successfully excluded major components like this? Perl? CPAN? Many modules are included but how they are chosen is somewhat arbitrary. However, those modules can be updated from CPAN (without redownloading Perl). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as
On Wed, June 28, 2006 1:14 pm, [EMAIL PROTECTED] wrote: On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote: Instead, I would like to humbly request the inclusion of a UUID datatype (or an opaque 128 bit datatype) in the core package. It's increasingly common and some databases (MS SQLServer) already have built in support for it. We have it. We're just not putting in the effort required to have it included in core, as it's too much effort to convince people that the type has value, that is is generic, and would be widely used without being abused. All the geometric types that I'll never use in core, with few or no uses, including functions to operate on these types, and no UUID type... Hehe... To me, that's irony... :-) Is it on pgfoundry? From past discussions, the new criteria for getting something into core is to first determine if it is successful on pgfoundry. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] vacuum row?
On Mon, June 26, 2006 9:37 am, Mark Woodward wrote: On 6/24/06, Mark Woodward [EMAIL PROTECTED] wrote: I originally suggested a methodology for preserving MVCC and everyone is confusing it as update in place, this isnot what I intended. Actually, you should've presented your idea as performing MVCC the way Firebird does... the idea is basically the same. Doing some research never hurts... especially with this crowd. Is it really nessisary make personal comments like this? Lets discuss ideas not personalities or people. The whole issue was how to address updates steadily degrading performance. I wanted to brainstorm the issue and find a solution. I tossed out a first guess at an algorithm to start the ball rolling. Was it perfect? No. Was it intended to be? no. It was intended to spark a discussion, get people, first to recognize the problem, and then to think about possible solutions. I find that this, while chaotic, usually finds the best solutions. There are a lot of good and smart people here who understand this process and see it for what it is. Unfortunately, some don't. It isn't about research, per se, because it is assumed that we all know the various issues involved to some degree. It is about using the collective knowledge of the group and coming up with an answer. Actually, it is. There are plenty of databases that don't need an expensive separate process to clean out dead-space, so it would be wise to see how those alternatives handle MVCC to see what this project can glean from other's work- that is the point of open source. Firebird, for example, has a half-dozen articles on how it handles MVCC and dead tuples. In particular, it puts the vacuum burden on the sessions and fires off a separate cleanup (sweep) thread. After all, what will know better than the transaction itself on what needs to be cleaned up? Linking goodness: http://www.firebirdsql.org/doc/whitepapers/fb_vs_ibm_vs_oracle.htm http://www.ibphoenix.com/main.nfs?a=ibphoenixpage=ibp_expert4 -M ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] vacuum, performance, and MVCC
On Fri, June 23, 2006 9:56 am, Martijn van Oosterhout wrote: On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote: It sounds like you have a big problem and you need a big solution. Well, Postgres does a decent job as it is. The problem is under peek load, sometimes it gets bogged down and the usual things like vacuum will not help immediately. I think a few more features like the dead space map for quick vacuum and even something like the original post's proposition would make postgres fly under heavy load too... I know there have a been a number of suggestions in the past to deal with this thing. Some I don't remember being mentioned in this thread are: - Once a tuple has been determined to be invisible to everyone, truncate it to just the header. This would probably work wonders for frequently updated wide tables. However, this required keeping track of the oldest active xact, I'm not sure how that works at the moment. - Have the bgwriter do cleanup work before writing out a block. It could probably do the truncation bit above, but totally removing old tuples requires cleaning out the indexes too, which AIUI is the hard part of vacuuming. One totally whacked out idea I just thought of: Instead of just truncating tuples when they're invisible, mark them dying and make the data section store an array of CTIDs pointing to the index tuples pointing to it. Lookups that find the tuple via an index could store the CTID of the index tuple before continuing. If the bgwriter sees it has a full set, it can efficiently remove the tuple straight away. There are ofcourse drawbacks to this approach, you'd probably need something like the half-dirty pages to avoid a large increase in write load. If it's even beneficial at all given concurrency issues. A lot of these recommendations sound like garbage collection ideas found in modern programming languages. Perhaps it would be worth considering allowing sessions to keep track of which pages they alter and spawn a separate process per connection to sweep up slowly along the way. Also, it's nice that vacuum now has slow-down settings, but why isn't there a option to autovacuum during periods of idleness and pause when busy? -M ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] let's meet
On Thu, June 22, 2006 1:42 pm, Josh Berkus wrote: Marc, Sorry folks, my fault ... hit the 'accept' button too fast So, was Wilbur really as attractive as she claimed? In all seriousness, it's actually a pretty clever spam ploy. 1) Spam mailing lists of nerdy, desperate guys with thinly-veiled fake solicitations for sex. 2) Thousands of naive mailing list subscribers respond to the email. 3) Email addresses of said desperate men are harvested and sold. 4) Aforementioned men are spammed with ads for B16 p3n15 pills. 5) ??? 6) Profit! Best regards, Wilbur ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] That EXPLAIN ANALYZE patch still needs work
It would be nice to keep the gettimeofday()s wherever they are most useful on hardware/software where they are cheap. Perhaps a compile-time option? On Fri, June 9, 2006 11:18 am, Martijn van Oosterhout wrote: On Fri, Jun 09, 2006 at 10:00:20AM -0400, Tom Lane wrote: To tell you the truth, this information makes me even less pleased with the sampling-gettimeofday patch than I was before. If gettimeofday() in itself increases the runtime of a node by a factor of 10, then just trying to subtract off that time is no solution. There's too much impact on surrounding nodes, and too much roundoff error anyhow. I had thought we were applying an order-of-ten-percent correction by subtracting SampleOverhead, not an order-of-10x correction :-( Eh? The whole point is to call gettimeofday() much less often. If you call it 1000th as often, then the correction is only on the order of one hundredth of the normal query time... Subtracting SampleOverhead is only a correction on the order of a few percent, it's the reduced calling of gettimeofday() that provides the benefit. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: Ending EXPLAIN ANALYZE early (was Re: [HACKERS] That EXPLAIN
I think what he meant was a separate EXPLAIN-CANCEL message on a cancel-type connection, which would be completely backwards compatible. Old clients simply wouldn't be able to use the special EXPLAIN cancel, just like it is now. On Thu, June 8, 2006 3:01 pm, Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Would it be possible to make a whole new protocol message for EXPLAIN results? I'm really unwilling to get into that. For one thing, that would absolutely positively break *all* use of EXPLAIN from un-fixed clients, whether you were trying to stop early or not. The reason I'm suggesting that is because it might make it easier to implement the SIGINFO handler that sends incremental EXPLAIN results on demand that I was describing earlier. Doesn't matter, because that's not happening ;-) SIGINFO isn't portable, and even if it were, a signal handler couldn't possibly generate EXPLAIN output (remember those catalog accesses). regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Implementing RESET CONNECTION ...
To complete the connection pooling for multiple users, it would be great to have a protocol level option to change roles semi-permanently (to reduce permissions). RESET SESSION AUTHORIZATION would then bounce back to that (new, set) role until another protocol-level role rollback. This would allow completely reusable connections per database while maintaining a real sandbox for each connection. On Tue, April 25, 2006 10:19 am, Bruce Momjian wrote: OK, what would people like done with this patch? Our TODO list has: * -Add RESET CONNECTION command to reset all session state This would include resetting of all variables (RESET ALL), dropping of temporary tables, removing any NOTIFYs, cursors, open transactions, prepared queries, currval()s, etc. This could be used for connection pooling. We could also change RESET ALL to have this functionality. The difficult of this features is allowing RESET ALL to not affect changes made by the interface driver for its internal use. One idea is for this to be a protocol-only feature. Another approach is to notify the protocol when a RESET CONNECTION command is used. This patch does everything except reset currval(), but the big missing item is that it doesn't handle the protocol issues outlined in the TODO item. However, there also has been very little discussion on exactly how the protocol stuff would work. Should we add it for 8.2 and see if we get any problem reports? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] control pg_hba.conf via SQL
Could postgres offer at least a read-only view of the data in the interim? Ordering could be controlled by line number. On Thu, March 30, 2006 10:14 am, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: If your pg_hba.conf looks like hostall all 0.0.0.0/32 md5 there's not much call to update it dynamically ... There'll be a call to update it once - to 0.0.0.0/0 ;-) Doh ;-). Should make more effort to check my throwaway examples ... But it's not clear to me why a CONNECT right shouldn't encompass all the things that hba does, i.e. connect method, source address and auth method. Because that stuff doesn't fit into either the syntax of GRANT or the system tables that store grant information. It's talking about concepts that don't even exist in the SQL world (while users and databases certainly do). Also, we know from experience that there's value in applying an ordered set of tests in pg_hba.conf --- in particular, rules about local vs local net vs anywhere connections are most easily expressed that way. We would need some substitute rule or concept in order to do the same work in GRANT, and I don't see what that would be. Recently in another thread someone was remarking about how ugly MySQL's authentication methods are. I think that's in part because they have chosen to wedge the client hostname into their concept of user. It doesn't fit nicely. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [NOVICE] Question on TRUNCATE privleges
The author doesn't mention why he got a 600x increase- perhaps he bypassed the delete triggers which was OK for his situation. I don't like the notion that an optimization requires additional privileges...why not detect an unqualified delete and call truncate instead IFF there are no delete triggers on the table? I'm not entirely sure that requiring ownership of the table is the appropriate restriction for TRUNCATE. It made some sense back when TRUNCATE wasn't transaction-safe, but now that it is, you could almost argue that ordinary DELETE privilege should allow TRUNCATE. Almost. The hole in the argument is that TRUNCATE doesn't run ON DELETE triggers and so it could possibly be used to bypass things the table owner wants to have happen. You could equate TRUNCATE to DROP TRIGGER(s), DELETE, CREATE TRIGGER(s) ... but DROP TRIGGER requires ownership. CREATE TRIGGER only requires TRIGGER privilege which is grantable. So one answer is to change DROP TRIGGER to require TRIGGER privilege (which would mean user A could remove a trigger installed by user B, if both have TRIGGER privileges on the table) and then say you can TRUNCATE if you have both DELETE and TRIGGER privileges. It looks to me like the asymmetry between CREATE TRIGGER and DROP TRIGGER is actually required by SQL99, though, so changing it would be a hard sell (unless SQL2003 fixes it?). Comments anyone? regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Documentation: Fast Backward/Forward
A simple documentation enhancement request: please provide Fast Backward/Forward links at the bottom of the page as well. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Errors uniquely identified
There doesn't seem to be any way to get error reporting with unique identifiers for each type of error (or warning or notice). Is this something planned for the future? Would this be a daunting task because the various messages are strewn throughout the sources? Would centralized error strings (and ids) be appreciated for easier automation parsing? Just some thoughts...I might offer to tackle this unless there is some reason not to do so. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly