Re: [HACKERS] [COMMITTERS] pgsql: Add restart_after_crash GUC.
On Tue, Jul 20, 2010 at 9:47 AM, Robert Haas wrote: > Log Message: > --- > Add restart_after_crash GUC. In postgresql.conf.sample, on/off is used as a boolean value. But true/false is used for exit_on_error and restart_after_crash. Sorry, I had overlooked that inconsistency when reviewing the original patch. I attached the bug-fix patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center true_false_to_on_off_0727.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Wed, Jul 21, 2010 at 4:36 PM, Fujii Masao wrote: >> I was actually hoping to see a patch for these things first, before any of >> the synchronous replication stuff. Eliminating the polling loops is >> important, latency will be laughable otherwise, and it will help the >> synchronous case too. > > At first, note that the poll loop in the backend and walreceiver doesn't > exist without synchronous replication stuff. > > Yeah, I'll start with the change of the poll loop in the walsender. I'm > thinking that we should make the backend signal the walsender to send the > outstanding WAL immediately as the previous synchronous replication patch > I submitted in the past year did. I use the signal here because walsender > needs to wait for the request from the backend and the ack message from > the standby *concurrently* in synchronous replication. If we use the > semaphore instead of the signal, the walsender would not be able to > respond the ack immediately, which also degrades the performance. > > The problem of this idea is that signal can be sent per transaction commit. > I'm not sure if this frequent signaling really harms the performance of > replication. BTW, when I benchmarked the previous synchronous replication > patch based on the idea, AFAIR the result showed no impact of the > signaling. But... Thought? Do you have another better idea? The attached patch changes the backend so that it signals walsender to wake up from the sleep and send WAL immediately. It doesn't include any other synchronous replication stuff. The signal is sent right after a COMMIT, PREPARE TRANSACTION, COMMIT PREPARED or ABORT PREPARED record has been fsync'd. To suppress redundant signaling, I added the flag which indicates whether walsender is ready for sending WAL up to the currently-fsync'd location. Only when the flag is false, the backend sets it to true and sends the signal to walsender. When the flag is true, the signal doesn't need to be sent. The flag is set to false right before walsender sends WAL. The code is also available in my git repository: git://git.postgresql.org/git/users/fujii/postgres.git branch: wakeup-walsnd Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center change_poll_loop_in_walsender_0727.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Mon, Jul 26, 2010 at 8:25 PM, Robert Haas wrote: > On Mon, Jul 26, 2010 at 6:48 AM, Marko Tiikkaja > wrote: >> On 7/26/10 1:44 PM +0300, Fujii Masao wrote: >>> >>> On Mon, Jul 26, 2010 at 6:36 PM, Yeb Havinga wrote: I wasn't entirely clear. My suggestion was to have only acknowledge_commit = {no|recv|fsync|replay} instead of replication_mode = {async|recv|fsync|replay} >>> >>> Okay, I'll change the patch accordingly. >> >> For what it's worth, I think replication_mode is a lot clearer. >> Acknowledge_commit sounds like it would do something similar to >> asynchronous_commit. > > I agree. As the result of the vote, I'll leave the parameter "replication_mode" as it is. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 12:36 PM, Joshua Tolley wrote: > Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum > idea is really the best thing for us. I've been thinking about Oracle's way of > doing things[1]. In short, there are three different modes: availability, > performance, and protection. "Protection" appears to mean that at least one > standby has applied the log; "availability" means at least one standby has > received the log info (it doesn't specify whether that info has been fsynced > or applied, but presumably does not mean "applied", since it's distinct from > "protection" mode); "performance" means replication is asynchronous. I'm not > sure this method is perfect, but it might be simpler than the quorum behavior > that has been considered, and adequate for actual use cases. In my case, I'd like to set up one synchronous standby on the near rack for high-availability, and one asynchronous standby on the remote site for disaster recovery. Can Oracle's way cover the case? "availability" mode with two standbys might create a sort of similar situation. That is, since the ACK from the near standby arrives in first, the near standby acts synchronous and the remote one does asynchronous. But the ACK from the remote standby can arrive in first, so it's not guaranteed that the near standby has received the log info before transaction commit returns a "success" to the client. In this case, we have to failover to the remote standby even if it's not under control of a clusterware. This is a problem for me. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL cipher and version
Robert Haas writes: > On Mon, Jul 26, 2010 at 9:57 AM, Dave Page wrote: >> On Mon, Jul 26, 2010 at 2:49 PM, Robert Haas wrote: >>> Any objections to me committing this? >> >> Might wanna fix this first: >> >> +PG_FUNCTION_INFO_V1(ssl_veresion); >> > Wow. It works remarkably well without fixing that, but I'll admit > that does seem lucky. Well, it's got no arguments, which is the main thing that works differently in call protocol V1. I think you'd find that the PG_RETURN_NULL case doesn't really work though ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Thu, Jul 22, 2010 at 10:37:12AM +0200, Yeb Havinga wrote: > Fujii Masao wrote: > Initially I also expected the quorum to behave like described by > Aidan/option 2. Also, IMHO the name "quorom" is a bit short, like having > "maximum" but not saying a max_something. > > quorum_min_sync_standbys > quorum_max_sync_standbys Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum idea is really the best thing for us. I've been thinking about Oracle's way of doing things[1]. In short, there are three different modes: availability, performance, and protection. "Protection" appears to mean that at least one standby has applied the log; "availability" means at least one standby has received the log info (it doesn't specify whether that info has been fsynced or applied, but presumably does not mean "applied", since it's distinct from "protection" mode); "performance" means replication is asynchronous. I'm not sure this method is perfect, but it might be simpler than the quorum behavior that has been considered, and adequate for actual use cases. [1] http://download.oracle.com/docs/cd/B28359_01/server.111/b28294/protection.htm#SBYDB02000 alternatively, http://is.gd/dLkq4 -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] dynamically allocating chunks from shared memory
On Mon, Jul 26, 2010 at 3:16 PM, Kevin Grittner wrote: > Robert Haas wrote: > >> I actually think that memory management is one of the weakest >> elements of our current architecture > > I'm actually pretty impressed by the memory contexts in PostgreSQL. > Apparently I'm not alone in that, either; a paper by Hellerstein, > Stonebraker, and Hamilton[1] has this in section 7.2 (Memory > Allocator): > > "The interested reader may want to browse the open-source PostgreSQL > code. This utilizes a fairly sophisticated memory allocator." > > I think the problem here is that we don't extend that sophistication > to shared memory. That's one aspect of it, and the other is that we don't have much global coordination about how we use it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
Robert Haas wrote: > I actually think that memory management is one of the weakest > elements of our current architecture I'm actually pretty impressed by the memory contexts in PostgreSQL. Apparently I'm not alone in that, either; a paper by Hellerstein, Stonebraker, and Hamilton[1] has this in section 7.2 (Memory Allocator): "The interested reader may want to browse the open-source PostgreSQL code. This utilizes a fairly sophisticated memory allocator." I think the problem here is that we don't extend that sophistication to shared memory. -Kevin [1] Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007. Architecture of a Database System. Foundations and Trends(R) in Databases Vol. 1, No. 2 (2007) 141*259. http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On Mon, Jul 26, 2010 at 1:50 PM, Markus Wanner wrote: > Note however, that a thread based design doesn't have this problem *at all*. > Memory generally is shared (between threads) and you can dynamically > allocate more or less (until Linux' OOM killer hits you.. yet another > story). The OS reuses memory you don't currently need even for other > applications. > > Users as well as developers know the threaded model (arguably, much better > than the process based one). So that's what we get compared to. And what > developers (including me) are used to. I'm sort of used to the process model, myself, but I may be in the minority. > I think we are getting by with fixed allocations at the moment, because we > did a lot to get by with it. By working around these limitations. > > However, that's just my thinking. Thank you for your inputs. I completely agree with you that fixed allocations suck. We're just disagreeing (hopefully, in a friendly and collegial fashion) about what to do about it. I actually think that memory management is one of the weakest elements of our current architecture, though I think for somewhat different reasons than what you're thinking about. Besides the fact that we have various smaller pools of dynamically shared memory (e.g. a separate ring of buffers for each SLRU), I'm also unhappy about some of the things we do with backend-private memory, work_mem being the biggest culprit by far, because it's very difficult for the DBA to set the knobs in a way that uses all of the memory he wants to allocate to the database efficiently no overruns and none left over. The case where you can count on the database and all of your temporary files, etc. to fit in RAM is really an exceptional case: in general, you need to assume that there will be more demand for memory than there will be memory available, and as much as possible you want the system (rather than the user) to decide how it should optimally be allocated. The query planner and executor actually do have most of what is needed to execute queries using more or less memory, but they lack the global intelligence needed for intelligent decision-making. Letting the OS buffer cache rather than the PG buffer cache handle most of the system's memory helps, but it's not a complete solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
Hi, On 07/26/2010 07:16 PM, Robert Haas wrote: Of course, there are other parts of the system (a whole bunch of them) that used shared memory also, and perhaps some of those could be modified to use the dynamic allocator as well. But they're getting by without it now, so maybe they don't really need it. The SLRU stuff, I think, works more or less like shared buffers (so you have the same set of issues) and I think most of the other users are allocating small, fixed-size chunks. Yeah, I see your point(s). Note however, that a thread based design doesn't have this problem *at all*. Memory generally is shared (between threads) and you can dynamically allocate more or less (until Linux' OOM killer hits you.. yet another story). The OS reuses memory you don't currently need even for other applications. Users as well as developers know the threaded model (arguably, much better than the process based one). So that's what we get compared to. And what developers (including me) are used to. I think we are getting by with fixed allocations at the moment, because we did a lot to get by with it. By working around these limitations. However, that's just my thinking. Thank you for your inputs. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On Mon, Jul 26, 2010 at 12:51 PM, Markus Wanner wrote: >> Dynamically allocating out of a 2MB >> segment gives up most of that flexibility. > > Absolutely, that's why I'd like to see other modules that use the dynamic > allocator. The more the better. Right, I agree. The problem is that I don't think they can. The elephant in the room is shared_buffers, which I believe to be typically BY FAR the largest consumer of shared memory. It would be absolutely fantastic if we had a shared_buffers implementation that could free up unused buffers when they're not needed, or add more when required. But there are several reasons why I don't believe that will ever happen. One, much of the code that uses shared_buffers relies on shared_buffers being located at a fixed memory address on a contiguous chunk, and it's hard to see how we could change that assumption without sacrificing performance. Two, the overall size of the shared memory arena is largely dependent on the size of shared_buffers, so unless you also have the ability to resize the arena on the fly (which is well-nigh to impossible with our current architecture, and maybe with any architecture), resizing shared_buffers doesn't actually add that much flexibility. Three, the need for shared buffers is elastic rather than absolute: stealing a few shared buffers for a defined purpose (like sending imessages) is perfectly reasonable, but it's rarely going to be a good idea for the buffer manager to proactively free up memory just in case some other part of the system might need some. If you have a system that normally has 4GB of shared buffers and some other module borrows 100MB and then returns it, the system will just cache less data while that memory is in use and then start right back up caching more again once it's returned. That's very nice, and it's hard to see how else to achieve that result. Of course, there are other parts of the system (a whole bunch of them) that used shared memory also, and perhaps some of those could be modified to use the dynamic allocator as well. But they're getting by without it now, so maybe they don't really need it. The SLRU stuff, I think, works more or less like shared buffers (so you have the same set of issues) and I think most of the other users are allocating small, fixed-size chunks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
Hi, On 07/26/2010 06:33 PM, Robert Haas wrote: It would be nice to think, for example, that this could be used as infrastructure for parallel query to stream results back from worker processes to the backend connected to the user. If you're using 16 processors to concurrently scan 16 partitions of an appendrel and stream those results back to the master Now, *that* sounds like music to my ears ;-) Or put another way: yes, I think imessages and the bgworker infrastructure stuff could enable or at least help that goal. Dynamically allocating out of a 2MB segment gives up most of that flexibility. Absolutely, that's why I'd like to see other modules that use the dynamic allocator. The more the better. What I think will end up happening here is that you'll always have to size the segment used by the dynamic allocator considerably larger than the amount of memory you expect to actually be used, so that performance doesn't go into the toilet when it fills up. As Markus pointed out upthread, you'll always need some hard limit on the amount of space that imessages can use, but you can make that limit much larger if it's not reserved for a single purpose. If you use the "temporarily allocated shared buffers" method, then you could set the default limit to something like "64MB, but not more than 1/8th of shared buffers". I've been thinking about such rules as well. They quickly get more complex if you begin to take OOM situations and their counter-measures into account. In a way, fixing every separate pool to its specific size just is the very simples rule-set I can think of. The dynamic allocator buys you more flexibility, but choosing good limits and rules between the sub-systems is another issue. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Functional dependencies and GROUP BY
On Sat, Jul 24, 2010 at 06:23, Peter Eisentraut wrote: > Another open question I thought of was whether we should put the > dependency record on the pg_index row, or the pg_constraint row, or > perhaps the pg_class row. Right now, it is using pg_index, because that > was easiest to code up, but I suspect that once we have not-null > constraints in pg_constraint, it will be more consistent to make all > dependencies go against pg_constraint rather than a mix of several > catalogs. I think for primary keys pg_index is OK. However for the not-null case we have to use pg_constraint... So given that we end up having to code that anyways, it seems like it will end up being cleaner/consistent to always use the pg_constraint row(s). So +1 for using pg_constraint instead of pg_index from me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
Hi, On 07/26/2010 04:31 PM, Alvaro Herrera wrote: Excerpts from Robert Haas's message of lun jul 26 08:52:46 -0400 2010: Here's another idea. Instead of making imessages use an SLRU, how about having it steal pages from shared_buffers? This would require segmenting messages into small enough chunks that they'd fit, but the nice part is that it would avoid the need to have a completely separate shared memory arena. Ideally, we'd make the infrastructure general enough that things like SLRU could use it also; and get rid of or reduce in size some of the special-purpose chunks we're now allocating. To me that sounds like solving the same kind of problem for every module separately and somewhat differently. I tend to like general solutions (often too much, but that's another story), and to me it still seems a completely dynamic memory allocator solves that generically (and way more elegant than 'stealing pages' sounds). Right now we allocate a single large arena, and the lot of shared_buffers, SLRU pools, locking objects, etc are all allocated from there. Uh.. they all allocate from different, statically sized pool, don't they? If we want another 2 MB for "dynamic shmem", we'd just allocate 2 MB more in that large arena and give those to this new code. That's how it could work if we used a dynamic allocator. But currently, if I understand correctly, once the shared_buffers pool is full, it cannot steal memory from the SLRU pools. Or am I mistaken? Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On Mon, Jul 26, 2010 at 10:31 AM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of lun jul 26 08:52:46 -0400 2010: >> Here's another idea. Instead of making imessages use an SLRU, how >> about having it steal pages from shared_buffers? This would require >> segmenting messages into small enough chunks that they'd fit, but the >> nice part is that it would avoid the need to have a completely >> separate shared memory arena. Ideally, we'd make the infrastructure >> general enough that things like SLRU could use it also; and get rid of >> or reduce in size some of the special-purpose chunks we're now >> allocating. > > What's the problem you see with "another shared memory arena"? Right > now we allocate a single large arena, and the lot of shared_buffers, > SLRU pools, locking objects, etc are all allocated from there. If we > want another 2 MB for "dynamic shmem", we'd just allocate 2 MB more in > that large arena and give those to this new code. But that's not a very flexible design. If you discover that you need 3MB instead of 2MB, you get to restart the entire cluster. If you discover that you need 1MB instead of 2MB, you get to either restart the entire cluster, or waste 1MB of shared memory. And since actual usage will almost certainly fluctuate, you'll almost certainly be wasting some shared memory that could otherwise be used for other purposes some of the time. Now, granted, we have this problem already today, and granted also, 2MB is not an enormous amount of memory on today's machines. If we really think that 2MB will always be adequate for every purpose for which we wish to use unicast messaging, then perhaps it's OK, but I'm not convinced that's true. It would be nice to think, for example, that this could be used as infrastructure for parallel query to stream results back from worker processes to the backend connected to the user. If you're using 16 processors to concurrently scan 16 partitions of an appendrel and stream those results back to the master, will 128kB/backend be enough memory to avoid pipeline stalls? What if there's replication going on at the same time? What if there's other concurrent activity that also uses imessages? Or even better, what if there's other concurrent activity that uses the dynamic allocator but NOT imessages? If the point of having a dynamic allocator is that it's eventually going to be used by lots of different subsystems, then we had better have a fairly high degree of confidence that it actually will, but in fact we've made very little effort to characterize who the other users might be and whether the stated implementation limitations will be adequate for them. Frankly, I doubt it. One of the major reasons why malloc() is so powerful is that you don't have to decide in advance how much memory you're going to need, as you would if you put the structure in the data segment. Dynamically allocating out of a 2MB segment gives up most of that flexibility. What I think will end up happening here is that you'll always have to size the segment used by the dynamic allocator considerably larger than the amount of memory you expect to actually be used, so that performance doesn't go into the toilet when it fills up. As Markus pointed out upthread, you'll always need some hard limit on the amount of space that imessages can use, but you can make that limit much larger if it's not reserved for a single purpose. If you use the "temporarily allocated shared buffers" method, then you could set the default limit to something like "64MB, but not more than 1/8th of shared buffers". Since the memory won't get used unless it's needed, you don't really have to care whether a particular installation is likely to need some, none, or all of that; whereas if you're allocating nailed-down memory, you're going to want a much smaller default - a couple of MB, at most. Furthermore, if you do happen to be running on a 64GB machine with 8GB of shared_buffers and 64MB isn't adequate, you can easily make it possible to bump that value up by changing a GUC and hitting reload. With the "nailed-down shared memory" approach, you're locked into whatever you decide at postmaster start. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 11:07 AM, Robert Haas wrote: > On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure wrote: >> It was absolutely a good decision because it prevented type inference >> in ways that were ambiguous or surprising (for a canonical case see: >> http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html). >> >> || operator is still pretty tolerant in the 8.3+ world. >> select interval_col || bool_col; -- error >> select interval_col::text || bool_col; -- text concatenation >> select text_col || interval_col || bool_col; -- text concatenation >> >> variadic text would require text casts on EVERY non 'unknown' argument >> which drops it below the threshold of usefulness IMO -- it would be >> far stricter than vanilla || concatenation. Ok, pure bikeshed here >> (shutting my trap now!), but concat() is one of those wonder functions >> that you want to make as usable and terse as possible. I don't see >> the value in making it overly strict. > > I'm just very skeptical that we should give our functions argument > types that are essentially fantasy. CONCAT() doesn't concatenate > integers or intervals or boxes: it concatenates strings, and only > strings. Surely the right fix, if our casting rules are too > restrictive, is to fix the casting rules; not to start adding a bunch > of kludgery in every function we define. > > The problem with your canonical example (and the other examples of > this type I've seen) is that they are underspecified. Given two > identically-named operators, one of which accepts types T1 and T2, and > the other of which accepts types T3 and T4, what is one to do with T1 > OP T4? You can cast T1 to T3 and call the first operator or you can > cast T4 to T2 and call the second one, and it's really not clear which > is right, so you had better thrown an error. The same applies to > functions: given foo(text) and foo(date) and the call > foo('2010-04-15'), you had better complain, or you may end up with a > very sad user. But sometimes our current casting rules require casts > in situations where they don't seem necessary, such as > LPAD(integer_column, '0', 5). That's not ambiguous because there's > only one definition of LPAD, and the chances that the user will be > unpleasantly surprised if you call it seem quite low. > > In other words, this problem is not unique to CONCAT(). shoot, can't resist :-). Are the casting rules broken? If you want to do lpad w/o casts for integers, you can define it explicitly -- feature, not bug. You can basically do this for any function with fixed arguments -- either in userland or core. lpad(int) actually introduces a problem case with the minus sign possibly requiring application specific intervention, so things are probably correct exactly as they are. Casting rules need to be tight because if they are not they can introduce independent behaviors when you underspecify as you noted above. ISTM you are unhappy with the "any" variadic mechanism in general, not the casting rules. "any" essentially means: "the types you pass to me are irrelevant, because i'm going to look up behaviors on the server and apply them as I see fit". You've defined a regular methodology across ALL types and allow the user to pass anything -- I see nothing at all wrong with this as long as it's only implemented in very special cases. If you happen to not like the predefined 'box' behavior, nothing is stopping you from massaging it before sending it in. Also, there's no guarantee that the behavior hidden under the "any" can be reproduced via manual cast...concat() is some thing of a special case where you can. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure wrote: > On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas wrote: >> On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure wrote: CONCAT('foo', NULL) => 'foo' really the behavior that everyone else implements here? And why does CONCAT() take a variadic "ANY" argument? Shouldn't that be variadic TEXT? >>> >>> What does that accomplish, besides forcing you to sprinkle every >>> concat call with text casts (maybe that's not a bad thing?)? >> >> You could ask the same thing about the existing || operator. And in >> fact, we used to have that behavior. We changed it in 8.3. Perhaps >> that was a good decision and perhaps it wasn't, but I don't think >> using CONCAT() to make an end-run around that decision is the way to >> go. > > It was absolutely a good decision because it prevented type inference > in ways that were ambiguous or surprising (for a canonical case see: > http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html). > > || operator is still pretty tolerant in the 8.3+ world. > select interval_col || bool_col; -- error > select interval_col::text || bool_col; -- text concatenation > select text_col || interval_col || bool_col; -- text concatenation > > variadic text would require text casts on EVERY non 'unknown' argument > which drops it below the threshold of usefulness IMO -- it would be > far stricter than vanilla || concatenation. Ok, pure bikeshed here > (shutting my trap now!), but concat() is one of those wonder functions > that you want to make as usable and terse as possible. I don't see > the value in making it overly strict. I'm just very skeptical that we should give our functions argument types that are essentially fantasy. CONCAT() doesn't concatenate integers or intervals or boxes: it concatenates strings, and only strings. Surely the right fix, if our casting rules are too restrictive, is to fix the casting rules; not to start adding a bunch of kludgery in every function we define. The problem with your canonical example (and the other examples of this type I've seen) is that they are underspecified. Given two identically-named operators, one of which accepts types T1 and T2, and the other of which accepts types T3 and T4, what is one to do with T1 OP T4? You can cast T1 to T3 and call the first operator or you can cast T4 to T2 and call the second one, and it's really not clear which is right, so you had better thrown an error. The same applies to functions: given foo(text) and foo(date) and the call foo('2010-04-15'), you had better complain, or you may end up with a very sad user. But sometimes our current casting rules require casts in situations where they don't seem necessary, such as LPAD(integer_column, '0', 5). That's not ambiguous because there's only one definition of LPAD, and the chances that the user will be unpleasantly surprised if you call it seem quite low. In other words, this problem is not unique to CONCAT(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL cipher and version
On Mon, Jul 26, 2010 at 9:57 AM, Dave Page wrote: > On Mon, Jul 26, 2010 at 2:49 PM, Robert Haas wrote: >> Last week, I ran across a situation where I needed to know the SSL >> version and cipher in use for a particular database connection. >> Magnus pointed me to contrib/sslinfo, but that didn't have quite what >> I needed. The attached patch adds two additional functions to >> contrib/sslinfo to report this information. >> >> Any objections to me committing this? > > Might wanna fix this first: > > +PG_FUNCTION_INFO_V1(ssl_veresion); > Wow. It works remarkably well without fixing that, but I'll admit that does seem lucky. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas wrote: > On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure wrote: >>> CONCAT('foo', NULL) => 'foo' really the behavior that everyone else >>> implements here? And why does CONCAT() take a variadic "ANY" >>> argument? Shouldn't that be variadic TEXT? >> >> What does that accomplish, besides forcing you to sprinkle every >> concat call with text casts (maybe that's not a bad thing?)? > > You could ask the same thing about the existing || operator. And in > fact, we used to have that behavior. We changed it in 8.3. Perhaps > that was a good decision and perhaps it wasn't, but I don't think > using CONCAT() to make an end-run around that decision is the way to > go. It was absolutely a good decision because it prevented type inference in ways that were ambiguous or surprising (for a canonical case see: http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html). || operator is still pretty tolerant in the 8.3+ world. select interval_col || bool_col; -- error select interval_col::text || bool_col; -- text concatenation select text_col || interval_col || bool_col; -- text concatenation variadic text would require text casts on EVERY non 'unknown' argument which drops it below the threshold of usefulness IMO -- it would be far stricter than vanilla || concatenation. Ok, pure bikeshed here (shutting my trap now!), but concat() is one of those wonder functions that you want to make as usable and terse as possible. I don't see the value in making it overly strict. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
Excerpts from Robert Haas's message of lun jul 26 08:52:46 -0400 2010: > Here's another idea. Instead of making imessages use an SLRU, how > about having it steal pages from shared_buffers? This would require > segmenting messages into small enough chunks that they'd fit, but the > nice part is that it would avoid the need to have a completely > separate shared memory arena. Ideally, we'd make the infrastructure > general enough that things like SLRU could use it also; and get rid of > or reduce in size some of the special-purpose chunks we're now > allocating. What's the problem you see with "another shared memory arena"? Right now we allocate a single large arena, and the lot of shared_buffers, SLRU pools, locking objects, etc are all allocated from there. If we want another 2 MB for "dynamic shmem", we'd just allocate 2 MB more in that large arena and give those to this new code. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL cipher and version
On Mon, Jul 26, 2010 at 2:49 PM, Robert Haas wrote: > Last week, I ran across a situation where I needed to know the SSL > version and cipher in use for a particular database connection. > Magnus pointed me to contrib/sslinfo, but that didn't have quite what > I needed. The attached patch adds two additional functions to > contrib/sslinfo to report this information. > > Any objections to me committing this? Might wanna fix this first: +PG_FUNCTION_INFO_V1(ssl_veresion); -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SSL cipher and version
Last week, I ran across a situation where I needed to know the SSL version and cipher in use for a particular database connection. Magnus pointed me to contrib/sslinfo, but that didn't have quite what I needed. The attached patch adds two additional functions to contrib/sslinfo to report this information. Any objections to me committing this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company ssl_cipher_and_version.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql \timing output supressed in quiet mode
Peter Eisentraut wrote: > In psql's quiet mode, the output of the \timing option is suppressed, > except in the \copy command. That inconsistency should be fixed in any > case. It seems to me that if I explicitly turn on timing, then that > should be unaffected by the quiet mode. Comments? +1 -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure wrote: >> CONCAT('foo', NULL) => 'foo' really the behavior that everyone else >> implements here? And why does CONCAT() take a variadic "ANY" >> argument? Shouldn't that be variadic TEXT? > > What does that accomplish, besides forcing you to sprinkle every > concat call with text casts (maybe that's not a bad thing?)? You could ask the same thing about the existing || operator. And in fact, we used to have that behavior. We changed it in 8.3. Perhaps that was a good decision and perhaps it wasn't, but I don't think using CONCAT() to make an end-run around that decision is the way to go. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
> >> CONCAT('foo', NULL) => 'foo' really the behavior that everyone else >> implements here? And why does CONCAT() take a variadic "ANY" >> argument? Shouldn't that be variadic TEXT? > CONCAT with variadic text parameter will be limited with existing default casts to text - for example, you can't to cast date to text, int to text. postgres=# create or replace function concat(variadic text[]) returns text as $$select string_agg(x,'') from unnest($1) x$$ language sql; CREATE FUNCTION postgres=# select concat('a','b'); concat ab (1 row) Time: 20,812 ms postgres=# select concat('a',10); ERROR: function concat(unknown, integer) does not exist LINE 1: select concat('a',10); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. so with variadic "any"[] concat doesn't need explicit cats. Regards Pavel Stehule p.s. inside function is every value transformed to text. > merlin > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 8:02 AM, Robert Haas wrote: > Regardless of where this function ends up, the concat_ws documentation > should contain some mention of the fact that "ws" is intended to mean > "with separator", big +1 on that -- I've been loosely following the thread and I had assumed 'ws' meant 'wide string' all this time :-). > Come to think of it, have we checked that the behavior of LEFT, RIGHT, > REVERSE, etc. is the same on other DBs, especially as far as nulls, > empty strings, too-large or negative subscripts, etc is concerned? Probably 'standard' behavior wrt null would be to be strict; return null if any argument is null. The proposed behavior seems ok though. > CONCAT('foo', NULL) => 'foo' really the behavior that everyone else > implements here? And why does CONCAT() take a variadic "ANY" > argument? Shouldn't that be variadic TEXT? What does that accomplish, besides forcing you to sprinkle every concat call with text casts (maybe that's not a bad thing?)? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamically allocating chunks from shared memory
On Thu, Jul 22, 2010 at 3:09 PM, Markus Wanner wrote: >> FWIW I don't think you should be thinking in "replacing imessages with >> SLRU". I rather think you should be thinking in how can you implement >> the imessages API on top of SLRU. > > Well, I'm rather comparing SLRU with the dynamic allocator. So far I'm > unconvinced that SLRU would be a better base for imessages than a dynamic > allocator. (And I'm arguing that SLRU should use a dynamic allocator > underneath). Here's another idea. Instead of making imessages use an SLRU, how about having it steal pages from shared_buffers? This would require segmenting messages into small enough chunks that they'd fit, but the nice part is that it would avoid the need to have a completely separate shared memory arena. Ideally, we'd make the infrastructure general enough that things like SLRU could use it also; and get rid of or reduce in size some of the special-purpose chunks we're now allocating. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte-character aware support for function "downcase_truncate_identifier()"
On Mon, Jul 26, 2010 at 12:40 AM, Rajanikant Chirmade wrote: > Since discussion stopped in discussion thread > > http://archives.postgresql.org/pgsql-bugs/2006-09/msg00128.php > > Are there any implications of this change in handling identifiers ? > > Thanks & Regards, > Rajanikant Chirmade An even more relevant message appears to be this one: http://archives.postgresql.org/pgsql-bugs/2006-09/msg00133.php Both this and the comment in downcase_truncate_identifier() suggests that the current method is attributable to lack of support for Unicode-aware case normalization and is known not to work correctly in all locales. Locale and encoding stuff isn't really my area of expertise, but if now have support for Unicode-aware case normalization, shouldn't we be using it here, too? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Sun, Jul 25, 2010 at 11:29 PM, Itagaki Takahiro wrote: > I think RAISE is badly designed. Using % as a placeholder has a limitation > to format strings. For example, format() cannot work as concat(): > SELECT format('%%', 123, 456) => ERROR It's hard to argue with this, as far as it goes. > So, my proposal is renaming stringfunc//sprintf() to format(), > and moving it into the core. I think sprintf() is superior to format() > in every aspect; '%s%s' works as concat(), and '%s%%' can append > % without blanks. So forget about format() and put sprintf() in contrib/stringfunc. That's not an argument for putting anything in core. Perhaps such an argument can be made, but this isn't it. > Then, concat_ws() will be moved into core because contrib/stringfunc > only has the function now. In addition, I'd like to include the function for > the compatibility to MySQL. Also, concat() and concat_ws() can share > the implementation. Regardless of where this function ends up, the concat_ws documentation should contain some mention of the fact that "ws" is intended to mean "with separator", and that the naming is chosen for compatibility with MySQL. As for where to put it, I see no terribly compelling reason why it needs to be in core. You can write array_to_string(array[txt1, txt2, txt3], sep) and get the same effect as concat_ws(sep, txt1, txt2, txt3). I don't really want to start maintaining duplicate functionality for things like this, especially since MySQL users will no doubt expect that our implementation will be bug-compatible. If the output isn't identical to what MySQL does for every set of arguments, it'll be reported as a bug. Come to think of it, have we checked that the behavior of LEFT, RIGHT, REVERSE, etc. is the same on other DBs, especially as far as nulls, empty strings, too-large or negative subscripts, etc is concerned? Is CONCAT('foo', NULL) => 'foo' really the behavior that everyone else implements here? And why does CONCAT() take a variadic "ANY" argument? Shouldn't that be variadic TEXT? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bg worker: overview
Markus Wanner writes: > To simplify, you might want to start a bgworker on database 'postgres', > which then acts as a sub-coordinator (and doesn't really need to use its > database connection). Yeah, that sounds like the simplest way forward, so that it's easy for this "user daemon" to communicate with the coordinator. Typically this would start a backend and LOAD a module, which would enter the user daemon main loop, I guess. Then all the usual backend code facilities are there, even SQL and calling user defined function. > Well, there are different types of imessages defined in imsg.h. If you are > coding something within Postgres, you'd just add all the required messages > types there. There's no such thing as an external registration for new > message types. Given that imessages can have a payload, maybe the simplest way there would be to add a "IMSGT_EXEC_QUERY_PARAMS" message type, the payload of which would be composed of the SQL text and its parameters. I get it that requiring a bgworker backend connected to a given database is already part of the API right? > So, the bgworker infrastructure could probably satisfy the internal > communication needs. But how does this ticker daemon talk to the outside? > Does it need to open a socket and listen there? Or do the requests to that > queue come in via SQL? The ticker only job is to manage a "ticks" table per database. All it needs for that is a libpq connection, really, but given your model it'd be a single backend (worker) that would send imessages to the coordinator so that a background worker would tick, by executing this SQL: select pgq.ticker()). So that would be a lot of changes to follow your facilities, it's unclear to me how much we're twisting it so that it fits. Well, maybe that's not a good example after all. Dave, want to see about pgagent? Regards, -- Dimitri Fontaine PostgreSQL DBA, Architecte -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Mon, Jul 26, 2010 at 6:48 AM, Marko Tiikkaja wrote: > On 7/26/10 1:44 PM +0300, Fujii Masao wrote: >> >> On Mon, Jul 26, 2010 at 6:36 PM, Yeb Havinga wrote: >>> >>> I wasn't entirely clear. My suggestion was to have only >>> >>> acknowledge_commit = {no|recv|fsync|replay} >>> >>> instead of >>> >>> replication_mode = {async|recv|fsync|replay} >> >> Okay, I'll change the patch accordingly. > > For what it's worth, I think replication_mode is a lot clearer. > Acknowledge_commit sounds like it would do something similar to > asynchronous_commit. I agree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gincostestimate
On 26/07/10 12:58, Oleg Bartunov wrote: Jan, On Sun, 25 Jul 2010, Jan Urbaski wrote: On 02/07/10 14:33, Teodor Sigaev wrote: Patch implements much more accuracy estimation of cost for GIN index scan than generic cost estimation function. I was able to reproduce his issue, that is: select id from ftstest where body_fts @@ to_tsquery('commonterm80'); was choosing a sequential scan, which was resulting in much longer execution than the bitmap index plan that I got after disabling seqscans. I then applied the patch, recompiled PG and tried again... and nothing changed. I first tried running ANALYSE and then dropping and recreating the GIN index, but the planner still chooses the seq scan. read thread http://archives.postgresql.org/pgsql-hackers/2010-04/msg01407.php There is always a fuzz factor, as Tom said, about 1% in path cost comparisons. You may compare plans for 'commonterm60', 'commonterm40'. OK, I thought this might be the case, as with the patch the sequential scan is winning only be a small margin. Thanks, Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gincostestimate
Jan, On Sun, 25 Jul 2010, Jan Urbaski wrote: On 02/07/10 14:33, Teodor Sigaev wrote: Patch implements much more accuracy estimation of cost for GIN index scan than generic cost estimation function. Hi, I'm reviewing this patch, and to begin with it I tried to reproduce the problem that originally came up on -performance in http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php I attached scripts The links from that mail are now dead, so I set up my own test environment: * one table testfts(id serial, body text, body_fts tsvector) * 5 rows, each with 1000 random words taken from /usr/share/dict/british-english-insane (the wbritish-insane Debian package) separated by a single space * each row also had the word "commonterm" at the end, 80% had commonterm80, 60% had commonterm60 etc (using the same methodology as Jesper, that commonterm60 can appear only if commonterm80 is in the row) * a GIN index on the tsvectors I was able to reproduce his issue, that is: select id from ftstest where body_fts @@ to_tsquery('commonterm80'); was choosing a sequential scan, which was resulting in much longer execution than the bitmap index plan that I got after disabling seqscans. I then applied the patch, recompiled PG and tried again... and nothing changed. I first tried running ANALYSE and then dropping and recreating the GIN index, but the planner still chooses the seq scan. read thread http://archives.postgresql.org/pgsql-hackers/2010-04/msg01407.php There is always a fuzz factor, as Tom said, about 1% in path cost comparisons. You may compare plans for 'commonterm60', 'commonterm40'. Full explains below (the NOTICE is a debugging aid from the patch, which I temporarily enabled to see if it's picking up the code). from this debug you can see that cost estimation now are much accurate than before. I'll continue reading the code and trying to understand what it does, but in the meantime: am I doing something wrong that I don't see the planner switching to the bitmap index plan? I see that the difference in costs is small, so maybe I just need to tweak the planner knobs a bit? Is the output below expected? I think Tom explained this http://archives.postgresql.org/pgsql-hackers/2010-04/msg01426.php Cheers, Jan wulczer=# explain analyse select id from ftstest where body_fts @@ to_tsquery('commonterm80'); NOTICE: GIN stats: nEntryPages: 49297.00 nDataPages: 16951.00 nPendingPages :0.00 nEntries: 277521.00 QUERY PLAN -- Seq Scan on ftstest (cost=0.00..1567.00 rows=39890 width=4) (actual time=221.893..33179.794 rows=39923 loops=1) Filter: (body_fts @@ to_tsquery('commonterm80'::text)) Total runtime: 33256.661 ms (3 rows) wulczer=# set enable_seqscan to false; SET Time: 0.257 ms wulczer=# explain analyse select id from ftstest where body_fts @@ to_tsquery('commonterm80'); NOTICE: GIN stats: nEntryPages: 49297.00 nDataPages: 16951.00 nPendingPages :0.00 nEntries: 277521.00 QUERY PLAN Bitmap Heap Scan on ftstest (cost=449.15..1864.50 rows=39890 width=4) (actual time=107.421..181.284 rows=39923 loops=1) Recheck Cond: (body_fts @@ to_tsquery('commonterm80'::text)) -> Bitmap Index Scan on ftstest_gin_idx (cost=0.00..439.18 rows=39890 width=0) (actual time=97.057..97.057 rows=39923 loops=1) Index Cond: (body_fts @@ to_tsquery('commonterm80'::text)) Total runtime: 237.218 ms (5 rows) Time: 237.999 ms Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 a able about account acid across act addition adjustment advertisement after again against agreement air all almost among amount amusement and angle angry animal answer ant any apparatus apple approval arch argument arm army art as at attack attempt attention attraction authority automatic awake baby back bad bag balance ball band base basin basket bath be beautiful because bed bee before behaviour belief bell bent berry between bird birth bit bite bitter black blade blood blow blue board boat body boiling bone book boot bottle box boy brain brake branch brass bread breath brick bridge bright broken brother brown brush bucket building bulb burn burst business but butter button by cake camera canvas card care carriage cart cat cause certain chain chalk chance change cheap cheese chemical chest chief chin church circle clean clear clock cloth cloud coal c
Re: [HACKERS] Synchronous replication
On 7/26/10 1:44 PM +0300, Fujii Masao wrote: On Mon, Jul 26, 2010 at 6:36 PM, Yeb Havinga wrote: I wasn't entirely clear. My suggestion was to have only acknowledge_commit = {no|recv|fsync|replay} instead of replication_mode = {async|recv|fsync|replay} Okay, I'll change the patch accordingly. For what it's worth, I think replication_mode is a lot clearer. Acknowledge_commit sounds like it would do something similar to asynchronous_commit. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Mon, Jul 26, 2010 at 6:36 PM, Yeb Havinga wrote: > Fujii Masao wrote: >> >> I still like >> >> replication_mode = {async|recv|fsync|replay} >> >> rather than >> >> synchronous_replication = {on|off} >> acknowledge_commit = {no|recv|fsync|replay} >> > > Hello Fujii, > > I wasn't entirely clear. My suggestion was to have only > > acknowledge_commit = {no|recv|fsync|replay} > > instead of > > replication_mode = {async|recv|fsync|replay} Okay, I'll change the patch accordingly. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Fujii Masao wrote: I still like replication_mode = {async|recv|fsync|replay} rather than synchronous_replication = {on|off} acknowledge_commit = {no|recv|fsync|replay} Hello Fujii, I wasn't entirely clear. My suggestion was to have only acknowledge_commit = {no|recv|fsync|replay} instead of replication_mode = {async|recv|fsync|replay} regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: quorum commit Re: [HACKERS] Synchronous replication
Fujii Masao wrote: In the following case, how should quorum commit behave? 1. quorum_standbys = 2; there are three connected synchronous standbys 2. One standby sends the ACK back and fails 3. The ACK arrives from another standby 4. How should quorum commit behave? (a) Transaction commit returns a "success" since the master has already received two ACKs (b) Transaction commit waits for the "last" ACK since only one of currently connected standbys has sent the ACK I'd opt for option (b) if that doesn't make the code very complex, or expensive (to check connected state when reaching quorum). regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
quorum commit Re: [HACKERS] Synchronous replication
On Thu, Jul 22, 2010 at 5:37 PM, Yeb Havinga wrote: > Fujii Masao wrote: >> >> How should the synchronous replication behave when the number of connected >> standby servers is less than quorum? >> >> 1. Ignore quorum. The current patch adopts this. If the ACKs from all >> connected standbys have arrived, transaction commit is successful >> even if the number of standbys is less than quorum. If there is no >> connected standby, transaction commit always is successful without >> regard to quorum. >> >> 2. Observe quorum. Aidan wants this. Until the number of connected >> standbys has become more than or equal to quorum, transaction commit >> waits. >> >> Which is the right behavior of quorum commit? Or we should add new >> parameter specifying the behavior of quorum commit? >> > > Initially I also expected the quorum to behave like described by > Aidan/option 2. I have another question about the detailed design of quorum commit. In the following case, how should quorum commit behave? 1. quorum_standbys = 2; there are three connected synchronous standbys 2. One standby sends the ACK back and fails 3. The ACK arrives from another standby 4. How should quorum commit behave? (a) Transaction commit returns a "success" since the master has already received two ACKs (b) Transaction commit waits for the "last" ACK since only one of currently connected standbys has sent the ACK Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Mon, Jul 26, 2010 at 5:27 PM, Yeb Havinga wrote: > Fujii Masao wrote: >>> >>> Intuitively by looking at the enumeration of replication_mode I'd think >>> that >>> the sync standbys are all standby's that operate in a not async mode. >>> That >>> would be clearer with a boolean sync (or not) and for sync standbys the >>> replication_mode specified. >>> >> >> You mean that something like synchronous_replication as the recovery.conf >> parameter should be added in addition to replication_mode? Since >> increasing >> the number of similar parameters would confuse users, I don't like do >> that. >> > > I think what would be confusing if there is a mismatch between implemented > concepts and parameters. > > 1 does the master wait for standby servers on commit? > 2 how many acknowledgements must the master receive before it can continue? > 3 is a standby server a synchronous one, i.e. does it acknowledge a commit? > 4 when do standby servers acknowledge a commit? > 5 does it only wait when the standby's are connected, or also when they are > not connected? > 6..? > > When trying to match parameter names for the concepts above: > 1 - does not exist, but can be answered with quorum_standbys = 0 > 2 - quorum_standbys > 3 - yes, if replication_mode != async (here is were I thought I had to think > to much) > 4 - replication modes recv, fsync and replay bot not async > 5 - Zoltan's strict_sync_replication parameter > > Just an idea, what about > for 4: acknowledge_commit = {no|recv|fsync|replay} > then 3 = yes, if acknowledge_commit != no Thanks for the clarification. I still like replication_mode = {async|recv|fsync|replay} rather than synchronous_replication = {on|off} acknowledge_commit = {no|recv|fsync|replay} because the former is more intuitive for me and I don't want to increase the number of parameters. We need to hear from some users in this respect. If most want the latter, of course, I'd love to adopt it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Fujii Masao wrote: Intuitively by looking at the enumeration of replication_mode I'd think that the sync standbys are all standby's that operate in a not async mode. That would be clearer with a boolean sync (or not) and for sync standbys the replication_mode specified. You mean that something like synchronous_replication as the recovery.conf parameter should be added in addition to replication_mode? Since increasing the number of similar parameters would confuse users, I don't like do that. I think what would be confusing if there is a mismatch between implemented concepts and parameters. 1 does the master wait for standby servers on commit? 2 how many acknowledgements must the master receive before it can continue? 3 is a standby server a synchronous one, i.e. does it acknowledge a commit? 4 when do standby servers acknowledge a commit? 5 does it only wait when the standby's are connected, or also when they are not connected? 6..? When trying to match parameter names for the concepts above: 1 - does not exist, but can be answered with quorum_standbys = 0 2 - quorum_standbys 3 - yes, if replication_mode != async (here is were I thought I had to think to much) 4 - replication modes recv, fsync and replay bot not async 5 - Zoltan's strict_sync_replication parameter Just an idea, what about for 4: acknowledge_commit = {no|recv|fsync|replay} then 3 = yes, if acknowledge_commit != no regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of Synchronous Replication patches
On Sat, Jul 24, 2010 at 4:40 PM, wrote: > Instead, I will post a patch that unifies my configuration choices with > Fujii's patch. Do you have suggestions for better worded GUCs? > "slave" seems to be phased out by "standby" for political correctness, so > "synchronous_standby" instead of "synchronous_slave". You mentioned > "min_sync_replication_clients" -> "quorum_min_sync_standbys". What else? I think that the meaning of my "quorum" parameter is the same as that of your "min_sync_replication_clients". Right? I'm planning to add new parameter specifying the behavior of quorum commit when the number of connected synchronous standbys becomes less than "quorum". 1. Ignore quorum. If the ACKs from all connected standbys have arrived, transaction commit is successful even if the number of standbys is less than quorum. If there is no connected standby, transaction commit always is successful without regard to quorum. 2. Observe quorum. Until the number of connected standbys has become more than or equal to quorum, transaction commit waits. http://archives.postgresql.org/pgsql-hackers/2010-07/msg01327.php Is the meaning of this parameter the same as that of your "strict_sync_replication"? > You also noticed that my patch addressed 2PC, maybe I will have to add > this part to Fujii's patch, too. Note: I haven't yet read his patch, > maybe working with LSNs instead of XIDs make this work automatically, > I don't know. We should definitely test. Yeah, transaction commit seems to have to wait for replication in not only RecordTransactionCommit() but also EndPrepare(), RecordTransactionCommitPrepared() and RecordTransactionAbortPrepared(). I'll fix that. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] security label support, part.2
The attached patches are revised ones, as follows. * A new SECURITY LABEL statement replaced the previous ALTER TABLE statement with SECURITY LABEL TO option. It has the following syntax. SECURITY LABEL [ FOR ] ON IS ''; E.g) SECURITY LABEL ON TABLE t1 IS 'system_u:object_r:sepgsql_table_t:s0'; * It supports multiple security providers to assign its security label on a database object. The pg_seclabel catalog was modified as follows: CATALOG(pg_seclabel,3037) BKI_WITHOUT_OIDS { Oid reloid; /* OID of table containing the object */ Oid objoid; /* OID of the object itself */ int4subid; /* column number, or 0 if not used */ + texttag;/* identifier of external security provider */ textlabel; /* security label of the object */ } FormData_pg_seclabel; The new 'tag' field identifies which security provider manages this security label. For example, SE-PostgreSQL uses "selinux" for its identifier. * The security hook to check relabeling become to be registered using register_object_relabel_hook() which takes a tag of ESP module and a function pointer to the security hook. ExecSecLabelStmt() picks up an appropriate security hook, then it shall be invoked even if multiple modules are loaded. * Add _copySecLabelStmt() on nodes/copyfuncs.c and _equalSecLabelStmt() on nodes/equalfuncs.c, because I forgot to add them, although new parsenode (SecLabelStmt) was added. * Add descriptions about pg_seclabel catalog and SECURITY LABEL statement on the documentation. Thanks, (2010/07/23 22:36), Robert Haas wrote: > On Fri, Jul 23, 2010 at 8:59 AM, KaiGai Kohei wrote: >> (2010/07/23 20:44), Robert Haas wrote: >>> >>> 2010/7/23 KaiGai Kohei: > > Hmm. How about if there's just one provider loaded, you can omit it, > but if you fail to specify it and there's>1 loaded, we just throw an > error saying you didn't specify whose label it is. > Perhaps, we need to return the caller a state whether one provider checked the given label at least, or not. >>> >>> Return to the caller? This is an SQL command. You either get an >>> error, or you don't. >>> >> Ahh, I was talked about relationship between the core PG code and ESP >> module. >> It means the security hook returns a state which informs the core PG code >> whether one provider checked the given label, then the core PG code can >> decide whether it raise an actual error to users, or not. >> >> In other words, I'd like to suggest the security hook which returns a tag >> of ESP module, as follows: >> >> const char * >> check_object_relabel_hook(const ObjectAddress *object, >> const char *provider, >> const char *seclabel); > > I don't think that's a very good design. What I had in mind was a > simple API for security providers to register themselves (including > their names), and then the core code will only call the relevant > security provider. I did try to explain this in point #3 of my > original review. > -- KaiGai Kohei pgsql-v9.1-security-label-2.v2.patch Description: application/octect-stream pgsql-v9.1-security-label-1.v2.patch Description: application/octect-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers