[HACKERS] Re: [COMMITTERS] pgsql: Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented
On 9 June 2012 17:19, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: On 9 June 2012 16:46, Tom Lane t...@sss.pgh.pa.us wrote: I don't believe there was actual consensus for this change, It was hardly a subject of marked disagreement. It was hardly a subject of discussion, as yet. Personally I'm pretty doubtful about suddenly starting to throw errors for syntax we've accepted without complaint for over nine years, on merely the grounds that we *might* get around to making it do something different in the future. You yourself have complained loudly about compatibility breaks that were considerably better founded than this. Possibly a NOTICE or WARNING (with some other text than this) would be a better choice for warning people that a compatibility break might be coming. OK, I will revert pending further discussion and agreement. The reason for action was simply to close an ubobtrusive open item, but its clear it wasn't. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visual Studio 2012 RC
On Sunday, June 10, 2012, Brar Piening wrote: Magnus Hagander wrote: I don't have too much hope for them actually changing it - they seem hell-bent on forcing everybody into metro, and this seems to be yet another way to do that. But we can always hope... Looks like they've learnt their lesson... http://blogs.msdn.com/b/**visualstudio/archive/2012/06/** 08/visual-studio-express-2012-**for-windows-http://blogs.msdn.com/b/visualstudio/archive/2012/06/08/visual-studio-express-2012-for-windows-desktop.aspx Yeah, though what I didn't realise was that 2012 won't target XP (and 2k3?). Urgh. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] [RFC] Interface of Row Level Security
Sorry for my late reply. 2012/6/6 Florian Pflug f...@phlo.org: On Jun5, 2012, at 22:33 , Kohei KaiGai wrote: 2012/6/5 Florian Pflug f...@phlo.org: I can live with any behaviour, as long as it doesn't depends on details of the query plan. My vote would be for always using the role which was active at statement creation time (i.e. at PREPARE/DECLARE time) for RLS purposes though. That way, we could treat the role as being IMMUTABLE I think. I have same opinion at the point where the active role should be consistent, but a bit different in details. I think, it should be a timing at beginning of execution, not plan creation time, like as existing permission checks are applied on InitPlan() or its callee. From a consistency POV, I'd agree. But binding the active role after planning means giving up a lot of query optimization opportunities. It will prevent the role from participating in constant folding, which translates to potentially enormous loss of performance. Take the following policy function as an example (role = 'some_role' AND expensive_function(row)) OR (role = 'other_role' AND row.field = 'some_value') and assume that there's an index on 'field'. Now, if role is known to be 'other_row', the planner can eliminate the expensive_function() from the plan, and satisfy row.field = 'some_value' with an index scan. Without that knowledge, you'll probably get a sequential scan. I do value consistency, but in this case the benefit of not being consistent with other privilege checks outweighs the drawbacks I think. I never deny some possibility to have special optimization using some information to be determined at planner stage, in the near future. However, I wonder the first version of this patch should include all the desirable features at once. It seems to me, caution of the problem is current_user is performing out of the snapshot controls. According to the definition of stable function, its result should not be changed during a particular scan. Yeah, well, ISTM that our definition of STABLE doesn't really take functions whose return value depends on GUCs into account. Probably, has_XXX_privilege() should work with reliable data source being safe from user-id switching. But it is a tough work to enhance whole of the GUC stuff for snapshot aware… We don't need that for every GUC, though. I'd say its sufficient to somehow provide the policy function with a stable value of the active role (whatever active ends up meaning). If we do that, we can simply document that making the policy function depend on other GUCs is a seriously bad idea. We *do* need that stable active role value though, since we cannot very well advise against using current_role in the policy function without providing an alternativ… I have no preference about implementation detail of this. As long as we can have the actually stable user-id until v9.3 release, it is sufficient for me. To reiterate, my point is that we won't get away with zero planner changes even without RLSBYPASS. In fact, we'll be paying *most* of the complexity costs of RLSBYPASS whether we actually add that feature or not. Which makes not adding it look like a pretty bad deal to me. It seems to me, 95% of our opinion is common, except for a few detailed stuff. I never dislike the idea of RLSBYPASS permission, but I'd like to investigate this idea with more time, until v9.3 being closed. At least, nobody opposed to allow superusers to bypass RLS policy. So, it can be a minimum startup point, isn't it? RLS patch will takes hundreds of lines due to syntax support or query rewriting stuff. It is always good idea to focus on the core functionality at the starting point. Absolutely. But at the same time, it's important to check that the design allows the additional features to be added later easily. In the case of RLS, I'm worried that decreeing that it's the role at execution time, not planning time, that counts, we're painting ourselves into a corner. I view RLSBYPASS as a good sanity check for that. Another one is that GRANT-with-filter-function idea. Both seem to fall into place quite naturally if handled while planning - you simply have to add decide which additional WHERE clause to add, if any. Without any negative performance effects from policies which don't apply to the current role, if the optimizer does it's job. Getting the same level of performance when policies are added unconditionally seems quite impossible. You might be able to make up for some losses by caching STABLE values (which amounts to another constant folding pass), but you won't ever be able to make up for a lost index scan opportunity or other optimizations which change the structure of the whole query plan. If we have a mechanism to check permissions at planner stage, it should perform as if it works at executor stage, as existing permission mechanism doing. Thus, here is no inconsistency even if
Re: [HACKERS] Visual Studio 2012 RC
On Sun, Jun 10, 2012 at 3:16 AM, Brar Piening b...@gmx.de wrote: Magnus Hagander wrote: I don't have too much hope for them actually changing it - they seem hell-bent on forcing everybody into metro, and this seems to be yet another way to do that. But we can always hope... Looks like they've learnt their lesson... http://blogs.msdn.com/b/visualstudio/archive/2012/06/08/visual-studio-express-2012-for-windows-desktop.aspx Yeah. Didn't expect that to happen, but happy to see that it did! :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)
2012/6/8 Simon Riggs si...@2ndquadrant.com: On 25 April 2012 10:40, Kohei KaiGai kai...@kaigai.gr.jp wrote: I tried to implement a patch according to the idea. It allows extensions to register an entry point of the self-managed daemon processes, then postmaster start and stop them according to the normal manner. The patch needs much work yet, but has many good ideas. There doesn't seem to be a place where we pass the parameter to say which one of the multiple daemons a particular process should become. It would be helpful for testing to make the example module call 2 daemons each with slightly different characteristics or parameters, so we can test the full function of the patch. This patch intended to register a daemon multiple times with different name such as auth-counter-1 or auth-counter-2. But, I agree with the suggestion to take a parameter to identify each daemon makes interface better than the original one. I think its essential that we allow these processes to execute SQL, so we must correctly initialise them as backends and set up signalling. Which also means we need a parameter to limit the number of such processes. It should be controllable with a flag of RegisterExtraDaemon(). Although it helps to reduce code duplication in case when extra daemons execute SQL, but some other use-cases may not need SQL execution. Also, I prefer to call these bgworker processes, which is more similar to auto vacuum worker and bgwriter naming. That also gives a clue as to how to set up signalling etc.. I don't think we should allow these processes to override sighup and sigterm. Signal handling should be pretty standard, just as it is with normal backends. Hmm. CHECK_FOR_INTERRUPTS() might be sufficient to handle signaling behavior according to the standard. I have a prototype that has some of these characteristics, so I see our work as complementary. At present, I don't think this patch would be committable in CF1, but I'd like to make faster progress with it than that. Do you want to work on this more, or would you like me to merge our prototypes into a more likely candidate? I'm not favor in duplicate similar efforts. If available, could you merge some ideas in my patch into your prototypes? Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] incorrect handling of the timeout in pg_receivexlog
On Wed, Jun 6, 2012 at 8:10 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jun 5, 2012 at 11:42 PM, Magnus Hagander mag...@hagander.net wrote: Works for me. We still need a (reworked) patch, though, right? We just move where the move between seconds and milliseconds happens? Attached is the updated version of the patch. Thanks. I definitely don't think we need subsecond granularity in the user facing number. Even a second is pretty short. Yep. (We do need to retain the ability to set it to 0 = off of course). Yep, a value of zero disables the status updates, and the patch adds that explanation into the document of pg_basebackup and pg_receivexlog. Applied, with some small modifications. For example, you don't need a frontend-specific copy of #define's that are in the backend, since those don't require linking to the backend, just the #include. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog and feedback message
On Thu, Jun 7, 2012 at 12:40 PM, Magnus Hagander mag...@hagander.net wrote: On Thursday, June 7, 2012, Fujii Masao wrote: On Thu, Jun 7, 2012 at 6:25 PM, Magnus Hagander mag...@hagander.net wrote: On Thursday, June 7, 2012, Fujii Masao wrote: On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander mag...@hagander.net wrote: On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander mag...@hagander.net wrote: On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander mag...@hagander.net wrote: Right now, pg_receivexlog sets: replymsg-write = InvalidXLogRecPtr; replymsg-flush = InvalidXLogRecPtr; replymsg-apply = InvalidXLogRecPtr; when it sends it's status updates. I'm thinking it sohuld set replymsg-write = blockpos instad. Why? That way you can see in pg_stat_replication what has actually been received by pg_receivexlog - not just what we last sent. This can be useful in combination with an archive_command that can block WAL recycling until it has been saved to the standby. And it would be useful as a general monitoring thing as well. I think the original reason was that it shouldn't interefer with synchronous replication - but it does take away a fairly useful usecase... I think that not only replaymsg-write but also -flush should be set to blockpos in pg_receivexlog. Which allows pg_receivexlog to behave as synchronous standby, so we can write WAL to both local and remote synchronously. I believe there are some use cases for synchronous pg_receivexlog. pg_receivexlog doesn't currently fsync() after every write. It only fsync():s complete files. So we'd need to set -flush only at the end of a segment, right? Yes. Currently the status update is sent for each status interval. In sync replication, transaction has to wait for a while even after pg_receivexlog has written or flushed the WAL data. So we should add new option which specifies whether pg_receivexlog sends the status packet back as soon as it writes or flushes the WAL data, like the walreceiver does? That might be useful, but I think that's 9.3 material at this point. Fair enough. That's new feature rather than a bugfix. But I think we can get the set the write location in as a bugfix. Also set the flush location? Sending the flush location back seems helpful when using pg_receivexlog for WAL archiving purpose. By seeing the flush location we can ensure that WAL file has been archived durably (IOW, WAL file has been flushed in remote archive area). You can do that with the write location as well, as long as you round it You mean to prevent pg_receivexlog from sending back the end of WAL file as the write location *before* it completes the WAL file? If so, yes. But why do you want to keep the flush location invalid? No. pg_receivexlog sends back the correct write location. Whoever does the check (through pg_stat_replication) rounds down, so it only counts it once pg_receivexlog has acknowledged receiving the whole mail. I'm not against doing the flush location as well, I'm just worried about feature-creep :-) But let's see how big a change that would turn out to be... How about this? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ receivelog_flushpos.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New Postgres committer: Kevin Grittner
On Fri, Jun 8, 2012 at 1:04 AM, Peter Geoghegan pe...@2ndquadrant.com wrote: On 7 June 2012 23:40, Peter Geoghegan pe...@2ndquadrant.com wrote: On 7 June 2012 23:15, Tom Lane t...@sss.pgh.pa.us wrote: Please join me in welcoming him aboard. Congratulations, Kevin. Idle thought for the web team: Now might be a good time to take down the blurb on .org in which Kevin describes the mailing list support as amazing. :-) Should we, though? We have quotes from e.g. Simon in the list as well - just because he's a commiter doesn't make it any less valid, does it? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Boyer-Moore, no less
Searching for Boyer-Moore in the manual, one only finds the following line (in the 8.4 release notes): Improve the performance of text_position() and related functions by using Boyer-Moore- Horspool searching (David Rowley) I have two questions: 1. Where in postgres is the Boyer-Moore algorithm used? (What happened to text_position() anyway?) 2. Shouldn't there be, for 'advocacy' reasons, a mention of this well-known algorithm in the manual proper? (I can make a doc-patch if needed) Thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup --xlog compatibility break
On Mon, Jun 4, 2012 at 2:14 PM, Peter Eisentraut pete...@gmx.net wrote: On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote: Yeah, good arguments all around, i agree too :-) Next question is - suggestions for naming of said paramter? --xlog-method=something? And/or -Xsomething, which would automatically enable -x? How's this? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ basebackup_xlog_param.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Boyer-Moore, no less
On 10.06.2012 14:28, Erik Rijkers wrote: Searching for Boyer-Moore in the manual, one only finds the following line (in the 8.4 release notes): Improve the performance of text_position() and related functions by using Boyer-Moore- Horspool searching (David Rowley) I have two questions: 1. Where in postgres is the Boyer-Moore algorithm used? (What happened to text_position() anyway?) text_position() is in src/backend/utils/adt/varlena.c. It was split into a few subroutines, which together implement the Boyer-Moore algorithm. The internal text_position() function backs the SQL-style POSITION(substring IN string) syntax. In hindsight, the release notes should've spelled out POSITION(substring IN string), rather than just assume that people know that text_position() refers to that. 2. Shouldn't there be, for 'advocacy' reasons, a mention of this well-known algorithm in the manual proper? (I can make a doc-patch if needed) Doesn't seem necessary to me. I don't see any natural place to put such a notice in the docs. I think people just rightly assume that the built-in functions are implemented efficiently, and no need to go into details. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
Peter Eisentraut pete...@gmx.net writes: On lör, 2012-06-09 at 18:26 -0400, Tom Lane wrote: That's not actually quite the same thing as what I suggest above. Currently, unix_socket_directory *overrides* the compiled-in choice. I'm suggesting that it would be better to invent a list that is *added to* the compiled-in choice. If we think it would be best to still be able to override that, then I'd vote for keeping unix_socket_directory as is, and then adding a list named something like secondary_socket_directories. But if we just turn unix_socket_directory into a list, I think the lack of separation between primary and secondary directories will be confusing. By that logic, any list-valued parameter should be split into a primary and secondary setting. Well, no: the key point here is that there will be one directory that is special because it's the one baked into libpq. I agree that for the purposes of the backend in isolation, we might as well just have a list. What's less clear is whether, when considering the backend+client ecosystem as a whole, the special status of the configure-time socket directory ought to be reflected in the way we set up the GUCs. I have to admit that I'm not totally sold on either approach. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup --xlog compatibility break
On Sun, Jun 10, 2012 at 8:43 PM, Magnus Hagander mag...@hagander.net wrote: On Mon, Jun 4, 2012 at 2:14 PM, Peter Eisentraut pete...@gmx.net wrote: On tis, 2012-05-29 at 22:31 +0200, Magnus Hagander wrote: Yeah, good arguments all around, i agree too :-) Next question is - suggestions for naming of said paramter? --xlog-method=something? And/or -Xsomething, which would automatically enable -x? How's this? Looks good to me. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Something weird happening in the buildfarm
Although HEAD builds seem to still be happy, all back branches have been failing with git errors for the last six hours or so. Who broke what? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-www] Something weird happening in the buildfarm
On Sun, Jun 10, 2012 at 3:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: Although HEAD builds seem to still be happy, all back branches have been failing with git errors for the last six hours or so. Who broke what? We had a server failure on the box that runs git.postgresql.org, so it was unavailable for a few hours (I think a bit over 3 hours). It ha sbeen back up for a while now, though, so they should be going green now... I think the reason it's showing red on so many backbranches still is that they just don't build as often. All tries are almost 5 hours ago, which is around the time it came back up. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On Sun, Jun 10, 2012 at 8:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Eisentraut pete...@gmx.net writes: On lör, 2012-06-09 at 18:26 -0400, Tom Lane wrote: That's not actually quite the same thing as what I suggest above. Currently, unix_socket_directory *overrides* the compiled-in choice. I'm suggesting that it would be better to invent a list that is *added to* the compiled-in choice. If we think it would be best to still be able to override that, then I'd vote for keeping unix_socket_directory as is, and then adding a list named something like secondary_socket_directories. But if we just turn unix_socket_directory into a list, I think the lack of separation between primary and secondary directories will be confusing. By that logic, any list-valued parameter should be split into a primary and secondary setting. Well, no: the key point here is that there will be one directory that is special because it's the one baked into libpq. I agree that for the purposes of the backend in isolation, we might as well just have a list. What's less clear is whether, when considering the backend+client ecosystem as a whole, the special status of the configure-time socket directory ought to be reflected in the way we set up the GUCs. I have to admit that I'm not totally sold on either approach. I think we should consider this in the context of allowing both additional UNIX sockets and additional TCP ports. In the case of TCP ports, it's clearly no good to turn port into a list, because one port number has to be primary, since it goes into the PID file and also affects the naming of any UNIX sockets created. If we add secondary_socket_dirs, I think we will also need secondary_ports. One idea might be to have one new GUC that serves both purposes: additional_sockets = '12345, /foo' I'm sure there are other ways to skin the cat as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log_newpage header comment
On Sat, Jun 9, 2012 at 1:43 AM, Amit kapila amit.kap...@huawei.com wrote: On further review, I think that we ought to make MarkBufferDirty() the caller's job, because sometimes we may need to xlog only if XLogIsNeeded(), but the buffer's got to get marked dirty either way. Incase the place where Xlog is not required, woudn't it fsync the data; So in that case even MarkBufferDirty() will also be not required. Uh... no. The whole point of doing things in shared buffers is that you don't have to write and fsync the buffers immediately. Instead, buffer evicting handles that stuff for you. So I think the new function should just do step 5 - emit XLOG and set LSN/TLI. In the API log_newpage_buffer(), as buffer already contains the page to be logged, so can't it be assumed that the page will be initialized and no need to check if PageIsNew before setting LSN/TLI. I don't see why it's any different from log_newpage() in that regard. That data is initialized before being written, as well, but someone contemplated the possible need to write a page of all zeros. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log_newpage header comment
On Sat, Jun 9, 2012 at 10:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Whee, testing is fun. Second try. I'm concerned by the fact that neither the original nor the new code bother to test whether the relation is WAL-loggable. It may be that ginbuildempty cannot be invoked for temp tables, but it still seems like an oversight waiting to bite you. I'd be happier if there were a RelationNeedsWAL test here. Come to think of it, the other foobuildempty functions aren't checking this either. That's a good thing, because if they were, it would be wrong. RelationNeedsWAL() tells you whether the main fork is WAL-logged, but the buildempty routines exist for the purpose of constructing the init fork, which should always be WAL-logged. A related point is that most of the other existing calls to log_newpage are covered by if (XLogIsNeeded()) tests. It's not immediately obvious why these two shouldn't be. After some reflection I think that's correct, but probably the comments for log_newpage and log_newpage_buffer need to explain the different WAL-is-needed tests that apply to the two usages. (I'm also thinking that the XLogIsNeeded macro is very poorly named, as the situations it should be used in are far more narrow than the macro name suggests. Haven't consumed enough caffeine to think of a better name, though.) XLogIsNeeded() is, indeed, not a very good name: it's telling us whether wal_level minimal, and thus whether there might be a standby. When log_newpage() is used to rebuild a relation, we use WAL bypass whenever possible, since the whole relation will be removed if the transaction rolls back, but we must still log it if a standby exists. In contrast, the init forks need to make it to the standby regardless. I'm not sure that I agree that it's the job of log_newpage() to explain to people calling it on what things they must conditionalize WAL generation: after all, none of this is unique to new-page records. If we emit some other record while creating an index on the primary, we can skip that when !XLogIsNeeded(), too. Any operation we perform on any relation fork other than the init fork can be conditionalized on RelationNeedsWAL(), not just new-page records. The charter of the function is just to avoid duplicating the mumbo-jumbo that's required to emit the record. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog and feedback message
On Sun, Jun 10, 2012 at 7:55 PM, Magnus Hagander mag...@hagander.net wrote: How about this? + /* +* Set flushed position to the last byte in the previous +* file. Per above we know that xrecoff%XLOG_SEG_SIZE=0 +*/ + flushedpos = blockpos; + if (flushedpos.xrecoff == 0) + { + flushedpos.xlogid--; + flushedpos.xrecoff = XLogFileSize-1; + } + else + flushedpos.xrecoff--; flushedpos.xrecoff doesn't need to be decremented by one. If xrecoff % XLOG_SEG_SIZE = 0, the position should be the last byte of previous (i.e., flushed) WAL file. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Add ERROR msg for GLOBAL/LOCAL TEMP is not yet implemented
Tom Lane wrote: Kevin Grittner writes: Because the current support for temporary tables is relatively similar to the standard's description of LOCAL TEMPORARY TABLES, but nothing at all like the standard's descri0ption of GLOBAL TEMPORARY TABLES. Um ... did you read the spec before stating that? Well, I did, but not recently. Clearly I should not have trusted my memory. A fresh review brought it all back to me. When I was reviewing the standard (in 1998) to decide what to implement for the SQL parser in the Wisconsin Courts framework I was designing, I decided to effectively consider any compound statement block (delimited by BEGIN/END) in a trigger or stored procedure to be equivalent to a module in terms of features such as temporary tables -- because we didn't have a concept of modules and wanted to borrow features from the standard which were defined in terms of modules. So over the years I muddled what was in the standard with implementation details of our framework. I apologize for the confusion. So there are three types of temporary tables defined in the standard, and the PostgreSQL implementation doesn't look like any of them. The bad thing is that PostgreSQL supports syntax for two of them without matching the standard semantics. :-( CREATE TEMPORARY TABLE (without LOCAL or GLOBAL) would be an extension which could not be confused with standard syntax, so arguably refusing to accept those would be the right thing to do from a standards compliance perspective -- it would be a legitimate PostgreSQL extension that way, but the breakage of working code which would result from suddenly doing that could never be considered acceptable. What to do about all this? Unless we expect to implement modules in the next release or two, perhaps we should address the LOCAL noise-word in the docs, with a note that its usage is non-standard and discouraged since it might conflict with standard usage in future releases. Since we've had an actual patch submitted for implementing GLOBAL temporary tables, perhaps that merits a run-time warning in 9.2 (also supported by a warning in the docs). I think we've dug ourselves into a hole by supporting standard syntax with non-standard semantics. As the saying goes, when you find yourself in that position, the first thing to do is to stop digging. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Temporary tables under hot standby
On Fri, Jun 08, 2012 at 01:26:20PM -0400, Robert Haas wrote: On Sun, Apr 29, 2012 at 4:02 PM, Noah Misch n...@leadboat.com wrote: On Tue, Apr 24, 2012 at 11:55:15PM -0400, Noah Misch wrote: Concerning everyone's favorite topic, how to name the new type of table, I liked Tom's proposal[1] to make CREATE TEMP TABLE retain current behavior and have CREATE GLOBAL TEMP TABLE and/or CREATE LOCAL TEMP TABLE request the new SQL-standard variety. ?(I'd vote for using CREATE GLOBAL and retaining CREATE LOCAL for future expansion.) ?As he mentions, to get there, we'd ideally start by producing a warning instead of silently accepting GLOBAL as a noise word. Should we put such a warning into 9.2? Here is the change I'd make. This is listed on the open items list. I haven't ever heard anyone propose to redefine CREATE LOCAL TEMP TABLE to mean anything different than CREATE TEMP TABLE, so I'm disinclined to warn about that. From a documentation perspective, it will be awkward to explain (or decline to explain) that both GLOBAL TEMPORARY and LOCAL TEMPORARY are standard syntaxes with non-standard behavior, only one of which emits a warning. That unduly telegraphs a prediction about which one will change first/ever. Maybe that's nonetheless the right pragmatic answer. I would be more open to warning people about CREATE GLOBAL TEMP TABLE - frankly, it's pretty wonky that we allow that but treat GLOBAL as a noise word in this first place. But I'm a little disinclined to have the message speculate about what might happen in future versions of PostgreSQL. Such predictions don't have a very good track record of being accurate. I feel the predictions in question (This may specify different semantics in future versions of PostgreSQL. and This usage is deprecated and may specify standard-compliant behavior in the future.) were broad enough to mitigate this concern. If we ever do change the interpretation of this syntax, to what could it be other than the standard behavior? We're not likely to introduce a different but still-nonstandard behavior for this standard syntax. I wrote the verbiage that way for the benefit of users encountering the new warning. They might reasonably ask, Why did the PostgreSQL developers create this work for me? No objection to removing the errhint, but I think the documentation wording should stay. Concerning whether to make this a WARNING or an ERROR, does anyone still object to WARNING? Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time for pgindent run?
On Wed, Jun 06, 2012 at 09:40:45PM -0400, Robert Haas wrote: On Tue, Jun 5, 2012 at 10:25 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Jun 05, 2012 at 10:21:14AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Is everyone ready for me to run pgindent? ?We are nearing the first commit-fest (June 15) and will have to branch the git tree soon. Also, we should do the pgindent run well before the commitfest, so that authors of pending patches have time to rebase their patches in case pgindent changes the code they are patching ... Ah, good point. ?That will affect commit-fest patches. ?We could run it only on the 9.3 branch, but that makes double-patching very hard. Is everyone good for a pgindent run this week? The sooner the better. +1 Barring a pgindent commit in the next ~2 days, I think we should branch and run it independently on master, after the 2012-06 CF, and on REL9_2_STABLE. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Streaming-only Remastering
So currently we have a major limitation in binary replication, where it is not possible to remaster your system (that is, designate the most caught-up standby as the new master) based on streaming replication only. This is a major limitation because the requirement to copy physical logs over scp (or similar methods), manage and expire them more than doubles the administrative overhead of managing replication. This becomes even more of a problem if you're doing cascading replication. Therefore I think this is a high priority for 9.3. As far as I can tell, the change required for remastering over streaming is relatively small; we just need to add a new record type to the streaming protocol, and then start writing the timeline change to that. Are there other steps required which I'm not seeing? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time for pgindent run?
On Sun, Jun 10, 2012 at 01:47:10PM -0400, Noah Misch wrote: On Wed, Jun 06, 2012 at 09:40:45PM -0400, Robert Haas wrote: On Tue, Jun 5, 2012 at 10:25 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Jun 05, 2012 at 10:21:14AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Is everyone ready for me to run pgindent? ?We are nearing the first commit-fest (June 15) and will have to branch the git tree soon. Also, we should do the pgindent run well before the commitfest, so that authors of pending patches have time to rebase their patches in case pgindent changes the code they are patching ... Ah, good point. ?That will affect commit-fest patches. ?We could run it only on the 9.3 branch, but that makes double-patching very hard. Is everyone good for a pgindent run this week? The sooner the better. +1 Barring a pgindent commit in the next ~2 days, I think we should branch and run it independently on master, after the 2012-06 CF, and on REL9_2_STABLE. OK, running it now. Sorry. I am at a conference and forgot about this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time for pgindent run?
On Sun, Jun 10, 2012 at 02:58:03PM -0400, Bruce Momjian wrote: On Sun, Jun 10, 2012 at 01:47:10PM -0400, Noah Misch wrote: On Wed, Jun 06, 2012 at 09:40:45PM -0400, Robert Haas wrote: On Tue, Jun 5, 2012 at 10:25 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Jun 05, 2012 at 10:21:14AM -0400, Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Is everyone ready for me to run pgindent? ?We are nearing the first commit-fest (June 15) and will have to branch the git tree soon. Also, we should do the pgindent run well before the commitfest, so that authors of pending patches have time to rebase their patches in case pgindent changes the code they are patching ... Ah, good point. ?That will affect commit-fest patches. ?We could run it only on the 9.3 branch, but that makes double-patching very hard. Is everyone good for a pgindent run this week? The sooner the better. +1 Barring a pgindent commit in the next ~2 days, I think we should branch and run it independently on master, after the 2012-06 CF, and on REL9_2_STABLE. OK, running it now. Sorry. I am at a conference and forgot about this. Done. Sorry for the delay. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming-only Remastering
On Sun, Jun 10, 2012 at 11:47 AM, Joshua Berkus j...@agliodbs.com wrote: So currently we have a major limitation in binary replication, where it is not possible to remaster your system (that is, designate the most caught-up standby as the new master) based on streaming replication only. This is a major limitation because the requirement to copy physical logs over scp (or similar methods), manage and expire them more than doubles the administrative overhead of managing replication. This becomes even more of a problem if you're doing cascading replication. Therefore I think this is a high priority for 9.3. As far as I can tell, the change required for remastering over streaming is relatively small; we just need to add a new record type to the streaming protocol, and then start writing the timeline change to that. Are there other steps required which I'm not seeing? Problem that may exist and is likely out of scope: It is possible for a master with multiple slave servers to have slaves which have not read all of the logs off of the master. It is annoying to have to rebuild a replica because it was 1kb behind in reading logs from the master. If the new master could deliver the last bit of the old masters logs that would be very nice. -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)
On Wed, Aug 24, 2011 at 03:38:09PM -0400, Tom Lane wrote: Merlin Moncure mmonc...@gmail.com writes: On Wed, Aug 24, 2011 at 1:24 PM, Daniel Farina dan...@heroku.com wrote: At Heroku we use CREATE INDEX CONCURRENTLY with great success, but recently when frobbing around some indexes I realized that there is no equivalent for DROP INDEX, and this is a similar but lesser problem (as CREATE INDEX takes much longer), as DROP INDEX takes an ACCESS EXCLUSIVE lock on the parent table while doing the work to unlink files, which nominally one would think to be trivial, but I assure you it is not at times for even indexes that are a handful of gigabytes (let's say ~= a dozen). Are you sure that you are really waiting on the time to unlink the file? there's other stuff going on in there like waiting for lock, plan invalidation, etc. Point being, maybe the time consuming stuff can't really be deferred which would make the proposal moot. Assuming the issue really is the physical unlinks (which I agree I'd like to see some evidence for), I wonder whether the problem could be I'd believe it. With a cold cache (sudo sysctl -w vm.drop_caches=3), my desktop takes 2.6s to rm a 985 MiB file. addressed by moving smgrDoPendingDeletes() to after locks are released, instead of before, in CommitTransaction/AbortTransaction. There does not seem to be any strong reason why we have to do that before lock release, since incoming potential users of a table should not be trying to access the old physical storage after that anyway. Agreed. We now have $OLD_SUBJECT, but this is a win independently. I have reviewed the code that runs between the old and new call sites, and I did not identify a hazard of moving it as you describe. nm diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 8f00186..8e4a455 100644 *** a/src/backend/access/transam/xact.c --- b/src/backend/access/transam/xact.c *** *** 1944,1957 CommitTransaction(void) */ AtEOXact_Inval(true); - /* -* Likewise, dropping of files deleted during the transaction is best done -* after releasing relcache and buffer pins. (This is not strictly -* necessary during commit, since such pins should have been released -* already, but this ordering is definitely critical during abort.) -*/ - smgrDoPendingDeletes(true); - AtEOXact_MultiXact(); ResourceOwnerRelease(TopTransactionResourceOwner, --- 1944,1949 *** *** 1961,1966 CommitTransaction(void) --- 1953,1969 RESOURCE_RELEASE_AFTER_LOCKS, true, true); + /* +* Likewise, dropping of files deleted during the transaction is best done +* after releasing relcache and buffer pins. (This is not strictly +* necessary during commit, since such pins should have been released +* already, but this ordering is definitely critical during abort.) Since +* this may take many seconds, also delay until after releasing locks. +* Other backends will observe the attendant catalog changes and not +* attempt to access affected files. +*/ + smgrDoPendingDeletes(true); + /* Check we've released all catcache entries */ AtEOXact_CatCache(true); *** *** 2354,2360 AbortTransaction(void) AtEOXact_Buffers(false); AtEOXact_RelationCache(false); AtEOXact_Inval(false); - smgrDoPendingDeletes(false); AtEOXact_MultiXact(); ResourceOwnerRelease(TopTransactionResourceOwner, RESOURCE_RELEASE_LOCKS, --- 2357,2362 *** *** 2362,2367 AbortTransaction(void) --- 2364,2370 ResourceOwnerRelease(TopTransactionResourceOwner, RESOURCE_RELEASE_AFTER_LOCKS, false, true); + smgrDoPendingDeletes(false); AtEOXact_CatCache(false); AtEOXact_GUC(false, 1); *** *** 4238,4250 AbortSubTransaction(void) AtEOSubXact_RelationCache(false, s-subTransactionId, s-parent-subTransactionId); AtEOSubXact_Inval(false); - AtSubAbort_smgr(); ResourceOwnerRelease(s-curTransactionOwner, RESOURCE_RELEASE_LOCKS, false, false); ResourceOwnerRelease(s-curTransactionOwner, RESOURCE_RELEASE_AFTER_LOCKS,
[HACKERS] Resource Owner reassign Locks
As discussed in several different email threads here and on performance , when using pg_dump a on large number of objects, the server has a quadratic behavior in LockReassignCurrentOwner where it has to dig through the entire local lock table to push one or two locks up from the portal being dropped to its parent. The attached patch fixes that by remembering up to 10 local locks, and pushing them up specifically rather than digging through the entire lock table. If the list overflows, then it reverts to the old behavior of digging through the entire local lock table. The same change was made to the case where the locks are being released, rather than reassigned (i.e. subtransaction abort rather than commit). I have no evidence that digging through the local lock table during bulk release was ever a bottleneck, but it seemed inconsistent not to make that change as well. When it needs to forget a lock, it searches backwards in the list of released lock and then just moves the last lock into the place of the one to be forgotten. Other ResourceOwner Forget functions slide the entire list down to close the gap, rather than using the selection-sort-like method. I don't understand why they do that. If Forgets are strictly LIFO, then it would not make a difference. If they are mostly LIFO but occasionally not, maybe the insertion method would win over the selection method. From what I can tell, Locks are mostly released in bulk anyway at transaction end, and rarely released explicitly. This patch reduces the time needed to dump 20,000 simple empty tables from 3m50.903s to 20.695s, and of course larger gains at larger numbers. dropdb test; createdb test for f in `seq 1 1 400` ; do perl -le print qq{create table foo\$_ (k integer , v integer);} foreach ($f..$f+) | psql test /dev/null time pg_dump test|wc -c done For degrading performance in other cases, I think the best test case is pgbench -P (implemented in another patch in this commitfest) which has a loop which pushes one or two locks up from a portal to the parent (which already owns them, due to previous rounds of the same loop) very frequently. There might be a performance degradation of 0.5% or so, but it is less than the run to run variation. I plan to run some longer test to get a better estimate. If there is a degradation in that range, how important is that? I wanted to test a real worst case, which would be a malicious ordering of lock releases (take 10 locks, release the first lock taken, then release the other 9 in reverse order), but with the absence of UNLOCK TABLE command, I can't figure out how to engineer that. Cheers, Jeff resowner_lock_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Temporary tables under hot standby
Noah Misch n...@leadboat.com writes: On Fri, Jun 08, 2012 at 01:26:20PM -0400, Robert Haas wrote: I haven't ever heard anyone propose to redefine CREATE LOCAL TEMP TABLE to mean anything different than CREATE TEMP TABLE, so I'm disinclined to warn about that. From a documentation perspective, it will be awkward to explain (or decline to explain) that both GLOBAL TEMPORARY and LOCAL TEMPORARY are standard syntaxes with non-standard behavior, only one of which emits a warning. Yeah. If we're going to touch this at all, I think we should warn about both, because they are both being interpreted in a non-standards-compliant fashion. It's possible that different message texts would be appropriate, though. If we create the infrastructure necessary to make GLOBAL TEMP standards-compliant, it would not be totally unreasonable (IMO) to make LOCAL TEMP act like GLOBAL TEMP. It would still be non-compliant, but closer than it is today. Moreover, if you argue that the whole session is one SQL module, it could actually be seen as compliant, in a subsetty kind of way. (Or so I think; but I've not read the relevant parts of the spec very recently either.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for foreign keys with arrays
On Wed, Mar 28, 2012 at 04:25:06PM +0200, Marco Nenciarini wrote: Il giorno lun, 19/03/2012 alle 18.41 +0100, Marco Nenciarini ha scritto: Attached is v5, which should address all the remaining issues. Please find attached v6 of the EACH Foreign Key patch. From v5 only cosmetic changes to the documentation were made. This has bitrotted; please refresh. Also, please evaluate Peter's feedback: http://archives.postgresql.org/message-id/1333693277.32606.9.ca...@vanquo.pezone.net Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote: I think we should consider this in the context of allowing both additional UNIX sockets and additional TCP ports. In the case of TCP ports, it's clearly no good to turn port into a list, because one port number has to be primary, since it goes into the PID file Not necessarily. The port number in the PID file is only used for pg_ctl to contact the server, and for that it only needs some port, not the primary one. Also, we write the first listen_address into the PID file for the same reason. So if you think there should be a primary port, then there should also be a primary listen_addresses. and also affects the naming of any UNIX sockets created. Why would that matter? If you configure M ports and N Unix socket locations, you get M*N actual sockets created. If we add secondary_socket_dirs, I think we will also need secondary_ports. One idea might be to have one new GUC that serves both purposes: additional_sockets = '12345, /foo' I was getting around to that, although I don't follow the specific syntax you have chosen here. I would like something where you set host and port together, so you can listen on port 5432 on localhost, and port 5433 on *, for example. So maybe listen_addresses = localhost:5432,*:5433 Web servers usually allow that sort of thing, but if you dig deep there, the configuration settings and their interactions can get pretty complicated. For example, you can usually set a default port and then override it in the listen_addresses equivalent. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On Sun, Jun 10, 2012 at 5:06 PM, Peter Eisentraut pete...@gmx.net wrote: On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote: I think we should consider this in the context of allowing both additional UNIX sockets and additional TCP ports. In the case of TCP ports, it's clearly no good to turn port into a list, because one port number has to be primary, since it goes into the PID file Not necessarily. The port number in the PID file is only used for pg_ctl to contact the server, and for that it only needs some port, not the primary one. Also, we write the first listen_address into the PID file for the same reason. So if you think there should be a primary port, then there should also be a primary listen_addresses. Fair enough, as far as this part goes, but... and also affects the naming of any UNIX sockets created. Why would that matter? If you configure M ports and N Unix socket locations, you get M*N actual sockets created. ...I *seriously* doubt that this is the behavior anyone wants. Creating M sockets per directory seems patently silly. If we add secondary_socket_dirs, I think we will also need secondary_ports. One idea might be to have one new GUC that serves both purposes: additional_sockets = '12345, /foo' I was getting around to that, although I don't follow the specific syntax you have chosen here. I was thinking that each element could be of the form /path or port. But I guess it should really be /path or host:port. I would like something where you set host and port together, so you can listen on port 5432 on localhost, and port 5433 on *, for example. So maybe listen_addresses = localhost:5432,*:5433 Web servers usually allow that sort of thing, but if you dig deep there, the configuration settings and their interactions can get pretty complicated. For example, you can usually set a default port and then override it in the listen_addresses equivalent. That seems like the obvious syntax, but I'm fuzzy on the details. We could let 'port' continue to mean the default port, and then listen_addresses can contain either unadorned addresses (in which case we bind to that address using the default port) or address:port designators (in which case we bind to the given address and port). But then support port = 1234 and listen_addresses = '5.5.5.5:6789'. Presumably the UNIX socket is still /tmp/.s.PGSQL.1234, but then what ends up in the lock file? PostmasterMain's current algorithm for figuring that out would write 5.5.5.5 for the host and 1234 for the port, which seems like nonsense. Such confusion is why I thought we might fall back on listing all the additional listen locations in a new, separate GUC. But perhaps there is a way to make it work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)
On Sun, Jun 10, 2012 at 4:19 PM, Noah Misch n...@leadboat.com wrote: On Wed, Aug 24, 2011 at 03:38:09PM -0400, Tom Lane wrote: Merlin Moncure mmonc...@gmail.com writes: On Wed, Aug 24, 2011 at 1:24 PM, Daniel Farina dan...@heroku.com wrote: At Heroku we use CREATE INDEX CONCURRENTLY with great success, but recently when frobbing around some indexes I realized that there is no equivalent for DROP INDEX, and this is a similar but lesser problem (as CREATE INDEX takes much longer), as DROP INDEX takes an ACCESS EXCLUSIVE lock on the parent table while doing the work to unlink files, which nominally one would think to be trivial, but I assure you it is not at times for even indexes that are a handful of gigabytes (let's say ~= a dozen). Are you sure that you are really waiting on the time to unlink the file? there's other stuff going on in there like waiting for lock, plan invalidation, etc. Point being, maybe the time consuming stuff can't really be deferred which would make the proposal moot. Assuming the issue really is the physical unlinks (which I agree I'd like to see some evidence for), I wonder whether the problem could be I'd believe it. With a cold cache (sudo sysctl -w vm.drop_caches=3), my desktop takes 2.6s to rm a 985 MiB file. addressed by moving smgrDoPendingDeletes() to after locks are released, instead of before, in CommitTransaction/AbortTransaction. There does not seem to be any strong reason why we have to do that before lock release, since incoming potential users of a table should not be trying to access the old physical storage after that anyway. Agreed. We now have $OLD_SUBJECT, but this is a win independently. I have reviewed the code that runs between the old and new call sites, and I did not identify a hazard of moving it as you describe. I looked at this when we last discussed it and didn't see a problem either, so I tend to agree that we ought to go ahead and do this, unless someone else sees a problem. Holding locks for even slightly less time is a good idea, and if it turns out to be substantially less time in some cases, then we win more. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
Robert Haas robertmh...@gmail.com writes: On Sun, Jun 10, 2012 at 5:06 PM, Peter Eisentraut pete...@gmx.net wrote: On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote: If we add secondary_socket_dirs, I think we will also need secondary_ports. One idea might be to have one new GUC that serves both purposes: additional_sockets = '12345, /foo' I was getting around to that, although I don't follow the specific syntax you have chosen here. I was thinking that each element could be of the form /path or port. But I guess it should really be /path or host:port. I'm uncomfortable with the potential for ambiguity there. I think we'd be much better off having two lists, one for TCP addresses and one for Unix socket directories. I'm unconvinced that allowing multiple port numbers is worth the amount of confusion it will cause. In particular, we've traditionally used the port number as part of the key for resources such as shared memory. I think we'd want the number used for that purpose to be what is written into the lock file ... but then what if the postmaster is not actually listening on *any* actual socket with that number? pg_ctl will not be happy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unlink for DROPs after releasing locks (was Re: Should I implement DROP INDEX CONCURRENTLY?)
Robert Haas robertmh...@gmail.com writes: On Sun, Jun 10, 2012 at 4:19 PM, Noah Misch n...@leadboat.com wrote: Agreed. We now have $OLD_SUBJECT, but this is a win independently. I have reviewed the code that runs between the old and new call sites, and I did not identify a hazard of moving it as you describe. I looked at this when we last discussed it and didn't see a problem either, so I tend to agree that we ought to go ahead and do this, +1, as long as you mean in 9.3 not 9.2. I don't see any risk either, but the time for taking new risks in 9.2 is past. Noah, please add this patch to the upcoming CF, if you didn't already. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On Sun, Jun 10, 2012 at 5:35 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Jun 10, 2012 at 5:06 PM, Peter Eisentraut pete...@gmx.net wrote: On sön, 2012-06-10 at 09:41 -0400, Robert Haas wrote: If we add secondary_socket_dirs, I think we will also need secondary_ports. One idea might be to have one new GUC that serves both purposes: additional_sockets = '12345, /foo' I was getting around to that, although I don't follow the specific syntax you have chosen here. I was thinking that each element could be of the form /path or port. But I guess it should really be /path or host:port. I'm uncomfortable with the potential for ambiguity there. I think we'd be much better off having two lists, one for TCP addresses and one for Unix socket directories. I suggested it mostly because we already use that convention in libpq: leading slash = pathname. I'm unconvinced that allowing multiple port numbers is worth the amount of confusion it will cause. In particular, we've traditionally used the port number as part of the key for resources such as shared memory. I think we'd want the number used for that purpose to be what is written into the lock file ... but then what if the postmaster is not actually listening on *any* actual socket with that number? pg_ctl will not be happy. Well, that's why I shied away from the syntax Peter is proposing. I think if we leave the semantics of listen_addresses and port alone, and just add a new GUC for extra places to listen, there's no problem. If you look at the patch I posted upthread, you'll see that I set things up so that we'll still fail if the primary port can't be listened on; once we've established that we can listen there, we'll try to set up the other sockets as well. I think that's a pretty sane way to allow this (which a number of people, not only me, seem to support) without creating surprising semantics. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Backends stalled in 'startup' state: index corruption
On May 26, 2012, at 9:17 AM, Tom Lane wrote: Would you guys please try this in the problem databases: select a.ctid, c.relname from pg_attribute a join pg_class c on a.attrelid=c.oid where c.relnamespace=11 and c.relkind in ('r','i') order by 1 desc; If you see any block numbers above about 20 then maybe the triggering condition is a row relocation after all. Sorry for such a long delay on the reply. Took a while to get the data directory moved elsewhere: select a.ctid, c.relname from pg_attribute a join pg_class c on a.attrelid=c.oid where c.relnamespace=11 and c.relkind in ('r','i') order by 1 desc; ctid | relname -+- (18,31) | pg_extension_name_index (18,30) | pg_extension_oid_index (18,29) | pg_seclabel_object_index (18,28) | pg_seclabel_object_index (18,27) | pg_seclabel_object_index (18,26) | pg_seclabel_object_index As the next step, I'd suggest verifying that the stall is reproducible if you remove pg_internal.init (and that it's not there otherwise), and then strace'ing the single incoming connection to see what it's doing. It does take a little while, but not nearly as long as the stalls we were seeing before. The pg_internal.init is 108k in case that's an interesting data point. Any other tests you'd like me to run on that bad data dir? Also, thus far, the newly initdb'd system continues to hum along just fine. It's also been upgraded to 9.1.4, so if it was the rebuilding of pg_internal.init, then your fix should keep it happy.
Re: [HACKERS] [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
Excerpts from Bruce Momjian's message of dom jun 10 15:20:34 -0400 2012: Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest. Hm, does this touch stuff that would also be modified by perltidy? I wonder if we should refrain from doing entab/detab on perl files and instead have perltidy touch such code. Perhaps the thing to do is ensure that perltidy also uses tabs instead of spaces. src/tools/msvc/Install.pm | 1056 ++- src/tools/msvc/MSBuildProject.pm| 325 +++--- src/tools/msvc/Mkvcbuild.pm | 1278 --- src/tools/msvc/Project.pm | 607 ++-- src/tools/msvc/Solution.pm | 980 +- src/tools/msvc/VCBuildProject.pm| 292 +++--- src/tools/msvc/VSObjectFactory.pm | 152 ++-- src/tools/msvc/build.pl | 20 +- src/tools/msvc/builddoc.pl | 32 +- src/tools/msvc/config_default.pl| 38 +- src/tools/msvc/gendef.pl| 66 +- src/tools/msvc/install.pl |4 +- src/tools/msvc/pgbison.pl | 14 +- src/tools/msvc/pgflex.pl| 60 +- src/tools/msvc/vcregress.pl | 400 -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Run pgindent on 9.2 source tree in preparation for first 9.3
On Sun, Jun 10, 2012 at 08:55:13PM -0400, Alvaro Herrera wrote: Excerpts from Bruce Momjian's message of dom jun 10 15:20:34 -0400 2012: Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest. Hm, does this touch stuff that would also be modified by perltidy? I wonder if we should refrain from doing entab/detab on perl files and instead have perltidy touch such code. src/tools/msvc/Install.pm | 1056 ++- src/tools/msvc/MSBuildProject.pm| 325 +++--- src/tools/msvc/Mkvcbuild.pm | 1278 --- src/tools/msvc/Project.pm | 607 ++-- src/tools/msvc/Solution.pm | 980 +- src/tools/msvc/VCBuildProject.pm| 292 +++--- src/tools/msvc/VSObjectFactory.pm | 152 ++-- src/tools/msvc/build.pl | 20 +- src/tools/msvc/builddoc.pl | 32 +- src/tools/msvc/config_default.pl| 38 +- src/tools/msvc/gendef.pl| 66 +- src/tools/msvc/install.pl |4 +- src/tools/msvc/pgbison.pl | 14 +- src/tools/msvc/pgflex.pl| 60 +- src/tools/msvc/vcregress.pl | 400 The Perl files were modified by perltidy and not by pgindent, as documented in the pgindent README: 9) Indent the Perl MSVC code: cd src/tools/msvc perltidy -b -bl -nsfs -naws -l=100 -ole=unix *.pl *.pm Perhaps the thing to do is ensure that perltidy also uses tabs instead of spaces. If you would like 'entab' run on the Perl files, let me know. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] log_newpage header comment
Uh... no. The whole point of doing things in shared buffers is that you don't have to write and fsync the buffers immediately. Instead, buffer evicting handles that stuff for you. So you mean to say that there exists operations where Xlog is not required even though it marks the buffer as dirty for later eviction. I don't see why it's any different from log_newpage() in that regard. That data is initialized before being written, as well, but someone contemplated the possible need to write a page of all zeros. The comment above the code indicates that the page is uninitialized. Does the code consider page with all zero's as uninitialized or the comment is not appropriate. -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Sunday, June 10, 2012 7:14 PM To: Amit kapila Cc: Tom Lane; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] log_newpage header comment On Sat, Jun 9, 2012 at 1:43 AM, Amit kapila amit.kap...@huawei.com wrote: On further review, I think that we ought to make MarkBufferDirty() the caller's job, because sometimes we may need to xlog only if XLogIsNeeded(), but the buffer's got to get marked dirty either way. Incase the place where Xlog is not required, woudn't it fsync the data; So in that case even MarkBufferDirty() will also be not required. Uh... no. The whole point of doing things in shared buffers is that you don't have to write and fsync the buffers immediately. Instead, buffer evicting handles that stuff for you. So I think the new function should just do step 5 - emit XLOG and set LSN/TLI. In the API log_newpage_buffer(), as buffer already contains the page to be logged, so can't it be assumed that the page will be initialized and no need to check if PageIsNew before setting LSN/TLI. I don't see why it's any different from log_newpage() in that regard. That data is initialized before being written, as well, but someone contemplated the possible need to write a page of all zeros. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers