Re: [HACKERS] Problem with pg_upgrade (8.4 - 9.0) due to ALTER DATABASE SET ROLE
On Mon, Feb 28, 2011 at 06:21, Tom Lane t...@sss.pgh.pa.us wrote: Robert Treat r...@xzilla.net writes: Did anything ever come of this discussion? I think it's a TODO --- nothing done about it as yet, AFAIR. On one of the databases I was upgrading, I ran into a similar problem with roles that are set as roles. The problem seems to stem from pg_dumpall dumping roles in alphabetical order: CREATE ROLE asha; ALTER ROLE asha SET role TO 'omniti'; .. sometime later ... CREATE ROLE omniti; That seems like a pretty bizarre thing to do. Why would you want such a setting? I'm sure there are several. I've seen (and done) this more than once to ensure that the owner of newly created object is the shared role and not the individual, for example. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 02/27/2011 11:57 PM, Peter Eisentraut wrote: On sön, 2011-02-27 at 10:45 -0500, Tom Lane wrote: Hmm, so this doesn't rely on libxml2 at all? Given the amount of pain that library has caused us, getting out from under it seems like a mighty attractive idea. This doesn't replace the existing xml functionality, so it won't help getting rid of libxml. Right, what I published on github.com doesn't replace the libxml2 functionality and I didn't say it does at this moment. The idea is to design (or rather start designing) a low-level XML API on which SQL/XML functionality can be based. As long as XSLT can be considered a sort of separate topic, then Postgres uses very small subset of what libxml2 offers and thus it might not be that difficult to implement the same level of functionality in a new way. In addition, I think that using a low-level API that Postgres development team fully controls would speed-up enhancements of the XML functionality in the future. When I thought of implementing some functionality listed on the official TODO, I was a little bit discouraged by the workarounds that need to be added in order to deal with libxml2 memory management. Also parsing the document each time it's accessed (which involves parser initialization and finalization) is not too comfortable and eventually efficient. A question is of course, if potential new implementation must necessarily replace the existing one, immediately or at all. What I published is implemented as a new data type and thus pg_type.h and pg_proc.h are the only files where something needs to be merged. From technical point of view, the new type can co-exist with the existing easily. This however implies a question if such co-existence (whether temporary or permanent) would be acceptable for users, i.e. if it wouldn't bring some/significant confusion. That's something I'm not able to answer. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On 2011-02-25 20:40, Jaime Casanova wrote: On Fri, Feb 25, 2011 at 10:41 AM, Yeb Havingayebhavi...@gmail.com wrote: I also did some initial testing on this patch and got the queue related errors with 1 clients. With the code change from Jaime above I still got a lot of 'not on queue warnings'. I tried to understand how the queue was supposed to work - resulting in the changes below that also incorporates a suggestion from Fujii upthread, to early exit when myproc was found yes, looking at the code, the warning and your patch... it seems yours is the right solution... I'm compiling right now to test again and see the effects, Robert maybe you can test your failure case again? i'm really sure it's related to this... I did some more testing over the weekend with this patched v17 patch. Since you've posted a v18 patch, let me write some findings with the v17 patch before continuing with the v18 patch. The tests were done on a x86_64 platform, 1Gbit network interfaces, 3 servers. Non default configuration changes are copy pasted at the end of this mail. 1) no automatic switch to other synchronous standby - start master server, add synchronous standby 1 - change allow_standalone_primary to off - add second synchronous standby - wait until pg_stat_replication shows both standby's are in STREAMING state - stop standby 1 what happens is that the master stalls, where I expected that it would've switched to standby 2 acknowledge commits. The following thing was pilot error, but since I was test-piloting a new plane, I still think it might be usual feedback. In my opinion, any number and order of pg_ctl stops and starts on both the master and standby servers, as long as they are not with -m immediate, should never cause the state I reached. 2) reaching some sort of shutdown deadlock state - start master server, add synchronous standby - change allow_standalone_primary to off then I did all sorts of test things, everything still ok. Then I wanted to shutdown everything, and maybe because of some symmetry (stack like) I did the following because I didn't think it through - pg_ctl stop on standby (didn't actualy wait until done, but immediately in other terminal) - pg_ctl stop on master O wait.. master needs to sync transactions - start standby again. but now: FATAL: the database system is shutting down There is no clean way to get out of this situation. allow_standalone_primary in the face of shutdowns might be tricky. Maybe shutdown must be prohibited to enter the shutting down phase in allow_standalone_primary = off together with no sync standby, that would allow for the sync standby to attach again. 3) PANIC on standby server At some point a standby suddenly disconnected after I started a new pgbench run on a existing master/standby pair, with the following error in the logfile. LOCATION: libpqrcv_connect, libpqwalreceiver.c:171 PANIC: XX000: heap_update_redo: failed to add tuple CONTEXT: xlog redo hot_update: rel 1663/16411/16424; tid 305453/15; new 305453/102 LOCATION: heap_xlog_update, heapam.c:4724 LOG: 0: startup process (PID 32597) was terminated by signal 6: Aborted This might be due to pilot error as well; I did a several tests over the weekend and after this error I was more alert on remembering immediate shutdowns/starting with a clean backup after that, and didn't see similar errors since. 4) The performance of the syncrep seems to be quite an improvement over the previous syncrep patches, I've seen tps-ses of O(650) where the others were more like O(20). The O(650) tps is limited by the speed of the standby server I used-at several times the master would halt only because of heavy disk activity at the standby. A warning in the docs might be right: be sure to use good IO hardware for your synchronous replicas! With that bottleneck gone, I suspect the current syncrep version can go beyond 1000tps over 1 Gbit. regards, Yeb Havinga recovery.conf: standby_mode = 'on' primary_conninfo = 'host=mg73 user=repuser password=pwd application_name=standby1' trigger_file = '/tmp/postgresql.trigger.5432' postgresql.conf nondefault parameters: log_error_verbosity = verbose log_min_messages = warning log_min_error_statement = warning listen_addresses = '*'# what IP address(es) to listen on; search_path='\$user\, public, hl7' archive_mode = on archive_command = 'test ! -f /data/backup_in_progress || cp -i %p /archive/%f /dev/null' checkpoint_completion_target = 0.9 checkpoint_segments = 16 default_statistics_target = 500 constraint_exclusion = on max_connections = 120 maintenance_work_mem = 128MB effective_cache_size = 1GB work_mem = 44MB wal_buffers = 8MB shared_buffers = 128MB wal_level = 'archive' max_wal_senders = 4 wal_keep_segments = 1000 # 16000MB (for production increase this) synchronous_standby_names = 'standby1,standby2,standby3' synchronous_replication = on allow_standalone_primary = off -- Sent via pgsql-hackers mailing
Re: [HACKERS] wCTE: about the name of the feature
On 2011-02-28 8:20 AM +0200, Tom Lane wrote: Marko Tiikkajamarko.tiikk...@cs.helsinki.fi writes: On 2011-02-24 6:40 PM, I wrote: I am planning on working on the documentation this weekend. And here's my attempt. The language is a bit poor at some places but I can't think of anything better. Applied after some rather heavy editorialization. Thanks again. I tried to be more strict about using subquery when talking about WITHs in general since INSERT/UPDATE/DELETE is not a subquery in my book. I undid most of those changes --- it didn't seem to add anything to be strict in this way, and anyway you hadn't done it consistently, eg the syntax still had with_query. I wasn't so sure about those changes either. It does seem more consistent this way. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PG signal handler and non-reentrant malloc/free calls
Hi, I believe we have a case where not holding off interrupts while doing a malloc() can cause a deadlock due to system or libc level locking. In this case, a pg_ctl stop in fast mode was resorted to and that caused a backend to handle the interrupt when it was inside the malloc call. Now as part of the abort processing, in the subtransaction cleanup code path, this same backend tried to clear memory contexts, leading to an eventual free() call. The free() call tried to take the same lock which was already held by malloc() earlier resulting into a deadlock! Will try to get the call stack if needed. The malloc/free functions are known to be not re-entrant. Doesn't it make sense to hold off interrupts while doing such calls inside the AllocSet* set of functions? Thankfully the locations are not very many. AllocSetContextCreate, AllocSetAlloc and AllocSetFree seem to be the only candidates. Comments, thoughts? Regards, Nikhils
Re: [HACKERS] PG signal handler and non-reentrant malloc/free calls
On 28.02.2011 14:04, Nikhil Sontakke wrote: I believe we have a case where not holding off interrupts while doing a malloc() can cause a deadlock due to system or libc level locking. In this case, a pg_ctl stop in fast mode was resorted to and that caused a backend to handle the interrupt when it was inside the malloc call. Now as part of the abort processing, in the subtransaction cleanup code path, this same backend tried to clear memory contexts, leading to an eventual free() call. The free() call tried to take the same lock which was already held by malloc() earlier resulting into a deadlock! Our signal handlers shouldn't try to do anything that complicated. die(), which handles SIGTERM caused by fast shutdown in backends, doesn't do abort processing itself. It just sets a global variable. Unless ImmediateInterruptOK is set, but it's only set around a few blocking system calls where it is safe to do so. (Checks...) Actually, md5_crypt_verify() looks suspicious, it does ImmediateInterruptOK = true, and then calls palloc() and pfree(). Will try to get the call stack if needed. Yes, please. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication server timeout patch
On Sun, Feb 27, 2011 at 11:52 AM, Fujii Masao masao.fu...@gmail.com wrote: There are two things that I think are pretty clear. If the receiver has wal_receiver_status_interval=0, then we should ignore replication_timeout for that connection. The patch still doesn't check that wal_receiver_status_interval is set up properly. I'll implement that later. Done. I attached the updated patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center replication_timeout_v3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Bug of the hot standby feedback
Hi, When I implemented the replication timeout patch, I found the bug on the HS feedback feature. When wal_receiver_status_interval is zero and hot_standby_feedback is enabled, walreceiver sends the feedback too aggressively. I think that the attached patch should be applied to fix the bug. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center hot_standby_feedback_bug_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 02/28/2011 04:25 AM, Anton wrote: On 02/27/2011 11:57 PM, Peter Eisentraut wrote: On sön, 2011-02-27 at 10:45 -0500, Tom Lane wrote: Hmm, so this doesn't rely on libxml2 at all? Given the amount of pain that library has caused us, getting out from under it seems like a mighty attractive idea. This doesn't replace the existing xml functionality, so it won't help getting rid of libxml. Right, what I published on github.com doesn't replace the libxml2 functionality and I didn't say it does at this moment. The idea is to design (or rather start designing) a low-level XML API on which SQL/XML functionality can be based. As long as XSLT can be considered a sort of separate topic, then Postgres uses very small subset of what libxml2 offers and thus it might not be that difficult to implement the same level of functionality in a new way. In addition, I think that using a low-level API that Postgres development team fully controls would speed-up enhancements of the XML functionality in the future. When I thought of implementing some functionality listed on the official TODO, I was a little bit discouraged by the workarounds that need to be added in order to deal with libxml2 memory management. Also parsing the document each time it's accessed (which involves parser initialization and finalization) is not too comfortable and eventually efficient. A question is of course, if potential new implementation must necessarily replace the existing one, immediately or at all. What I published is implemented as a new data type and thus pg_type.h and pg_proc.h are the only files where something needs to be merged. From technical point of view, the new type can co-exist with the existing easily. This however implies a question if such co-existence (whether temporary or permanent) would be acceptable for users, i.e. if it wouldn't bring some/significant confusion. That's something I'm not able to answer. The only reason we need the XML stuff in core at all and not in a separate module is because of the odd syntax requirements of SQL/XML. But those operators work on the xml type, and not on any new type you might invent. Which TODO items were you trying to implement? And what were the blockers? We really can't just consider XSLT, and more importantly XPath, as separate topics. Any alternative XML implementation that doesn't include XPath is going to be unacceptably incomplete, IMNSHO. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
On Sun, Feb 27, 2011 at 5:17 PM, Josh Berkus j...@agliodbs.com wrote: I think there would be value in giving the DBA an easier way to see which tables are hot, but I am really leery about the idea of trying to feed that directly into the query planner. I think this is one of those cases where we let people tune it manually for starters, and then wait for feedback. Eventually someone will say oh, I never tune that by hand any more, ever since I wrote this script which does the following computation... and I just run it out cron. And then we will get out the party hats. But we will never get the experience we need to say what that auto-tuning algorithm will be unless we first provide the knob for someone to fiddle with manually. I'm not disagreeing with that. I'm saying first, we give DBAs a way to see which tables are currently hot. Such a feature has multiple benefits, making it worth the overhead and/or coding effort. Whether we're shooting for autotuning or manual tuning, it starts with having the data. Well, what we have now is a bunch of counters in pg_stat_all_tables and pg_statio_all_tables. Making that easier for the DBA almost seems like more of a job for a third-party tool that, say, graphs it, than a job for PG itself. But if you have an idea I'm ears. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
Andrew Dunstan and...@dunslane.net writes: On 02/28/2011 04:25 AM, Anton wrote: A question is of course, if potential new implementation must necessarily replace the existing one, immediately or at all. What I published is implemented as a new data type and thus pg_type.h and pg_proc.h are the only files where something needs to be merged. From technical point of view, the new type can co-exist with the existing easily. This however implies a question if such co-existence (whether temporary or permanent) would be acceptable for users, i.e. if it wouldn't bring some/significant confusion. That's something I'm not able to answer. The only reason we need the XML stuff in core at all and not in a separate module is because of the odd syntax requirements of SQL/XML. But those operators work on the xml type, and not on any new type you might invent. Well, in principle we could allow them to work on both, just the same way that (for instance) + is a standardized operator but works on more than one datatype. But I agree that the prospect of two parallel types with essentially duplicate functionality isn't pleasing at all. I think a reasonable path forwards for this work would be to develop and extend the non-libxml-based type as an extension, outside of core, with the idea that it might replace the core implementation if it ever gets complete enough. The main thing that that would imply that you might not bother with otherwise is an ability to deal with existing plain-text-style stored values. This doesn't seem terribly hard to do IMO --- one easy way would be to insert an initial zero byte in all new-style values as a flag to distinguish them from old-style. The forced parsing that would occur to deal with an old-style value would be akin to detoasting and could be hidden in the same access macros. We really can't just consider XSLT, and more importantly XPath, as separate topics. Any alternative XML implementation that doesn't include XPath is going to be unacceptably incomplete, IMNSHO. Agreed. The single most pressing problem we've got with XML right now is the poor state of the XPath extensions in contrib/xml2. If we don't see a meaningful step forward in that area, a new implementation of the xml datatype isn't likely to win acceptance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On Sun, Feb 27, 2011 at 10:20 PM, Andrew Dunstan and...@dunslane.net wrote: No, I think the xpath implementation is from libxml2. But in any case, I think the problem is in the whole design of the xpath_table function, and not in the library used for running the xpath queries. i.e it's our fault, and not the libraries. (mutters about workmen and tools) Yeah, I think the problem is that we picked a poor definition for the xpath_table() function. That poor definition will be equally capable of causing us headaches on top of any other implementation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 02/28/2011 10:30 AM, Tom Lane wrote: The single most pressing problem we've got with XML right now is the poor state of the XPath extensions in contrib/xml2. If we don't see a meaningful step forward in that area, a new implementation of the xml datatype isn't likely to win acceptance. xpath_table is severely broken by design IMNSHO. We need a new design, but I'm reluctant to work on that until someone does LATERAL, because a replacement would be much nicer to design with it than without it. But I don't believe replacing the underlying XML/XPath implementation would help us fix it at all. cheers andreww -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
Andrew Dunstan and...@dunslane.net writes: xpath_table is severely broken by design IMNSHO. We need a new design, but I'm reluctant to work on that until someone does LATERAL, because a replacement would be much nicer to design with it than without it. Well, maybe I'm missing something, but I don't really understand why xpath_table's design is so unreasonable. Also, what would a better solution look like exactly? (Feel free to assume LATERAL is available.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG signal handler and non-reentrant malloc/free calls
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: Unless ImmediateInterruptOK is set, but it's only set around a few blocking system calls where it is safe to do so. (Checks...) Actually, md5_crypt_verify() looks suspicious, it does ImmediateInterruptOK = true, and then calls palloc() and pfree(). Hm, yeah, and ClientAuthentication() seems way too optimistic about what it does with that set too. I'm not sure what we can do about it though. The general shape of the problem here is that we're about to go off into uncooperative third-party libraries like krb5, so if we don't enable interrupts we're going to have problems honoring authentication timeout. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with pg_upgrade (8.4 - 9.0) due to ALTER DATABASE SET ROLE
On Mon, Feb 28, 2011 at 3:42 AM, Magnus Hagander mag...@hagander.net wrote: On Mon, Feb 28, 2011 at 06:21, Tom Lane t...@sss.pgh.pa.us wrote: Robert Treat r...@xzilla.net writes: Did anything ever come of this discussion? I think it's a TODO --- nothing done about it as yet, AFAIR. On one of the databases I was upgrading, I ran into a similar problem with roles that are set as roles. The problem seems to stem from pg_dumpall dumping roles in alphabetical order: CREATE ROLE asha; ALTER ROLE asha SET role TO 'omniti'; .. sometime later ... CREATE ROLE omniti; That seems like a pretty bizarre thing to do. Why would you want such a setting? I'm sure there are several. I've seen (and done) this more than once to ensure that the owner of newly created object is the shared role and not the individual, for example. Yeah, there are actually several of the roles that get set to the omniti role, like the robert role, which doesn't have any issue because it comes alphabetically after omniti. This also helps folks get around several permission related issues (simplified management, uniform permissions across users, simplified dependencies, etc..), but object ownership is a key part of it. Robert Treat play: xzilla.net work: omniti.com hiring: l42.org/Lg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: Well, in principle we could allow them to work on both, just the same way that (for instance) + is a standardized operator but works on more than one datatype. But I agree that the prospect of two parallel types with essentially duplicate functionality isn't pleasing at all. The real issue here is whether we want to store XML as text (as we do now) or as some predigested form which would make output the whole thing slower but speed up things like xpath lookups. We had the same issue with JSON, and due to the uncertainty about which way to go with it we ended up integrating nothing into core at all. It's really not clear that there is one way of doing this that is right for all use cases. If you are storing xml in an xml column just to get it validated, and doing no processing in the DB, then you'd probably prefer our current representation. If you want to build functional indexes on xpath expressions, and then run queries that extract data using other xpath expressions, you would probably prefer the other representation. I tend to think that it would be useful to have both text and predigested types for both XML and JSON, but I am not too eager to begin integrating more stuff into core or contrib until it spends some time on pgfoundry or github or wherever people publish their PostgreSQL extensions these days and we have a few users prepared to testify to its awesomeness. In any case, the definitional problems with xpath_table(), and/or the memory management problems with libxml2, are not the basis on which we should be making this decision. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 02/28/2011 10:51 AM, Tom Lane wrote: Andrew Dunstanand...@dunslane.net writes: xpath_table is severely broken by design IMNSHO. We need a new design, but I'm reluctant to work on that until someone does LATERAL, because a replacement would be much nicer to design with it than without it. Well, maybe I'm missing something, but I don't really understand why xpath_table's design is so unreasonable. Also, what would a better solution look like exactly? (Feel free to assume LATERAL is available.) What's unreasonable about it is that the supplied paths are independent of each other, and evaluated in the context of the entire XML document. Let's take the given example in the docs, changed slightly to assume each piece of XML can have more than one article listing in it (i.e,. 'article' is not the root node of the document): SELECT * FROM xpath_table('article_id', 'article_xml', 'articles', '//article/author|//article/pages|//article/title', 'date_entered ''2003-01-01'' ') AS t(article_id integer, author text, page_count integer, title text); There is nothing that says that the author has to come from the same article as the title, nor is there any way of saying that they must. If an article node is missing author or pages or title, or has more than one where its siblings do not, they will line up wrongly. An alternative would be to supply a single xpath expression that would specify the context nodes to be iterated over (in this case that would be '//article') and a set of xpath expressions to be evaluated in the context of those nodes (in this case 'article|pages|title' ort better yet, supply these as a text array). We'd produce exactly one row for each node found by the context expression, and take the first value found by each of the column expressions in that context (or we could error out if we found more than one, or supply an array if the result field is an array). So with LATERAL taking care of the rest, the function signature could be something like: xpath_table_new( doc xml, context_xpath text, column_xpath text[]) returns setof record Given this, you could not get a row with title and author from different article nodes in the source document like you can now. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] EXPLAIN doesn't show sufficient info for wCTE cases
EXPLAIN currently shows ModifyTable nodes as just Insert, Update, or Delete, without any indication of the target table. This was more or less good enough when there could only be one such node per query, but it's looking pretty inadequate to me as I play around with data-modifying statements in WITH. The obvious thing to do is show the target table much as we do for table scan nodes, eg Update on my_table. There is a deficiency in that, which is that for inherited UPDATE/DELETE cases a single ModifyTable node could have multiple target tables. But after reflecting on it a bit, I think it would be good enough to show the parent table name. The individual child plans will necessarily include scans of the individual child tables, so you can figure out which is which from that if you need to know. Alternatively we could list all the target tables in a new node attribute, eg Update (costs...) Target Tables: foo_parent, foo_child1, ... But in the majority of cases this seems like a waste of precious vertical space. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXPLAIN doesn't show sufficient info for wCTE cases
On Mon, Feb 28, 2011 at 11:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: EXPLAIN currently shows ModifyTable nodes as just Insert, Update, or Delete, without any indication of the target table. This was more or less good enough when there could only be one such node per query, but it's looking pretty inadequate to me as I play around with data-modifying statements in WITH. The obvious thing to do is show the target table much as we do for table scan nodes, eg Update on my_table. There is a deficiency in that, which is that for inherited UPDATE/DELETE cases a single ModifyTable node could have multiple target tables. But after reflecting on it a bit, I think it would be good enough to show the parent table name. The individual child plans will necessarily include scans of the individual child tables, so you can figure out which is which from that if you need to know. Alternatively we could list all the target tables in a new node attribute, eg Update (costs...) Target Tables: foo_parent, foo_child1, ... But in the majority of cases this seems like a waste of precious vertical space. Thoughts? I think it's good to include the table name, for sure. I *think* I agree that it isn't necessary to include the child names. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python custom exceptions for SPI
On lör, 2011-02-12 at 11:58 +0100, Jan Urbański wrote: On 11/02/11 10:53, Jan Urbański wrote: On 10/02/11 22:26, Steve Singer wrote: Here's an updated patch with documentation. It's an incremental patch on top of the latest explicit-subxacts version. Committed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Fri, Feb 18, 2011 at 1:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: There might be more issues, I haven't read the patch in detail. But anyway, I'm going to set it to Waiting on Author. I think it needs at least a day or so's work, and I can't put in that kind of time on it now. Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PL/pgSQL return value in after triggers
PL/pgSQL trigger functions currently require a value to be returned, even though that value is not used for anything in case of a trigger fired AFTER. I was wondering if we could relax that. It would make things a bit more robust and produce clearer PL/pgSQL code. The specific case I'm concerned about is that a trigger function could accidentally be run in a BEFORE trigger even though it was not meant for that. It is common practice that trigger functions for AFTER triggers return NULL, which would have unpleasant effects if used in a BEFORE trigger. I think it is very uncommon to have the same function usable for BEFORE and AFTER triggers, so it would be valuable to have coding support specifically for AFTER triggers. We could just allow RETURN without argument, or perhaps no RETURN at all. Comments? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python custom exceptions for SPI
Peter Eisentraut pete...@gmx.net writes: On lör, 2011-02-12 at 11:58 +0100, Jan UrbaÅski wrote: Here's an updated patch with documentation. It's an incremental patch on top of the latest explicit-subxacts version. Committed. I'm seeing a core dump as well as multiple inconsistencies in error message spelling in the plpython regression tests on a Fedora 13 box (python 2.6.4). Several buildfarm critters don't look too happy either. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL return value in after triggers
On Mon, Feb 28, 2011 at 12:07 PM, Peter Eisentraut pete...@gmx.net wrote: PL/pgSQL trigger functions currently require a value to be returned, even though that value is not used for anything in case of a trigger fired AFTER. I was wondering if we could relax that. It would make things a bit more robust and produce clearer PL/pgSQL code. The specific case I'm concerned about is that a trigger function could accidentally be run in a BEFORE trigger even though it was not meant for that. It is common practice that trigger functions for AFTER triggers return NULL, which would have unpleasant effects if used in a BEFORE trigger. I think it is very uncommon to have the same function usable for BEFORE and AFTER triggers, so it would be valuable to have coding support specifically for AFTER triggers. We could just allow RETURN without argument, or perhaps no RETURN at all. Comments? It has bugged me for years that after triggers need to contain a useless RETURN statement, but I'm not sure now is the time to go fix it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL return value in after triggers
Peter Eisentraut pete...@gmx.net writes: PL/pgSQL trigger functions currently require a value to be returned, even though that value is not used for anything in case of a trigger fired AFTER. I was wondering if we could relax that. I got bit by that just a couple days ago --- I supposed that a trigger that wasn't returning anything useful shouldn't need an explicit RETURN. So +1 for doing something about it. However, unless it's a very small and simple patch, I concur with Robert that it might be a bit late to consider this for 9.1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
On Sun, Feb 27, 2011 at 8:33 PM, Joachim Wieland j...@mcknight.de wrote: On Sun, Feb 27, 2011 at 3:04 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Why exactly, Heikki do you think the hash is more troublesome? It just feels wrong to rely on cryptography just to save some shared memory. Remember that it's not only about saving shared memory, it's also about making sure that the snapshot reflects a state of the database that has actually existed at some point in the past. Furthermore, we can easily invalidate a snapshot that we have published earlier by deleting its checksum in shared memory as soon as the original transaction commits/aborts. And for these two a checksum seems to be a good fit. Saving memory then comes as a benefit and makes all those happy who don't want to argue about how many slots to reserve in shared memory or don't want to have another GUC for what will probably be a low-usage feature. But you can do all of this with files too, can't you? Just remove or truncate the file when the snapshot is no longer valid. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python custom exceptions for SPI
On mån, 2011-02-28 at 12:08 -0500, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On lör, 2011-02-12 at 11:58 +0100, Jan Urbański wrote: Here's an updated patch with documentation. It's an incremental patch on top of the latest explicit-subxacts version. Committed. I'm seeing a core dump as well as multiple inconsistencies in error message spelling in the plpython regression tests on a Fedora 13 box (python 2.6.4). Several buildfarm critters don't look too happy either. Fixed. (Well, some of it. We'll see ...) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] mysql2pgsql.perl update
Hi all. I spent some time updating mysql2pgsql.perl. Changes were driven by an attempt to migrate a redmine database. Original code was failing for a number of reasons (regex recursion explosion, . I was wondering it there's a more formal / appropriate place to put this. I'd also appreciate a separate set of eyes on my changes. I grabbed an initial copy from http://pgfoundry.org/projects/mysql2pgsql/ Git repo of the changes at https://github.com/SmartReceipt/mysql2pgsql/commits/master/ Commit history follows. commit 3cbe2cfa8782d250e5c4ee814c4585c96105fb45 (HEAD, sr/master, origin/master, master) Author: Andrew Hammond andrew.george.hamm...@gmail.com Date: Sat Feb 26 12:36:36 2011 -0800 simplify handling of mysql autoincrement to use serial8 datatype commit 5c559b7073e6f6e72ce11f0f45be4d13cc30fd9a Author: Andrew Hammond andrew.george.hamm...@gmail.com Date: Sat Feb 26 12:26:46 2011 -0800 multi-value inserts are supported in pgsql as of 8.2 and way faster commit f5798a7911e5114378fd4764a62288e1826f6b56 Author: Andrew Hammond andrew.george.hamm...@gmail.com Date: Sat Feb 26 12:22:32 2011 -0800 replace \\ with commit d730da318da301157783bf15c6add3e58e32b57e Author: Andrew Hammond andrew.george.hamm...@gmail.com Date: Sat Feb 26 12:11:49 2011 -0800 best practices for handling quoted strings and use lookahead rather than unnecessary match and replace commit 20610fcb50d082e2c46bf27190e4d30dba966c27 Author: Andrew Hammond andrew.george.hamm...@gmail.com Date: Sat Feb 26 12:08:40 2011 -0800 whitespace indenting normalization commit 4281e1c314501f4209245ac55d31f1e43b4ddc21 Author: Andrew Hammond andrew.george.hamm...@gmail.com Date: Sat Feb 26 11:58:52 2011 -0800 pgfoundry version 2007-12-04 17:29 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Snapshot synchronization, again...
Robert Haas robertmh...@gmail.com writes: On Sun, Feb 27, 2011 at 8:33 PM, Joachim Wieland j...@mcknight.de wrote: Remember that it's not only about saving shared memory, it's also about making sure that the snapshot reflects a state of the database that has actually existed at some point in the past. But you can do all of this with files too, can't you? Just remove or truncate the file when the snapshot is no longer valid. Yeah. I think adopting a solution similar to 2PC state files is a very reasonable way to go here. This isn't likely to be a high-usage or performance-critical feature, so it's not essential to keep the information in shared memory for performance reasons. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Well, what we have now is a bunch of counters in pg_stat_all_tables and pg_statio_all_tables. Right. What I'm saying is those aren't good enough, and have never been good enough. Counters without a time basis are pretty much useless for performance monitoring/management (Baron Schwartz has a blog post talking about this, but I can't find it right now). Take, for example, a problem I was recently grappling with for Nagios. I'd like to do a check as to whether or not tables are getting autoanalyzed often enough. After all, autovac can fall behind, and we'd want to be alerted of that. The problem is, in order to measure whether or not autoanalyze is behind, you need to count how many inserts,updates,deletes have happened since the last autoanalyze. pg_stat_user_tables just gives us the counters since the last reset ... and the reset time isn't even stored in PostgreSQL. This means that, without adding external tools like pg_statsinfo, we can't autotune autoanalyze at all. There are quite a few other examples where the counters could contribute to autotuning and DBA performance monitoring if only they were time-based. As it is, they're useful for finding unused indexes and that's about it. One possibility, of course, would be to take pg_statsinfo and make it part of core. There's a couple disadvantages of that; (1) is the storage and extra objects required, which would then require us to add extra management routines as well. (2) is that pg_statsinfo only stores top-level view history, meaning that it wouldn't be very adaptable to improvements we make in system views in the future. On the other hand, anything which increases the size of pg_statistic would be a nightmare. One possible compromise solution might be to implement code for the stats collector to automatically reset the stats at a given clock interval. If we combined this with keeping the reset time, and keeping a snapshot of the stats from the last clock tick (and their reset time) that would be good enough for most monitoring. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 10:04:54AM -0800, Josh Berkus wrote: Take, for example, a problem I was recently grappling with for Nagios. I'd like to do a check as to whether or not tables are getting autoanalyzed often enough. After all, autovac can fall behind, and we'd want to be alerted of that. The problem is, in order to measure whether or not autoanalyze is behind, you need to count how many inserts,updates,deletes have happened since the last autoanalyze. pg_stat_user_tables just gives us the counters since the last reset ... and the reset time isn't even stored in PostgreSQL. The solution I use for that in to use munin to monitor everything and let it generate alerts based on the levels. It's not great, but better than nothing. The problem, as you say, is that you want to now the rates rather than the absolute values. The problem with rates is that you can get wildly different results depending on the time interval you're looking at. For the concrete example above, autoanalyse has to be able to determine if there is work to do so the information must be somehwere. I'm guessing it's not easily available? If you had a function is_autovacuumcandidate you'd be done ofcourse. But there's ofcourse lots of stats people want, it's just not clear how to get them. What you really need is to store the stats every few minutes, but that's what munin does. I doubt it's worth building RRD like capabilities into postgres. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ Patriotism is when love of your own people comes first; nationalism, when hate for people other than your own comes first. - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes: [ latest version of snapshot-taking patch ] I started to look at this, and find myself fairly confused as to what the purpose is anymore. Reviewing the thread, there has been a lot of discussion of refactoring the API of pg_parse_and_rewrite and related functions exported by postgres.c; but the current patch seems to have abandoned that goal (except for removing pg_parse_and_rewrite itself, which doesn't seem to me to have a lot of point except as part of a more general refactoring). With respect to the issue of changing snapshot timing, most of the discussion around that seemed to start from assumptions about the behavior of wCTEs that we've now abandoned. And there was some discussion about rule behavior too, but it's not clear to me whether this patch intends to change that or not. The lack of any documentation change doesn't help here. So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus j...@agliodbs.com wrote: On the other hand, anything which increases the size of pg_statistic would be a nightmare. Hmm? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
On Mon, Feb 28, 2011 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes: [ latest version of snapshot-taking patch ] I started to look at this, and find myself fairly confused as to what the purpose is anymore. Reviewing the thread, there has been a lot of discussion of refactoring the API of pg_parse_and_rewrite and related functions exported by postgres.c; but the current patch seems to have abandoned that goal (except for removing pg_parse_and_rewrite itself, which doesn't seem to me to have a lot of point except as part of a more general refactoring). With respect to the issue of changing snapshot timing, most of the discussion around that seemed to start from assumptions about the behavior of wCTEs that we've now abandoned. And there was some discussion about rule behavior too, but it's not clear to me whether this patch intends to change that or not. The lack of any documentation change doesn't help here. So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? IIUC, this is the result of countless rounds of communal bikeshedding around: http://archives.postgresql.org/pgsql-hackers/2010-07/msg01256.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On 2/28/11 10:24 AM, Robert Haas wrote: On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus j...@agliodbs.com wrote: On the other hand, anything which increases the size of pg_statistic would be a nightmare. Hmm? Like replacing each statistic with a series of time-based buckets, which would then increase the size of the table by 5X to 10X. That was the first solution I thought of, and rejected. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python custom exceptions for SPI
Peter Eisentraut pete...@gmx.net writes: On mån, 2011-02-28 at 12:08 -0500, Tom Lane wrote: I'm seeing a core dump as well as multiple inconsistencies in error message spelling in the plpython regression tests on a Fedora 13 box (python 2.6.4). Several buildfarm critters don't look too happy either. Fixed. (Well, some of it. We'll see ...) Core dump is still there. It appears to be a python assertion failure. I installed python's debuginfo and got this backtrace: Program terminated with signal 6, Aborted. #0 0x0032a36328f5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.2-6.fc12.x86_64 krb5-libs-1.7.1-17.fc13.x86_64 libcom_err-1.41.10-7.fc13.x86_64 libselinux-2.0.94-2.fc13.x86_64 openssl-1.0.0c-1.fc13.x86_64 zlib-1.2.3-23.fc12.x86_64 (gdb) bt #0 0x0032a36328f5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x0032a36340d5 in abort () at abort.c:92 #2 0x0032a362b8b5 in __assert_fail (assertion=0x32a5b46391 gc-gc.gc_refs != 0, file=value optimized out, line=277, function=value optimized out) at assert.c:81 #3 0x0032a5b0853e in visit_decref (op=module at remote 0x7f11c3666d38, data=value optimized out) at Modules/gcmodule.c:277 #4 0x0032a5a7cbd9 in dict_traverse (op= {'info': built-in function info, 'notice': built-in function notice, 'Fatal': type at remote 0x1bba7e0, 'log': built-in function log, 'prepare': built-in function prepare, 'spiexceptions': module at remote 0x7f11c3666d38, 'SPIError': type at remote 0x1bbacc0, 'Error': type at remote 0x1bba300, 'execute': built-in function execute, '__package__': None, 'quote_ident': built-in function quote_ident, 'warning': built-in function warning, 'subtransaction': built-in function subtransaction, 'quote_literal': built-in function quote_literal, 'quote_nullable': built-in function quote_nullable, 'error': built-in function error, 'debug': built-in function debug, '__name__': 'plpy', 'fatal': built-in function fatal, '__doc__': None}, visit=0x32a5b084c0 visit_decref, arg=0x0) at Objects/dictobject.c:2003 #5 0x0032a5b08c9f in subtract_refs (generation=1) at Modules/gcmodule.c:296 #6 collect (generation=1) at Modules/gcmodule.c:817 #7 0x0032a5b096fa in collect_generations (basicsize=value optimized out) at Modules/gcmodule.c:924 #8 _PyObject_GC_Malloc (basicsize=value optimized out) at Modules/gcmodule.c:1363 #9 0x0032a5b0972e in _PyObject_GC_NewVar (tp=0x32a5d899a0, nitems=1) at Modules/gcmodule.c:1383 #10 0x0032a5a9703f in PyTuple_New (size=1) at Objects/tupleobject.c:69 #11 0x0032a5af3697 in r_object (p=0x7fffe1f5f330) at Python/marshal.c:788 #12 0x0032a5af389c in r_object (p=0x7fffe1f5f330) at Python/marshal.c:927 #13 0x0032a5af36dd in r_object (p=0x7fffe1f5f330) at Python/marshal.c:794 #14 0x0032a5af389c in r_object (p=0x7fffe1f5f330) at Python/marshal.c:927 #15 0x0032a5af36dd in r_object (p=0x7fffe1f5f330) at Python/marshal.c:794 #16 0x0032a5af389c in r_object (p=0x7fffe1f5f330) at Python/marshal.c:927 #17 0x0032a5af3fd8 in PyMarshal_ReadObjectFromString (str=value optimized out, len=value optimized out) at Python/marshal.c:1107 #18 0x0032a5af5952 in PyMarshal_ReadLastObjectFromFile (fp=0x1c66e50) at Python/marshal.c:1066 #19 0x0032a5aedb39 in read_compiled_module (cpathname=0x7fffe1f63540 /usr/lib64/python2.6/string.pyc, fp=value optimized out) at Python/import.c:767 #20 0x0032a5aef69d in load_source_module (name=0x7fffe1f656a0 string, pathname=0x7fffe1f645c0 /usr/lib64/python2.6/string.py, fp=0x1c29b30) at Python/import.c:991 #21 0x0032a5af0cb5 in import_submodule (mod=None, subname=0x7fffe1f656a0 string, fullname=0x7fffe1f656a0 string) at Python/import.c:2589 #22 0x0032a5af0f34 in load_next (mod=None, altmod=None, p_name=value optimized out, buf=0x7fffe1f656a0 string, p_buflen=0x7fffe1f65698) at Python/import.c:2409 #23 0x0032a5af1582 in import_module_level (name=0x0, globals=value optimized out, locals=value optimized out, fromlist=None, level=value optimized out) at Python/import.c:2131 #24 0x0032a5af22c4 in PyImport_ImportModuleLevel (name=0x7f11c40c2084 string, globals= {'plpy': module at remote 0x7f11c3666ad0, 'GD': {}, 'args': [], '__builtins__': module at remote 0x7f11c4158830, '__name__': '__main__', 'SD': {}, '__doc__': None, '__plpython_procedure_import_succeed_41194': function at remote 0x7f11c3652aa0, '__package__': None}, locals=None, fromlist=None, level=value optimized out) at Python/import.c:2182 #25 0x0032a5ad762f in builtin___import__ (self=value optimized out, args=value optimized out, kwds=value optimized out) at Python/bltinmodule.c:48 #26 0x0032a5a43db3 in PyObject_Call (func=built-in function __import__, arg=value optimized out, kw=value optimized out) at
Re: [HACKERS] Sync Rep v17
On Mon, 2011-02-21 at 18:06 +0900, Fujii Masao wrote: Thanks for the patch! Thanks for the review. Code available at git://github.com/simon2ndQuadrant/postgres.git PREPARE TRANSACTION and ROLLBACK PREPARED should wait for replication as well as COMMIT PREPARED? PREPARE - Yes ROLLBACK - No Further discussion welcome What if fast shutdown is requested while RecordTransactionCommit is waiting in SyncRepWaitForLSN? ISTM fast shutdown cannot complete until replication has been successfully done (i.e., until at least one synchronous standby has connected to the master especially if allow_standalone_primary is disabled). Is this OK? A behaviour - important, though needs further discussion. We should emit WARNING when the synchronous standby with wal_receiver_status_interval = 0 connects to the master. Because, in that case, a transaction unexpectedly would wait for replication infinitely. This can't happen because a WALSender only activates as a sync standby once it has received a reply from the chosen standby. + /* Need a modifiable copy of string */ + rawstring = pstrdup(SyncRepStandbyNames); + + /* Parse string into list of identifiers */ + if (!SplitIdentifierString(rawstring, ',', elemlist)) pfree(rawstring) and list_free(elemlist) should be called also if SplitIdentifierString returns TRUE. Otherwise, memory-leak would happen. Fixed, thanks + ereport(FATAL, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg(invalid list syntax for parameter \synchronous_standby_names\))); + return false; return false is not required here though that might be harmless. Compiler likes it. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Mon, 2011-02-28 at 10:31 +0100, Yeb Havinga wrote: 1) no automatic switch to other synchronous standby - start master server, add synchronous standby 1 - change allow_standalone_primary to off - add second synchronous standby - wait until pg_stat_replication shows both standby's are in STREAMING state - stop standby 1 what happens is that the master stalls, where I expected that it would've switched to standby 2 acknowledge commits. The following thing was pilot error, but since I was test-piloting a new plane, I still think it might be usual feedback. In my opinion, any number and order of pg_ctl stops and starts on both the master and standby servers, as long as they are not with -m immediate, should never cause the state I reached. The behaviour of allow_synchronous_standby = off is pretty much untested and does seem to have various gotchas in there. 2) reaching some sort of shutdown deadlock state - start master server, add synchronous standby - change allow_standalone_primary to off then I did all sorts of test things, everything still ok. Then I wanted to shutdown everything, and maybe because of some symmetry (stack like) I did the following because I didn't think it through - pg_ctl stop on standby (didn't actualy wait until done, but immediately in other terminal) - pg_ctl stop on master O wait.. master needs to sync transactions - start standby again. but now: FATAL: the database system is shutting down There is no clean way to get out of this situation. allow_standalone_primary in the face of shutdowns might be tricky. Maybe shutdown must be prohibited to enter the shutting down phase in allow_standalone_primary = off together with no sync standby, that would allow for the sync standby to attach again. The behaviour of allow_synchronous_standby = off is not something I'm worried about personally and I've argued all along it sounds pretty silly to me. If someone wants to spend some time defining how it *should* work that might help matters. I'm inclined to remove it before commit if it can't work cleanly, to be re-added at a later date if it makes sense. 3) PANIC on standby server At some point a standby suddenly disconnected after I started a new pgbench run on a existing master/standby pair, with the following error in the logfile. LOCATION: libpqrcv_connect, libpqwalreceiver.c:171 PANIC: XX000: heap_update_redo: failed to add tuple CONTEXT: xlog redo hot_update: rel 1663/16411/16424; tid 305453/15; new 305453/102 LOCATION: heap_xlog_update, heapam.c:4724 LOG: 0: startup process (PID 32597) was terminated by signal 6: Aborted This might be due to pilot error as well; I did a several tests over the weekend and after this error I was more alert on remembering immediate shutdowns/starting with a clean backup after that, and didn't see similar errors since. Good. There are no changes in the patch for that section of code. 4) The performance of the syncrep seems to be quite an improvement over the previous syncrep patches, I've seen tps-ses of O(650) where the others were more like O(20). The O(650) tps is limited by the speed of the standby server I used-at several times the master would halt only because of heavy disk activity at the standby. A warning in the docs might be right: be sure to use good IO hardware for your synchronous replicas! With that bottleneck gone, I suspect the current syncrep version can go beyond 1000tps over 1 Gbit. Good, thanks. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Thu, 2011-02-24 at 22:08 +0900, Fujii Masao wrote: On Tue, Feb 22, 2011 at 2:38 PM, Fujii Masao masao.fu...@gmail.com wrote: I've read about two-tenths of the patch, so I'll submit another comments about the rest later. Sorry for the slow reviewing... Here are another comments: Thanks for your comments Code available at git://github.com/simon2ndQuadrant/postgres.git + {synchronous_standby_names, PGC_SIGHUP, WAL_REPLICATION, + gettext_noop(List of potential standby names to synchronise with.), + NULL, + GUC_LIST_INPUT | GUC_IS_NAME Why did you add GUC_IS_NAME here? I don't think that it's reasonable to limit the length of this parameter to 63. Because dozens of standby names might be specified in the parameter. OK, misunderstanding by me causing bug. Fixed SyncRepQueue-qlock should be initialized by calling SpinLockInit? Fixed + * Portions Copyright (c) 2010-2010, PostgreSQL Global Development Group Typo: s/2010/2011 Fixed sync_replication_timeout_client would mess up the wait-forever option. So, when allow_standalone_primary is disabled, ISTM that sync_replication_timeout_client should have no effect. Agreed, done. Please check max_wal_senders before calling SyncRepWaitForLSN for non-replication case. SyncRepWaitForLSN() handles this SyncRepRemoveFromQueue seems not to be as short-term as we can use the spinlock. Instead, LW lock should be used there. + old_status = get_ps_display(len); + new_status = (char *) palloc(len + 21 + 1); + memcpy(new_status, old_status, len); + strcpy(new_status + len, waiting for sync rep); + set_ps_display(new_status, false); + new_status[len] = '\0'; /* truncate off waiting */ Updating the PS display should be skipped if update_process_title is false. Fixed. + /* + * XXX extra code needed here to maintain sorted invariant. Yeah, such a code is required. I think that we can shorten the time it takes to find an insert position by searching the list backwards. Because the given LSN is expected to be relatively new in the queue. Sure, just skipped that because of time pressure. Will add. + * Our approach should be same as racing car - slow in, fast out. + */ Really? Even when removing the entry from the queue, we need to search the queue as well as we do in the add-entry case. Why don't you make walsenders remove the entry from the queue, instead? This models wakeup behaviour of LWlocks + longtimeout = SyncRepGetWaitTimeout(); snip + bool timeout = false; snip + else if (timeout 0 + TimestampDifferenceExceeds(GetCurrentTransactionStopTimestamp(), + now, timeout)) + { + release = true; + timeout = true; + } You seem to mix up two timeout variables. Yes, good catch. Fixed. + if (proc-lwWaitLink == MyProc) + { + /* + * Remove ourselves from middle of queue. + * No need to touch head or tail. + */ + proc-lwWaitLink = MyProc-lwWaitLink; + } When we find ourselves, we should break out of the loop soon, instead of continuing the loop to the end? Incorporated in Yeb's patch -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Tue, 2011-02-22 at 14:38 +0900, Fujii Masao wrote: On Mon, Feb 21, 2011 at 6:06 PM, Fujii Masao masao.fu...@gmail.com wrote: I've read about a tenth of the patch, so I'll submit another comments about the rest later. Here are another comments: Thanks for your comments Code available at git://github.com/simon2ndQuadrant/postgres.git SyncRepReleaseWaiters should be called when walsender exits. Otherwise, if the standby crashes while a transaction is waiting for replication, it waits infinitely. Will think on this. sync_rep_service and potential_sync_standby are not required to be in the WalSnd shmem because only walsender accesses them. For use in debug, if not later monitoring +static bool +SyncRepServiceAvailable(void) +{ + bool result = false; + + SpinLockAcquire(WalSndCtl-ctlmutex); + result = WalSndCtl-sync_rep_service_available; + SpinLockRelease(WalSndCtl-ctlmutex); + + return result; +} Fixed volatile pointer needs to be used to prevent code rearrangement. + slock_t ctlmutex; /* locks shared variables shown above */ cltmutex should be initialized by calling SpinLockInit. Fixed + /* + * Stop providing the sync rep service, even if there are + * waiting backends. + */ + { + SpinLockAcquire(WalSndCtl-ctlmutex); + WalSndCtl-sync_rep_service_available = false; + SpinLockRelease(WalSndCtl-ctlmutex); + } sync_rep_service_available should be set to false only when there is no synchronous walsender. The way I had it is correct because if (MyWalSnd-sync_rep_service) then if we're the sync walsender, so if we stop being it, then there isn't one. A potential walsender might then become the sync walsender. I think you'd like it if there was no gap at the point the potential wal sender takes over? Just not sure how to do that robustly at present. Will think some more. + /* + * When we first start replication the standby will be behind the primary. + * For some applications, for example, synchronous replication, it is + * important to have a clear state for this initial catchup mode, so we + * can trigger actions when we change streaming state later. We may stay + * in this state for a long time, which is exactly why we want to be + * able to monitor whether or not we are still here. + */ + WalSndSetState(WALSNDSTATE_CATCHUP); + The above has already been committed. Please remove that from the patch. Removed I don't like calling SyncRepReleaseWaiters for each feedback because I guess that it's too frequent. How about receiving all the feedbacks available from the socket, and then calling SyncRepReleaseWaiters as well as walreceiver does? Possibly, but an optimisation for later when we have behaviour correct. + boolownLatch; /* do we own the above latch? */ We can just remove the ownLatch flag. Agreed, removed -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Fri, 2011-02-25 at 16:41 +0100, Yeb Havinga wrote: --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -274,6 +274,8 @@ SyncRepRemoveFromQueue(void) } else { + bool found = false; + while (proc-lwWaitLink != NULL) { /* Are we the next proc in our traversal of the queue? */ @@ -284,17 +286,19 @@ SyncRepRemoveFromQueue(void) * No need to touch head or tail. */ proc-lwWaitLink = MyProc-lwWaitLink; + found = true; + break; } - if (proc-lwWaitLink == NULL) - elog(WARNING, could not locate ourselves on wait queue); proc = proc-lwWaitLink; } + if (!found) + elog(WARNING, could not locate ourselves on wait queue); - if (proc-lwWaitLink == NULL) /* At tail */ + /* If MyProc was removed from the tail, maintain list invariant head==tail */ + if (proc-lwWaitLink == NULL) { - Assert(proc == MyProc); - /* Remove ourselves from tail of queue */ + Assert(proc != MyProc); /* impossible since that is the head=MyProc branch above */ Assert(queue-tail == MyProc); queue-tail = proc; proc-lwWaitLink = NULL; Used your suggested fix Code available at git://github.com/simon2ndQuadrant/postgres.git I needed to add this to make the documentation compile --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2010,6 +2010,9 @@ SET ENABLE_SEQSCAN TO OFF; You should also consider setting varnamehot_standby_feedback/ as an alternative to using this parameter. /para + /listitem + /varlistentry + /variablelist/sect2 sect2 id=runtime-config-sync-rep Separate bug, will fix -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: cross column correlation ...
Excerpts from Robert Haas's message of sáb feb 26 02:24:26 -0300 2011: On Fri, Feb 25, 2011 at 2:26 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Rod Taylor's message of vie feb 25 14:03:58 -0300 2011: How practical would it be for analyze to keep a record of response times for given sections of a table as it randomly accesses them and generate some kind of a map for expected response times for the pieces of data it is analysing? I think what you want is random_page_cost that can be tailored per tablespace. We have that. Oh, right. But it's not the same as tracking *sections of a table*. I dunno. I imagine if you have a section of a table in different storage than other sections, you created a tablespace and moved the partition holding that section there. Otherwise, how do you prevent the tuples from moving to other sections? (We don't really have a concept of sections of a table.) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? IIUC, this is the result of countless rounds of communal bikeshedding around: Quite :-(. But I'm not sure where the merry-go-round stopped. http://archives.postgresql.org/pgsql-hackers/2010-07/msg01256.php Please notice that the very terms of discussion in that message depend on a view of wCTEs that has got nothing to do with what was applied. I'm afraid that the goals of this patch might be similarly obsolete. I definitely don't want to apply the patch in a hurry just because we're down to the end of the commitfest. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mysql2pgsql.perl update
Andrew, I spent some time updating mysql2pgsql.perl. Changes were driven by an attempt to migrate a redmine database. Original code was failing for a number of reasons (regex recursion explosion, . I was wondering it there's a more formal / appropriate place to put this. I'd also appreciate a separate set of eyes on my changes. I grabbed an initial copy from http://pgfoundry.org/projects/mysql2pgsql/ Git repo of the changes at https://github.com/SmartReceipt/mysql2pgsql/commits/master/ I can give you ownership of the pgFoundry project. What's your pgfoundry ID? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Josh Berkus j...@agliodbs.com writes: On 2/28/11 10:24 AM, Robert Haas wrote: On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus j...@agliodbs.com wrote: On the other hand, anything which increases the size of pg_statistic would be a nightmare. Hmm? Like replacing each statistic with a series of time-based buckets, which would then increase the size of the table by 5X to 10X. That was the first solution I thought of, and rejected. I think Josh is thinking of the stats collector's dump file, not pg_statistic. Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. This is just contrib/btree_GIST, yes? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Josh Berkus j...@agliodbs.com writes: On 2/28/11 10:24 AM, Robert Haas wrote: On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus j...@agliodbs.com wrote: On the other hand, anything which increases the size of pg_statistic would be a nightmare. Hmm? Like replacing each statistic with a series of time-based buckets, which would then increase the size of the table by 5X to 10X. That was the first solution I thought of, and rejected. I think Josh is thinking of the stats collector's dump file, not pg_statistic. Yeah. Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... Well, we have these things called tables. Any chance of using those? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
On Mon, Feb 28, 2011 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? IIUC, this is the result of countless rounds of communal bikeshedding around: Quite :-(. But I'm not sure where the merry-go-round stopped. http://archives.postgresql.org/pgsql-hackers/2010-07/msg01256.php Please notice that the very terms of discussion in that message depend on a view of wCTEs that has got nothing to do with what was applied. I'm afraid that the goals of this patch might be similarly obsolete. No, I don't think so. IIUC, the problem is that EXPLAIN ANALYZE runs the rewrite products with different snapshot handling than the regular execution path. So in theory you could turn on auto_explain and have the semantics of your queries change. That would be Bad. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
On 2011-02-28 8:22 PM, Tom Lane wrote: Marko Tiikkajamarko.tiikk...@cs.helsinki.fi writes: [ latest version of snapshot-taking patch ] I started to look at this, and find myself fairly confused as to what the purpose is anymore. Reviewing the thread, there has been a lot of discussion of refactoring the API of pg_parse_and_rewrite and related functions exported by postgres.c; but the current patch seems to have abandoned that goal (except for removing pg_parse_and_rewrite itself, which doesn't seem to me to have a lot of point except as part of a more general refactoring). With respect to the issue of changing snapshot timing, most of the discussion around that seemed to start from assumptions about the behavior of wCTEs that we've now abandoned. And there was some discussion about rule behavior too, but it's not clear to me whether this patch intends to change that or not. The lack of any documentation change doesn't help here. So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? The only intended change is what I was wondering in the original post: snapshot handling between normal execution and EXPLAIN ANALYZE when a query expands to multiple trees because of rewrite rules. Like I said earlier, this is just a bugfix now that wCTEs don't need it anymore. Rcgards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm afraid that the goals of this patch might be similarly obsolete. No, I don't think so. IIUC, the problem is that EXPLAIN ANALYZE runs the rewrite products with different snapshot handling than the regular execution path. Possibly, but it's not clear to me that this patch fixes that. As I said, it's no longer obvious what the patch means to do, and I'd like a clear statement of that. So in theory you could turn on auto_explain and have the semantics of your queries change. That would be Bad. That's just FUD. auto_explain doesn't run EXPLAIN ANALYZE. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes: On 2011-02-28 8:22 PM, Tom Lane wrote: So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? The only intended change is what I was wondering in the original post: snapshot handling between normal execution and EXPLAIN ANALYZE when a query expands to multiple trees because of rewrite rules. Like I said earlier, this is just a bugfix now that wCTEs don't need it anymore. OK, and which behavior is getting changed, to what? I am not interested in trying to reverse-engineer a specification from the patch. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus j...@agliodbs.com wrote: Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. This is just contrib/btree_GIST, yes? Yes, core KNN was committed by Tom during the November CommitFest. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
On Mon, Feb 28, 2011 at 2:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm afraid that the goals of this patch might be similarly obsolete. No, I don't think so. IIUC, the problem is that EXPLAIN ANALYZE runs the rewrite products with different snapshot handling than the regular execution path. Possibly, but it's not clear to me that this patch fixes that. As I said, it's no longer obvious what the patch means to do, and I'd like a clear statement of that. Fair enough. I assume Marko will provide that shortly. I believe the consensus was to make the regular case behave like EXPLAIN ANALYZE rather than the other way around... So in theory you could turn on auto_explain and have the semantics of your queries change. That would be Bad. That's just FUD. auto_explain doesn't run EXPLAIN ANALYZE. Oh, woops. I stand corrected. But I guess the query might behave differently with and without EXPLAIN ANALYZE? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Thu, 2011-02-24 at 18:13 -0800, Daniel Farina wrote: I have also reproduced this. Notably, things seem fine as long as pgbench is confined to one backend, but as soon as two are used (-c 2) by the feature I can get segfaults. Sorry that you all experienced this. I wasn't able to get concurrent queue accesses even with -c 8, so I spent about half a day last week investigating a possible spinlock locking flaw. That meant the code in that area was untested, which is most obvious now. I guess that means I should test on different hardware in future. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mysql2pgsql.perl update
On Mon, Feb 28, 2011 at 10:49 AM, Josh Berkus j...@agliodbs.com wrote: Andrew, I spent some time updating mysql2pgsql.perl. Changes were driven by an attempt to migrate a redmine database. Original code was failing for a number of reasons (regex recursion explosion, . I was wondering it there's a more formal / appropriate place to put this. I'd also appreciate a separate set of eyes on my changes. I grabbed an initial copy from http://pgfoundry.org/projects/mysql2pgsql/ Git repo of the changes at https://github.com/SmartReceipt/mysql2pgsql/commits/master/ I can give you ownership of the pgFoundry project. What's your pgfoundry ID? pgfoundry id is ahammond. does foundry support git or should I just link to a repo on github? A -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
On 2011-02-28 9:03 PM, Tom Lane wrote: Marko Tiikkajamarko.tiikk...@cs.helsinki.fi writes: On 2011-02-28 8:22 PM, Tom Lane wrote: So: exactly what is the intended behavioral change as of now, and what is the argument supporting that change? The only intended change is what I was wondering in the original post: snapshot handling between normal execution and EXPLAIN ANALYZE when a query expands to multiple trees because of rewrite rules. Like I said earlier, this is just a bugfix now that wCTEs don't need it anymore. OK, and which behavior is getting changed, to what? I am not interested in trying to reverse-engineer a specification from the patch. My recollection is (and the archives seem to agree) that normal execution and SQL functions were changed to only advance the CID instead of taking a new snapshot. EXPLAIN ANALYZE and SPI (not exactly sure about this one) did that already so they were just changed to use the new API. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXPLAIN doesn't show sufficient info for wCTE cases
On Mon, Feb 28, 2011 at 11:44:06AM -0500, Robert Haas wrote: On Mon, Feb 28, 2011 at 11:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: EXPLAIN currently shows ModifyTable nodes as just Insert, Update, or Delete, without any indication of the target table. This was more or less good enough when there could only be one such node per query, but it's looking pretty inadequate to me as I play around with data-modifying statements in WITH. The obvious thing to do is show the target table much as we do for table scan nodes, eg Update on my_table. There is a deficiency in that, which is that for inherited UPDATE/DELETE cases a single ModifyTable node could have multiple target tables. But after reflecting on it a bit, I think it would be good enough to show the parent table name. The individual child plans will necessarily include scans of the individual child tables, so you can figure out which is which from that if you need to know. Alternatively we could list all the target tables in a new node attribute, eg Update (costs...) Target Tables: foo_parent, foo_child1, ... But in the majority of cases this seems like a waste of precious vertical space. Thoughts? I think it's good to include the table name, for sure. I *think* I agree that it isn't necessary to include the child names. Would this affect the use case of breaking up a too-long table into partitions? WITH f AS ( DELETE FROM ONLY foo WHERE foo_ts = '2011-01-01' AND foo_ts '2011-02-01' RETURNING * ) INSERT INTO foo_201101 SELECT * FROM f; Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus j...@agliodbs.com wrote: Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. This is just contrib/btree_GIST, yes? Yes, core KNN was committed by Tom during the November CommitFest. Right. However, it's disappointing that this isn't in, because the number of use cases for KNN-gist in core isn't very large. We really need support for KNN in btree_gist to make it useful. Given that it is a contrib module, I personally wouldn't object to it getting patched later, like during alpha or beta. But somebody's got to do the work, and I've got a dozen higher-priority problems right now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mysql2pgsql.perl update
does foundry support git or should I just link to a repo on github? If you prefer using git, the latter. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] knngist - 0.8
On Mon, Feb 28, 2011 at 2:25 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:53 PM, Josh Berkus j...@agliodbs.com wrote: Since no one has stepped up to fix these issues, I have marked this patch Returned with Feedback. This is just contrib/btree_GIST, yes? Yes, core KNN was committed by Tom during the November CommitFest. Right. However, it's disappointing that this isn't in, because the number of use cases for KNN-gist in core isn't very large. We really need support for KNN in btree_gist to make it useful. Given that it is a contrib module, I personally wouldn't object to it getting patched later, like during alpha or beta. But somebody's got to do the work, and I've got a dozen higher-priority problems right now. Well, we can argue about whether it's too late for 9.1 if and when a patch shows up. Right now we don't have that problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... Well, we have these things called tables. Any chance of using those? Having the stats collector write tables would violate the classical form of the heisenberg principle (thou shalt avoid having thy measurement tools affect that which is measured), not to mention assorted practical problems like not wanting the stats collector to take locks or run transactions. The ideal solution would likely be for the stats collector to expose its data structures as shared memory, but I don't think we get to do that under SysV shmem --- it doesn't like variable-size shmem much. Maybe that's another argument for looking harder into mmap or POSIX shmem, although it's not clear to me how well either of those fixes that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Mon, 2011-02-21 at 21:35 +0900, Tatsuo Ishii wrote: Well, good news all round. v17 implements what I believe to be the final set of features for sync rep. This one I'm actually fairly happy with. It can be enjoyed best at DEBUG3. The patch is very lite touch on a few areas of code, plus a chunk of specific code, all on master-side. Pretty straight really. I'm sure problems will be found, its not long since I completed this; thanks to Daniel Farina for your help with patch assembly. + primaryvarnamesynchronous_standby_names/ configuration parameter/primary + /indexterm + listitem + para +Specifies a list of standby names that can become the sole +synchronous standby. Other standby servers connect that are also on +the list become potential standbys. If the current synchronous standby +goes away it will be replaced with one of the potential standbys. +Specifying more than one standby name can allow very high availability. + /para Can anybody please enlighten me? I do not quite follow Other standby servers connect that are also on the list become potential standbys part. Can I read this as Other standby servers that are also on the list become potential synchrnous standbys? Yes I have reworded it to see if that improves the explanation Code available at git://github.com/simon2ndQuadrant/postgres.git untagged text included here for clarity synchronous_standby_names Specifies a list of standby names that can become the sole synchronous standby. At any time there can be only one synchronous standby server. The first standby to connect that is listed here will become the synchronous standby server. Other standby servers that connect will then become potential synchronous standbys. If the current synchronous standby disconnects for whatever reason it will be replaced with one of the potential standbys. Specifying more than one standby name can allow very high availability. The standby name is currently taken as the application_name of the standby, as set in the primary_conninfo on the standby. Names are not enforced for uniqueness, though clearly that could lead to confusion and misconfiguration. Specifying multiple standbys with the same name does not allow more than one standby to be the current synchronous standby. If a standby is removed from the list of servers then it will stop being the synchronous standby, allowing another to take it's place. Standbys may also be added to the list without restarting the server. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes: On 2011-02-28 9:03 PM, Tom Lane wrote: OK, and which behavior is getting changed, to what? I am not interested in trying to reverse-engineer a specification from the patch. My recollection is (and the archives seem to agree) that normal execution and SQL functions were changed to only advance the CID instead of taking a new snapshot. EXPLAIN ANALYZE and SPI (not exactly sure about this one) did that already so they were just changed to use the new API. OK, so the intent is that in all cases, we just advance CID and don't take a new snapshot between queries that were generated (by rule expansion) from a single original parsetree? But we still take a new snap between original parsetrees? Works for me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Feb 28, 2011, at 14:31, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... Well, we have these things called tables. Any chance of using those? Having the stats collector write tables would violate the classical form of the heisenberg principle (thou shalt avoid having thy measurement tools affect that which is measured), not to mention assorted practical problems like not wanting the stats collector to take locks or run transactions. The ideal solution would likely be for the stats collector to expose its data structures as shared memory, but I don't think we get to do that under SysV shmem --- it doesn't like variable-size shmem much. Maybe that's another argument for looking harder into mmap or POSIX shmem, although it's not clear to me how well either of those fixes that. Spitballing here, but could sqlite be an intermediate, compromise solution? Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review: Fix snapshot taking inconsistencies
On 2011-02-28 9:36 PM, Tom Lane wrote: Marko Tiikkajamarko.tiikk...@cs.helsinki.fi writes: On 2011-02-28 9:03 PM, Tom Lane wrote: OK, and which behavior is getting changed, to what? I am not interested in trying to reverse-engineer a specification from the patch. My recollection is (and the archives seem to agree) that normal execution and SQL functions were changed to only advance the CID instead of taking a new snapshot. EXPLAIN ANALYZE and SPI (not exactly sure about this one) did that already so they were just changed to use the new API. OK, so the intent is that in all cases, we just advance CID and don't take a new snapshot between queries that were generated (by rule expansion) from a single original parsetree? But we still take a new snap between original parsetrees? Works for me. Exactly. Regards, Marko Tiikkaja -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Spitballing here, but could sqlite be an intermediate, compromise solution? For a core PostgreSQL component ?!?!? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, 2011-02-28 at 11:39 -0800, Josh Berkus wrote: Spitballing here, but could sqlite be an intermediate, compromise solution? For a core PostgreSQL component ?!?!? Sure, why not? It is ACID compliant, has the right kind of license, has a standard API that we are all used to. It seems like a pretty decent solution in consideration. We don't need MVCC for this problem. JD -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Em 28-02-2011 15:50, Tom Lane escreveu: Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... What about splitting statistic file per database? -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pl/python do not delete function arguments
On lör, 2011-02-26 at 09:43 +0100, Jan Urbański wrote: I'm officially at a loss on how to fix that bug without some serious gutting of how PL/Python arguments work. If someone comes up with a brilliant way to solve this problem, we can commit it after beta, or even during the 9.2 cycle (should the brilliant solution be backpatcheable). We'd essentially be trading off freeing something too soon with freeing it not at all. I'm not sure how good that tradeoff is. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EXPLAIN doesn't show sufficient info for wCTE cases
David Fetter da...@fetter.org writes: On Mon, Feb 28, 2011 at 11:44:06AM -0500, Robert Haas wrote: I think it's good to include the table name, for sure. I *think* I agree that it isn't necessary to include the child names. Would this affect the use case of breaking up a too-long table into partitions? WITH f AS ( DELETE FROM ONLY foo WHERE foo_ts = '2011-01-01' AND foo_ts '2011-02-01' RETURNING * ) INSERT INTO foo_201101 SELECT * FROM f; In that example, each ModifyTable node is constrained to a single target table, so I'm not sure what your point is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Fri, 2011-02-25 at 16:41 +0100, Yeb Havinga wrote: I needed to add this to make the documentation compile --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2010,6 +2010,9 @@ SET ENABLE_SEQSCAN TO OFF; You should also consider setting varnamehot_standby_feedback/ as an alternative to using this parameter. /para + /listitem + /varlistentry + /variablelist/sect2 sect2 id=runtime-config-sync-rep Corrected, thanks. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Joshua D. Drake j...@commandprompt.com writes: On Mon, 2011-02-28 at 11:39 -0800, Josh Berkus wrote: Spitballing here, but could sqlite be an intermediate, compromise solution? For a core PostgreSQL component ?!?!? Sure, why not? Because it's fifty times more mechanism than we need here? We don't want a SQL interface (not even a lightweight one) and it's unclear that we ever want the data to go to disk at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Euler Taveira de Oliveira eu...@timbira.com writes: Em 28-02-2011 15:50, Tom Lane escreveu: Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... What about splitting statistic file per database? That would improve matters for some usage patterns, but I'm afraid only a minority. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 2:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Ultimately we need to think of a reporting mechanism that's a bit smarter than rewrite the whole file for any update ... Well, we have these things called tables. Any chance of using those? Having the stats collector write tables would violate the classical form of the heisenberg principle (thou shalt avoid having thy measurement tools affect that which is measured), not to mention assorted practical problems like not wanting the stats collector to take locks or run transactions. The ideal solution would likely be for the stats collector to expose its data structures as shared memory, but I don't think we get to do that under SysV shmem --- it doesn't like variable-size shmem much. Maybe that's another argument for looking harder into mmap or POSIX shmem, although it's not clear to me how well either of those fixes that. Well, certainly, you could make it work with mmap() - you could arrange a mechanism whereby anyone who tries to reference off the end of the portion they've mapped calls stat() on the file and remaps it at its now-increased size.But you'd need to think carefully about locking and free-space management, which is where it starts to sound an awful lot like you're reinventing the idea of a heap. Maybe there's a way to design some kind of lighter weight mechanism, but the complexity of the problem is not obviously a lot less than the general problem of storing frequently updated tabular data. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] OSSP gone missing? Fate of UUID?
A couple of weeks ago when installing uuid-ossp on a new server, I noticed that the ossp site is gone. I haven't found anything on the web to indicate what happened. Anyone know? Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Sat, 2011-02-19 at 22:52 -0500, Robert Haas wrote: On Sat, Feb 19, 2011 at 3:28 AM, Simon Riggs si...@2ndquadrant.com wrote: First, we should be clear to explain that you are referring to the fact that the request synchronous_commit = off synchronous_replication = on makes no sense in the way the replication system is currently designed, even though it is a wish-list item to make it work in 9.2+ What exactly do you mean by make it work? We can either (1) wait for the local commit and the remote commit (synchronous_commit=on, synchronous_replication=on), (2) wait for the local commit only (synchronous_commit=on, synchronous_replication=off), or (3) wait for neither (synchronous_commit=off, synchronous_replication=off). There's no fourth possible behavior, AFAICS. Currently, no, since as we discussed earlier we currently need to fsync WAL locally before it gets sent to standby. The question is whether synchronous_commit=off, synchronous_replication=on should behave like (1) or (3) Yes, that is the right question. You have it as #1; I'm arguing it should be #3. I realize it's an arguable point; I'm just arguing for what makes most sense to me. Various comments follow on thread. We can pick this up once we've committed the main patch. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Sat, 2011-02-19 at 23:26 -0500, Robert Haas wrote: I believe the problem is that the definition of IsOnSyncRepQueue is bogus, so that the loop in SyncRepWaitOnQueue always takes the first branch. Sorry, don't see that. Jaime/Yeb fix applied. It was a little confusing to me setting this up that setting only synchronous_replication did nothing; I had to also set synchronous_standby_names. We might need a cross-check there. I'm inclined to make an empty synchronous_standby_names mean that any standby can become the sync standby. That seems more useful behaviour and avoids the need for a cross-check (what exactly would we check??). I believe the docs for synchronous_replication also need some updating; this part appears to be out of date: +between primary and standby. The commit wait will last until the +first reply from any standby. Multiple standby servers allow +increased availability and possibly increase performance as well. Agreed The words on the primary in the next sentence may not be necessary any more either, as I believe this parameter now has no effect anywhere else. Agreed Docs changed: git://github.com/simon2ndQuadrant/postgres.git -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 6:31 PM, Josh Berkus j...@agliodbs.com wrote: Like replacing each statistic with a series of time-based buckets, which would then increase the size of the table by 5X to 10X. That was the first solution I thought of, and rejected. I don't understand what you're talking about at all here. I think there are a lot of unsolved problems in monitoring but the one thing I think everyone is pretty clear on is that the right way to export metrics like these is to export a counter and then have some external component periodically copy the counter into some history table and calculate the derivative, second derivative, running average of the first derivative, etc. What's needed here is for someone to write a good mrtg/rrd/whatever replacement using postgres as its data store. If you're monitoring something sensitive then you would store the data in a *different* postgres server to avoid Tom's complaint. There may be aspects of the job that Postgres does poorly but we can focus on improving those parts of Postgres rather than looking for another database. And frankly Postgres isn't that bad a tool for it -- when I did some performance analysis recently I actually ended up loading the data into Postgres so I could do some of the aggregations using window functions anyways. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Mon, Feb 28, 2011 at 4:13 PM, Simon Riggs si...@2ndquadrant.com wrote: On Sat, 2011-02-19 at 23:26 -0500, Robert Haas wrote: I believe the problem is that the definition of IsOnSyncRepQueue is bogus, so that the loop in SyncRepWaitOnQueue always takes the first branch. Sorry, don't see that. Jaime/Yeb fix applied. It was a little confusing to me setting this up that setting only synchronous_replication did nothing; I had to also set synchronous_standby_names. We might need a cross-check there. I'm inclined to make an empty synchronous_standby_names mean that any standby can become the sync standby. That seems more useful behaviour and avoids the need for a cross-check (what exactly would we check??). Hmm, that is a little surprising but might be reasonable. My thought was that we would check that if synchronous_replication=on then synchronous_standbys must be non-empty. I think there ought to be some way for the admin to turn synchronous replication *off* though, in a way that an individual user cannot override. How will we do that? Docs changed: git://github.com/simon2ndQuadrant/postgres.git I'm hoping you're going to post an updated patch once the current rash of updates is all done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spatio-Temporal Functions
On Sat, Feb 26, 2011 at 1:43 PM, Nick Raj nickrajj...@gmail.com wrote: Hi, I am writing some spatio-temporal function in postgis. Like, ST_Enters( obj_geom, boundary_geom, time t1,time t2) For example- Does vehicle enters in ABC between time t1 to t2? Let us suppose, i take only one object geometry at a time and do compare then i could not produce the output because it is continous function of time i.e. Enters is made up of Disjoint - Touches - Inside . So, for creating this function, i require all the object's geometry before it go inside the function. So that it can call only once. My first problem is that create or replace function ST_Enters(, geometry,timestamp,timestamp) returns boolean Which kind of data type to be used because i have to catch more than 1000 rows and geometry[] doesn't work for it. So, Can anyone tell me which data type can be used to catch this? Second, How can i use this function in SQL? Because suppose i write query like select ST_Enters ( obj_geom, (select boundary_geom from boundary), '25/2/2011 12:23:32','26/2/2011') from vehicledata where vehicleid= 'XYZ'; and again it comes out to be one object geometry matches with boundary geometry, if i store the inbetween data in some temp table then it creates a lot of computation between the table and if it end up in between then table may contains some wrong data. So, How to use this function in SQL in continuous manner?? Thanks in advance. I think you might need to ask these questions on the PostGIS lists... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Mon, 2011-02-28 at 16:22 -0500, Robert Haas wrote: Docs changed: git://github.com/simon2ndQuadrant/postgres.git I'm hoping you're going to post an updated patch once the current rash of updates is all done. Immediately prior to commit, yes. Everybody else has been nudging me towards developing in public view, commit by commit on a public repo. So that's what I'm doing now, as promised earlier. That should help people object to specific commits if they no likey. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Mon, Feb 28, 2011 at 4:36 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, 2011-02-28 at 16:22 -0500, Robert Haas wrote: Docs changed: git://github.com/simon2ndQuadrant/postgres.git I'm hoping you're going to post an updated patch once the current rash of updates is all done. Immediately prior to commit, yes. Everybody else has been nudging me towards developing in public view, commit by commit on a public repo. So that's what I'm doing now, as promised earlier. That should help people object to specific commits if they no likey. It took a few days for the problems with the last version to shake out. I think you should give people about that much time again. It's not realistic to suppose that everyone will follow your git repo in detail. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SR standby hangs
On Tue, Feb 22, 2011 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark gsst...@mit.edu writes: On Tue, Feb 22, 2011 at 12:55 PM, Robert Haas robertmh...@gmail.com wrote: A little OT, but ISTM that the buffer pin mechanism by its nature is prone to lock upgrade hazards. Except that pins don't block exclusive locks so there's no deadlock risk. The oddity here is on Vacuums super-exclusive lock which is the real equivalent of an exclusive lock. However there's the added bonus that there can only be one vacuum on a table at a time. That makes it safe We have seen deadlocks arising from this type of scenario: autovac has vacuum lock on table X autovac blocks waiting for cleanup lock on buffer B in X process P has pin on B due to a suspended query (eg cursor) P tries to get exclusive lock on X, is blocked by autovac's lock The heavyweight-lock manager fails to recognize deadlock because it doesn't know about the buffer-level LWLock. It might be interesting to have autovacuum skip a block it finds pinned for too long. +1, although as somebody pointed out nearby, this will only be legal if it's not a vacuum-to-prevent-wraparound situation. Another approach to this problem would be to jigger things so that the query doesn't hold a buffer pin while suspended. I'm not quite sure how to make that work, but maybe it's possible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On 28.02.2011 23:28, daveg wrote: On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote: We'll likely need to go back and forth a few times with various debugging patches until we get to the heart of this.. Anything new on this? I'm seeing at on one of my clients production boxes. I haven't heard anything from the OP since. Also, what is the significance, ie what is the risk or damage potential if this flag is set incorrectly? Sequential scans will honor the flag, so you might see some dead rows incorrectly returned by a sequential scan. That's the only damage, but an incorrectly set flag could be a sign of something more sinister, like corrupt tuple headers. The flag should never be set incorrectly, so if you see that message you have hit a bug in PostgreSQL, or you have bad hardware. This flag is quite new, so a bug in PostgreSQL is quite possible. If you still have a backup that contains those incorrectly set flags, I'd like to see what the page looks like. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...
I don't understand what you're talking about at all here. I think there are a lot of unsolved problems in monitoring but the one thing I think everyone is pretty clear on is that the right way to export metrics like these is to export a counter and then have some external component periodically copy the counter into some history table and calculate the derivative, second derivative, running average of the first derivative, etc. You missed the original point of the discussion, which was to have stats we could use for auto-tuning internally. Not to export them. For example, there are optimizations we could make with the query planner if we knew which tables and indexes were hot in general. That's how we started this discussion, and it's not solved by storing the stats history on another server. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote: On 12.01.2011 06:21, Fujii Masao wrote: On Sat, Dec 25, 2010 at 2:09 PM, Maxim Bogukmaxim.bo...@gmail.com wrote: While I trying create reproducible test case for BUG #5798 I encountered very strange effect on two of my servers (both servers have same hardware platform/OS (freebsd 7.2) and PostgreSQL 8.4.4). Very simple test table created as: CREATE TABLE test (id integer); INSERT INTO test select generate_series(0,1); And I trying repeateble vacuum of that table with script: perl -e foreach (1..10) {system \psql -d test -h -c 'vacuum test'\;} And once per like an minute (really random intervals can be 5 minutes without problems can be 3 vacuum in row show same error) I getting next errors: WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation test page 1 ... WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation test page 30 for all pages of the relation. Oh, interesting. This is the first time anyone can reliably reproducible that. I can't reproduce that on my laptop with that script, though, so I'm going to need your help to debug this. Can you compile PostgreSQL with the attached patch, and rerun the test? It will dump the pages with incorrectly set flags to files in /tmp/, and adds a bit more detail in the WARNING. Please run the test until you get those warnings, and tar up the the created /tmp/pageimage* files, and post them along with the warning generated. We'll likely need to go back and forth a few times with various debugging patches until we get to the heart of this.. Anything new on this? I'm seeing at on one of my clients production boxes. Also, what is the significance, ie what is the risk or damage potential if this flag is set incorrectly? Thanks -dg -- David Gould da...@sonic.net If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 02/28/2011 05:23 PM, Robert Haas wrote: On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote: Well, in principle we could allow them to work on both, just the same way that (for instance) + is a standardized operator but works on more than one datatype. But I agree that the prospect of two parallel types with essentially duplicate functionality isn't pleasing at all. The real issue here is whether we want to store XML as text (as we do now) or as some predigested form which would make output the whole thing slower but speed up things like xpath lookups. We had the same issue with JSON, and due to the uncertainty about which way to go with it we ended up integrating nothing into core at all. It's really not clear that there is one way of doing this that is right for all use cases. If you are storing xml in an xml column just to get it validated, and doing no processing in the DB, then you'd probably prefer our current representation. If you want to build functional indexes on xpath expressions, and then run queries that extract data using other xpath expressions, you would probably prefer the other representation. Yes, it was actually the focal point of my considerations: whether to store plain text or 'something else'. It's interesting to know that such uncertainty already existed in another area. Maybe it's specific to other open source projects too... I tend to think that it would be useful to have both text and predigested types for both XML and JSON, but I am not too eager to begin integrating more stuff into core or contrib until it spends some time on pgfoundry or github or wherever people publish their PostgreSQL extensions these days and we have a few users prepared to testify to its awesomeness. It definitely makes sense to develop this new functionality separate for some time. It's kind of exciting to develop something new, but spending significant effort on the 'native XM' probably needs a bit higher level of consensus than what appeared in this discussion. In that context, the remark about users and their needs is something that I can't ignore. Thanks to all for contributions to this discussion. In any case, the definitional problems with xpath_table(), and/or the memory management problems with libxml2, are not the basis on which we should be making this decision. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
Anton antonin.hou...@gmail.com wrote: it was actually the focal point of my considerations: whether to store plain text or 'something else'. Given that there were similar issues for other hierarchical data types, perhaps we need something similar to tsvector, but for hierarchical data. The extra layer of abstraction might not cost much when used for XML compared to the possible benefit with other data. It seems likely to be a very nice fit with GiST indexes. So under this idea, you would always have the text (or maybe byte array?) version of the XML, and you could shard it to a separate column for fast searches. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 4:13 PM, Greg Stark gsst...@mit.edu wrote: On Mon, Feb 28, 2011 at 6:31 PM, Josh Berkus j...@agliodbs.com wrote: What's needed here is for someone to write a good mrtg/rrd/whatever replacement using postgres as its data store. If you're monitoring something sensitive then you would store the data in a *different* postgres server to avoid Tom's complaint. There may be aspects of the job that Postgres does poorly but we can focus on improving those parts of Postgres rather than looking for another database. And frankly Postgres isn't that bad a tool for it -- when I did some performance analysis recently I actually ended up loading the data into Postgres so I could do some of the aggregations using window functions anyways. Greg, see https://labs.omniti.com/labs/reconnoiter, but also see Josh's nearby email about how he's trying to solve this internal to the database. Robert Treat play: xzilla.net work: omniti.com hiring: l42.org/Lg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sync Rep v17
On Mon, 2011-02-28 at 16:55 -0500, Robert Haas wrote: On Mon, Feb 28, 2011 at 4:36 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, 2011-02-28 at 16:22 -0500, Robert Haas wrote: Docs changed: git://github.com/simon2ndQuadrant/postgres.git I'm hoping you're going to post an updated patch once the current rash of updates is all done. Immediately prior to commit, yes. Everybody else has been nudging me towards developing in public view, commit by commit on a public repo. So that's what I'm doing now, as promised earlier. That should help people object to specific commits if they no likey. It took a few days for the problems with the last version to shake out. I think you should give people about that much time again. It's not realistic to suppose that everyone will follow your git repo in detail. Yeh, I'm not rushing to commit. And even afterwards I expect comments that will mean I'm editing this for next month at least. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Why our counters need to be time-based WAS: WIP: cross column correlation ...
j...@agliodbs.com (Josh Berkus) writes: I don't understand what you're talking about at all here. I think there are a lot of unsolved problems in monitoring but the one thing I think everyone is pretty clear on is that the right way to export metrics like these is to export a counter and then have some external component periodically copy the counter into some history table and calculate the derivative, second derivative, running average of the first derivative, etc. You missed the original point of the discussion, which was to have stats we could use for auto-tuning internally. Not to export them. For example, there are optimizations we could make with the query planner if we knew which tables and indexes were hot in general. That's how we started this discussion, and it's not solved by storing the stats history on another server. There's value to both, and there's no dearth of monitoring frameworks that people keep on replacing with successors, so there's certainly room for both ;-). Recent stuff about such... https://lopsa.org/content/philosophy-monitoring https://labs.omniti.com/labs/reconnoiter I'm not quite sure what ought to be in PostgreSQL as a built-in; I suspect that what's eventually needed is to be able to correlate things across database instances, so that when Tom says, I need to know what data the planner's working on, the answer can be OK, got that... This data is surely useful to get out of the system, so I'd bias towards something sorta like what Greg suggests. And the closed-ended answer may prevent us from asking more sophisticated questions, also not a notably good thing... -- (reverse (concatenate 'string moc.liamg @ enworbbc)) If tautologies do not convey information, mathematicians would not be surprised by them. -- Mark Miller -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PL developer summit, May 21 at PgCon
Hi! I've been contacting PL developers directly over the last couple of weeks about having a PL summit at PgCon this year. The overwhelming response was: yes, let's do it! So, we will have the summit starting at 9:30am, Saturday May 21 at PgCon in Ottawa. We'll meet in one of the session rooms on campus. Details on what has been talked about so far is here: http://www.chesnok.com/daily/2011/02/28/pl-developer-summit-at-pgcon-may-21/ To RSVP, please fill out this form: https://spreadsheets0.google.com/viewform?formkey=dFIxWEU3T3I2WGlORVhkQW4zZVQ0VkE6MQ We'll provide lunch for those that RSVP. I've started a wiki page for the agenda: http://wiki.postgresql.org/wiki/PgCon_2011_PL_Summit Thanks! -selena -- http://chesnok.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Native XML
On 02/28/2011 05:28 PM, Kevin Grittner wrote: Antonantonin.hou...@gmail.com wrote: it was actually the focal point of my considerations: whether to store plain text or 'something else'. There seems to be an almost universal assumption that storing XML in its native form (i.e. a text stream) is going to produce inefficient results. Maybe it will, but I think it needs to be fairly convincingly demonstrated. And then we would have to consider the costs. For example, unless we implemented our own XPath processor to work with our own XML format (do we really want to do that?), to evaluate an XPath expression for a piece of XML we'd actually need to produce the text format from our internal format before passing it to some external library to parse into its internal format and then process the XPath expression. That means we'd actually be making things worse, not better. But this is clearly the sort of processing people want to do - see today's discussion upthread about xpath_table. I'm still waiting to hear what it is that the OP is finding hard to do because we use libxml2. Given that there were similar issues for other hierarchical data types, perhaps we need something similar to tsvector, but for hierarchical data. The extra layer of abstraction might not cost much when used for XML compared to the possible benefit with other data. It seems likely to be a very nice fit with GiST indexes. So under this idea, you would always have the text (or maybe byte array?) version of the XML, and you could shard it to a separate column for fast searches. Tsearch should be able to handle XML now. It certainly knows how to recognize XML tags. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI bug?
An updated patch to address this issue is attached. It fixes a couple issues related to use of the backend-local lock table hint: - CheckSingleTargetForConflictsIn now correctly handles the case where a lock that's being held is not reflected in the local lock table. This fixes the assertion failure reported in this thread. - PredicateLockPageCombine now retains locks for the page that is being removed, rather than removing them. This prevents a potentially dangerous false-positive inconsistency where the local lock table believes that a lock is held, but it is actually not. - add some more comments documenting the times when the local lock table can be inconsistent with reality, as reflected in the shared memory table. This patch also incorporates Kevin's changes to copy locks when creating a new version of a tuple rather than trying to maintain a linkage between different versions. So this is a patch that should apply against HEAD and addresses all outstanding SSI bugs known to Kevin or myself. Besides the usual regression and isolation tests, I have tested this by running DBT-2 on a 16-core machine to verify that there are no assertion failures that only show up under concurrent access. Dan -- Dan R. K. Ports MIT CSAILhttp://drkp.net/ diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index ba01874..7a0e1a9c 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -824,7 +824,6 @@ restart: if (_bt_page_recyclable(page)) { /* Okay to recycle this page */ - Assert(!PageIsPredicateLocked(rel, blkno)); RecordFreeIndexPage(rel, blkno); vstate-totFreePages++; stats-pages_deleted++; diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c index d660ce5..580af2a 100644 --- a/src/backend/storage/lmgr/predicate.c +++ b/src/backend/storage/lmgr/predicate.c @@ -124,10 +124,6 @@ * SerializableXactHashLock * - Protects both PredXact and SerializableXidHash. * - * PredicateLockNextRowLinkLock - * - Protects the priorVersionOfRow and nextVersionOfRow fields of - * PREDICATELOCKTARGET when linkage is being created or destroyed. - * * * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California @@ -444,8 +440,6 @@ static void ReleaseOneSerializableXact(SERIALIZABLEXACT *sxact, bool partial, bool summarize); static bool XidIsConcurrent(TransactionId xid); static void CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag); -static bool CheckSingleTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag, - PREDICATELOCKTARGETTAG *nexttargettag); static void FlagRWConflict(SERIALIZABLEXACT *reader, SERIALIZABLEXACT *writer); static void OnConflict_CheckForSerializationFailure(const SERIALIZABLEXACT *reader, SERIALIZABLEXACT *writer); @@ -1044,7 +1038,6 @@ InitPredicateLocks(void) PredXact-LastSxactCommitSeqNo = FirstNormalSerCommitSeqNo - 1; PredXact-CanPartialClearThrough = 0; PredXact-HavePartialClearedThrough = 0; - PredXact-NeedTargetLinkCleanup = false; requestSize = mul_size((Size) max_table_size, PredXactListElementDataSize); PredXact-element = ShmemAlloc(requestSize); @@ -1651,9 +1644,11 @@ PageIsPredicateLocked(const Relation relation, const BlockNumber blkno) * Important note: this function may return false even if the lock is * being held, because it uses the local lock table which is not * updated if another transaction modifies our lock list (e.g. to - * split an index page). However, it will never return true if the - * lock is not held. We only use this function in circumstances where - * such false negatives are acceptable. + * split an index page). However, it will almost never return true if + * the lock is not held; it can only do so in rare circumstances when + * a coarser-granularity lock that covers this one is being held. We + * are careful to only use this function in circumstances where such + * errors are acceptable. */ static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag) @@ -1717,6 +1712,9 @@ GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag, /* * Check whether the lock we are considering is already covered by a * coarser lock for our transaction. + * + * Like PredicateLockExists, this function might return a false + * negative, but it will never return a false positive. */ static bool CoarserLockCovers(const PREDICATELOCKTARGETTAG *newtargettag) @@ -1747,7 +1745,6 @@ static void RemoveTargetIfNoLongerUsed(PREDICATELOCKTARGET *target, uint32 targettaghash) { PREDICATELOCKTARGET *rmtarget; - PREDICATELOCKTARGET *next; Assert(LWLockHeldByMe(SerializablePredicateLockListLock)); @@ -1755,33 +1752,6 @@ RemoveTargetIfNoLongerUsed(PREDICATELOCKTARGET *target, uint32 targettaghash) if
Re: [HACKERS] WIP: cross column correlation ...
But it's not the same as tracking *sections of a table*. I dunno. I imagine if you have a section of a table in different storage than other sections, you created a tablespace and moved the partition holding that section there. Otherwise, how do you prevent the tuples from moving to other sections? (We don't really have a concept of sections of a table.) Section could be as simple as being on the inner or outer part of a single disk, or as complicated as being on the SSD cache of a spinning disk, or in the multi-gigabyte cache on the raid card or SAN due to being consistently accessed. Section is the wrong word. If primary key values under 10 million are consistently accessed, they will be cached even if they do get moved through the structure. Values over 10M may be fast if on the same page as the other value but probably aren't. This is very evident when dealing with time based data in what can be a very large structure. 1% may be very hot and in memory while 99% is not. Partitioning only helps if you can predict what will be hot in the future. Sometimes an outside source (world events) impacts what section of the structure is hot. regards, Rod
Re: [HACKERS] Sync Rep v17
On Mon, 2011-02-28 at 18:40 +, Simon Riggs wrote: SyncRepReleaseWaiters should be called when walsender exits. Otherwise, if the standby crashes while a transaction is waiting for replication, it waits infinitely. Will think on this. The behaviour seems correct to me: If allow_standalone_primary = off then you wish to wait forever (at your request...) If allow_standalone_primary = on then we sit and wait until we hit client timeout, which occurs even after last standby has gone. -- Simon Riggs http://www.2ndQuadrant.com/books/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Feb 28, 2011, at 3:28 PM, daveg wrote: On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote: On 12.01.2011 06:21, Fujii Masao wrote: On Sat, Dec 25, 2010 at 2:09 PM, Maxim Bogukmaxim.bo...@gmail.com wrote: While I trying create reproducible test case for BUG #5798 I encountered very strange effect on two of my servers (both servers have same hardware platform/OS (freebsd 7.2) and PostgreSQL 8.4.4). Very simple test table created as: CREATE TABLE test (id integer); INSERT INTO test select generate_series(0,1); And I trying repeateble vacuum of that table with script: perl -e foreach (1..10) {system \psql -d test -h -c 'vacuum test'\;} And once per like an minute (really random intervals can be 5 minutes without problems can be 3 vacuum in row show same error) I getting next errors: WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation test page 1 ... WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation test page 30 for all pages of the relation. Oh, interesting. This is the first time anyone can reliably reproducible that. I can't reproduce that on my laptop with that script, though, so I'm going to need your help to debug this. Can you compile PostgreSQL with the attached patch, and rerun the test? It will dump the pages with incorrectly set flags to files in /tmp/, and adds a bit more detail in the WARNING. Please run the test until you get those warnings, and tar up the the created /tmp/pageimage* files, and post them along with the warning generated. We'll likely need to go back and forth a few times with various debugging patches until we get to the heart of this.. Anything new on this? I'm seeing at on one of my clients production boxes. Also, what is the significance, ie what is the risk or damage potential if this flag is set incorrectly? Was this cluster upgraded to 8.4.4 from 8.4.0? It sounds to me like a known bug in 8.4.0 which was fixed by this commit: commit 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f Author: Tom Lane t...@sss.pgh.pa.us Date: Mon Aug 24 02:18:32 2009 + Fix a violation of WAL coding rules in the recent patch to include an all tuples visible flag in heap page headers. The flag update *must* be applied before calling XLogInsert, but heap_update and the tuple moving routines in VACUUM FULL were ignoring this rule. A crash and replay could therefore leave the flag incorrectly set, causing rows to appear visible in seqscans when they should not be. This might explain recent reports of data corruption from Jeff Ross and others. In passing, do a bit of editorialization on comments in visibilitymap.c. oy:postgresql machack$ git describe --tag 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f REL8_4_0-190-g7fc7a7c If the flag got twiddled while running as 8.4.0, the incorrect PD_ALL_VISIBLE flag would (obviously) not be fixed by the upgrade to 8.4.4. (Is this a separate issue?) Regards, David -- David Christensen End Point Corporation da...@endpoint.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 1, 2011 at 10:43 AM, David Christensen da...@endpoint.com wrote: Was this cluster upgraded to 8.4.4 from 8.4.0? It sounds to me like a known bug in 8.4.0 which was fixed by this commit: commit 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f Author: Tom Lane t...@sss.pgh.pa.us Date: Mon Aug 24 02:18:32 2009 + Fix a violation of WAL coding rules in the recent patch to include an all tuples visible flag in heap page headers. The flag update *must* be applied before calling XLogInsert, but heap_update and the tuple moving routines in VACUUM FULL were ignoring this rule. A crash and replay could therefore leave the flag incorrectly set, causing rows to appear visible in seqscans when they should not be. This might explain recent reports of data corruption from Jeff Ross and others. In passing, do a bit of editorialization on comments in visibilitymap.c. oy:postgresql machack$ git describe --tag 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f REL8_4_0-190-g7fc7a7c If the flag got twiddled while running as 8.4.0, the incorrect PD_ALL_VISIBLE flag would (obviously) not be fixed by the upgrade to 8.4.4. (Is this a separate issue?) Yes, it's a different issue. I observed it on 8.4.2. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
On Tue, Mar 1, 2011 at 1:43 AM, David Christensen da...@endpoint.com wrote: Was this cluster upgraded to 8.4.4 from 8.4.0? It sounds to me like a known bug in 8.4.0 which was fixed by this commit: The reproduction script described was running vacuum repeatedly. A single vacuum run out to be sufficient to clean up the problem if it was left-over. I wonder if it would help to write a regression test that runs 100 or so vacuums and see if the bulid farm turns up any examples of this behaviour. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers