[HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
From: pgsql-hackers-ow...@postgresql.org [pgsql-hackers-ow...@postgresql.org] on behalf of Amit kapila [amit.kap...@huawei.com] Sent: Monday, October 15, 2012 7:28 PM To: robertmh...@gmail.com; j...@agliodbs.com Cc: pgsql-hackers@postgresql.org Subject: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review] On Monday, October 15, 2012 7:28 PM Amit kapila wrote: On Saturday, October 13, 2012 1:24 PM Amit kapila wrote: Tue, 26 Jun 2012 17:04:42 -0400 Robert Haas wrote: >> I see you posted up a follow-up email asking Tom what he had in mind. >> Personally, I don't think this needs incredibly complicated testing. >> I think you should just test a workload involving inserting and/or >> updating rows with lots of trailing NULL columns, and then another >> workload with a table of similar width that... doesn't. If we can't >> find a regression - or, better, we find a win in one or both cases - >> then I think we're done here. >As per the last discussion for this patch, performance data needs to be >provided before this patch's Review can proceed >further. >So as per your suggestion and from the discussions about this patch, I have >collected the performance data as below: >Results are taken with following configuration. >1. Schema - UNLOGGED TABLE with 2,000,000 records having all columns are INT >type. >2. shared_buffers = 10GB >3. All the performance result are taken with single connection. >4. Performance is collected for INSERT operation (insert into temptable select >* from inittable) >Platform details: >Operating System: Suse-Linux 10.2 x86_64 >Hardware : 4 core (Intel(R) Xeon(R) CPU L5408 @ 2.13GHz) >RAM : 24GB > Further to Performance data, I have completed the review of the Patch. Please find the patch to address Review Comments attached with this mail. IMO, now its ready for a committer. With Regards, Amit Kapila. Truncate-trailing-null-columns-from-heap-rows.v2.patch Description: Truncate-trailing-null-columns-from-heap-rows.v2.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash_search and out of memory
On Thu, Oct 18, 2012 at 8:35 AM, Tom Lane wrote: > I wrote: >> Hitoshi Harada writes: >>> If OOM happens during expand_table() in hash_search_with_hash_value() >>> for RelationCacheInsert, > > the palloc-based allocator does throw > errors. I think that when that was designed, we were thinking that > palloc-based hash tables would be thrown away anyway after an error, > but of course that's not true for long-lived tables such as the relcache > hash table. > > I'm not terribly comfortable with trying to use a PG_TRY block to catch > an OOM error - there are too many ways that could break, and this code > path is by definition not very testable. I think moving up the > expand_table action is probably the best bet. Will you submit a patch? Here it is. I factored out the bucket finding code to re-calculate it after expansion. Thanks, -- Hitoshi Harada hashoom.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
I wrote: > Or we could go back to using ARRAY here --- that should be safe since > ARRAY is already fully reserved. Ugh ... no, that doesn't work, because ARRAY[...] is allowed in c_expr and hence b_expr. So the ambiguity would still be there. We'd need a different fully-reserved keyword to go this way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On Thu, Oct 18, 2012 at 10:03 PM, Hannu Krosing wrote: > Hmm. Maybe we should think of implementing this as REMOTE TABLE, that > is a table which gets no real data stored locally but all insert got through > WAL > and are replayed as real inserts on slave side. FWIW, MySQL calls this exact concept the "black hole" storage engine. Regards, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys
Marco Nenciarini writes: > Please find the attached refreshed patch (v2) which fixes the loose ends > you found. I've started looking at this patch, and the first thing I notice is that the syntax doesn't work. It's ambiguous, and this: %left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL /* kluge to keep xml_whitespace_option from causing shift/reduce conflicts */ %rightPRESERVE STRIP_P + %nonassoc ELEMENT %% is not in any way an acceptable fix. All that that will do is cause an arbitrary choice to be made when it's not clear what to do. Half the time the arbitrary choice will be wrong. Consider for example regression=# create table t1 (f1 int[] default 4! element references t2); ERROR: column "element" does not exist The parser has resolved the ambiguity about whether "!" is an infix or postfix operator by assuming it's infix. (Yeah, I realize we've "fixed" some similar cases with precedence hacks, but they are cases that were forced on us by brain-dead syntax choices in the SQL standard. We don't need to go there for syntax we're making up ourselves.) We could get around that by making ELEMENT a fully reserved word, but I don't think that's a really acceptable solution. ELEMENT is reserved according to more recent versions of the SQL standard, but only as a built-in function name, and in any case reserving it is very likely to break people's existing applications. Another possibility is to forget about the column constraint ELEMENT REFERENCES syntax, and only support the table-constraint syntax with ELEMENT inside the column list --- I've not checked, but I think that syntax doesn't have any ambiguity problems. Or we could go back to using ARRAY here --- that should be safe since ARRAY is already fully reserved. Or we could choose some other syntax. I'm wondering about dropping the use of a keyword entirely, and instead using '[]' decoration. This wouldn't work too badly in the table constraint case: FOREIGN KEY (foo, bar[]) REFERENCES t (x,y) but I'm less sure where to put the decoration for the column constraint case. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On Thu, Oct 18, 2012 at 6:10 PM, Josh Berkus wrote: > Daniel, > >> I'm not going to disagree with that, I only feel it's reasonable to >> ask why those who react so strongly against deprecation why they think >> what they do, and receive a clinical response, because not everyone >> has seen those use cases. My level of interest in deprecation is only >> as far as "if those who have to deal with the RULES implementation >> don't want to work on it anymore in favor of other things, I think the >> pain to users of deprecation is, from my vantage point, manageable if >> given some time." > > Note that you have heard from one of the people maintaining RULES, who > doesn't find them problematic to maintain (Tom). Note that the original > hackers calling for deprecation do not work on RULEs except where they > touch other features. > > And I'll say again: if you want a full list of use-cases for RULEs, you > need to go further than the -hackers list. This is a small, insular > community which does not represent the majority of PostgreSQL users. > You have gone to Heroku's users, but given the nature of your user base, > they seem like the least likely group of people to use RULEs. That's > like me polling a bunch of Data Warehousing geeks and then declaring > that we don't really need SERIALIZABLE. I have tried very assiduously to avoid generalizing, even though perhaps I have failed. I made a false assumption that nobody wanted to work with RULES that I am very sorry for. What I do not like, and stand by, is that I did not like the form of terse dismissal of even the idea of deprecation, with a feeling that it is entirely unnecessary to explain that assessment in any detail for dozens of emails. I don't think it's a healthy thing, especially for a community where (traditionally) deprecation comes up so seldom. I don't think this took place on the other deprecation threads so much that spawned in this one's wake. I can only offer the data I have. Please do not over-read in what I have been trying to communicate, or conflate my position with those of other individuals, if you feel there is a chance of that. Let us please consider the matter resolved unless you feel you have more pointers for me -- anyone can send them to me individually, perhaps. I am not soldiering for the deprecation of RULES, but rather the community's approach to but a suggestion of certain kinds of change. Perhaps this is but a one-off, because discussion in the other threads has been seemingly healthier. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
Daniel, > I'm not going to disagree with that, I only feel it's reasonable to > ask why those who react so strongly against deprecation why they think > what they do, and receive a clinical response, because not everyone > has seen those use cases. My level of interest in deprecation is only > as far as "if those who have to deal with the RULES implementation > don't want to work on it anymore in favor of other things, I think the > pain to users of deprecation is, from my vantage point, manageable if > given some time." Note that you have heard from one of the people maintaining RULES, who doesn't find them problematic to maintain (Tom). Note that the original hackers calling for deprecation do not work on RULEs except where they touch other features. And I'll say again: if you want a full list of use-cases for RULEs, you need to go further than the -hackers list. This is a small, insular community which does not represent the majority of PostgreSQL users. You have gone to Heroku's users, but given the nature of your user base, they seem like the least likely group of people to use RULEs. That's like me polling a bunch of Data Warehousing geeks and then declaring that we don't really need SERIALIZABLE. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On Thu, Oct 18, 2012 at 1:55 PM, Andrew Dunstan wrote: > > On 10/18/2012 01:11 PM, Daniel Farina wrote: > >> Here's another use case that in my history with RULES that didn't seem >> to pan out so well: In my recollection, one way to use rules is to >> retarget operations that happen against a view and move them to a >> table, and as I recall to make this work as one expected one had to >> have a very wordy RULE (for UPDATEs) with a litany of (fairly simple) >> equality and not-null conditions to make it work as one would expect >> (to not under-constrain the UPDATE). This became a maintenance >> headache whenever attributes were added to the underlying relation. > > > > Yes, but you also get a similar headache with a trigger. Unless you're VERY > careful you can get a trigger failure by adding an attribute, and an almost > guaranteed one by removing an attribute. It's true that the language for > specifying the operations is more expressive, but no matter what mechanism > you use, changing the shape of the objects can get you into trouble. > > I've never said that rules are perfect, nor that they should be used > whenever possible. What I have said is that there are known cases where they > are the best solution currently available. I still think that. I'm not going to disagree with that, I only feel it's reasonable to ask why those who react so strongly against deprecation why they think what they do, and receive a clinical response, because not everyone has seen those use cases. My level of interest in deprecation is only as far as "if those who have to deal with the RULES implementation don't want to work on it anymore in favor of other things, I think the pain to users of deprecation is, from my vantage point, manageable if given some time." I also want to be very clear that I know my vantage point is skewed, but I feel like exposing what assessment of user activity I can to -hackers is important, and the best I can do when it comes to considering topics like these. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2
On Thu, Oct 18, 2012 at 10:36 PM, Robert Haas wrote: > Sadly, the situation on Windows doesn't look so good. I > don't remember the exact numbers but I think it was something like 40 > or 60 or 80 times slower on the Windows box one of my colleagues > tested than it is on Linux. Do you happen to know the hardware and Windows version? Windows QueryPerformanceCounter that instr_time.h uses should use RDTSC based timing when the hardware can support it, just like Linux. I don't know if Windows can avoid syscall overhead though. > Maybe it's worth finding a platform where > pg_test_timing reports that timing is very slow and then measuring how > much impact this has on something like a pgbench or pgbench -S > workload. This can easily be tested on Linux by changing to the hpet or acpi_pm clocksource. There probably still are platforms that can do worse than this, but probably not by orders of magnitude. Regards, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On Thu, 2012-10-18 at 12:38 -0400, Tom Lane wrote: > I think the argument that it causes user confusion is a fairly strong > one, though. What is confusing, IMO, is changing the hba syntax all the time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On Thu, 2012-10-18 at 13:20 +0200, Magnus Hagander wrote: > In particular, we made a couple of changes over sveral releases back > in the authentication config, that we should perhaps consider > finishing by removing the old stuff now? > > 1. krb5 authentication. We've had gssapi since 8.3 (which means in all > supported versions). krb5 has been deprecated, also since 8.3. Time to > remove it? > > 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old > syntax deprecated but still mapping to the new one. Has it been there > long enough that we should start throwing an error for ident on unix? > The hba syntax changes between 8.3 and 8.4 continue to annoy me to this day, so I'd like to avoid these in the future, especially if they are for mostly cosmetic reasons. I think any change should be backward compatible to all supported versions, or alternatively to 8.4, since that's incompatible with 8.3 anyway. (Those two will be the same before 9.3 goes out.) So, in my opinion, krb5 could be removed, assuming that gssapi is a full substitute. But ident-over-unix-sockets should stay, at least until 9.0 is EOL. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans
On Thu, Oct 18, 2012 at 5:30 PM, Claudio Freire wrote: > Backward: > >QUERY PLAN > -- > GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual > time=28.190..157708.405 rows=9001 loops=1) >-> Index Only Scan Backward using pgbench_accounts_pkey on > pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4) > (actual time=28.178..135282.317 rows=9001 loops=1) > Index Cond: ((aid >= 1000) AND (aid <= 2)) > Heap Fetches: 0 > Total runtime: 160735.539 ms > I/O thoughput averages 12MB/s (a small increase), and the 3-second > difference seems related to it (it's consistent). > I/O utilization averages 88% (important increase) > > This last result makes me think deeper prefetching could be > potentially beneficial (it would result in read merges), but it's > rather hard to implement without a skiplist of leaf pages. Maybe the > backward-sequential pattern could be detected. I'll have to tinker > with that. Fun. That didn't take long. With the attached anti-sequential scan patch, and effective_io_concurrency=8: QUERY PLAN - GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual time=26.964..84299.789 rows=9001 loops=1) -> Index Only Scan Backward using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4) (actual time=26.955..62761.774 rows=9001 loops=1) Index Cond: ((aid >= 1000) AND (aid <= 2)) Heap Fetches: 0 Total runtime: 87170.355 ms I/O thoughput 22MB/s (twice as fast) I/O utilization 95% (I was expecting 100% but... hey... good enough) With e_i_c=24, it gets to 100% utilization and 30MB/s (that's 3 times faster). So, I'd like to know what you think, but maybe for back-sequential scans, prefetch should be set to a multiple (ie: x24) of e_i_c, in order to exploit read request merges. postgresql-git-bt_prefetch_backseq.diff Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Skip checkpoint on promoting from streaming replication
On 18 October 2012 21:22, Alvaro Herrera wrote: > This patch seems to have been neglected by both its submitter and the > reviewer. Also, Simon said he was going to set it > returned-with-feedback on his last reply, but I see it as needs-review > still in the CF app. Is this something that is going to be reconsidered > and resubmitted for the next commitfest? If so, please close it up in > the current one. I burned time on the unlogged table problems, so haven't got round to this yet. I'm happier than I was with this. I'm also conscious this is very important and there are no later patch dependencies, so there's no rush to commit it and every reason to make sure it happens without any mistakes. It will be there for 9.3. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
Well, it'd be nice to be able to rewrite a query referring to a table to still refer to that same table, but you can't, because you get infinite recursion. If that was possible it would be quite easy to express any row/column level security policies with it. If you could do that, it'd presumably be a WHOLE lot faster than triggers. Because frankly, if triggers were infinitely fast, I don't think we'd be having this conversation. But they're not. They're slow; really slow, and while we may be able to improve that somewhat by some means, they're basically always going to be slow. Being able to rewrite queries is inside the server is useful, and rules are not a very good solution to that problem, but right now they're the only thing we've got. Moreover there are no triggers fired on selects. Best regards -- Luben Karavelov
Re: [HACKERS] embedded list
Alvaro Herrera writes: > Oops. I mentioned this explicitely somewhere in the discussion. I > assumed you had seen that, and that you would have complained had you > found it objectionable. Sorry, I've been too busy to pay very much attention to this patch. >> I think we should remove the head argument at least from dlist_delete, >> and probably also dlist_insert_after and dlist_insert_before. > There are more functions that get the list head just to run the check. > Can I assume that you don't propose removing the argument from those? > (dlist_next_node, dlist_prev_node I think are the only ones). Yeah, I wondered whether to do the same for those. But it's less of an issue there, because in practice the caller is almost certainly going to also need to do dlist_has_next or dlist_has_prev respectively, and those require the list header. On the other hand, applying the same principle to slists, you could argue that slist_has_next and slist_next_node should not require the head pointer (since that's throwing away an advantage of slists). If we wanted to remove the head pointer from those, there would be some value in not having the head argument in dlist_next_node/dlist_prev_node for symmetry with slist_next_node. I'm not as excited about these since it seems relatively less likely to matter. What do you think? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On 10/18/2012 01:11 PM, Daniel Farina wrote: Here's another use case that in my history with RULES that didn't seem to pan out so well: In my recollection, one way to use rules is to retarget operations that happen against a view and move them to a table, and as I recall to make this work as one expected one had to have a very wordy RULE (for UPDATEs) with a litany of (fairly simple) equality and not-null conditions to make it work as one would expect (to not under-constrain the UPDATE). This became a maintenance headache whenever attributes were added to the underlying relation. Yes, but you also get a similar headache with a trigger. Unless you're VERY careful you can get a trigger failure by adding an attribute, and an almost guaranteed one by removing an attribute. It's true that the language for specifying the operations is more expressive, but no matter what mechanism you use, changing the shape of the objects can get you into trouble. I've never said that rules are perfect, nor that they should be used whenever possible. What I have said is that there are known cases where they are the best solution currently available. I still think that. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
2012-10-18 20:08 keltezéssel, Tom Lane írta: Alvaro Herrera writes: Boszormenyi Zoltan escribió: this is the latest one, fixing a bug in the accounting of per-statement lock timeout handling and tweaking some comments. Tom, are you able to give this patch some more time on this commitfest? I'm still hoping to get to it, but I've been spending a lot of time on bug fixing rather than patch review lately :-(. If you're hoping to close out the current CF soon, maybe we should just slip it to the next one. Fine by me. Thanks. -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] embedded list
Tom Lane escribió: > Alvaro Herrera writes: > > Here's the final version. I think this is ready to go in. > > I got around to reviewing this today. I'm pretty seriously annoyed at > the definition of dlist_delete: it should *not* require the list header. > The present coding simply throws away one of the primary advantages of > a doubly-linked list over a singly-linked list, namely that you don't > have to have your hands on the list header in order to unlink a node. > This isn't merely academic either, as I see that the patch to catcache > code actually added a field to struct catctup to support making the > list header available. That's a complete waste of 8 bytes (on a 64-bit > machine) per catalog cache entry. The only thing it buys for us is > the ability to run dlist_check, which is something that isn't even > compiled (not even in an Assert build), and which doesn't actually do > that much useful even if it is compiled --- for instance, there's no way > to verify that the nodes were actually in the list claimed. Oops. I mentioned this explicitely somewhere in the discussion. I assumed you had seen that, and that you would have complained had you found it objectionable. (It's hard enough to figure out if people don't respond because they don't have a problem with something, or just because they didn't see it.) On the other hand, it's convenient to remove them, because in predicate.c there are plenty of SHM_QUEUE node removals which is clearly easier to port over to dlist if we don't have to figure out exactly which list each node is in. (Maybe in other SHM_QUEUE users as well, but that's the most complex of the bunch.) > I think we should remove the head argument at least from dlist_delete, > and probably also dlist_insert_after and dlist_insert_before. There are more functions that get the list head just to run the check. Can I assume that you don't propose removing the argument from those? (dlist_next_node, dlist_prev_node I think are the only ones). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Prefetch index pages for B-Tree index scans
I've noticed, doing some reporting queries once, that index scans fail to saturate server resources on compute-intensive queries. Problem is, just after fetching a page, postgres starts computing stuff before fetching the next. This results in I/O - compute - I/O - compute alternation that results in idle CPU and disk, both. I've also noticed earlier patches attempted to implement prefetch of index pages, yet they don't seem to be committed. I'm wondering why. The patches themselves were quite complex, attempting to prefetch heap tuples in addition to index tuples. I can see how this would be beneficial, but heap prefetch is notoriously more complicated, and with the advent of index-only scans, maybe index-page-only prefetching would be of use. To test this hypothesis, I wrote the attached patch. pgbench doesn't seem to see any performance impact (expected, since pgbench uses really tiny queries that touch a single page). Using pgbench's data with a scale of 1000, I tried some other queries that were more taxing of index-only scans. This was done on my personal computer, with a really lame I/O subsystem (compared to database servers), and with 9.2.1 rather than git, but I think it should be significant anyway. I will try to verify on RAID-ed disks, but I'm in short supply of test systems at the moment. Pgbench's biggest index (on pgbench_accounts.aid) is not unsurprisingly quite sequential, since it has been just created. So, I tested both forward and backward index scans, to get an idea of how the patch impacts on sequential and non-sequential workloads. I haven't been able to test truly random ones yet - I'm trying to load up a dump from a production database that's both big and messy to test. A few things worth noting, is that the patch avoids doing prefetch on single-page index access, and only when block numbers aren't sequentially increasing (only when scanning the index nonsequentially will prefetch be attempted), since I noticed the fadvise call in those cases was being counterproductive. So, here we go: The base query I used, is: explain analyze select aid, count(*) from pgbench_accounts where aid between 1000 and 2 group by aid order by aid; For backward scans, just order by aid desc. The full command used is sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches' ; pg_ctl start -l ${PGLOG} ; sleep 5 ; ( sleep 10 ; echo 'set effective_io_concurrency to 0;' ; echo 'explain analyze select aid, count(*) from pgbench_accounts where aid between 1000 and 2 group by aid order by aid;' ) | psql -h localhost -p 5433 pgbench ; sleep 5 ; pg_ctl stop server starting The server is started and stopped every time to make sure the shared cache is empty, and sleeps are there to avoid backlash I/O from dropping caches to influence the benchmark. Results: With effective_io_concurrency set to 0 (patch disabled): Forward: QUERY PLAN GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual time=47.552..31113.353 rows=9001 loops=1) -> Index Only Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4) (actual time=47.542..13197.982 rows=9001 loops=1) Index Cond: ((aid >= 1000) AND (aid <= 2)) Heap Fetches: 0 Total runtime: 33648.500 ms I/O thoughtput averages 60MB/s (clearly sequential) I/O utilization averages around 30% Backward: QUERY PLAN -- GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual time=10.279..159704.590 rows=9001 loops=1) -> Index Only Scan Backward using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4) (actual time=10.266..132853.382 rows=9001 loops=1) Index Cond: ((aid >= 1000) AND (aid <= 2)) Heap Fetches: 0 Total runtime: 163202.869 ms I/O thoughput averages 11MB/s (clearly not fully random, but neither sequential) I/O utilization averages 68% With effective_io_concurrency set to 1 (patch enabled): Forward: QUERY PLAN GroupAggregate (cost=0.00..4149039.04 rows=90257289 width=4) (actual time=47.582..30474.222 rows=9001 loops=1) -> Index Only Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.00..2795179.71 rows=90257289 width=4) (actual time=47.571..12208.340 rows=9001 loops=1) Index Cond: ((aid >= 1000) AND (aid <= 2)) Heap Fetches: 0 Total runtime: 32875.6
Re: [HACKERS] Skip checkpoint on promoting from streaming replication
This patch seems to have been neglected by both its submitter and the reviewer. Also, Simon said he was going to set it returned-with-feedback on his last reply, but I see it as needs-review still in the CF app. Is this something that is going to be reconsidered and resubmitted for the next commitfest? If so, please close it up in the current one. Thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] embedded list
Alvaro Herrera writes: > Here's the final version. I think this is ready to go in. I got around to reviewing this today. I'm pretty seriously annoyed at the definition of dlist_delete: it should *not* require the list header. The present coding simply throws away one of the primary advantages of a doubly-linked list over a singly-linked list, namely that you don't have to have your hands on the list header in order to unlink a node. This isn't merely academic either, as I see that the patch to catcache code actually added a field to struct catctup to support making the list header available. That's a complete waste of 8 bytes (on a 64-bit machine) per catalog cache entry. The only thing it buys for us is the ability to run dlist_check, which is something that isn't even compiled (not even in an Assert build), and which doesn't actually do that much useful even if it is compiled --- for instance, there's no way to verify that the nodes were actually in the list claimed. I think we should remove the head argument at least from dlist_delete, and probably also dlist_insert_after and dlist_insert_before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On Wed, Oct 17, 2012 at 7:25 PM, Tom Lane wrote: > Josh Berkus writes: >> I would tend to say "well, they're not hurting anyone, why not keep >> them?" Except that we're gathering an increasing number of features >> (RETURNING, FDWs, CTEs, Command triggers) which don't work well together >> with RULEs. > > Really? On what do you base that claim? The only one of those that I > might believe is command triggers, but AFAIK we only have/plan command > triggers for DDL, so there's no overlap. > > I'm fairly annoyed by the entire tenor of this conversation, because > the people who are hollering the loudest seem to be people who have > never actually touched any of the rules code, but nonetheless seem > prepared to tell those of us who have what to spend our time on. > > Now having said that, I would definitely like to see rules in their > current form go away eventually. But not without a substitute. > Triggers are not a complete replacement, and no amount of wishful > thinking makes them so. > > Perhaps it would be more profitable to try to identify the pain points > that make people so eager to get rid of rules, and then see if we could > alleviate them. One big problem I know about offhand is the > multiple-evaluation risk, which seems at least in principle fixable. > What others are there? Well, it'd be nice to be able to rewrite a query referring to a table to still refer to that same table, but you can't, because you get infinite recursion. If you could do that, it'd presumably be a WHOLE lot faster than triggers. Because frankly, if triggers were infinitely fast, I don't think we'd be having this conversation. But they're not. They're slow; really slow, and while we may be able to improve that somewhat by some means, they're basically always going to be slow. Being able to rewrite queries is inside the server is useful, and rules are not a very good solution to that problem, but right now they're the only thing we've got. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_lwlocks view - lwlocks statistics, round 2
On Tue, Oct 16, 2012 at 11:31 AM, Satoshi Nagayasu wrote: > A flight-recorder must not be disabled. Collecting > performance data must be top priority for DBA. This analogy is inapposite, though, because a flight recorder rarely crashes the aircraft. If it did, people might have second thoughts about the "never disable the flight recorder" rule. I have had a couple of different excuses to look into the overhead of timing lately, and it does indeed seem that on many modern Linux boxes even extremely frequent gettimeofday calls produce only very modest amounts of overhead. Sadly, the situation on Windows doesn't look so good. I don't remember the exact numbers but I think it was something like 40 or 60 or 80 times slower on the Windows box one of my colleagues tested than it is on Linux. And it turns out that that overhead really is measurable and does matter if you do it in a code path that gets run frequently. Of course I am enough of a Linux geek that I don't use Windows myself and curse my fate when I do have to use it, but the reality is that we have a huge base of users who only use PostgreSQL at all because it runs on Windows, and we can't just throw those people under the bus. I think that older platforms like HP/UX likely have problems in this area as well although I confess to not having tested. That having been said, if we're going to do this, this is probably the right approach, because it only calls gettimeofday() in the case where the lock acquisition is contended, and that is a lot cheaper than calling it in all cases. Maybe it's worth finding a platform where pg_test_timing reports that timing is very slow and then measuring how much impact this has on something like a pgbench or pgbench -S workload. We might find that it is in fact negligible. I'm pretty certain that it will be almost if not entirely negligible on Linux but that's not really the case we need to worry about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #7521: Cannot disable WAL log while using pg_dump
You may disable full_page_writes, but as you can see from my previous post, disabling it did not do the trick. My zfs' USED property continues to increase. On Wed, Oct 17, 2012 at 3:55 PM, ichbinrene wrote: > I'm experiencing the exact same issue: > > PostgreSQL 9.1.6 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.4.6 20120305 > (Red Hat 4.4.6-4), 32-bit > CentOS release 6.3 (Final) > > I might also turn full_page_writes off but I fear for data integrity in case > of a crash . > > > > -- > View this message in context: > http://postgresql.1045698.n5.nabble.com/BUG-7521-Cannot-disable-WAL-log-while-using-pg-dump-tp5722846p5728727.html > Sent from the PostgreSQL - bugs mailing list archive at Nabble.com. > > > -- > Sent via pgsql-bugs mailing list (pgsql-b...@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Incorrect behaviour when using a GiST index on points
On Thu, Oct 11, 2012 at 07:17:28AM -0400, Noah Misch wrote: > On Tue, Oct 02, 2012 at 01:58:40PM -0400, Noah Misch wrote: > > > > On Mon, Aug 27, 2012 at 7:43 PM, Tom Lane wrote: > > > >> There's also the big-picture question of whether we should just get rid > > > >> of fuzzy comparisons in the geometric types instead of trying to hack > > > >> indexes to work around them. > > > In any event, I think we should entertain a patch to make the GiST operator > > class methods bug-compatible with corresponding operators. Even if we > > decide > > to change operator behavior in HEAD, the back branches could use it. > > We have broad agreement that the specific implementation of fuzz in geometric > comparison operators is shoddy, but nobody has voiced interest in designing a > concrete improvement. I propose adding a TODO item "Remove or improve > rounding in geometric comparison operators", endorsing Alexander's design, and > reviewing his patch. Objections? TODO added, and here's a review: The patch adds no regression tests; it should add tests illustrating the problems it fixes. I audited the other indexable geometric operators for similar problems. This passage in gist_point_consistent_internal(), which handles (point,point) operators, caught my suspicion: case RTSameStrategyNumber: if (isLeaf) { result = FPeq(key->low.x, query->x) && FPeq(key->low.y, query->y); } else { result = (query->x <= key->high.x && query->x >= key->low.x && query->y <= key->high.y && query->y >= key->low.y); } break; A leaf entry reachable from an internal entry may fall exactly on the internal-entry bounding box. Since we would accept a fuzzy match at the leaf level, I think we must also accept a fuzzy match at the internal level. > *** a/src/backend/access/gist/gistproc.c > --- b/src/backend/access/gist/gistproc.c > *** > *** 1326,1331 gist_point_consistent(PG_FUNCTION_ARGS) > --- 1327,1333 > bool *recheck = (bool *) PG_GETARG_POINTER(4); > boolresult; > StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset; > + BOX*query, *key; This function now has "query" variables within subsidiary blocks redundant with and masking this one. Avoid doing that. > > switch (strategyGroup) > { > *** > *** 1337,1348 gist_point_consistent(PG_FUNCTION_ARGS) > *recheck = false; > break; > case BoxStrategyNumberGroup: > ! result = DatumGetBool(DirectFunctionCall5( > ! > gist_box_consistent, > ! > PointerGetDatum(entry), > ! > PG_GETARG_DATUM(1), > ! > Int16GetDatum(RTOverlapStrategyNumber), > ! >0, PointerGetDatum(recheck))); > break; > case PolygonStrategyNumberGroup: > { > --- 1339,1356 > *recheck = false; > break; > case BoxStrategyNumberGroup: > ! /* > ! * This code repeats logic of on_ob which uses simple > comparison > ! * rather than FP* functions. > ! */ > ! query = PG_GETARG_BOX_P(1); > ! key = DatumGetBoxP(entry->key); > ! > ! *recheck = false; > ! result = key->high.x >= query->low.x && > ! key->low.x <= query->high.x && > ! key->high.y >= query->low.y && > ! key->low.y <= query->high.y; For leaf entries, this correctly degenerates to on_pb(). For internal entries, it must, but does not, implement box_overlap(). (The fuzzy box_overlap() would be fine.) I recommend making gist_point_consistent()'s treatment of boxes resemble its treatment of circles and polygons; that eases verifying their correctness. Call gist_box_consistent. Then, for leaf entries, call box_contain_pt(). GiST "consistent" functions often validate the strategy number, but the circle, polygon and box branches of gist_point_consistent silently assum
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On Thu, Oct 18, 2012 at 2:56 PM, Hannu Krosing wrote: > * works as table on INSERTS up to inserting logical WAL record describing > the > insert but no data is inserted locally. > > with all things that follow from the local table having no data > - unique constraints don't make sense > - indexes make no sense > - updates and deletes hit no data > - etc. . . Yep, I think I was understanding those aspects. I think I disagree that "indexes make no sense." I think that it would be meaningful to have an index type for this, one that is a pointer at WAL records, to enable efficiently jumping to the right WAL log to start accessing a data stream, given an XID. That's a fundamentally different sort of index than we have today (much the way that hash indexes, GiST indexes, and BTrees differ from one another). I'm having a hard time thinking about what happens if you have cascaded replication, and want to carry records downstream. In that case, the XIDs from the original system aren't miscible with the XIDs in a message queue on a downstream database, and I'm not sure what we'd want to do. Keep the original XIDs in a side attribute, maybe? It seems weird, at any rate. Or perhaps data from foreign sources has got to go into a separate queue/'sorta-table', and thereby have two XIDs, the "source system XID" and the "when we loaded it in locally XID." -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Truncate if exists
On Tue, Oct 16, 2012 at 2:12 PM, Stafford, David x78061 wrote: > On Mon, 15 Oct 2012 10:21:18 -0700, Robert Haas wrote: >> Yeah, I think the functionality that we need is pretty much there >> already today. What we need to do is to get the syntax to a point >> where people can write the code they want to write without getting >> tangled up by it. >> >> I think the invention of DO was a big step in the right direction >> ... >> With DO, you can write the logic you want >> as an SQL statement, it's just a clunky and awkward SQL statement. In >> my view the goal ought to be to refine that mechanism to remove the >> clunkiness and awkwardness, rather than to invent something completely >> new. > > As someone who has worked with a number of databases now, none of them really > get this DDL integration completely right. What I'd like to see is 1) a > predicate to easily test things about the schema (does this table, column, > index, schema, etc. exist? does it have the right type?) and 2) a way to > conditionally execute DDL (and DML, which should fall right out, but it isn't > really what this discussion is covering). I would propose extending the > current EXISTS / NOT EXISTS predicate as follows: > > [NOT] EXISTS TABLE tab [ ( col [type] [ , col [type]]... ) ] > [NOT] EXISTS COLUMN tab.col [type] > [NOT] EXISTS INDEX tab.idx [ ( col [ , col]... ) ] > [NOT] EXISTS CONSTRAINT tab.cstrt -- not sure what else might be useful here > [NOT] EXISTS ( select ) -- this is the current EXISTS predicate, of course > [NOT] EXISTS ANY ROWS FROM tab [join] [WHERE predicate] [GROUP BY col [ , > col]...] -- exactly the same as > > -- (select 1 FROM etc.) > > -- only because I like > > -- it better > (the latter [which by no means am I nuts over; it's just that when extending > EXISTS I can't stop neatening it up to my personal preferences] could be > extended with [NOT] EXISTS MORE THAN n ROWS FROM and [NOT] EXISTS EXACTLY n > ROWS FROM.) > > There is a new SQL statement: IF predicate true-statement [ELSE > false-statement]. > > To actually execute this new IF statement, the executor would need an IF node > that evaluates the predicate (with ANDs and ORs, just like all SQL > predicates) and then executes the rest only if the predicate came out TRUE > (or NOT FALSE; I forget which is usually used, and the difference with NULL > could be useful, as long as it matches other predicates). This moves one > more bit of procedural logic into the executor. > > Another wrinkle is that the dependent statement might not compile, due to > missing tables or whatnot. Actually executing it while it doesn't compile is > an error, but we want to defer that error until we actually decide we need to > execute it. Also, it's probably good to try compiling it again at that > point. So my thought would be to try planning the dependent statement(s); if > they compile, hook them to the IF node; if not, hook a DEFERRED node to the > IF node. The DEFERRED node has the parse tree (or raw string, whatever makes > sense) of the statement; on execution it tries again to plan that statement; > if it succeeds, run it; if not, error out. > > I'd also add a SEQUENCE node to the executor. It just runs its children in > order (could be n-ary, or if fixed arity nodes are what is in the > planner/executor today, could be binary, first left, then right, and right > could be another SEQUENCE). The DEFERRED node means that a CREATE statement > could precede use of what is created in the same sequence and all could get > planned (with some deferral to execution time) in advance and run in one > lump. This implements DO at the executor level. > > The biggest concepts left from plpgsql are looping and variables. Most > variables could be modeled as a single row value; SQL already can update a > row, so the planning of assignments and calculations of scalars (and arrays, > I think) already fits into things the planner knows about. Table variables > (which I don't know that plpgsql supports, but someday it should) are less > defined. Adding plpgsql's loops to the executor would let whole functions > run under one trip through the executor. This is beyond just improving the > DDL support for scripts. > > I have written a number of database upgrade scripts. Over time we've made > them less fragile, by checking for the existence of tables, indexes, and most > recently, columns. The usual sequence is: > 1) check the existence of an index; check that the first few columns are > correct; if not, drop the index > 2) repeat for other indexes that have changed definition over time > 3) check the existence of the table; create with current layout if it is > missing > 4) chec
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On 10/18/2012 08:36 PM, Claudio Freire wrote: The CREATE QUEUE command, in fact, could be creating such a channel. The channel itself won't be WAL-only, just the messages going through it. This (I think) would solve locking issues. Hmm. Maybe we should think of implementing this as REMOTE TABLE, that is a table which gets no real data stored locally but all insert got through WAL and are replayed as real inserts on slave side. Then if you want matviews or partitioned table, you just attach triggers to the table on slave side to do them. This would be tangential to their use as pure queues which would happen at the level of plugins to logical replication. -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On 10/18/2012 07:33 PM, Josh Berkus wrote: Simon, It's hard to work out how to reply to this because its just so off base. I don't agree with the restrictions you think you see at all, saying it politely rather than giving a one word answer. You have inside knowledge of Hannu's design. Actually Simon has currently no more knowledge of this specific design than you do - I posted this on this list as soon as I had figured it out as a possible solution of a specific problem of supporting full pgQ/Londiste functionality in WAL based logical replication with minimal overhead. (well, actually I let it settle a few weeks, but i did not discuss this off-list before ). Simon may have better grasp of it thanks to having done work on the BDR/Logical Replication design and thus having better or at least more recent understanding of issues involved in Logical Replication. When mapping londiste/Slony message capture to Logical WAL the WAL already _is_ the event queue for replication. NOT LOGGED tables make it also usable for non-replication things using same mechanisms. (the equivalent in trigger-based system would be a log trigger which captures insert event and then cancels an insert). I am merely going from his description *on this list*, because that's all I have to go in. He requested comments, so here I am, commenting. I'm *hoping* that it's merely the description which is poor and not the conception of the feature. *As Hannu described the feature* it sounds useless and obscure, and miles away from powering any kind of general queueing mechanism. If we describe a queue as something you put stuff in at one end and get it out in same or some other specific order at the other end, then WAL _is_ a queue when you use it for replication (if you just write to it, then it is "Log", if you write and read, it is "Queue") That is, the WAL already is a form of persistent and ordered (that is how WAL works) stream of messages ("WAL records") that are generated on the "master" and replayed on one or more consumers (called "slaves" in case of simple replication) All it takes to make this scenario work is keeping track of LSN or simply log position on the slave side. What you seem to be wanting is support for a cooperative consumers, that is multiple consumers on the same queue working together and sharing the work to process the incoming event . This can be easily achieved using a single ordered event stream and extra bookkeeping structures on the consumer side (look at cooperative consumer samples in skytools). What I suggested was optimisation for the case where you know that you will never need the data on the master side and are only interested in it on the slave side. By writing rows/events/messages only to log (or steam or queue), you avoid the need to later clean up it on the master by either DELETE or TRUNCATE or rotating tables. For both physical and logical streaming the WAL _is_ the queue of events that were recorded on master and need to be replied on the slave. Thanks to introducing logical replication, it now makes sense to have actions recorded _only_ in this queue and this is what the whole RC was about. I recommend that you introduce yourself a bit to skytools/pgQ to get a better feel of the things I am talking about. Londiste is just one application built on a general event logging, transport and transform/replay (that is what i'd call queueing :) ) system pgQ. pgQ does have its roots in Slony an(and earlier) replication systems, but it is by no means _only_ a replication system. The LOG ONLY tables are _not_ needed for pure replication (like Slony) but they make replication + queueing type solutions like skytools/pgQ much more efficient as they do away wuth the need to maintain the queued data on the master side where it will never be needed ( just to reapeat this once more ) Or anything we discussed at the clustering meetings. And, again, if you didn't want comments, you shouldn't have posted an RFC. I did want comments and as far as I know I do not see you as hostile :) I do understand that what you mean by QUEUE (and specially as a MESSAGE QUEUE) is different from what I described. You seem to want specifically an implementation of cooperative consumers for a generic queue. The answer is yes, it is possible to build this on WAL, or table based event logs/queue of londiste / slony. It just takkes a little extra management on the receiving side to do the record locking and distribution between cooperating consumers. All we're discussing is moving a successful piece of software into core, which has been discussed for years at the international technical meetings we've both been present at. I think an open viewpoint on the feasibility of that would be reasonable, especially when it comes from one of the original designers. When I ask you for technical clarification or bring up potential problems with a 2Q feature, you consistently treat it as a personal attack
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On 18 October 2012 10:20, Andres Freund wrote: > On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote: >> Kevin Grittner wrote: >> > Hmm. The comment is probably better now, but I've been re-checking >> > the code, and I think my actual code change is completely wrong. >> > Give me a bit to sort this out. >> >> I'm having trouble seeing a way to make this work without rearranging >> the code for concurrent drop to get to a state where it has set >> indisvalid = false, made that visible to all processes, and ensured >> that all scans of the index are complete -- while indisready is still >> true. That is the point where TransferPredicateLocksToHeapRelation() >> could be safely called. Then we would need to set indisready = false, >> make that visible to all processes, and ensure that all access to the >> index is complete. I can't see where it works to set both flags at >> the same time. I want to sleep on it to see if I can come up with any >> other way, but right now that's the only way I'm seeing to make DROP >> INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-( > > In a nearby bug I had to restructure the code that in a way thats similar to > this anyway, so that seems fine. Maybe you can fix the bug ontop of the two > attached patches? First patch and first test committed. Working on second patch/test. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On Thu, Oct 18, 2012 at 2:33 PM, Josh Berkus wrote: >> I should also add that this is an switchable sync/asynchronous >> transactional queue, whereas LISTEN/NOTIFY is a synchronous >> transactional queue. > > Thanks for explaining. New here, I missed half the conversation, but since it's been brought up and (to me wrongfully) dismissed, I'd like to propose: NOTIFY [ALL|ONE] [REMOTE|LOCAL|CLUSTER|DOWNSTREAM] ASYNCHRONOUSLY LISTEN [REMOTE|LOCAL|CLUSTER|UPSTREAM] too for good measure. That ought to work out fine as SQL constructs go, implementation aside. That's not enough for matviews, but it is IMO a good starting point. All you need after that, are triggers for notifying automatically upon insert, and some mechanism to attach triggers to a channel for the receiving side. Since channels are limited to short strings, maybe a different kind of object (but with similar manipulation syntax) ought to be created. The CREATE QUEUE command, in fact, could be creating such a channel. The channel itself won't be WAL-only, just the messages going through it. This (I think) would solve locking issues. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] Row-Level Security
Kohei KaiGai escribió: > The revised patch fixes the problem that Daen pointed out. Robert, would you be able to give this latest version of the patch a look? (KaiGai, does it still apply cleanly? If not, please submit a rebased version.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree
Jeff Davis escribió: > On Sat, 2012-08-18 at 18:10 +0400, Alexander Korotkov wrote: > > > > Thanks! There is a separate patch for adjacent. I've reworked adjacent > > check in order to make it more clear. > > I am taking a look at this patch now. A few quick comments: > * I tried some larger tests and they seemed to work. I haven't reviewed > the spgist code changes in detail though. Jeff, Heikki, Any input on the subsequent version of this patch? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Statistics and selectivity estimation for ranges
Heikki, would you be able to give this patch a look and perhaps commit it? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
Alexander Korotkov escribió: > > 4. It looks like the randomization is happening while trying to compare > > the penalties. I think it may be more readable to separate those two > > steps; e.g. > > > > /* create a mapping whether randomization is on or not */ > > for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i)) > > offsets[i - FirstOffsetNumber] = i; > > > > if (randomization) > > /* randomize offsets array */ > > > > for (i = 0; i < maxoff; i++) > > { > > offset = offsets[i]; > > ... > > } > > > > That's just an idea; if you think it's more readable as-is (or if I am > > misunderstanding) then let me know. > > Actually, current implementation comes from idea of creating possible less > overhead when randomization is off. I'll try to measure overhead in worst > case. If it is low enough then you proposal looks reasonable to me. Were you able to do these measurements? If not, I'll defer to your and Jeff's judgement on what's the best next step here. Jeff, do you think we need more review of this patch? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
Alvaro Herrera writes: > Boszormenyi Zoltan escribió: >> this is the latest one, fixing a bug in the accounting >> of per-statement lock timeout handling and tweaking >> some comments. > Tom, are you able to give this patch some more time on this commitfest? I'm still hoping to get to it, but I've been spending a lot of time on bug fixing rather than patch review lately :-(. If you're hoping to close out the current CF soon, maybe we should just slip it to the next one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
Boszormenyi Zoltan escribió: > Hi, > > this is the latest one, fixing a bug in the accounting > of per-statement lock timeout handling and tweaking > some comments. Tom, are you able to give this patch some more time on this commitfest? (If not, I think it would be fair to boot it to CF3; this is final in a series, there's nothing that depends on it, and there's been good movement on it; there's plenty of time before the devel cycle closes.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
Simon, > It's hard to work out how to reply to this because its just so off > base. I don't agree with the restrictions you think you see at all, > saying it politely rather than giving a one word answer. You have inside knowledge of Hannu's design. I am merely going from his description *on this list*, because that's all I have to go in. He requested comments, so here I am, commenting. I'm *hoping* that it's merely the description which is poor and not the conception of the feature. *As Hannu described the feature* it sounds useless and obscure, and miles away from powering any kind of general queueing mechanism. Or anything we discussed at the clustering meetings. And, again, if you didn't want comments, you shouldn't have posted an RFC. > All we're discussing is moving a successful piece of software into > core, which has been discussed for years at the international > technical meetings we've both been present at. I think an open > viewpoint on the feasibility of that would be reasonable, especially > when it comes from one of the original designers. When I ask you for technical clarification or bring up potential problems with a 2Q feature, you consistently treat it as a personal attack and are emotionally defensive instead of answering my technical questions. This, in turn, frustrates the heck out of me (and others) because we can't get the technical questions answered. I don't want you to justify yourself, I want a clear technical spec. I'm asking these questions because I'm excited about ReplicationII, and I want it to be the best feature it can possibly be. Or, as we tell many new contributors, "We wouldn't bring up potential problems and ask lots of questions if we weren't interested in the feature." Now, on to the technical questions: >> QUEUE emphasizes the aspect of logged only table that it accepts >> "records" in a certain order, persists these and then quarantees >> that they can be read out in exact the same order - all this being >> guaranteed by existing WAL mechanisms. >> >> It is not meant to be a full implementation of application level queuing >> system though but just the capture, persisting and distribution parts >> >> Using this as an "application level queue" needs a set of interface >> functions to extract the events and also to keep track of the processed >> events. As there is no general consensus what these shoul be (like if >> processing same event twice is allowed) this part is left for specific >> queue consumer implementations. While implementations vary, I think you'll find that the set of operations required for a full-featured application queue are remarkably similar across projects. Personally, I've worked with celery, Redis, AMQ, and RabbitMQ, as well as a custom solution on top of pgQ. The design, as you've described it, make several of these requirements unreasonably convoluted to implement. It sounds to me like the needs of internal queueing and application queueing may be hopelessly divergent. That was always possible, and maybe the answer is to forget about application queueing and focus on making this mechanism work for replication and for matviews, the two features we *know* we want it for. Which don't need the application queueing features I described AFAIK. > The two halves of the queue are the TAIL/entry point and the HEAD/exit > point. As you point out these could be on the different servers, > wherever the logical changes flow to, but could also be on the same > server. When the head and tail are on the same server, the MESSAGE > QUEUE syntax seems appropriate, but I agree that calling it that when > its just a head or just a tail seems slightly misleading. Yeah, that's why I was asking for clarification; the way Hannu described it, it sounded like it *couldn't* be read on the insert node, but only on a replica. > We do, I think, want a full queue implementation in core. We also want > to allow other queue implementations to interface with Postgres, so we > probably want to allow "first half" only as well. Meaning we want both > head and tail separately in core code. The question is whether we > require both head and tail in core before we allow commit, to which I > would say I think adding the tail first is OK, and adding the head > later when we know exactly the design. I'm just pointing out that some of the requirements of the design for the replication queue may conflict with a design for a full-featured application queue. I don't quite follow you on what you mean by "head" vs. "tail". Explain? > Having said that, the LOGGING ONLY syntax makes me shiver. Better name? I suck at names. Sorry. > I should also add that this is an switchable sync/asynchronous > transactional queue, whereas LISTEN/NOTIFY is a synchronous > transactional queue. Thanks for explaining. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your su
Re: [HACKERS] Deprecations in authentication
On 10/18/2012 04:43 AM, Simon Riggs wrote: On 18 October 2012 12:20, Magnus Hagander wrote: Since Simon stirred up a hornets nest suggesting deprecation of a number of features, I figured I'd take it one step further and suggest removal of some previously deprecated features :) I'm laughing at the analogy that angry and unintelligent agents responded to my proposals, but there was no stirring action from me. I believe the stirring occurred when you dropped the idea in the proverbial bucket. It is not possible to drop even the tiniest pebble into any ideology of our community without some plague causing flying insects swarming just in case. You and I, included. JD -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On Thu, Oct 18, 2012 at 6:46 AM, Andrew Dunstan wrote: > > On 10/17/2012 07:25 PM, Tom Lane wrote: > >> >> I'm fairly annoyed by the entire tenor of this conversation, because >> the people who are hollering the loudest seem to be people who have >> never actually touched any of the rules code, but nonetheless seem >> prepared to tell those of us who have what to spend our time on. > > > +1 > > I too have been quite annoyed. Sorry that I'm an offender. I also did not like the way the conversation was going for some time; for me, I felt like I didn't understand a lot of the terse rejections that materialized immediately on behalf of users that I personally cannot identify, and I felt those rejections weren't in a neutral language either that encouraged clarification. I'm glad things have moved beyond that. > The biggest pain people have mentioned is that they don't work with COPY. I > am in fact about to start working on a project which will probably alleviate > that pain point. I'm not going to say much more, and I would not have said > anything right now except that there is this sudden rush to deprecate rules, > or announce a future removal of the feature. However, I hope to have a > proposal to put to the community by about the end of November. I have encountered this as a papercut. Here's another use case that in my history with RULES that didn't seem to pan out so well: In my recollection, one way to use rules is to retarget operations that happen against a view and move them to a table, and as I recall to make this work as one expected one had to have a very wordy RULE (for UPDATEs) with a litany of (fairly simple) equality and not-null conditions to make it work as one would expect (to not under-constrain the UPDATE). This became a maintenance headache whenever attributes were added to the underlying relation. It was also quite complex, as I recall, when one wanted to maintain an interface but normalize the underlying table and split writes into two or more places. It has been quite some time, does that sound like a correct rendering of a problem? -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Tue, Oct 16, 2012 at 9:31 PM, Heikki Linnakangas wrote: > On 15.10.2012 19:31, Fujii Masao wrote: >> >> On Mon, Oct 15, 2012 at 11:27 PM, Heikki Linnakangas >> wrote: >>> >>> On 15.10.2012 13:13, Heikki Linnakangas wrote: Oh, I didn't remember that we've documented the specific structs that we pass around. It's quite bogus anyway to explain the messages the way we do currently, as they are actually dependent on the underlying architecture's endianess and padding. I think we should refactor the protocol to not transmit raw structs, but use pq_sentint and friends to construct the messages. This was discussed earlier (see http://archives.postgresql.org/message-id/4fe2279c.2070...@enterprisedb.com), I think there's consensus that 9.3 would be a good time to do that as we changed the XLogRecPtr format anyway. >>> >>> >>> >>> This is what I came up with. The replication protocol is now >>> architecture-independent. The WAL format itself is still >>> architecture-independent, of course, but this is useful if you want to >>> e.g >>> use pg_receivexlog to back up a server that runs on a different platform. >>> >>> I chose the int64 format to transmit timestamps, even when compiled with >>> --disable-integer-datetimes. >>> >>> Please review if you have the time.. >> >> >> Thanks for the patch! >> >> When I ran pg_receivexlog, I encountered the following error. > > > Yeah, clearly I didn't test this near enough... > > I fixed the bugs you bumped into, new version attached. Thanks for updating the patch! We should remove the check of integer_datetime by pg_basebackup background process and pg_receivexlog? Currently, they always check it, and then if its setting value is not the same between a client and server, they fail. Thanks to the patch, ISTM this check is no longer required. + pq_sendint64(&reply_message, GetCurrentIntegerTimestamp()); In XLogWalRcvSendReply() and XLogWalRcvSendHSFeedback(), GetCurrentTimestamp() is called twice. I think that we can skip the latter call if integer-datetime is enabled because the return value of GetCurrentTimestamp() and GetCurrentIntegerTimestamp() is in the same format. It's worth reducing the number of GetCurrentTimestamp() calls, I think. elog(DEBUG2, "sending write %X/%X flush %X/%X apply %X/%X", -(uint32) (reply_message.write >> 32), (uint32) reply_message.write, -(uint32) (reply_message.flush >> 32), (uint32) reply_message.flush, -(uint32) (reply_message.apply >> 32), (uint32) reply_message.apply); +(uint32) (writePtr >> 32), (uint32) writePtr, +(uint32) (flushPtr >> 32), (uint32) flushPtr, +(uint32) (applyPtr >> 32), (uint32) applyPtr); elog(DEBUG2, "write %X/%X flush %X/%X apply %X/%X", -(uint32) (reply.write >> 32), (uint32) reply.write, -(uint32) (reply.flush >> 32), (uint32) reply.flush, -(uint32) (reply.apply >> 32), (uint32) reply.apply); +(uint32) (writePtr >> 32), (uint32) writePtr, +(uint32) (flushPtr >> 32), (uint32) flushPtr, +(uint32) (applyPtr >> 32), (uint32) applyPtr); Isn't it worth logging not only WAL location but also the replyRequested flag in these debug message? The remaining of the patch looks good to me. >> + hdrlen = sizeof(int64) + sizeof(int64) + >> sizeof(int64); >> + hdrlen = sizeof(int64) + sizeof(int64) + >> sizeof(char); >> >> These should be macro, to avoid calculation overhead? > > > The compiler will calculate this at compilation time, it's going to be a > constant at runtime. Yes, you're right. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore
On Thu, 2012-10-18 at 12:19 -0300, Alvaro Herrera wrote: > Robert Haas escribió: > > On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge > > wrote: > > > Any comments on this? > > > > I'm not sure I'd want to back-patch this, since it is a behavior > > change, but I do think it's probably a good idea to change it for 9.3. > > Hm, but the bug is said to happen only in 9.2, so if we don't backpatch > we would leave 9.2 alone exhibiting this behavior. > Yeah, Alvarro got it right. The behaviour changed in 9.2. This patch needs to be applied on 9.2 and master, nothing else. If the patch is good enough though... -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On 10/17/2012 04:25 PM, Tom Lane wrote: ...Now having said that, I would definitely like to see rules in their current form go away eventually. But not without a substitute. Triggers are not a complete replacement, and no amount of wishful thinking makes them so. ... Perhaps it would be more profitable to try to identify the pain points that make people so eager to get rid of rules, and then see if we could alleviate them. Alternately/additionally identify the deficiencies in triggers that drive users to prefer rules. For example, a common need is to update a log table whenever updates are made to a main table. Using rules to accomplish this is very easy to understand and write, even for most beginners. (Understand properly including limitations and dangers is another issue, of course.) It is also easy to maintain. If you drop the table, the rule is cleaned up as well. With triggers you need to select from a variety of available languages, write a function in that language and write a trigger that calls that function. Dropping the function will remove the trigger but the user must remember to delete the function as well, if desired. Nothing insurmountable but inconvenient compared to the use of a rule. Per the documentation "PostgreSQL only allows the execution of a user-defined function for the triggered action. The standard allows the execution of a number of other SQL commands..." There may be valid reasons why implementing that part of the SQL standard in PostgreSQL is difficult or unwise but removing that limitation on triggers would eliminate one annoyance that pushes users toward rules. Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog filename formatting functions in recovery
On Thu, Oct 18, 2012 at 9:22 AM, Alvaro Herrera wrote: > Daniel, I assume you are submitting an updated version based on the > feedback that has been provided. I will mark this patch returned with > feedback in the current CF; please submit the next version to CF3. Thank you for reminding me, so the approach that seems basically reasonable is to add an overload for these functions to accept a timeline number, if I read this feedback correctly? -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
Robert Haas writes: > On Thu, Oct 18, 2012 at 7:20 AM, Magnus Hagander wrote: >> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old >> syntax deprecated but still mapping to the new one. Has it been there >> long enough that we should start throwing an error for ident on unix? > Definitely not. I see no reason to change that, well, really ever. > But certainly not after just two releases. It seems to me like a > useful convenience that does no real harm. I think the argument that it causes user confusion is a fairly strong one, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Very minor feature suggestion
It might be nice for psql to have a 'htmlcaption' boolean pset option that would wrap the provided title/caption, if any, in a caption tag in the HTML report output, when using html format. Motivation: When I use: \pset title 'Some title' or \C 'Some title' psql emits the text: Title is "Some title". even when using html format. This seems more like a diagnostic/annotation (like 'Title is unset', which is what you get from a plain "\pset title") than report output. For casual use, even "\echo Some title" is more pleasant, except for post-processing by scripts, which people are no doubt doing. When using html format, it would arguably be better for the title to be wrapped in a caption tag inside the table itself. You couldn't change the default behavior, but the html caption would be a nice option. Regards, Kevin Murphy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xlog filename formatting functions in recovery
Daniel, I assume you are submitting an updated version based on the feedback that has been provided. I will mark this patch returned with feedback in the current CF; please submit the next version to CF3. Thanks to the (rather numerous!) reviewers. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader
This patch doesn't seem to be going anywhere, sadly. Since we're a bit late in the commitfest and this patch hasn't seen any activity for a long time, I'll mark it as returned-with-feedback. I hope one or both versions are resubmitted (with additional fixes?) for the next commitfest, and that the discussion continues to determine which of the two approaches is the best. Thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tuplesort memory usage: grow_memtuples
On 16 October 2012 21:47, Peter Geoghegan wrote: > The same basic strategy for sizing the tuplesort memtuples array in > also exists in tuplestore. I wonder if we should repeat this there? I > suppose that that could follow later. Incidentally, the basis of this remark is commit 2689abf0, where Tom decided to keep the two in sync. That's a precedent for what we need to do here, I suppose. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On Thu, Oct 18, 2012 at 7:20 AM, Magnus Hagander wrote: > Since Simon stirred up a hornets nest suggesting deprecation of a > number of features, I figured I'd take it one step further and suggest > removal of some previously deprecated features :) > > In particular, we made a couple of changes over sveral releases back > in the authentication config, that we should perhaps consider > finishing by removing the old stuff now? > > 1. krb5 authentication. We've had gssapi since 8.3 (which means in all > supported versions). krb5 has been deprecated, also since 8.3. Time to > remove it? That seems like a sufficiently long deprecation window, but is gssapi a full substitute for krb5? I don't really have a strong opinion on this, not being a user myself. > 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old > syntax deprecated but still mapping to the new one. Has it been there > long enough that we should start throwing an error for ident on unix? Definitely not. I see no reason to change that, well, really ever. But certainly not after just two releases. It seems to me like a useful convenience that does no real harm. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: pgbench - aggregation of info written into log
Tomas Vondra wrote: > Also, I've realized the last interval may not be logged at all - I'll > take a look into this in the next version of the patch. I didn't see any later version of this patch posted anywhere. I guess it'll have to wait until the next commitfest. Please fix the remaining issues and resubmit. Thanks to Robert Haas, Jeff Janes and Pavel Stehule for the reviews. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
Tomas Vondra wrote: > Hi, > > thanks for the review. I'll look into that in ~2 weeks, once the > pgconf.eu > is over. Excellent. Please submit the updated version to the upcoming commitfest when you have it. I'm marking this patch Returned with Feedback. Many thanks to Shigeru Hanada for the review and benchmark. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tuplesort memory usage: grow_memtuples
Greg Stark escribió: > On Tue, Oct 16, 2012 at 9:47 PM, Peter Geoghegan > wrote: > > The patch will now been marked "ready for committer". Does this need > > doc changes, in light of what is arguably a behavioural difference? > > You only mentioned release notes. > > I'm happy to look at this one, probably next week at pgconf.eu. It seems like > a > reasonable size patch to get back into things. > > That's assuming my committer bits haven't lapsed and people are ok > with me stepping back into things? Feel free. Your committer bits should still work. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Global Sequences
On 18 October 2012 16:15, Tom Lane wrote: > But I'd want to see a pretty > bulletproof argument why overriding *only* nextval is sufficient > (and always will be) before accepting a hook for just nextval. If we > build an equivalent amount of functionality piecemeal it's going to > be a lot uglier than if we recognize we need this type of concept > up front. We discussed that we need only nextval() and setval() elsewhere, but adding others is fairly cheap so we can chew on that when we have a patch to discuss. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Global Sequences
On 18 October 2012 16:08, Robert Haas wrote: > To make > it even better, add some generic options that can be passed through to > the underlying handler. Agreed >> Or maybe better, invent a level of indirection like a "sequence access >> method" (comparable to index access methods) that provides a compatible >> set of substitute functions for sequence operations. If you want to >> override nextval() for a sequence, don't you likely also need to >> override setval(), currval(), etc? Not to mention overriding ALTER >> SEQUENCE's behavior. > > This might be better, but it's also possibly more mechanism than we > truly need here. But then again, if we're going to end up with more > than a handful of handlers, we probably do want to do this. Let me have a play and see what comes out simplest. Somewhere in the middle seems about right. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore
Robert Haas escribió: > On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge > wrote: > > Any comments on this? > > I'm not sure I'd want to back-patch this, since it is a behavior > change, but I do think it's probably a good idea to change it for 9.3. Hm, but the bug is said to happen only in 9.2, so if we don't backpatch we would leave 9.2 alone exhibiting this behavior. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First draft of snapshot snapshot building design document
On Thursday, October 18, 2012 04:47:12 PM Robert Haas wrote: > On Tue, Oct 16, 2012 at 7:30 AM, Andres Freund wrote: > > On Thursday, October 11, 2012 01:02:26 AM Peter Geoghegan wrote: > >> The design document [2] really just explains the problem (which is the > >> need for catalog metadata at a point in time to make sense of heap > >> tuples), without describing the solution that this patch offers with > >> any degree of detail. Rather, [2] says "How we build snapshots is > >> somewhat intricate and complicated and seems to be out of scope for > >> this document", which is unsatisfactory. I look forward to reading the > >> promised document that describes this mechanism in more detail. > > > > Here's the first version of the promised document. I hope it answers most > > of the questions. > > > > Input welcome! > > I haven't grokked all of this in its entirety, but I'm kind of > uncomfortable with the relfilenode -> OID mapping stuff. I'm > wondering if we should, when logical replication is enabled, find a > way to cram the table OID into the XLOG record. It seems like that > would simplify things. > > If we don't choose to do that, it's worth noting that you actually > need 16 bytes of data to generate a unique identifier for a relation, > as in database OID + tablespace OID + relfilenode# + backend ID. > Backend ID might be ignorable because WAL-based logical replication is > going to ignore temporary relations anyway, but you definitely need > the other two. ... Hm. I should take look at the way temporary tables are represented. As you say I is not going to matter for WAL decoding, but still... > Another thing to think about is that, like catalog snapshots, > relfilenode mappings have to be time-relativized; that is, you need to > know what the mapping was at the proper point in the WAL sequence, not > what it is now. In practice, the risk here seems to be minimal, > because it takes a while to churn through 4 billion OIDs. However, I > suspect it pays to think about this fairly carefully because if we do > ever run into a situation where the OID counter wraps during a time > period comparable to the replication lag, the bugs will be extremely > difficult to debug. I think with a rollbacks + restarts we might even be able to see the same relfilenode earlier. > Anyhow, adding the table OID to the WAL header would chew up a few > more bytes of WAL space, but it seems like it might be worth it to > avoid having to think very hard about all of these issues. I don't think its necessary to change wal logging here. The relfilenode mapping is now looked up using the timetravel snapshot we've built using (spcNode, relNode) as the key, so the time-relativized lookup is "builtin". If we screw that up way much more is broken anyway. Two problems are left: 1) (reltablespace, relfilenode) is not unique in pg_class because InvalidOid is stored for relfilenode if its a shared or nailed table. That not a problem for the lookup because weve already checked the relmapper before that, so we never look those up anyway. But it violates documented requirements of syscache.c. Even after some looking I haven't found any problem that that could cause. 2) We need to decide whether a HEAP[1-2]_* record did catalog changes when building/updating snapshots. Unfortunately we also need to do this *before* we built the first snapshot. For now treating all tables as catalog modifying before we built the snapshot seems to work fine. I think encoding the oid in the xlog header wouln't help all that much here, because I am pretty sure we want to have the set of "catalog tables" to be extensible at some point... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash_search and out of memory
I wrote: > Hitoshi Harada writes: >> If OOM happens during expand_table() in hash_search_with_hash_value() >> for RelationCacheInsert, > What OOM? expand_table is supposed to return without doing anything > if it can't expand the table. If that's not happening, that's a bug > in the hash code. Oh, wait, I take that back --- the palloc-based allocator does throw errors. I think that when that was designed, we were thinking that palloc-based hash tables would be thrown away anyway after an error, but of course that's not true for long-lived tables such as the relcache hash table. I'm not terribly comfortable with trying to use a PG_TRY block to catch an OOM error - there are too many ways that could break, and this code path is by definition not very testable. I think moving up the expand_table action is probably the best bet. Will you submit a patch? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review for pg_dump: Sort overloaded functions in deterministic order
Joel Jacobson wrote: > Hi Joachim, > > Attached, please find new patch. Test unchanged. This was committed, as discussed in the original patch's thread. It would be great if reviewers could reply to the email that submits the patch, instead of creating a thread of their own. It helps keep things better organized. Thanks for the review. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order
Joel Jacobson wrote: > On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera > wrote: > > Uh, the patch you posted keeps the pg_get_function_identity_arguments > > call in dumpFunc, but there is now also a new one in getFuncs. Do we > > need to remove the second one? > > It could be done, but unfortunately we cannot use the value computed > in dumpFunc(), > because getFuncs() is called before dumpFunc(). Right, I got that from the discussion. > What could be done is to keep the changes in getFuncs(), and also > change dumpFunc() > to use the value computed in getFuncs(), but I think the gain is small > in relation > to the complexity of changing dumpFunc(), as we would still need to > make the two other > function calls in the SQL query in dumpFunc() to pg_get_function_arguments() > and > pg_get_function_result(). Changing pg_dump is complex enough whatever the change, yes. I have not touched this. > > Here's an updated patch for your consideration. I was about to push > > this when I noticed the above. The only change here is that the extra > > code that tests for new remoteVersions in the second "else if" branch of > > getFuncs and getAggregates has been removed, since it cannot ever be > > reached. > > Looks really good. Thanks, pushed it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
On 18 October 2012 16:18, Christopher Browne wrote: > A "shim" adds complexity, but retains the "upgrade across versions" > use case, and reduces the need to keep supporting elder versions of > Slony. Right. Upgrading across major versions is likely to continue to remain a very important use-case for Slony. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Wed, Oct 17, 2012 at 8:46 PM, Amit Kapila wrote: >> On Monday, October 15, 2012 3:43 PM Heikki Linnakangas wrote: >> On 13.10.2012 19:35, Fujii Masao wrote: >> > On Thu, Oct 11, 2012 at 11:52 PM, Heikki Linnakangas >> > wrote: >> >> Ok, thanks. Committed. >> > >> > I found one typo. The attached patch fixes that typo. >> >> Thanks, fixed. >> >> > ISTM you need to update the protocol.sgml because you added >> > the field 'replyRequested' to WalSndrMessage and StandbyReplyMessage. > > >> >> > Is it worth adding the same mechanism (send back the reply immediately >> > if walsender request a reply) into pg_basebackup and pg_receivexlog? >> >> Good catch. Yes, they should be taught about this too. I'll look into >> doing that too. > > If you have not started and you don't have objection, I can pickup this to > complete it. > > For both (pg_basebackup and pg_receivexlog), we need to get a timeout > parameter from user in command line, as > there is no conf file here. New Option can be -t (parameter name can be > recvtimeout). > > The main changes will be in function ReceiveXlogStream(), it is a common > function for both > Pg_basebackup and pg_receivexlog. Handling will be done in same way as we > have done in walreceiver. > > Suggestions/Comments? Before implementing the timeout parameter, I think that it's better to change both pg_basebackup background process and pg_receivexlog so that they send back the reply message immediately when they receive the keepalive message requesting the reply. Currently, they always ignore such keepalive message, so status interval parameter (-s) in them always must be set to the value less than replication timeout. We can avoid this troublesome parameter setting by introducing the same logic of walreceiver into both pg_basebackup background process and pg_receivexlog. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
On Thu, Oct 18, 2012 at 9:49 AM, Peter Geoghegan wrote: > On 16 October 2012 15:26, Jan Wieck wrote: >> This means that the transition time from the existing, trigger based >> approach to the new WAL based mechanism will see both technologies in >> parallel, which is no small thing to support. > > So, you're talking about a shim between the two in order to usefully > support inter-version replication, or are you just thinking about > making a clean break in compatibility for Postgres versions prior to > 9.3 in a new release branch? It's early to assume either. In Slony 2.0, we accepted that we were breaking compatibility with versions of Postgres before 8.3; we accepted that because there were considerable 'manageability' benefits (e.g. - system catalogues no longer hacked up, so pg_dump works against all nodes, and some dramatically reduced locking). But that had the attendant cost that we have had to continue fixing bugs on 1.2, to a degree, even until now, because people on Postgres versions earlier than 8.3 have no way to use version 2.0. Those merits and demerits apply pretty clearly to this. It would be somewhat attractive for a "version 2.3" (or, more likely, to indicate the break from earlier versions, "3.0" to make the clean break to the new-in-PG-9.3 facilities. It is attractive in that we could: a) Safely remove the trigger-based log capture apparatus (or, at least, I'm assuming so), and b) Consciously upgrade to take advantage of all the latest cool stuff found in Postgres 9.3. (I haven't got any particular features in mind; perhaps we add RANGE comparators for xid to 9.3, and make extensive use of xid_range types? That would be something that couldn't reasonably get hacked to work in anything before 9.2...) c) Drop out any special cases having to do with support of versions 8.3, 8.4, 9.0, 9.1, and 9.2. But, of course, we'd be leaving everyone running 8.3 thru 9.2 behind, if we did so, and would corresponding shackle ourselves to need to support the 2.x branches for still longer. And this would mean that this Slony "3.0" would expressly NOT support one of our intended use cases, namely to support upgrading from elder versions of Postgres. A "shim" adds complexity, but retains the "upgrade across versions" use case, and reduces the need to keep supporting elder versions of Slony. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Global Sequences
Robert Haas writes: > On Tue, Oct 16, 2012 at 1:29 PM, Tom Lane wrote: >> Or maybe better, invent a level of indirection like a "sequence access >> method" (comparable to index access methods) that provides a compatible >> set of substitute functions for sequence operations. If you want to >> override nextval() for a sequence, don't you likely also need to >> override setval(), currval(), etc? Not to mention overriding ALTER >> SEQUENCE's behavior. > This might be better, but it's also possibly more mechanism than we > truly need here. But then again, if we're going to end up with more > than a handful of handlers, we probably do want to do this. It's definitely a lot of mechanism, and if we can get away with something simpler that's fine with me. But I'd want to see a pretty bulletproof argument why overriding *only* nextval is sufficient (and always will be) before accepting a hook for just nextval. If we build an equivalent amount of functionality piecemeal it's going to be a lot uglier than if we recognize we need this type of concept up front. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in -c CLI option of pg_dump/pg_restore
On Tue, Oct 16, 2012 at 10:31 AM, Guillaume Lelarge wrote: > Any comments on this? I'm not sure I'd want to back-patch this, since it is a behavior change, but I do think it's probably a good idea to change it for 9.3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Global Sequences
On Tue, Oct 16, 2012 at 1:29 PM, Tom Lane wrote: > In particular, the reason proposing a hook first seems backwards is that > if we have a catalog-level representation that some sequences are local > and others not, we should be using that to drive the determination of > whether to call a substitute function --- and maybe which one to call. > For instance, I could see attaching a function OID to each sequence > and then having nextval() call that function, instead of a hook per se. Yeah, I like that. That makes it easy to configure your database so that some sequences have special behavior (which the database designer can set up however they like) and others can be just vanilla, and the plugin doesn't have to try to figure out which ones are which (which was my first concern in reading Simon's original proposal). To make it even better, add some generic options that can be passed through to the underlying handler. So something like: ALTER SEQUENCE wump SET HANDLER (nextval my_magical_nextval, setval my_magical_setval) OPTIONS (any_label_you_want_the_handlers_to_get 'some_text_associated_with_the_label', another_label 'some_more_text'); That way you could say, for example, that sequence wump should get its values from coordinator node 172.24.16.93 and that the global identifier for this sequence is UUID e15ea6e6-43d5-4f65-8efd-cf28a14a2d70. That way you can avoid having to make any assumptions about how local sequence names on particular nodes are mapped onto global names. > Or maybe better, invent a level of indirection like a "sequence access > method" (comparable to index access methods) that provides a compatible > set of substitute functions for sequence operations. If you want to > override nextval() for a sequence, don't you likely also need to > override setval(), currval(), etc? Not to mention overriding ALTER > SEQUENCE's behavior. This might be better, but it's also possibly more mechanism than we truly need here. But then again, if we're going to end up with more than a handful of handlers, we probably do want to do this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On 18 October 2012 12:43, Simon Riggs wrote: > On 18 October 2012 12:20, Magnus Hagander wrote: > >> Since Simon stirred up a hornets nest suggesting deprecation of a >> number of features, I figured I'd take it one step further and suggest >> removal of some previously deprecated features :) > > I'm laughing at the analogy that angry and unintelligent agents > responded to my proposals, but there was no stirring action from me. Hmm, this looks like a stirring action in itself, so I withdraw and apologise. You are right that some people are angry and so IMHO it was wrong of me to try to joke about that. My point was only that I had acted in good faith, rather than to deliberately cause annoyance. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] First draft of snapshot snapshot building design document
On Tue, Oct 16, 2012 at 7:30 AM, Andres Freund wrote: > On Thursday, October 11, 2012 01:02:26 AM Peter Geoghegan wrote: >> The design document [2] really just explains the problem (which is the >> need for catalog metadata at a point in time to make sense of heap >> tuples), without describing the solution that this patch offers with >> any degree of detail. Rather, [2] says "How we build snapshots is >> somewhat intricate and complicated and seems to be out of scope for >> this document", which is unsatisfactory. I look forward to reading the >> promised document that describes this mechanism in more detail. > > Here's the first version of the promised document. I hope it answers most of > the questions. > > Input welcome! I haven't grokked all of this in its entirety, but I'm kind of uncomfortable with the relfilenode -> OID mapping stuff. I'm wondering if we should, when logical replication is enabled, find a way to cram the table OID into the XLOG record. It seems like that would simplify things. If we don't choose to do that, it's worth noting that you actually need 16 bytes of data to generate a unique identifier for a relation, as in database OID + tablespace OID + relfilenode# + backend ID. Backend ID might be ignorable because WAL-based logical replication is going to ignore temporary relations anyway, but you definitely need the other two. There's nothing, for example, to keep you from having two relations with the same value in pg_class.relfilenode in the same database but in different tablespaces. It's unlikely to happen, because for new relations we set OID = relfilenode, but a subsequent rewrite can bring it about if the stars align just right. (Such situations are, of course, a breeding ground for bugs, which might make you question whether our current scheme for assigning relfilenodes has much of anything to recommend it.) Another thing to think about is that, like catalog snapshots, relfilenode mappings have to be time-relativized; that is, you need to know what the mapping was at the proper point in the WAL sequence, not what it is now. In practice, the risk here seems to be minimal, because it takes a while to churn through 4 billion OIDs. However, I suspect it pays to think about this fairly carefully because if we do ever run into a situation where the OID counter wraps during a time period comparable to the replication lag, the bugs will be extremely difficult to debug. Anyhow, adding the table OID to the WAL header would chew up a few more bytes of WAL space, but it seems like it might be worth it to avoid having to think very hard about all of these issues. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] False indication in pg_stat_replication.sync_state
On Thu, Oct 18, 2012 at 5:42 PM, Kyotaro HORIGUCHI wrote: > Hello. My colleague found that pg_stat_replication.sync_state > shows false state for some condition. > > This prevents Pacemaker from completing fail-over that could > safely be done. > > The point is in walsender.c, pg_stat_get_wal_senders() below, (as > of REL9_2_1) > > 1555:if (walsnd->pid != 0) > 1556:{ > 1557: /* > 1558: * Treat a standby such as a pg_basebackup background process > 1559: * which always returns an invalid flush location, as an > 1560: * asynchronous standby. > 1561: */ > ! 1562: sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ? > 1563: 0 : walsnd->sync_standby_priority; > > Here, XLogRecPtrIsInvalid(walsnd->flush) is defined as > (walsnd->flush.xrecoff == 0) which becomes true as usual at every > WAL 'file's (not segments) boundary. xrecoff == 0 is certainly > invalid for the start point of WAL *RECORD*, but should be > considered valid in replication stream. This check was introduced > at 9.2.0 and the version up between 9.1.4 and 9.1.5. > > | DEBUG: write 4/0 flush 3/FEFFEC68 apply 3/FEFFEC68 > | DEBUG: write 4/0 flush 4/0 apply 3/FEFFEC68 > | DEBUG: HOGE: flush = 3/FEFFEC68 sync_priority[0] = 1 > | DEBUG: write 4/111C0 flush 4/0 apply 3/FEFFECC0 > !| DEBUG: HOGE: flush = 4/0 sync_priority[0] = 0 > > This value zero of sync_priority[0] makes sync_status 'async' > errorneously and confuses Pacemaker. > > # The log line marked with 'HOGE' above printed by applying the > # patch at the bottom of this message and invoking 'select > # sync_state from pg_stat_replication' periodically. To increase > # the chance to see the symptom, sleep 1 second for 'file' > # boundaries :-) > > The Heikki's recent(?) commit > 0ab9d1c4b31622e9176472b4276f3e9831e3d6ba which changes the format > of XLogRecPtr from logid:xrecoff struct to 64 bit linear address > would fix the false indication. But I suppose this patch won't be > applied to existing 9.1.x and 9.2.x because of the modification > onto streaming protocol. > > As far as I see the patch, it would'nt change the meaning of > XLogRecPtr to change XLogRecPtrIsInvalid from (xrecoff == 0) to > (xrecoff == 0 && xlogid == 0). But this change affects rather > wide portion where handling WAL nevertheless what is needed here > is only to stop the false indication. > > On the other hand, pg_basebackup seems return 0/0 as flush and > apply positions so it seems enough only to add xlogid == 0 into > the condition. The patch attached for REL9_2_1 does this and > yields the result following. > > | DEBUG: write 2/FEFFFD48 flush 2/FEFFFD48 apply 2/FEFF7AB0 > | DEBUG: write 3/0 flush 2/FEFFFD48 apply 2/FEFF7E88 > | DEBUG: write 3/0 flush 3/0 apply 2/FEFFFD48 > | DEBUG: HOGE: flush = 2/FEFFFD48 sync_priority[0] = 1 > | DEBUG: write 3/E338 flush 3/0 apply 2/FE80 > !| DEBUG: HOGE: flush = 3/0 sync_priority[0] = 1 > > I think this patch should be applied for 9.2.2 and 9.1.7. Looks good to me, though I don't think the source code comment needs to be updated in the way the patch does. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] September 2012 commitfest
Amit Kapila wrote: > For the Patch, Trim trailing NULL columns, I have provided the performance > data required > and completed the review. There are only few review comments which can be > addressed. > So is it possible that I complete them and mark it as "Ready For Committer" > or what else can be the way to proceed for this patch > if author doesn't respond. Sure, you can do that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash_search and out of memory
Hitoshi Harada writes: > If OOM happens during expand_table() in hash_search_with_hash_value() > for RelationCacheInsert, What OOM? expand_table is supposed to return without doing anything if it can't expand the table. If that's not happening, that's a bug in the hash code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC][PATCH] wal decoding, attempt #2 - Design Documents (really attached)
On 16 October 2012 15:26, Jan Wieck wrote: > This means that the transition time from the existing, trigger based > approach to the new WAL based mechanism will see both technologies in > parallel, which is no small thing to support. So, you're talking about a shim between the two in order to usefully support inter-version replication, or are you just thinking about making a clean break in compatibility for Postgres versions prior to 9.3 in a new release branch? -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On 10/17/2012 07:25 PM, Tom Lane wrote: I'm fairly annoyed by the entire tenor of this conversation, because the people who are hollering the loudest seem to be people who have never actually touched any of the rules code, but nonetheless seem prepared to tell those of us who have what to spend our time on. +1 I too have been quite annoyed. Now having said that, I would definitely like to see rules in their current form go away eventually. But not without a substitute. Triggers are not a complete replacement, and no amount of wishful thinking makes them so. Perhaps it would be more profitable to try to identify the pain points that make people so eager to get rid of rules, and then see if we could alleviate them. One big problem I know about offhand is the multiple-evaluation risk, which seems at least in principle fixable. What others are there? Yeah. That's by far the best approach. It has the merit of being positive rather than just taking something away that people do use, even if it's only a relatively small number of users. The biggest pain people have mentioned is that they don't work with COPY. I am in fact about to start working on a project which will probably alleviate that pain point. I'm not going to say much more, and I would not have said anything right now except that there is this sudden rush to deprecate rules, or announce a future removal of the feature. However, I hope to have a proposal to put to the community by about the end of November. On that point, it's also worth noting that FDWs provide a nice workaround, instead of doing a straight COPY TO mytable FROM myfile you set up the source as a foreign table using file_fdw or my file_text_array_fdw, and then do INSERT INTO mytable SELECT ... FROM my_foreign_table; Maybe this too would be worth mentioning in the docs, maybe in the rules section with an xref from the copy section. So, please, hold off for little bit. I don't mind putting warnings in the docs, but I'd really rather we waited on any announcement of a future possible deprecation, or log warnings that using rules will cause zombies to eat your brainz. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
Simon Riggs wrote: > On 18 October 2012 12:20, Magnus Hagander wrote: > > > Since Simon stirred up a hornets nest suggesting deprecation of a > > number of features, I figured I'd take it one step further and suggest > > removal of some previously deprecated features :) > > I'm laughing at the analogy that angry and unintelligent agents > responded to my proposals, but there was no stirring action from me. We may all be stupid individually, but it's the swarm that matters. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
Andres Freund wrote: > On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote: >> I'm having trouble seeing a way to make this work without >> rearranging the code for concurrent drop to get to a state where >> it has set indisvalid = false, made that visible to all processes, >> and ensured that all scans of the index are complete -- while >> indisready is still true. That is the point where >> TransferPredicateLocksToHeapRelation() could be safely called. >> Then we would need to set indisready = false, make that visible to >> all processes, and ensure that all access to the index is >> complete. I can't see where it works to set both flags at the same >> time. > In a nearby bug I had to restructure the code that in a way thats > similar to this anyway, so that seems fine. Maybe you can fix the > bug ontop of the two attached patches? Perfect; these two patches provide a spot in the code which is exactly right for handling the predicate lock adjustments. Attached is a patch which applies on top of the two you sent. Thanks! -Kevin *** a/src/backend/catalog/index.c --- b/src/backend/catalog/index.c *** *** 1320,1325 index_drop(Oid indexId, bool concurrent) --- 1320,1337 * In the concurrent case we make sure that nobody can be looking at the * indexes by dropping the index in multiple steps, so we don't need a full * fledged AccessExlusiveLock yet. + * + * All predicate locks on the index are about to be made invalid. Promote + * them to relation locks on the heap. For correctness the index must not + * be seen with indisvalid = true during query planning after the move + * starts, so that the index will not be used for a scan after the + * predicate lock move, as this could create new predicate locks on the + * index which would not ensure a heap relation lock. Also, the index must + * not be seen during execution of a heap tuple insert with indisready = + * false before the move is complete, since the conflict with the + * predicate lock on the index gap could be missed before the lock on the + * heap relation is in place to detect a conflict based on the heap tuple + * insert. */ heapId = IndexGetRelation(indexId, false); if (concurrent) *** *** 1439,1444 index_drop(Oid indexId, bool concurrent) --- 1451,1464 } /* + * No more predicate locks will be acquired on this index, and we're + * about to stop doing inserts into the index which could show + * conflicts with existing predicate locks, so now is the time to move + * them to the heap relation. + */ + TransferPredicateLocksToHeapRelation(userIndexRelation); + + /* * now we are sure that nobody uses the index for queries, they just * might have it opened for updating it. So now we can unset * ->indisready and wait till nobody could update the index anymore. *** *** 1507,1518 index_drop(Oid indexId, bool concurrent) userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock); userIndexRelation = index_open(indexId, AccessExclusiveLock); } ! ! /* ! * All predicate locks on the index are about to be made invalid. Promote ! * them to relation locks on the heap. ! */ ! TransferPredicateLocksToHeapRelation(userIndexRelation); /* * Schedule physical removal of the files --- 1527,1534 userHeapRelation = heap_open(heapId, ShareUpdateExclusiveLock); userIndexRelation = index_open(indexId, AccessExclusiveLock); } ! else ! TransferPredicateLocksToHeapRelation(userIndexRelation); /* * Schedule physical removal of the files -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe & leaves around undroppable indexes
On 18 October 2012 12:56, Abhijit Menon-Sen wrote: > At 2012-09-25 01:46:18 +0200, and...@2ndquadrant.com wrote: >> >> The attached patch fixes this issue. Haven't looked at the other one >> in detail yet. > > Here are tests for both bugs. They currently fail with HEAD. > > Note that the first test now uses PREPARE instead of the SELECTs in the > original example, which no longer works after commit #96cc18 because of > the re-added AcceptInvalidationMessages calls (archaeology by Andres). Thanks, I'll apply these now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: DROP INDEX CONCURRENTLY is not really concurrency safe & leaves around undroppable indexes
At 2012-09-25 01:46:18 +0200, and...@2ndquadrant.com wrote: > > The attached patch fixes this issue. Haven't looked at the other one > in detail yet. Here are tests for both bugs. They currently fail with HEAD. Note that the first test now uses PREPARE instead of the SELECTs in the original example, which no longer works after commit #96cc18 because of the re-added AcceptInvalidationMessages calls (archaeology by Andres). -- Abhijit diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule index 75e33bc..d964aaf 100644 --- a/src/test/isolation/isolation_schedule +++ b/src/test/isolation/isolation_schedule @@ -14,3 +14,5 @@ test: fk-contention test: fk-deadlock test: fk-deadlock2 test: eval-plan-qual +test: drop-index-concurrently-1 +test: drop-index-concurrently-2 diff --git a/src/test/isolation/specs/drop-index-concurrently-1.spec b/src/test/isolation/specs/drop-index-concurrently-1.spec new file mode 100644 index 000..83c44ab --- /dev/null +++ b/src/test/isolation/specs/drop-index-concurrently-1.spec @@ -0,0 +1,33 @@ +setup +{ + CREATE TABLE test_dc(id serial primary key, data int); + INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); + CREATE INDEX test_dc_data ON test_dc(data); +} + +teardown +{ + DROP TABLE test_dc; +} + +session "s1" +step "noseq" { SET enable_seqscan = false; } +step "prepi" { PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34; } +step "preps" { PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text; } +step "begin" { BEGIN; } +step "explaini" { EXPLAIN (COSTS OFF) EXECUTE getrow_idx; } +step "explains" { EXPLAIN (COSTS OFF) EXECUTE getrow_seq; } +step "selecti" { EXECUTE getrow_idx; } +step "selects" { EXECUTE getrow_seq; } +step "end" { COMMIT; } + +session "s2" +setup { BEGIN; } +step "select2" { SELECT * FROM test_dc WHERE data=34; } +step "insert2" { INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); } +step "end2" { COMMIT; } + +session "s3" +step "drop" { DROP INDEX CONCURRENTLY test_dc_data; } + +permutation "noseq" "prepi" "preps" "begin" "explaini" "explains" "select2" "drop" "insert2" "end2" "selecti" "selects" "end" diff --git a/src/test/isolation/expected/drop-index-concurrently-1.out b/src/test/isolation/expected/drop-index-concurrently-1.out new file mode 100644 index 000..c42ac91 --- /dev/null +++ b/src/test/isolation/expected/drop-index-concurrently-1.out @@ -0,0 +1,36 @@ +Parsed test spec with 3 sessions + +starting permutation: noseq prepi preps begin explaini explains select2 drop insert2 end2 selecti selects end +step noseq: SET enable_seqscan = false; +step prepi: PREPARE getrow_idx AS SELECT * FROM test_dc WHERE data=34; +step preps: PREPARE getrow_seq AS SELECT * FROM test_dc WHERE data::text=34::text; +step begin: BEGIN; +step explaini: EXPLAIN (COSTS OFF) EXECUTE getrow_idx; +QUERY PLAN + +Index Scan using test_dc_data on test_dc + Index Cond: (data = 34) +step explains: EXPLAIN (COSTS OFF) EXECUTE getrow_seq; +QUERY PLAN + +Seq Scan on test_dc + Filter: ((data)::text = '34'::text) +step select2: SELECT * FROM test_dc WHERE data=34; +id data + +34 34 +step drop: DROP INDEX CONCURRENTLY test_dc_data; +step insert2: INSERT INTO test_dc(data) SELECT * FROM generate_series(1, 100); +step end2: COMMIT; +step selecti: EXECUTE getrow_idx; +id data + +34 34 +13434 +step selects: EXECUTE getrow_seq; +id data + +34 34 +13434 +step end: COMMIT; +step drop: <... completed> diff --git a/src/test/isolation/specs/drop-index-concurrently-2.spec b/src/test/isolation/specs/drop-index-concurrently-2.spec new file mode 100644 index 000..273f735 --- /dev/null +++ b/src/test/isolation/specs/drop-index-concurrently-2.spec @@ -0,0 +1,21 @@ +setup +{ + CREATE TABLE test_dc(id serial primary key, data int); + CREATE INDEX test_dc_data ON test_dc(data); +} + +session "s1" +setup { BEGIN; } +step "explain" { EXPLAIN (COSTS OFF) SELECT * FROM test_dc WHERE data=34343; } +step "rollback" { ROLLBACK; } +step "droptab" { DROP TABLE test_dc; } +step "selecti" { SELECT indexrelid::regclass, indisvalid, indisready FROM pg_index WHERE indexrelid = 'test_dc_data'::regclass; } +step "dropi" { DROP INDEX test_dc_data; } + +session "s2" +step "drop" { DROP INDEX CONCURRENTLY test_dc_data; } + +session "s3" +step "cancel" { SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query = 'DROP INDEX CONCURRENTLY test_dc_data;'; } + +permutation "explain" "drop" "cancel" "rollback" "droptab" "selecti" "dropi" diff --git a/src/test/isolation/expected/drop-index-concurrently-2.out b/src/test/isolation/expected/drop-index-concurrently-2.out new file mode 100644 index 000..4802777 --- /dev/null +++ b/src/test/isolation/expected/drop-index-concurrently-2.out @@ -0,0 +1,24 @@ +Parsed test spec
Re: [HACKERS] Deprecations in authentication
On 18 October 2012 12:20, Magnus Hagander wrote: > Since Simon stirred up a hornets nest suggesting deprecation of a > number of features, I figured I'd take it one step further and suggest > removal of some previously deprecated features :) I'm laughing at the analogy that angry and unintelligent agents responded to my proposals, but there was no stirring action from me. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On 18 October 2012 12:37, Magnus Hagander wrote: > On Thu, Oct 18, 2012 at 1:32 PM, Simon Riggs wrote: >> On 18 October 2012 12:20, Magnus Hagander wrote: >> >>> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old >>> syntax deprecated but still mapping to the new one. Has it been there >>> long enough that we should start throwing an error for ident on unix? >> >> Any reason to remove? Having two names for same thing is a happy place >> for users with bad/fond memories. It costs little and no errors are >> associated with using the old name (are there?). > > The only real reason for that one would be confusion. e.g. using ident > over tcp is for most people very insecure, whereas ident over unix > sockets is very secure. there are exceptions to both those, but for > the majority of cases we are using the same name for one thing that > has very good security and one that has very bad. And confusion when > it comes to security is usually not a good thing. I'll go with that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On Thu, Oct 18, 2012 at 1:32 PM, Simon Riggs wrote: > On 18 October 2012 12:20, Magnus Hagander wrote: > >> 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old >> syntax deprecated but still mapping to the new one. Has it been there >> long enough that we should start throwing an error for ident on unix? > > Any reason to remove? Having two names for same thing is a happy place > for users with bad/fond memories. It costs little and no errors are > associated with using the old name (are there?). The only real reason for that one would be confusion. e.g. using ident over tcp is for most people very insecure, whereas ident over unix sockets is very secure. there are exceptions to both those, but for the majority of cases we are using the same name for one thing that has very good security and one that has very bad. And confusion when it comes to security is usually not a good thing. The krb5 one is more about maintaining code, but there is not much cost to keeping ident-over-unix, that's true. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecations in authentication
On 18 October 2012 12:20, Magnus Hagander wrote: > 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old > syntax deprecated but still mapping to the new one. Has it been there > long enough that we should start throwing an error for ident on unix? Any reason to remove? Having two names for same thing is a happy place for users with bad/fond memories. It costs little and no errors are associated with using the old name (are there?). -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Deprecations in authentication
Since Simon stirred up a hornets nest suggesting deprecation of a number of features, I figured I'd take it one step further and suggest removal of some previously deprecated features :) In particular, we made a couple of changes over sveral releases back in the authentication config, that we should perhaps consider finishing by removing the old stuff now? 1. krb5 authentication. We've had gssapi since 8.3 (which means in all supported versions). krb5 has been deprecated, also since 8.3. Time to remove it? 2. ident-over-unix-sockets was renamed to "peer" in 9.1, with the old syntax deprecated but still mapping to the new one. Has it been there long enough that we should start throwing an error for ident on unix? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Global Sequences
Simon Riggs writes: > Not sure how it is cleaner when we have to have trigger stuff hanging > around to make one object pretend to be another. That also creates a > chain of dependency which puts this into the future, rather than now. Yes, that part isn't cleaner at all. The part where we have a separate Object to deal with I like better, and I tried to reconciliate the two view points. Note that the event trigger would come installed and disabled, the user would only have to activate it: ALTER EVENT TRIGGER distribute_my_cluster ENABLE; Still the same issue. > The goal is make-sequences-work, not to invent something new that > might be cooler or more useful. If we create something new, then we > need to consider the references Daniel described, but that is a whole > different thing and already accessible if you need/want that. So ok, I withdraw my consensus proposal. I tried. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] pg_dump: Sort overloaded functions in deterministic order
On Wed, Oct 17, 2012 at 11:43 PM, Alvaro Herrera wrote: > Uh, the patch you posted keeps the pg_get_function_identity_arguments > call in dumpFunc, but there is now also a new one in getFuncs. Do we > need to remove the second one? It could be done, but unfortunately we cannot use the value computed in dumpFunc(), because getFuncs() is called before dumpFunc(). The patch currently only affects getFuncs(), it doesn't touch dumpFunc(). What could be done is to keep the changes in getFuncs(), and also change dumpFunc() to use the value computed in getFuncs(), but I think the gain is small in relation to the complexity of changing dumpFunc(), as we would still need to make the two other function calls in the SQL query in dumpFunc() to pg_get_function_arguments() and pg_get_function_result(). > Here's an updated patch for your consideration. I was about to push > this when I noticed the above. The only change here is that the extra > code that tests for new remoteVersions in the second "else if" branch of > getFuncs and getAggregates has been removed, since it cannot ever be > reached. Looks really good. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bugs in CREATE/DROP INDEX CONCURRENTLY
On Thursday, October 18, 2012 06:12:02 AM Kevin Grittner wrote: > Kevin Grittner wrote: > > Hmm. The comment is probably better now, but I've been re-checking > > the code, and I think my actual code change is completely wrong. > > Give me a bit to sort this out. > > I'm having trouble seeing a way to make this work without rearranging > the code for concurrent drop to get to a state where it has set > indisvalid = false, made that visible to all processes, and ensured > that all scans of the index are complete -- while indisready is still > true. That is the point where TransferPredicateLocksToHeapRelation() > could be safely called. Then we would need to set indisready = false, > make that visible to all processes, and ensure that all access to the > index is complete. I can't see where it works to set both flags at > the same time. I want to sleep on it to see if I can come up with any > other way, but right now that's the only way I'm seeing to make DROP > INDEX CONCURRENTLY compatible with SERIALIZABLE transactions. :-( In a nearby bug I had to restructure the code that in a way thats similar to this anyway, so that seems fine. Maybe you can fix the bug ontop of the two attached patches? Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 037daa464d3fc63dbc943b13dd90f477a4fc9aba Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Tue, 25 Sep 2012 01:41:29 +0200 Subject: [PATCH 1/2] Fix concurrency issues in concurrent index drops Previously a DROP INDEX CONCURRENTLY started with unsetting indisvalid *and* indisready. Thats problematic if some transaction is still looking at the index and another transction makes changes. See the example below. Now we do the drop in three stages, just as a concurrent index build. First unset indivalid, wait, unset indisready, wait, drop index. Example: Session 1: CREATE TABLE test_drop_concurrently(id serial primary key, data int); INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); CREATE INDEX test_drop_concurrently_data ON test_drop_concurrently(data); BEGIN; EXPLAIN ANALYZE SELECT * FROM test_drop_concurrently WHERE data = 34343; SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) Session 2: BEGIN; SELECT * FROM test_drop_concurrently WHERE data = 34343; Session 3: DROP INDEX CONCURRENTLY test_drop_concurrently_data; (in-progress) Session 2: INSERT INTO test_drop_concurrently(data) SELECT * FROM generate_series(1, 10); COMMIT; Session 1: SELECT * FROM test_drop_concurrently WHERE data = 34343; (1 row) SET enable_bitmapscan = false; SET enable_indexscan = false; SELECT * FROM test_drop_concurrently WHERE data = 34343; (2 rows) --- src/backend/catalog/index.c | 99 --- 1 file changed, 84 insertions(+), 15 deletions(-) diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 464950b..b39536e 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1316,6 +1316,10 @@ index_drop(Oid indexId, bool concurrent) * table lock strong enough to prevent all queries on the table from * proceeding until we commit and send out a shared-cache-inval notice * that will make them update their index lists. + * + * In the concurrent case we make sure that nobody can be looking at the + * indexes by dropping the index in multiple steps, so we don't need a full + * fledged AccessExlusiveLock yet. */ heapId = IndexGetRelation(indexId, false); if (concurrent) @@ -1336,7 +1340,19 @@ index_drop(Oid indexId, bool concurrent) /* * Drop Index concurrently is similar in many ways to creating an index - * concurrently, so some actions are similar to DefineIndex() + * concurrently, so some actions are similar to DefineIndex() just in the + * reverse order. + * + * First we unset indisvalid so queries starting afterwards don't use the + * index to answer queries anymore. We have to keep indisready = true + * though so transactions that are still using the index can continue to + * see valid index contents. E.g. when they are using READ COMMITTED mode, + * and another transactions that started later commits makes changes and + * commits, they need to see those new tuples in the index. + * + * After all transactions that could possibly have used it for queries + * ended we can unset indisready and wait till nobody could be updating it + * anymore. */ if (concurrent) { @@ -1355,21 +1371,14 @@ index_drop(Oid indexId, bool concurrent) elog(ERROR, "cache lookup failed for index %u", indexId); indexForm = (Form_pg_index) GETSTRUCT(tuple); - indexForm->indisvalid = false; /* make unusable for queries */ - indexForm->indisready = false; /* make invisible to changes */ + indexForm->indisvalid = false; /* make unusable for new queries */ + /* we keep indisready == true so it still gets updated */ simple_heap_update(i
[HACKERS] [BUG] False indication in pg_stat_replication.sync_state
Hello. My colleague found that pg_stat_replication.sync_state shows false state for some condition. This prevents Pacemaker from completing fail-over that could safely be done. The point is in walsender.c, pg_stat_get_wal_senders() below, (as of REL9_2_1) 1555:if (walsnd->pid != 0) 1556:{ 1557: /* 1558: * Treat a standby such as a pg_basebackup background process 1559: * which always returns an invalid flush location, as an 1560: * asynchronous standby. 1561: */ ! 1562: sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ? 1563: 0 : walsnd->sync_standby_priority; Here, XLogRecPtrIsInvalid(walsnd->flush) is defined as (walsnd->flush.xrecoff == 0) which becomes true as usual at every WAL 'file's (not segments) boundary. xrecoff == 0 is certainly invalid for the start point of WAL *RECORD*, but should be considered valid in replication stream. This check was introduced at 9.2.0 and the version up between 9.1.4 and 9.1.5. | DEBUG: write 4/0 flush 3/FEFFEC68 apply 3/FEFFEC68 | DEBUG: write 4/0 flush 4/0 apply 3/FEFFEC68 | DEBUG: HOGE: flush = 3/FEFFEC68 sync_priority[0] = 1 | DEBUG: write 4/111C0 flush 4/0 apply 3/FEFFECC0 !| DEBUG: HOGE: flush = 4/0 sync_priority[0] = 0 This value zero of sync_priority[0] makes sync_status 'async' errorneously and confuses Pacemaker. # The log line marked with 'HOGE' above printed by applying the # patch at the bottom of this message and invoking 'select # sync_state from pg_stat_replication' periodically. To increase # the chance to see the symptom, sleep 1 second for 'file' # boundaries :-) The Heikki's recent(?) commit 0ab9d1c4b31622e9176472b4276f3e9831e3d6ba which changes the format of XLogRecPtr from logid:xrecoff struct to 64 bit linear address would fix the false indication. But I suppose this patch won't be applied to existing 9.1.x and 9.2.x because of the modification onto streaming protocol. As far as I see the patch, it would'nt change the meaning of XLogRecPtr to change XLogRecPtrIsInvalid from (xrecoff == 0) to (xrecoff == 0 && xlogid == 0). But this change affects rather wide portion where handling WAL nevertheless what is needed here is only to stop the false indication. On the other hand, pg_basebackup seems return 0/0 as flush and apply positions so it seems enough only to add xlogid == 0 into the condition. The patch attached for REL9_2_1 does this and yields the result following. | DEBUG: write 2/FEFFFD48 flush 2/FEFFFD48 apply 2/FEFF7AB0 | DEBUG: write 3/0 flush 2/FEFFFD48 apply 2/FEFF7E88 | DEBUG: write 3/0 flush 3/0 apply 2/FEFFFD48 | DEBUG: HOGE: flush = 2/FEFFFD48 sync_priority[0] = 1 | DEBUG: write 3/E338 flush 3/0 apply 2/FE80 !| DEBUG: HOGE: flush = 3/0 sync_priority[0] = 1 I think this patch should be applied for 9.2.2 and 9.1.7. regards, -- Kyotaro Horiguchi NTT Open Source Software Center = The patch for this test. diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 064ddd5..19f79d1 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -618,6 +618,10 @@ ProcessStandbyReplyMessage(void) reply.flush.xlogid, reply.flush.xrecoff, reply.apply.xlogid, reply.apply.xrecoff); + if (reply.write.xrecoff == 0 || + reply.flush.xrecoff == 0) + sleep(1); + /* * Update shared state for this WalSender process based on reply data from * standby. @@ -1561,7 +1565,10 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS) */ sync_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ? 0 : walsnd->sync_standby_priority; - + elog(DEBUG1, "HOGE: flush = %X/%X sync_priority[%d] = %d", +walsnd->flush.xlogid, walsnd->flush.xrecoff, +i, sync_priority[i]); + if (walsnd->state == WALSNDSTATE_STREAMING && walsnd->sync_standby_priority > 0 && (priority == 0 || diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 064ddd5..1d4cbc4 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1555,11 +1555,11 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS) if (walsnd->pid != 0) { /* - * Treat a standby such as a pg_basebackup background process - * which always returns an invalid flush location, as an + * Treat a standby such as a pg_basebackup background process which + * always returns 0/0 (InvalidXLogRecPtr) as flush location, as an * asynchronous standby. */ - sync_priority[i] = XLogRecPtrIsInvalid(walsnd->f
[HACKERS] Database object names and libpq in UTF-8 locale on Windows
Hello, Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue with a UTF-8 database. Maybe this is expected, but want to be sure that I am not missing something. On Windows, I have created a database with: ENCODING = 'UTF-8' LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United States.1252' I am using libpq with a simple C program. The program handles UTF-8 strings. While it's not a traditional "UNICODE" Windows application using WCHAR, it should be possible to send and get back UTF-8 data with PostgreSQL. Interacting with the Windows system in the system locale is another story, but from a pure C / SQL / libpq point of view, as long as the PostgreSQL client encoding is properly defined to UTF-8, it should work, and it does mainly: - I can use UTF-8 string constants in my queries. - I can pass UTF-8 data to the database with parameterized queries. - I can fetch UTF-8 data from the database. - I can create db object names with UTF-8 characters. But the db object names must be specified with double quotes: When I do not use quoted db object names, I get a strange problem. The table is created, I can use it in my program, I can even use it in the pgAdmin query tool, but in the pgAdmin db browser, there is no table name displayed in the treeview... Further, when selecting schema information from pg_class, I can see that the table exists, but there is nothing displayed in the relname attribute... It appears that the problem disappears when using a "C" collation are char type: ENCODING = 'UTF-8' LC_COLLATE = 'C' LC_CTYPE = 'C' I suspect this has something to do with the fact that non-quoted identifiers are converted to lowercase, and because my LC_CTYPE is English_United States.1252, the conversion to lowercase fails... But: - why does PostgreSQL accept to create the table with invalid UTF-8 characters? - why can I make queries in the query tool with the UTF-8 table name? (note that show client_encoding in the query tool gives me "UNICODE" - is this the same as "UTF-8"?) So is there a bug, or do I have to use a C collation and char type for UTF-8 databases on Windows? I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale, and co). Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL databases on Windows...? Thanks for reading. Sebastien FLAESCH Four Js Development Tools -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On 17 October 2012 11:26, Hannu Krosing wrote: > LOGGED ONLY TABLE is very technical description of realisation - I'd > prefer it to work as mush like a table as possible, similar to how VIEW > currently works - for all usages that make sense, you can simply > substitute it for a TABLE > > QUEUE emphasizes the aspect of logged only table that it accepts > "records" in a certain order, persists these and then quarantees > that they can be read out in exact the same order - all this being > guaranteed by existing WAL mechanisms. > > It is not meant to be a full implementation of application level queuing > system though but just the capture, persisting and distribution parts > > Using this as an "application level queue" needs a set of interface > functions to extract the events and also to keep track of the processed > events. As there is no general consensus what these shoul be (like if > processing same event twice is allowed) this part is left for specific > queue consumer implementations. The two halves of the queue are the TAIL/entry point and the HEAD/exit point. As you point out these could be on the different servers, wherever the logical changes flow to, but could also be on the same server. When the head and tail are on the same server, the MESSAGE QUEUE syntax seems appropriate, but I agree that calling it that when its just a head or just a tail seems slightly misleading. I guess the question is whether we provide a full implementation or just the first half. We do, I think, want a full queue implementation in core. We also want to allow other queue implementations to interface with Postgres, so we probably want to allow "first half" only as well. Meaning we want both head and tail separately in core code. The question is whether we require both head and tail in core before we allow commit, to which I would say I think adding the tail first is OK, and adding the head later when we know exactly the design. Having said that, the LOGGING ONLY syntax makes me shiver. Better name? I should also add that this is an switchable sync/asynchronous transactional queue, whereas LISTEN/NOTIFY is a synchronous transactional queue. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Database object names and libpq in UTF-8 locale on Windows
Hello, Using PostgreSQL 9.2.1 on Windows, I am facing a strange character set issue with a UTF-8 database. Maybe this is expected, but want to be sure that I am not missing something. On Windows, I have created a database with: ENCODING = 'UTF-8' LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United States.1252' I am using libpq with a simple C program. The program handles UTF-8 strings. While it's not a traditional "UNICODE" Windows application using WCHAR, it should be possible to send and get back UTF-8 data with PostgreSQL. Interacting with the Windows system in the system locale is another story, but from a pure C / SQL / libpq point of view, as long as the PostgreSQL client encoding is properly defined to UTF-8, it should work, and it does mainly: - I can use UTF-8 string constants in my queries. - I can pass UTF-8 data to the database with parameterized queries. - I can fetch UTF-8 data from the database. - I can create db object names with UTF-8 characters. But the db object names must be specified with double quotes: When I do not use quoted db object names, I get a strange problem. The table is created, I can use it in my program, I can even use it in the pgAdmin query tool, but in the pgAdmin db browser, there is no table name displayed in the treeview... Further, when selecting schema information from pg_class, I can see that the table exists, but there is nothing displayed in the relname attribute... It appears that the problem disappears when using a "C" collation are char type: ENCODING = 'UTF-8' LC_COLLATE = 'C' LC_CTYPE = 'C' I suspect this has something to do with the fact that non-quoted identifiers are converted to lowercase, and because my LC_CTYPE is English_United States.1252, the conversion to lowercase fails... But: - why does PostgreSQL accept to create the table with invalid UTF-8 characters? - why can I make queries in the query tool with the UTF-8 table name? (note that show client_encoding in the query tool gives me "UNICODE" - is this the same as "UTF-8"?) So is there a bug, or do I have to use a C collation and char type for UTF-8 databases on Windows? I know that UTF-8 is not supported by the Windows C/POSIX library (setlocale, and co). Does it mean that it's not realistic to use UTF-8 encoding for PostgreSQL databases on Windows...? Thanks for reading. Sebastien FLAESCH Four Js Development Tools -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility
On 17 October 2012 21:25, Josh Berkus wrote: > >> It is not meant to be a full implementation of application level queuing >> system though but just the capture, persisting and distribution parts >> >> Using this as an "application level queue" needs a set of interface >> functions to extract the events and also to keep track of the processed >> events. As there is no general consensus what these shoul be (like if >> processing same event twice is allowed) this part is left for specific >> queue consumer implementations. > > Well, but AFAICT, you've already prohibited features through your design > which are essential to application-level queues, and are implemented by, > for example, pgQ. > > 1. your design only allows the queue to be read on replicas, not on the > node where the item was inserted. > > 2. if you can't UPDATE or DELETE queue items -- or LOCK them -- how on > earth would a client know which items they have executed and which they > haven't? > > 3. Double-down on #2 in a multithreaded environment. It's hard to work out how to reply to this because its just so off base. I don't agree with the restrictions you think you see at all, saying it politely rather than giving a one word answer. The problem here is you phrase these things with too much certainty, seeing only barriers. The "how on earth?" vibe is not appropriate at all. It's perfectly fine to ask for answers to those difficult questions, but don't presume that there are no answers, or that you know with certainty they are even hard ones. By phrasing things in such a closed way the only way forwards is through you, which does not help. All we're discussing is moving a successful piece of software into core, which has been discussed for years at the international technical meetings we've both been present at. I think an open viewpoint on the feasibility of that would be reasonable, especially when it comes from one of the original designers. I apologise for making a personal comment, but this does affect the technical discussion. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Global Sequences
On 16 October 2012 18:29, Tom Lane wrote: > Or maybe better, invent a level of indirection like a "sequence access > method" (comparable to index access methods) that provides a compatible > set of substitute functions for sequence operations. If you want to > override nextval() for a sequence, don't you likely also need to > override setval(), currval(), etc? Not to mention overriding ALTER > SEQUENCE's behavior. Agreed, though with exact API as discussed on portion of thread with Markus. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers