Re: [HACKERS] pg_upgrade --logfile option documentation
On Wed, Feb 29, 2012 at 06:22:27PM -0500, A.M. wrote: > > On Feb 29, 2012, at 6:02 PM, Bruce Momjian wrote: > > > On Wed, Feb 29, 2012 at 04:34:24PM -0500, Robert Haas wrote: > >> On Tue, Feb 28, 2012 at 9:45 PM, Bruce Momjian wrote: > >>> OK, I have implemented both Roberts and Àlvaro's ideas in my patch. > >>> I only add the .old suffix to pg_controldata when link mode is used, and > >>> I now do it after the schema has been created (the most common failure > >>> case for pg_upgrade), and just before we actually link files --- both > >>> very good ideas. > >> > >> Thanks for working on this. I think this will be a significant > >> usability improvement. > > > > Glad I got such good feedback and ideas. > > > >> Any ideas about improving the error reporting more generally, so that > >> when reloading the dump fails, the user can easily see what went > >> belly-up, even if they didn't use -l? > > > > The only idea I have is to write the psql log to a temporary file and > > report the last X lines from the file in case of failure. Does that > > help? > > > > > Perhaps pg_upgrade can print the path to the temp file containing the > log and instruct the user to look there for more detail. Hey, that's a good idea. I would always write the pg_dump output to a log file. If the dump succeeds, I remove the file, if not, I tell users to read the log file for details about the failure --- good idea. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Client Messages
On Thu, Mar 1, 2012 at 4:13 AM, Alvaro Herrera wrote: > > Excerpts from Heikki Linnakangas's message of jue ene 26 15:58:58 -0300 2012: >> On 26.01.2012 17:31, Tom Lane wrote: > >> > The idea that occurs to me is to have the code that uses the GUC do a >> > verify_mbstr(noerror) on it, and silently ignore it if it doesn't pass >> > (maybe with a LOG message). This would have to be documented of course, >> > but it seems better than the potential consequences of trying to send a >> > wrongly-encoded string. >> >> Hmm, fine with me. It would be nice to plug the hole that these bogus >> characters can leak elsewhere into the system through current_setting, >> though. Perhaps we could put the verify_mbstr() call somewhere in guc.c, >> to forbid incorrectly encoded characters from being stored in the guc >> variable in the first place. > > This patch is listed as "Needs review" but that seems to be wrong -- > it's "waiting on author", I think. Yes. I marked the patch as "waiting on author". > Do we have an updated patch? Fujii? No. I believe that the author Jim will submit the updated version. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Jim Nasby writes: > On 2/29/12 3:53 PM, Alvaro Herrera wrote: >> .. in fact this is precisely what killed Zdenek Kotala's idea of >> upgrading. > This is also why Simon has avoided the whole upgrade thing with his 16 bit > checksum idea (otherwise presumably we'd be looking at bigger checksums > anyway). > I get that fussing around with the version field is ugly. If there was > another way to do this without breaking pg_upgrade then it would be silly to > mess with the version field. Unfortunately, there is no other way. Fundamentally, what is going on here is that several of us think that we need a page format upgrade infrastructure first, and then we can think about adding checksums. Simon would like to cram checksums in without building such infrastructure, regardless of the consequences in code ugliness or future maintainability. Personally, I think that is a bad tradeoff. Eventually we are going to have to build that infrastructure, and the more we've kluged the page header format in the meanwhile, the more unpleasant it's going to be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
I'm curious about the LeafNode stuff. Is this something that could be done by expression_tree_walker? I'm not completely familiar with it so maybe there's some showstopper such as some node tags not being supported, or maybe it just doesn't help. But I'm curious. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 2/29/12 3:53 PM, Alvaro Herrera wrote: Excerpts from Tom Lane's message of mié feb 29 18:34:27 -0300 2012: Robert Haas writes: On Wed, Feb 29, 2012 at 2:33 PM, Heikki Linnakangas wrote: The utility would run in the old cluster before upgrading, so the the flag would have to be present in the old version. pg_upgrade would check that the flag is set, refusing to upgrade if it isn't, with an error like "please run pre-upgrade utility first". I find that a pretty unappealing design; it seems to me it'd be much easier to make the new cluster cope with everything. Easier for who? I don't care for the idea of code that has to cope with two page formats, or before long N page formats, because if we don't have some mechanism like this then we will never be able to decide that an old data format is safely dead. .. in fact this is precisely what killed Zdenek Kotala's idea of upgrading. This is also why Simon has avoided the whole upgrade thing with his 16 bit checksum idea (otherwise presumably we'd be looking at bigger checksums anyway). I get that fussing around with the version field is ugly. If there was another way to do this without breaking pg_upgrade then it would be silly to mess with the version field. Unfortunately, there is no other way. Page checksuming is something a lot of people (myself included) want to see. Being able to get it in 9.2 would be a big win over crossing our fingers that at some point in the future (who knows when) we'll maybe figure out the page format upgrade issue. While we should definitely be looking into that issue it's definitely not going to get fixed in 9.2. ISTM that checksums are actually ready to go if people can just swallow the bitter pill of a screwed-up page version field until we can actually get an upgrade utility in place (and until we get that utility in place I don't see that the version field would do us any good anyway...) -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Feb 29, 2012, at 6:02 PM, Bruce Momjian wrote: > On Wed, Feb 29, 2012 at 04:34:24PM -0500, Robert Haas wrote: >> On Tue, Feb 28, 2012 at 9:45 PM, Bruce Momjian wrote: >>> OK, I have implemented both Roberts and Àlvaro's ideas in my patch. >>> I only add the .old suffix to pg_controldata when link mode is used, and >>> I now do it after the schema has been created (the most common failure >>> case for pg_upgrade), and just before we actually link files --- both >>> very good ideas. >> >> Thanks for working on this. I think this will be a significant >> usability improvement. > > Glad I got such good feedback and ideas. > >> Any ideas about improving the error reporting more generally, so that >> when reloading the dump fails, the user can easily see what went >> belly-up, even if they didn't use -l? > > The only idea I have is to write the psql log to a temporary file and > report the last X lines from the file in case of failure. Does that > help? > Perhaps pg_upgrade can print the path to the temp file containing the log and instruct the user to look there for more detail. Cheers, M -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Nathan Boley writes: > On Wed, Feb 29, 2012 at 2:43 PM, Tom Lane wrote: >> Nathan Boley writes: >>> If I understand you're suggestion, queries of the form >>> SELECT * FROM rel >>> WHERE ARRAY[ 1,2,3,4 ] <= x >>> AND x <=ARRAY[ 1, 2, 3, 1000]; >>> would no longer use an index. Is that correct? >> No, just that we'd no longer have statistics relevant to that, and would >> have to fall back on default selectivity assumptions. > Which, currently, would mean queries of that form would typically use > a table scan, right? No, it doesn't. > What about MCV's? Will those be removed as well? Sure. Those seem even less useful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Wed, Feb 29, 2012 at 2:43 PM, Tom Lane wrote: > Nathan Boley writes: >> On Wed, Feb 29, 2012 at 12:39 PM, Tom Lane wrote: >>> I am starting to look at this patch now. I'm wondering exactly why the >>> decision was made to continue storing btree-style statistics for arrays, >>> in addition to the new stuff. > >> If I understand you're suggestion, queries of the form > >> SELECT * FROM rel >> WHERE ARRAY[ 1,2,3,4 ] <= x >> AND x <=ARRAY[ 1, 2, 3, 1000]; > >> would no longer use an index. Is that correct? > > No, just that we'd no longer have statistics relevant to that, and would > have to fall back on default selectivity assumptions. Which, currently, would mean queries of that form would typically use a table scan, right? > Do you think that > such applications are so common as to justify bloating pg_statistic for > everybody that uses arrays? I have no idea, but it seems like it will be a substantial regression for the people that are. What about MCV's? Will those be removed as well? Best, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Wed, Feb 29, 2012 at 04:34:24PM -0500, Robert Haas wrote: > On Tue, Feb 28, 2012 at 9:45 PM, Bruce Momjian wrote: > > OK, I have implemented both Roberts and Àlvaro's ideas in my patch. > > I only add the .old suffix to pg_controldata when link mode is used, and > > I now do it after the schema has been created (the most common failure > > case for pg_upgrade), and just before we actually link files --- both > > very good ideas. > > Thanks for working on this. I think this will be a significant > usability improvement. Glad I got such good feedback and ideas. > Any ideas about improving the error reporting more generally, so that > when reloading the dump fails, the user can easily see what went > belly-up, even if they didn't use -l? The only idea I have is to write the psql log to a temporary file and report the last X lines from the file in case of failure. Does that help? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
Robert Haas writes: > On Wed, Feb 29, 2012 at 2:08 PM, Tom Lane wrote: >> I think you're just assuming that without any solid evidence. My point >> is precisely that if the more-parameterized path *fails* to generate >> fewer rows, we want add_path to notice that and throw it out (or at >> least be able to throw it out, if there's not another reason to keep it). > Well, my "evidence" is that a parameterized path should pretty much > always include a paramaterized path somewhere in there - otherwise, > what is parameterization doing for us? Well, yes, we know that much. > And that's going to reduce the > row count. I may be missing something, but I'm confused as to why > this isn't nearly tautological. We don't know that --- I will agree it's likely, but that doesn't make it so certain that we can assume it without checking. A join condition won't necessarily eliminate any rows. (... thinks about that for awhile ...) One thing we could possibly do is have indxpath.c arbitrarily reject parameterizations that don't produce a smaller estimated number of rows than an unparameterized scan. Admittedly, this still doesn't *prove* the assumption for join relations, but maybe it brings the odds to where it's okay for add_path to make such an assumption. (... thinks some more ...) No, that doesn't get us there, because that doesn't establish that a more-parameterized path produces fewer rows than some path that requires less parameterization, yet not none at all. You really want add_path carrying out those comparisons. In your previous example, it's entirely possible that path D is dominated by B or C because of poor choices of join quals. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Robert Haas writes: > On Wed, Feb 29, 2012 at 4:34 PM, Tom Lane wrote: >> Easier for who? I don't care for the idea of code that has to cope with >> two page formats, or before long N page formats, because if we don't >> have some mechanism like this then we will never be able to decide that >> an old data format is safely dead. > Huh? You can drop support for a new page format any time you like. > You just decree that version X+1 no longer supports it, and you can't > pg_upgrade to it until all traces of the old page format are gone. And how would a DBA know that? > If you're going to require an offline rewrite of the whole old cluster > before doing the upgrade, how much better is it than just breaking the > page format and telling pg_upgrade users they're out of luck? The difference is that people aren't playing russian roulette with their data when they upgrade. The point of the mechanisms being discussed here is to know, for certain, that a large database no longer contains pages of the old format. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Nathan Boley writes: > On Wed, Feb 29, 2012 at 12:39 PM, Tom Lane wrote: >> I am starting to look at this patch now. I'm wondering exactly why the >> decision was made to continue storing btree-style statistics for arrays, >> in addition to the new stuff. > If I understand you're suggestion, queries of the form > SELECT * FROM rel > WHERE ARRAY[ 1,2,3,4 ] <= x > AND x <=ARRAY[ 1, 2, 3, 1000]; > would no longer use an index. Is that correct? No, just that we'd no longer have statistics relevant to that, and would have to fall back on default selectivity assumptions. Do you think that such applications are so common as to justify bloating pg_statistic for everybody that uses arrays? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Wed, Feb 29, 2012 at 2:08 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Feb 29, 2012 at 1:40 PM, Tom Lane wrote: >>> Indeed, and the code already knows that. However, in this example, path >>> A is capable of dominating the other three (being strictly less >>> parameterized than them), and B and C are each capable of dominating D. >>> The problem is just that I'd neglected to consider that rowcount now >>> also becomes a figure of merit. > >> In theory, yes, but in practice, won't it nearly always be the case >> that a less parameterized path generates more rows, and a more >> parameterized path generates less rows, so that neither dominates the >> other? > > I think you're just assuming that without any solid evidence. My point > is precisely that if the more-parameterized path *fails* to generate > fewer rows, we want add_path to notice that and throw it out (or at > least be able to throw it out, if there's not another reason to keep it). Well, my "evidence" is that a parameterized path should pretty much always include a paramaterized path somewhere in there - otherwise, what is parameterization doing for us? And that's going to reduce the row count. I may be missing something, but I'm confused as to why this isn't nearly tautological. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Wed, Feb 29, 2012 at 12:39 PM, Tom Lane wrote: > Alexander Korotkov writes: >> On Mon, Jan 23, 2012 at 7:58 PM, Noah Misch wrote: >>> I've attached a new version that includes the UINT64_FMT fix, some edits of >>> your newest comments, and a rerun of pgindent on the new files. I see no >>> other issues precluding commit, so I am marking the patch Ready for >>> Committer. >>> If I made any of the comments worse, please post another update. > >> Changes looks reasonable for me. Thanks! > > I am starting to look at this patch now. I'm wondering exactly why the > decision was made to continue storing btree-style statistics for arrays, > in addition to the new stuff. If I understand you're suggestion, queries of the form SELECT * FROM rel WHERE ARRAY[ 1,2,3,4 ] <= x AND x <=ARRAY[ 1, 2, 3, 1000]; would no longer use an index. Is that correct? Are you suggesting removing MCV's in lieu of MCE's as well? -Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
>The most obvious thing to do about this is to consider that one path can >dominate another on cost only if it is both cheaper *and* produces no >more rows. But I'm concerned about the cost of inserting yet another >test condition into add_path, which is slow enough already. Has anyone >got an idea for a different formulation that would avoid that? Will it discard the seq. scan path if the number of rows is more and cost is less or will it keep both paths and then later based on cost estimation for join it discards one of them? Can it be like if seq. scan has low cost and number of rows is only greater by certain thresh-hold, only then seq. scan will be kept else index scan will dominate. -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tom Lane Sent: Wednesday, February 29, 2012 4:05 AM To: pgsql-hackers@postgreSQL.org Subject: [HACKERS] Parameterized-path cost comparisons need some work I was experimenting today with a test case sent to me by somebody at Red Hat. The details aren't too important, except that it involves an inner join on the inside of a left join (where it cannot commute with the left join). I can reproduce the behavior with a standard regression test table, if I crank random_page_cost up a bit: regression=# set random_page_cost TO 5; SET regression=# explain select * from tenk1 t1 left join (tenk1 t2 join tenk1 t3 on t2.thousand = t3.unique2) on t1.hundred = t2.hundred where t1.unique1 = 1; QUERY PLAN Nested Loop Left Join (cost=0.00..507.16 rows=100 width=732) -> Index Scan using tenk1_unique1 on tenk1 t1 (cost=0.00..10.28 rows=1 width=244) Index Cond: (unique1 = 1) -> Nested Loop (cost=0.00..495.63 rows=100 width=488) -> Index Scan using tenk1_hundred on tenk1 t2 (cost=0.00..446.98 rows=100 width=244) Index Cond: (t1.hundred = hundred) -> Index Scan using tenk1_unique2 on tenk1 t3 (cost=0.00..0.48 rows=1 width=244) Index Cond: (unique2 = t2.thousand) (8 rows) regression=# set random_page_cost TO 6; SET regression=# explain select * from tenk1 t1 left join (tenk1 t2 join tenk1 t3 on t2.thousand = t3.unique2) on t1.hundred = t2.hundred where t1.unique1 = 1; QUERY PLAN - Hash Right Join (cost=928.30..2573.80 rows=100 width=732) Hash Cond: (t2.hundred = t1.hundred) -> Hash Join (cost=916.00..2523.00 rows=1 width=488) Hash Cond: (t2.thousand = t3.unique2) -> Seq Scan on tenk1 t2 (cost=0.00..458.00 rows=1 width=244) -> Hash (cost=458.00..458.00 rows=1 width=244) -> Seq Scan on tenk1 t3 (cost=0.00..458.00 rows=1 width=244) -> Hash (cost=12.29..12.29 rows=1 width=244) -> Index Scan using tenk1_unique1 on tenk1 t1 (cost=0.00..12.29 rows=1 width=244) Index Cond: (unique1 = 1) (10 rows) WTF? How did it suddenly fail to find the double-nested-loop plan, and have to settle for a much worse plan? The reason seems to be that for large enough random_page_cost, the seqscan on t2 is costed as cheaper than an indexscan that selects a significant fraction of the table. (Notice the t2 scans are nearly the same cost in these two examples.) That causes add_path to decide that the unparameterized seqscan is unconditionally better than the parameterized indexscan, and throw out the latter. Now it is impossible to form the parameterized nestloop subplan. The flaw in this logic, of course, is that the seqscan might be cheaper than the parameterized indexscan, but *it produces a whole lot more rows*, meaning that any subsequent join will be a lot more expensive. Previously add_path didn't have to worry about that, because all ordinary paths for a given relation produce the same number of rows (and we studiously kept things like inner indexscan paths out of add_path's view of the world). The most obvious thing to do about this is to consider that one path can dominate another on cost only if it is both cheaper *and* produces no more rows. But I'm concerned about the cost of inserting yet another test condition into add_path, which is slow enough already. Has anyone got an idea for a different formulation that would avoid that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Weak-memory specific problem in ResetLatch/WaitLatch (follow-up analysis)
Hi all, [Bcc'ed Tom Lane as he had done the initial investigation on this.] Following up on the earlier discussions in [1] http://archives.postgresql.org/pgsql-hackers/2010-11/msg01575.php and [2] http://archives.postgresql.org/pgsql-hackers/2011-08/msg00330.php with an initial fix in [3] http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4e15a4db5e65e43271f8d20750d6500ab12632d0 we (a group of researchers from Oxford and London) did a more formal analysis using a chain of automated verification tools. This suite of tools enables us to automatically check for presence or absence of bugs on (smaller) bits of software when run under weak memory model semantics, as is the case on x86 or PowerPC. [In this context we are always interested in further source code examples that make significant use of shared-memory concurrency on such platforms.] Using our tool chain, we were eager to (1) confirm the bug and (2) validate the proposed fix. See the very end for our analysis of the proposed fix. The example that Tom Lane initially posted in [2] can be further simplified to the following bit of self-contained (C) code (with line numbers), where WaitLatch is a simple busy wait as "while(!latch[.]);" and ResetLatch is just "latch[.]=0;" Then running two of these worker functions in parallel suffices to reproduce the problem that was initially reported. 1 #define WORKERS 2 2 volatile _Bool latch[WORKERS]; 3 volatile _Bool flag[WORKERS]; 4 5 void worker(int i) 6 { 7 while(!latch[i]); 8 for(;;) 9 { 10assert(!latch[i] || flag[i]); 11latch[i] = 0; 12if(flag[i]) 13{ 14 flag[i] = 0; 15 flag[i+1 % WORKERS] = 1; 16 latch[i+1 % WORKERS] = 1; 17} 18 19while(!latch[i]); 20 } 21 } We developed a tool that is able to analyse a piece of concurrent C code and determines whether it contains bugs. Our tool confirms the presence of at least two bugs in the piece of code under discussion. The first problem corresponds to a message passing idiom (view with fixed-width font; the info in front of each statement states the line number each worker/thread is executing): Worker 0 | Worker 1 ===+ (0:15) flag[1]=1; | (1:7) while (!latch[1]); (0:16) latch[1]=1; | (1:12) if (flag[1]) where we can observe latch[1] holding 1 and flag[1] holding 0 in the end. This behaviour can happen on the PowerPC architecture for three reasons. First, write-write pairs can be reordered, for example the write to flag and the write to latch by Worker 0. Second, read-read pairs can be reordered, for example the read of the while condition and the read of the if condition by Worker 1. Finally, the store to latch by Worker 0 might not be atomic. This corresponds to a scenario where we first execute Worker 0 up to line 17. All the reads and writes go directly to memory, except the assignments at lines 14 and 15, where the values 0 and 1 are stored respectively into the processor-local buffers of flag[0] and flag[1]. Then the second worker starts, reading the freshly updated value 1 of latch[1]. It then exits the blocking while (line 7). But latch[1] still holds 1, and flag[1] is still holding 0, as Worker 0 has not flushed yet the write of 1 waiting in its buffer. This means that the condition of the if statement would not be true, the critical section would be skipped, and the program would arrive at line 19, without having authorised the next worker to enter in critical section, and would loop forever. This seems to correspond to the scenario discussed in [2] above, where the wait in line 19 times out. This is confirmed by the fact that this behaviour is commonly observed on several generations of Power machines (e.g. 1.7G/167G on Power 7). Let us now focus on the second bug; it corresponds to a load buffering idiom Worker 0 | Worker 1 =+ (0:12) if (flag[0]) | (1:12) if (flag[1]) (0:15) flag[1]=1;| (1:15) flag[0]=1; where we can observe both flag[0] and flag[1] holding 1 in the end. This behaviour is valid on the PowerPC architecture for two reasons. First, PowerPC can reorder read-write pairs, hence the read of the if condition on either thread can be done after setting the flag to 1. Second, the fact that PowerPC relaxes the atomicity of stores is another reason for this to happen. Our understanding is that this behaviour is not yet implemented by real-world Power machines. Yet, since it is documented to be permitted by the architecture, this could lead to actual bugs on future generations of Power machines if not synchronised correctly. In [3] it was suggested to fix the problem by placing a barrier in ResetLatch, which corresponds to placing it between lines 11 and 12 in the code above. This amounts to placing a barrier between the two reads (lines 7/19 and 12, i.e., between WaitLatch and the if(flag[1]) ) of Worker 1. Placing a sync (i.e., the stro
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 4:34 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Feb 29, 2012 at 2:33 PM, Heikki Linnakangas >> wrote: >>> The utility would run in the old cluster before upgrading, so the the flag >>> would have to be present in the old version. pg_upgrade would check that the >>> flag is set, refusing to upgrade if it isn't, with an error like "please run >>> pre-upgrade utility first". > >> I find that a pretty unappealing design; it seems to me it'd be much >> easier to make the new cluster cope with everything. > > Easier for who? I don't care for the idea of code that has to cope with > two page formats, or before long N page formats, because if we don't > have some mechanism like this then we will never be able to decide that > an old data format is safely dead. Huh? You can drop support for a new page format any time you like. You just decree that version X+1 no longer supports it, and you can't pg_upgrade to it until all traces of the old page format are gone. If you're going to require an offline rewrite of the whole old cluster before doing the upgrade, how much better is it than just breaking the page format and telling pg_upgrade users they're out of luck? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Excerpts from Tom Lane's message of mié feb 29 18:34:27 -0300 2012: > > Robert Haas writes: > > On Wed, Feb 29, 2012 at 2:33 PM, Heikki Linnakangas > > wrote: > >> The utility would run in the old cluster before upgrading, so the the flag > >> would have to be present in the old version. pg_upgrade would check that > >> the > >> flag is set, refusing to upgrade if it isn't, with an error like "please > >> run > >> pre-upgrade utility first". > > > I find that a pretty unappealing design; it seems to me it'd be much > > easier to make the new cluster cope with everything. > > Easier for who? I don't care for the idea of code that has to cope with > two page formats, or before long N page formats, because if we don't > have some mechanism like this then we will never be able to decide that > an old data format is safely dead. .. in fact this is precisely what killed Zdenek Kotala's idea of upgrading. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Robert Haas writes: > On Wed, Feb 29, 2012 at 2:33 PM, Heikki Linnakangas > wrote: >> The utility would run in the old cluster before upgrading, so the the flag >> would have to be present in the old version. pg_upgrade would check that the >> flag is set, refusing to upgrade if it isn't, with an error like "please run >> pre-upgrade utility first". > I find that a pretty unappealing design; it seems to me it'd be much > easier to make the new cluster cope with everything. Easier for who? I don't care for the idea of code that has to cope with two page formats, or before long N page formats, because if we don't have some mechanism like this then we will never be able to decide that an old data format is safely dead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Tue, Feb 28, 2012 at 9:45 PM, Bruce Momjian wrote: > OK, I have implemented both Roberts and Àlvaro's ideas in my patch. > I only add the .old suffix to pg_controldata when link mode is used, and > I now do it after the schema has been created (the most common failure > case for pg_upgrade), and just before we actually link files --- both > very good ideas. Thanks for working on this. I think this will be a significant usability improvement. Any ideas about improving the error reporting more generally, so that when reloading the dump fails, the user can easily see what went belly-up, even if they didn't use -l? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 2:33 PM, Heikki Linnakangas wrote: > On 29.02.2012 21:30, Robert Haas wrote: >> >> On Wed, Feb 29, 2012 at 2:18 PM, Alvaro Herrera >> wrote: >>> >>> Note that if we want such an utility to walk and transform pages, we >>> probably need a marker in the catalogs somewhere so that pg_upgrade can >>> make sure that it was done in all candidate tables -- which is something >>> that we should get in 9.2 so that it can be used in 9.3. Such a marker >>> would also allow us get rid of HEAP_MOVED_IN and HEAP_MOVED_OUT. >> >> >> Getting rid of HEAP_MOVED_IN and HEAP_MOVED_OUT would be really nice, >> but I don't see why we need to squeeze anything into 9.2 for any of >> this. pg_upgrade can certainly handle the addition of a new pg_class >> column, and can arrange for in-place upgrades from pre-9.3 versions to >> 9.3 to set the flag to the appropriate value. > > The utility would run in the old cluster before upgrading, so the the flag > would have to be present in the old version. pg_upgrade would check that the > flag is set, refusing to upgrade if it isn't, with an error like "please run > pre-upgrade utility first". I find that a pretty unappealing design; it seems to me it'd be much easier to make the new cluster cope with everything. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
Excerpts from Thom Brown's message of mié feb 29 17:50:14 -0300 2012: > On 29 February 2012 20:33, Euler Taveira de Oliveira > wrote: > > On 29-02-2012 15:23, Thom Brown wrote: > >> Or just change it to output a verbose notice without changing the syntax? > >> > > I can't see why we will do it only for DROP OWNED. Chat messages are > > annoying > > unless the user asks for it (that's why I suggested VERBOSE). > > Possibly, but I think if you're going ahead an dropping objects owned > by a role, you'll tend to be interested in what those things were > rather than blindly removing things from the database without actually > knowing what they are. Sure. Initially DROP OWNED was supposed to be used only after REASSIGN OWNED. So you give your objects to someone else, then DROP OWNED to revoke whatever privileges you have (which is the only thing that REASSIGN OWNED wouldn't have touched), then commit suicide. There is no danger of dropping useful objects in this scenario because you already gave them away. (Unless, of course, some other poor soul is busy reassigning objects to you). I agree that it might be useful to list objects owned by a given user. Maybe this belongs into a view or a function, however, not a new command. After all, the mechanism that those commands use to display objects is not very user-friendly, or even claims to be complete (the lists are truncated after 200 objects or something like that). More generally, maybe the thing to do is have a SRF that reports objects that depend on some other object. If you specify a role as the referenced object, you get what's this thread is about. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY with hints, rebirth
On Wed, Feb 29, 2012 at 6:14 PM, Christopher Browne wrote: > On Wed, Feb 29, 2012 at 11:14 AM, Simon Riggs wrote: >> But it is very effective at avoiding 4 out of the 5 writes you mention. > > For the "common case," would we not want to have (1) [WAL] and (2) > [Writing the pre-frozen tuple]? > > If we only write the tuple (2), and don't capture WAL, then the COPY > wouldn't be replicable, right? Well, my answer is a question: how would you like it to work? The way I coded it is that it will still write WAL if wal_level is set, so it would be replicable. So it only works when writing to a newly created table but is otherwise separate to whether WAL is skipped. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Thu, Mar 1, 2012 at 1:09 AM, Tom Lane wrote: > Alexander Korotkov writes: > > On Thu, Mar 1, 2012 at 12:39 AM, Tom Lane wrote: > >> I am starting to look at this patch now. I'm wondering exactly why the > >> decision was made to continue storing btree-style statistics for arrays, > > > Probably, btree statistics really does matter for some sort of arrays? > For > > example, arrays representing paths in the tree. We could request a > subtree > > in a range query on such arrays. > > That seems like a pretty narrow, uncommon use-case. Also, to get > accurate stats for such queries that way, you'd need really enormous > histograms. I doubt that the existing parameters for histogram size > will permit meaningful estimation of more than the first array entry > (since we don't make the histogram any larger than we do for a scalar > column). > > The real point here is that the fact that we're storing btree-style > stats for arrays is an accident, backed into by having added btree > comparators for arrays plus analyze.c's habit of applying default > scalar-oriented analysis functions to any type without an explicit > typanalyze entry. I don't recall that we ever thought hard about > it or showed that those stats were worth anything. > OK. I don't object to removing btree stats from arrays. What do you thinks about pg_stats view in this case? Should it combine values histogram and array length histogram in single column like do for MCV and MCELEM? -- With best regards, Alexander Korotkov.
Re: [HACKERS] Collect frequency statistics for arrays
Alexander Korotkov writes: > On Thu, Mar 1, 2012 at 12:39 AM, Tom Lane wrote: >> I am starting to look at this patch now. I'm wondering exactly why the >> decision was made to continue storing btree-style statistics for arrays, > Probably, btree statistics really does matter for some sort of arrays? For > example, arrays representing paths in the tree. We could request a subtree > in a range query on such arrays. That seems like a pretty narrow, uncommon use-case. Also, to get accurate stats for such queries that way, you'd need really enormous histograms. I doubt that the existing parameters for histogram size will permit meaningful estimation of more than the first array entry (since we don't make the histogram any larger than we do for a scalar column). The real point here is that the fact that we're storing btree-style stats for arrays is an accident, backed into by having added btree comparators for arrays plus analyze.c's habit of applying default scalar-oriented analysis functions to any type without an explicit typanalyze entry. I don't recall that we ever thought hard about it or showed that those stats were worth anything. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling the location of server-side SSL files
Peter Eisentraut writes: > On ons, 2012-02-29 at 14:27 -0500, Tom Lane wrote: >> Hm? Obviously I misunderstood what changes you were proposing to make, >> so would you mind spelling it out? > The details are to be determined, but a possible change would likely be > that instead of looking for a file and using it if and only if found, > there would be some kind of connection parameter saying "use this file > for this functionality", and otherwise it's not used. The particular > example would be the CRL file. Mph. That seems unlikely to be a net win to me. The scenario I'm imagining is that you ("you" being DBA for some group of people) didn't have a CRL file before, and now you need one. Your administration problem is to get that CRL file into place for all your users. If we change as above, then you still have that admin problem, plus now you have another: getting all your users to use the new connection parameter. Which, as a rule, is going to be tough (for example, psql has no easy way to make that happen). The new admin problem offers you no leverage at all on the old one, either, since a user who's not acquired the CRL file more than likely hasn't changed his connection habits either. There may or may not be some value in a connection parameter that allows specifying a location besides ~/.postgresql/ for the SSL support files. But I don't find any attraction in changing the default behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Thu, Mar 1, 2012 at 12:39 AM, Tom Lane wrote: > I am starting to look at this patch now. I'm wondering exactly why the > decision was made to continue storing btree-style statistics for arrays, > in addition to the new stuff. The pg_statistic rows for array columns > tend to be unreasonably wide already, and as-is this patch will make > them quite a lot wider. I think it requires more than a little bit of > evidence to continue storing stats that seem to have only small > probability of usefulness. > > In particular, if we didn't store that stuff, we'd not need to widen the > number of columns in pg_statistic, which would noticeably reduce both > the footprint of the patch and the probability of breaking external > code. > Initially, I used existing slots for new statistics, but I've changed this after the first review: http://archives.postgresql.org/pgsql-hackers/2011-07/msg00780.php Probably, btree statistics really does matter for some sort of arrays? For example, arrays representing paths in the tree. We could request a subtree in a range query on such arrays. -- With best regards, Alexander Korotkov.
Re: [HACKERS] LIST OWNED BY...
On 29 February 2012 20:33, Euler Taveira de Oliveira wrote: > On 29-02-2012 15:23, Thom Brown wrote: >> Or just change it to output a verbose notice without changing the syntax? >> > I can't see why we will do it only for DROP OWNED. Chat messages are annoying > unless the user asks for it (that's why I suggested VERBOSE). Possibly, but I think if you're going ahead an dropping objects owned by a role, you'll tend to be interested in what those things were rather than blindly removing things from the database without actually knowing what they are. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Alexander Korotkov writes: > On Mon, Jan 23, 2012 at 7:58 PM, Noah Misch wrote: >> I've attached a new version that includes the UINT64_FMT fix, some edits of >> your newest comments, and a rerun of pgindent on the new files. I see no >> other issues precluding commit, so I am marking the patch Ready for >> Committer. >> If I made any of the comments worse, please post another update. > Changes looks reasonable for me. Thanks! I am starting to look at this patch now. I'm wondering exactly why the decision was made to continue storing btree-style statistics for arrays, in addition to the new stuff. The pg_statistic rows for array columns tend to be unreasonably wide already, and as-is this patch will make them quite a lot wider. I think it requires more than a little bit of evidence to continue storing stats that seem to have only small probability of usefulness. In particular, if we didn't store that stuff, we'd not need to widen the number of columns in pg_statistic, which would noticeably reduce both the footprint of the patch and the probability of breaking external code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
On 29-02-2012 15:23, Thom Brown wrote: > Or just change it to output a verbose notice without changing the syntax? > I can't see why we will do it only for DROP OWNED. Chat messages are annoying unless the user asks for it (that's why I suggested VERBOSE). -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Hello 2012/2/29 Alvaro Herrera : > > I think the way we're passing down the options to the checker is a bit > of a mess. The way it is formulated, it seems to me that we'd need to > add support code in the core CheckFunction for each option we might want > to accept in the PL-specific checkers -- including what type of value > the option receives. As an example, right now we have all but one > option taking a string argument (judging from usage of defGetString()); > however, option fatal_errors takes a boolean value, and it converts to > string "on" or "off" which is supposed to be passed down to the checker. > This doesn't seem very future-proof. I don't agree - we has not any datatype that can hold "key/value" list - and can be used via SQL interface. So two dimensional text array is simple and generic data type. Theoretically we can use JSON type now, but I think so array is more generic and better checked then JSON now (and it require less code - and JSON is still string too). We don't plan to modify parser to better support of JSON, so text array is more user friendly. I think so pl checker function can be simply called with used concept. But I am not against to your proposals. Actually "concept" of generic options was required in initial discuss and then there is implemented, but it not used. But cannot be removed, because probably don't would to change API in next version. > > (Also, the patch seems to be passing the fatal_errors value twice: first > in the options array, where it is ignored by the plpgsql checker, and a > second time as a separate boolean option. This needs a cleanup). > This is by design. One request for checker function (and check function statement) was generic support for some optional data. This can has sense for plperl or plpython, and it are not used now. Fatal_errors is only proof concept and can be removed. I plan to use these options in 9.3 for checking of "inline blocks". > I don't see any good way to pass down generic options in a generic way. > Maybe we can just state that all option values are going to be passed as > strings -- is that good enough? The other option would be to pass them > using something like pg_node_tree, but then it wouldn't be possible to > call the checker directly instead of through CHECK FUNCTION, which I > think was a requirement. And it'd be a stronger argument against usage > of SPI to call the checker function from CHECK FUNCTION, but that's an > unsolved problem. Using just string needs more complex parsing, and I don't like some like pg_node_tree too, because it cannot be simple created by hands for direct call of checker function. Please, accept fact, so we would to call directly PL checker function - and there all params of this function should be simple created - and using two dimensional array is really simple: ARRAY [['source',$$$$]]. I don't understand why you don't like pass generic options by generic way. Why? Regards Pavel Stehule > > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Excerpts from Heikki Linnakangas's message of mié feb 29 16:29:26 -0300 2012: > On 29.02.2012 21:18, Alvaro Herrera wrote: > > Excerpts from Heikki Linnakangas's message of mié feb 29 16:09:02 -0300 > > 2012: > >> I thought my view on how this should be done was already clear, but just > >> in case it isn't, let me restate: Enlarge the page header to make room > >> for the checksum. To handle upgrades, put code in the backend to change > >> the page format from old version to new one on-the-fly, as pages are > >> read in. Because we're making the header larger, we need to ensure that > >> there's room on every page. To do that, write a utility that you run on > >> the cluster before running pg_upgrade, which moves tuples to ensure > >> that. To ensure that the space doesn't get used again before upgrading, > >> change the old version so that it reserves those N bytes in all new > >> insertions and updates (I believe that approach has been discussed > >> before and everyone is comfortable with backpatching such a change). All > >> of this in 9.3. > > > > Note that if we want such an utility to walk and transform pages, we > > probably need a marker in the catalogs somewhere so that pg_upgrade can > > make sure that it was done in all candidate tables -- which is something > > that we should get in 9.2 so that it can be used in 9.3. > > In the simplest form, the utility could just create a magic file in the > data directory to indicate that it has run. All we need is a boolean > flag, unless you want to be fancy and make the utility restartable. > Implemented that way, there's no need to have anything in the catalogs. Well, I find it likely that people with huge and/or high velocity databases would want to get fancy about it, so that they can schedule it as they see fit. What I wouldn't like is an utility that is optional, so that we end up with situations after upgrade that do have the new page format, others that don't. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "make check" in src/test/isolation is unworkable
On 02/29/2012 02:33 PM, Peter Eisentraut wrote: On sön, 2011-05-08 at 19:35 -0400, Tom Lane wrote: I believe that the "make check" target in src/test/isolation is fundamentally unportable, as is illustrated by the fact that buildfarm member coypu is currently choking on it. The reason is that the pg_isolation_regress program depends on libpq, and in particular it depends on having an *installed* libpq. Anyplace where it appears to work, it's because you already installed Postgres, or at least libpq. I came across this old issue. Unless I'm missing something, there is no reason why pg_isolation_regress needs to be linked with libpq at all, and it works fine without it. If we removed the libpq link, then it would work just like pg_regress and could support "make check". Apparently, -Wl,--as-needed isn't working too well here. I believe we can't rely on it working. If we really don't need libpq then why not just filter it out? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling the location of server-side SSL files
On ons, 2012-02-29 at 14:27 -0500, Tom Lane wrote: > Peter Eisentraut writes: > > On ons, 2012-02-29 at 14:20 -0500, Tom Lane wrote: > >> In particular, I observe that we get pushback anytime we break something > >> in a way that makes SSL config files be required on the client side; > >> see bug #6302 for most recent example. > > > *If* we were to make a change in libpq analogous to the server side, the > > effect would be to make the files less required, which could actually > > help the case of bug #6302. > > Hm? Obviously I misunderstood what changes you were proposing to make, > so would you mind spelling it out? The details are to be determined, but a possible change would likely be that instead of looking for a file and using it if and only if found, there would be some kind of connection parameter saying "use this file for this functionality", and otherwise it's not used. The particular example would be the CRL file. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 29.02.2012 21:30, Robert Haas wrote: On Wed, Feb 29, 2012 at 2:18 PM, Alvaro Herrera wrote: Note that if we want such an utility to walk and transform pages, we probably need a marker in the catalogs somewhere so that pg_upgrade can make sure that it was done in all candidate tables -- which is something that we should get in 9.2 so that it can be used in 9.3. Such a marker would also allow us get rid of HEAP_MOVED_IN and HEAP_MOVED_OUT. Getting rid of HEAP_MOVED_IN and HEAP_MOVED_OUT would be really nice, but I don't see why we need to squeeze anything into 9.2 for any of this. pg_upgrade can certainly handle the addition of a new pg_class column, and can arrange for in-place upgrades from pre-9.3 versions to 9.3 to set the flag to the appropriate value. The utility would run in the old cluster before upgrading, so the the flag would have to be present in the old version. pg_upgrade would check that the flag is set, refusing to upgrade if it isn't, with an error like "please run pre-upgrade utility first". -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "make check" in src/test/isolation is unworkable
On sön, 2011-05-08 at 19:35 -0400, Tom Lane wrote: > I believe that the "make check" target in src/test/isolation is > fundamentally unportable, as is illustrated by the fact that buildfarm > member coypu is currently choking on it. The reason is that the > pg_isolation_regress program depends on libpq, and in particular it > depends on having an *installed* libpq. Anyplace where it appears to > work, it's because you already installed Postgres, or at least libpq. I came across this old issue. Unless I'm missing something, there is no reason why pg_isolation_regress needs to be linked with libpq at all, and it works fine without it. If we removed the libpq link, then it would work just like pg_regress and could support "make check". Apparently, -Wl,--as-needed isn't working too well here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 2:18 PM, Alvaro Herrera wrote: > Note that if we want such an utility to walk and transform pages, we > probably need a marker in the catalogs somewhere so that pg_upgrade can > make sure that it was done in all candidate tables -- which is something > that we should get in 9.2 so that it can be used in 9.3. Such a marker > would also allow us get rid of HEAP_MOVED_IN and HEAP_MOVED_OUT. Getting rid of HEAP_MOVED_IN and HEAP_MOVED_OUT would be really nice, but I don't see why we need to squeeze anything into 9.2 for any of this. pg_upgrade can certainly handle the addition of a new pg_class column, and can arrange for in-place upgrades from pre-9.3 versions to 9.3 to set the flag to the appropriate value. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 29.02.2012 21:18, Alvaro Herrera wrote: Excerpts from Heikki Linnakangas's message of mié feb 29 16:09:02 -0300 2012: I thought my view on how this should be done was already clear, but just in case it isn't, let me restate: Enlarge the page header to make room for the checksum. To handle upgrades, put code in the backend to change the page format from old version to new one on-the-fly, as pages are read in. Because we're making the header larger, we need to ensure that there's room on every page. To do that, write a utility that you run on the cluster before running pg_upgrade, which moves tuples to ensure that. To ensure that the space doesn't get used again before upgrading, change the old version so that it reserves those N bytes in all new insertions and updates (I believe that approach has been discussed before and everyone is comfortable with backpatching such a change). All of this in 9.3. Note that if we want such an utility to walk and transform pages, we probably need a marker in the catalogs somewhere so that pg_upgrade can make sure that it was done in all candidate tables -- which is something that we should get in 9.2 so that it can be used in 9.3. In the simplest form, the utility could just create a magic file in the data directory to indicate that it has run. All we need is a boolean flag, unless you want to be fancy and make the utility restartable. Implemented that way, there's no need to have anything in the catalogs. Such a marker would also allow us get rid of HEAP_MOVED_IN and HEAP_MOVED_OUT. True. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 2:09 PM, Heikki Linnakangas wrote: > On 29.02.2012 19:54, Simon Riggs wrote: >> I'm beginning to lose faith that objections are being raised at a >> rational level. It's not a panel game with points for clever answers, >> its an engineering debate about how to add features real users want. >> And they do want, so let me solve the problems by agreeing something >> early enough to allow it to be implemented, rather than just >> discussing it until we run out of time. > > I thought my view on how this should be done was already clear, but just in > case it isn't, let me restate: Enlarge the page header to make room for the > checksum. To handle upgrades, put code in the backend to change the page > format from old version to new one on-the-fly, as pages are read in. Because > we're making the header larger, we need to ensure that there's room on every > page. To do that, write a utility that you run on the cluster before running > pg_upgrade, which moves tuples to ensure that. To ensure that the space > doesn't get used again before upgrading, change the old version so that it > reserves those N bytes in all new insertions and updates (I believe that > approach has been discussed before and everyone is comfortable with > backpatching such a change). All of this in 9.3. This could all be done on-line if we had a per-table flag indicating which page version is in use. You can use some variant of CLUSTER or VACUUM or ALTER TABLE to rewrite the table using the page version of your choice. This is also more granular, in that it allows checksums (or other features) to be turned on and off on a per-table basis. When you read the table, the same flag tells you which page version to expect, and you can error out if you haven't got that (or if the CRCs don't match). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling the location of server-side SSL files
Peter Eisentraut writes: > On ons, 2012-02-29 at 14:20 -0500, Tom Lane wrote: >> In particular, I observe that we get pushback anytime we break something >> in a way that makes SSL config files be required on the client side; >> see bug #6302 for most recent example. > *If* we were to make a change in libpq analogous to the server side, the > effect would be to make the files less required, which could actually > help the case of bug #6302. Hm? Obviously I misunderstood what changes you were proposing to make, so would you mind spelling it out? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling the location of server-side SSL files
On ons, 2012-02-29 at 14:20 -0500, Tom Lane wrote: > In particular, I observe that we get pushback anytime we break something > in a way that makes SSL config files be required on the client side; > see bug #6302 for most recent example. *If* we were to make a change in libpq analogous to the server side, the effect would be to make the files less required, which could actually help the case of bug #6302. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling the location of server-side SSL files
Peter Eisentraut writes: > On ons, 2012-02-08 at 09:16 +0100, Magnus Hagander wrote: >> Yes, ignoring a missing file in a security context is definitely not good. >> It should throw an error. >> >> We have a few bad defaults from the old days around SSL for this, but if it >> requires breaking backwards compatibility to get it right, I think we >> should still do it. > Btw., should we also consider making similar changes on the libpq side? I think that breaking compatibility of libpq's behavior is a whole lot harder sell than changing things in a way that only affects what people have to put into postgresql.conf. We've always treated the latter as something that can change across major versions. In particular, I observe that we get pushback anytime we break something in a way that makes SSL config files be required on the client side; see bug #6302 for most recent example. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Excerpts from Heikki Linnakangas's message of mié feb 29 16:09:02 -0300 2012: > On 29.02.2012 19:54, Simon Riggs wrote: > > I'm beginning to lose faith that objections are being raised at a > > rational level. It's not a panel game with points for clever answers, > > its an engineering debate about how to add features real users want. > > And they do want, so let me solve the problems by agreeing something > > early enough to allow it to be implemented, rather than just > > discussing it until we run out of time. > > I thought my view on how this should be done was already clear, but just > in case it isn't, let me restate: Enlarge the page header to make room > for the checksum. To handle upgrades, put code in the backend to change > the page format from old version to new one on-the-fly, as pages are > read in. Because we're making the header larger, we need to ensure that > there's room on every page. To do that, write a utility that you run on > the cluster before running pg_upgrade, which moves tuples to ensure > that. To ensure that the space doesn't get used again before upgrading, > change the old version so that it reserves those N bytes in all new > insertions and updates (I believe that approach has been discussed > before and everyone is comfortable with backpatching such a change). All > of this in 9.3. Note that if we want such an utility to walk and transform pages, we probably need a marker in the catalogs somewhere so that pg_upgrade can make sure that it was done in all candidate tables -- which is something that we should get in 9.2 so that it can be used in 9.3. Such a marker would also allow us get rid of HEAP_MOVED_IN and HEAP_MOVED_OUT. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Client Messages
Excerpts from Heikki Linnakangas's message of jue ene 26 15:58:58 -0300 2012: > On 26.01.2012 17:31, Tom Lane wrote: > > The idea that occurs to me is to have the code that uses the GUC do a > > verify_mbstr(noerror) on it, and silently ignore it if it doesn't pass > > (maybe with a LOG message). This would have to be documented of course, > > but it seems better than the potential consequences of trying to send a > > wrongly-encoded string. > > Hmm, fine with me. It would be nice to plug the hole that these bogus > characters can leak elsewhere into the system through current_setting, > though. Perhaps we could put the verify_mbstr() call somewhere in guc.c, > to forbid incorrectly encoded characters from being stored in the guc > variable in the first place. This patch is listed as "Needs review" but that seems to be wrong -- it's "waiting on author", I think. Do we have an updated patch? Fujii? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
Robert Haas writes: > On Wed, Feb 29, 2012 at 1:40 PM, Tom Lane wrote: >> Indeed, and the code already knows that. However, in this example, path >> A is capable of dominating the other three (being strictly less >> parameterized than them), and B and C are each capable of dominating D. >> The problem is just that I'd neglected to consider that rowcount now >> also becomes a figure of merit. > In theory, yes, but in practice, won't it nearly always be the case > that a less parameterized path generates more rows, and a more > parameterized path generates less rows, so that neither dominates the > other? I think you're just assuming that without any solid evidence. My point is precisely that if the more-parameterized path *fails* to generate fewer rows, we want add_path to notice that and throw it out (or at least be able to throw it out, if there's not another reason to keep it). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 29.02.2012 19:54, Simon Riggs wrote: I'm beginning to lose faith that objections are being raised at a rational level. It's not a panel game with points for clever answers, its an engineering debate about how to add features real users want. And they do want, so let me solve the problems by agreeing something early enough to allow it to be implemented, rather than just discussing it until we run out of time. I thought my view on how this should be done was already clear, but just in case it isn't, let me restate: Enlarge the page header to make room for the checksum. To handle upgrades, put code in the backend to change the page format from old version to new one on-the-fly, as pages are read in. Because we're making the header larger, we need to ensure that there's room on every page. To do that, write a utility that you run on the cluster before running pg_upgrade, which moves tuples to ensure that. To ensure that the space doesn't get used again before upgrading, change the old version so that it reserves those N bytes in all new insertions and updates (I believe that approach has been discussed before and everyone is comfortable with backpatching such a change). All of this in 9.3. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Wed, Feb 29, 2012 at 1:40 PM, Tom Lane wrote: >> I am having trouble >> constructing an example, but I feel like there might be cases where >> it's possible to have path A, not parameterized, path B, parameterized >> by R, and path C, parameterized by S, and maybe also path D, >> parameterized by both R and S. In that case, I feel like paths B and >> C are incomparable. > > Indeed, and the code already knows that. However, in this example, path > A is capable of dominating the other three (being strictly less > parameterized than them), and B and C are each capable of dominating D. > The problem is just that I'd neglected to consider that rowcount now > also becomes a figure of merit. In theory, yes, but in practice, won't it nearly always be the case that a less parameterized path generates more rows, and a more parameterized path generates less rows, so that neither dominates the other? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
I think the way we're passing down the options to the checker is a bit of a mess. The way it is formulated, it seems to me that we'd need to add support code in the core CheckFunction for each option we might want to accept in the PL-specific checkers -- including what type of value the option receives. As an example, right now we have all but one option taking a string argument (judging from usage of defGetString()); however, option fatal_errors takes a boolean value, and it converts to string "on" or "off" which is supposed to be passed down to the checker. This doesn't seem very future-proof. (Also, the patch seems to be passing the fatal_errors value twice: first in the options array, where it is ignored by the plpgsql checker, and a second time as a separate boolean option. This needs a cleanup). I don't see any good way to pass down generic options in a generic way. Maybe we can just state that all option values are going to be passed as strings -- is that good enough? The other option would be to pass them using something like pg_node_tree, but then it wouldn't be possible to call the checker directly instead of through CHECK FUNCTION, which I think was a requirement. And it'd be a stronger argument against usage of SPI to call the checker function from CHECK FUNCTION, but that's an unsolved problem. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
Robert Haas writes: > On Tue, Feb 28, 2012 at 5:35 PM, Tom Lane wrote: >> The most obvious thing to do about this is to consider that one path can >> dominate another on cost only if it is both cheaper *and* produces no >> more rows. > Hmm. Are you sure that's the right rule? On further reflection it had seemed like we might have to treat the rowcount as another independent metric of value. > I am having trouble > constructing an example, but I feel like there might be cases where > it's possible to have path A, not parameterized, path B, parameterized > by R, and path C, parameterized by S, and maybe also path D, > parameterized by both R and S. In that case, I feel like paths B and > C are incomparable. Indeed, and the code already knows that. However, in this example, path A is capable of dominating the other three (being strictly less parameterized than them), and B and C are each capable of dominating D. The problem is just that I'd neglected to consider that rowcount now also becomes a figure of merit. > Even if one is cheaper than the other and also > produces fewer rows, they're not the same rows; so the expense of a > subsequent join might be different with one vs. the other. Yeah, if the quals being used are different, that's possible in principle; but I don't foresee being able to estimate such a thing. I think just using the number of rows is as good as we're likely to be able to do. > Even in the simple case where there's only one possible > parameterization, it seems very difficult to compare the cost of a > parameterized path A with the cost of an unparameterized path B. I don't believe this. We can certainly compare costs, and we can certainly compare rowcount. The possibility that the specific set of rows selected might affect subsequent join costs seems to me to be too second-order to justify ignoring these first-order differences. The variant of the argument that had occurred to me is that if you consider that rowcount is an independent metric, then it might be that a less-parameterized path can never dominate a more-parameterized path, because even if the former is cheaper it should always generate more rows. However, I'm not convinced that I believe that --- in particular, with a poor choice of parameterizing quals it might not be true. If we did have a more-parameterized path that wasn't any more selective, we'd really want add_path to notice that and throw it out. > I almost wonder if the bottom-up approach is the wrong way to tackle > this entirely. I don't find this idea too attractive ... and in any case we're not doing rm -rf src/backend/optimizer/ and start over for 9.2. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] controlling the location of server-side SSL files
On ons, 2012-02-08 at 09:16 +0100, Magnus Hagander wrote: > > I'm still worried about this. If we ignore a missing root.crt, then the > > effect is that authentication and certificate verification might fail, > > which would be annoying, but you'd notice it soon enough. But if we > > ignore a missing root.crl, we are creating a security hole. > > > > Yes, ignoring a missing file in a security context is definitely not good. > It should throw an error. > > We have a few bad defaults from the old days around SSL for this, but if it > requires breaking backwards compatibility to get it right, I think we > should still do it. Btw., should we also consider making similar changes on the libpq side? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
On Wed, Feb 29, 2012 at 12:20 PM, Thom Brown wrote: > On 29 February 2012 17:16, Tom Lane wrote: >> Thom Brown writes: >>> So could we introduce either a command to show which objects are owned >>> by a particular role, or allow a dry-run of DROP OWNED BY? >> >> It's always been possible to do that: >> >> begin; >> drop owned by joe; >> rollback; >> >> I believe this is already the recommended approach if you're concerned >> about what DROP CASCADE will do. > > No, the cascade part is fine. It's the objects which won't cause a > cascade that are an issue. Putting it in a transaction for rolling > back doesn't help find out what it intends to drop. > > How can the user tell what the statement would drop (ignoring cascades)? It's certainly possible to write a query for this, but I think this gets back to the old argument about whether every client (and every end-user) should be required to reimplement this, or whether maybe we ought to provide some server functionality around it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
On 29 February 2012 18:15, Euler Taveira de Oliveira wrote: > On 29-02-2012 14:20, Thom Brown wrote: >> No, the cascade part is fine. It's the objects which won't cause a >> cascade that are an issue. Putting it in a transaction for rolling >> back doesn't help find out what it intends to drop. >> > DROP OWNED BY foo VERBOSE? Or just change it to output a verbose notice without changing the syntax? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
On 29-02-2012 14:20, Thom Brown wrote: > No, the cascade part is fine. It's the objects which won't cause a > cascade that are an issue. Putting it in a transaction for rolling > back doesn't help find out what it intends to drop. > DROP OWNED BY foo VERBOSE? -- Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY with hints, rebirth
On Wed, Feb 29, 2012 at 11:14 AM, Simon Riggs wrote: > But it is very effective at avoiding 4 out of the 5 writes you mention. For the "common case," would we not want to have (1) [WAL] and (2) [Writing the pre-frozen tuple]? If we only write the tuple (2), and don't capture WAL, then the COPY wouldn't be replicable, right? -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Tue, Feb 28, 2012 at 5:35 PM, Tom Lane wrote: > The most obvious thing to do about this is to consider that one path can > dominate another on cost only if it is both cheaper *and* produces no > more rows. Hmm. Are you sure that's the right rule? I am having trouble constructing an example, but I feel like there might be cases where it's possible to have path A, not parameterized, path B, parameterized by R, and path C, parameterized by S, and maybe also path D, parameterized by both R and S. In that case, I feel like paths B and C are incomparable. Even if one is cheaper than the other and also produces fewer rows, they're not the same rows; so the expense of a subsequent join might be different with one vs. the other. Even in the simple case where there's only one possible parameterization, it seems very difficult to compare the cost of a parameterized path A with the cost of an unparameterized path B. No matter how much cheaper A is than B, the eventual nested loop join might be so inefficient as to completely wipe A out. On the flip side, even if A is more expensive than B, it's nearly always going to produce at least somewhat fewer rows. There are degenerate cases where that might not be true, like a parameterized join where the table we're index-scanning has only one value in the paramaterized column, but presumably that's rare. So it may be worth keeping A around just because the subsequent join could turn out to be much cheaper with the smaller row count. Thus it seems unlikely that we'll be able to conclude that either A or B is categorically superior. If you accept that reasoning, then it seems like there's little if any point in ever comparing a parameterized path to a non-parameterized path; or of comparing two differently-parameterized paths against each other. You basically need a separate bucket for each group of paths, where they compete against each other but not against differently-parameterized paths; and then after you form the nested loop, all the paths that are now parameterized the same way (but weren't before) can finally go head-to-head against each other. I almost wonder if the bottom-up approach is the wrong way to tackle this entirely. Suppose we're trying to build paths for {A B}. We first build unparameterized paths for {A} and {B} and compute join paths for {A B}. Now we know the cheapest way to build {A B} without using parameterization, so we can make some deductions about a plan of the form: Nested Loop -> A -> B (parameterized by A) In particular, if we take the best cost either overall or for a path with the pathkeys that will be produced by this join, and divide by the number of rows for A, a parameterized path for B only makes sense if the total cost is less than that value. Now we have a meaningful bound that we can use to limit consideration of paths for B: anything that's more expensive than that bound should be chucked. If B is just a single rel that's not that interesting, but if B is a joinrel, then the bound applies individually to any subset of its members. So we start replanning B as a separate join problem, but any path for any baserel or joinrel that exceeds our cutoff cost gets chucked; and if any rel ends up with no workable paths, then we just forget the whole thing. Of course, the problem with this approach (aside from complexity) is that you might end up planning the same subproblem multiple times with different cost bounds. You could cache the results of previous computations so that you only replan it when the cost bound goes up, but that's still pretty awful. So maybe this is unworkable. But the current approach seems problematic, too, because you don't really know enough to throw very much away at the time that you need to make that decision. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 4:41 PM, Robert Haas wrote: > On Wed, Feb 29, 2012 at 11:24 AM, Simon Riggs wrote: >> On Wed, Feb 29, 2012 at 3:46 PM, Heikki Linnakangas >> wrote: >> Are you saying you would accept the patch if we had this? >> >>> I think I would still be uncomfortable with the hacks in the page header. >> >> There are no "hacks". There are some carefully designed changes with >> input from multiple people, including yourself, and it copes as >> gracefully as it can with backwards compatibility requirements. > > You have comments from three different people, all experienced > hackers, disagreeing with this position; Who is the third person you speak of? Perhaps they will speak again if they wish to be heard. > Heikki and I have both > proposed alternate approaches. No, you've proposed rejecting the patch in favour of some future dream. We all agree on the future dream but it clearly happens in the future and that could easily be more than 1 release ahead. If you have comments that would allow a patch in this release, I am happy to hear them. I hear only two people seeking to reject a patch that adds value for users. Quite frankly, the comments about flag bits are ridiculous and bogus. > I'm not sure that we're at a point > where we can say that we know what the best solution is, but I think > it is clear that there's enough concern about this that you ought not > to be denying that there is a problem. There are some weird cases that can cause problems. My wish to is to resolve those so everyone is happy. If those weird cases are simply used as an excuse to reject, then I don't accept that and nor will our users. Of course, if there was a significant issue, it would be immediate rejection but there isn't. I'm trying to commit a useful patch in this release. If you'd like to work with me to find a solution acceptable to all, I'd be happy to include suggestions, just as I have already included comments from Heikki, Bruce, Noah and others. I do accept that Heikki now says that the code I added at his request is just a hack. Assuming I'm going to commit something in this release, what should it look like? At Heikki's request/idea I will be working on a 2-phase database level parameter that will give us checksumming on the whole database after a scan to enable it. That sounds like it will resolve the corner cases relatively cleanly. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 5:44 PM, Robert Haas wrote: >>> You have comments from three different people, all experienced >>> hackers, disagreeing with this position; >> >> Who is the third person you speak of? Perhaps they will speak again if >> they wish to be heard. > > Tom Lane. It was the very first email posted in response to the very > first version of this patch you ever posted. Tom objected to not being able to tell which version a data block was. At length, we have discussed this on list and there is no issue. It is clear what page format a block has. Please ping him if you believe he has other rational objections to committing something and he isn't listening. I'm beginning to lose faith that objections are being raised at a rational level. It's not a panel game with points for clever answers, its an engineering debate about how to add features real users want. And they do want, so let me solve the problems by agreeing something early enough to allow it to be implemented, rather than just discussing it until we run out of time. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 12:26 PM, Simon Riggs wrote: > On Wed, Feb 29, 2012 at 4:41 PM, Robert Haas wrote: >> On Wed, Feb 29, 2012 at 11:24 AM, Simon Riggs wrote: >>> On Wed, Feb 29, 2012 at 3:46 PM, Heikki Linnakangas >>> wrote: >>> > Are you saying you would accept the patch if we had this? >>> I think I would still be uncomfortable with the hacks in the page header. >>> >>> There are no "hacks". There are some carefully designed changes with >>> input from multiple people, including yourself, and it copes as >>> gracefully as it can with backwards compatibility requirements. >> >> You have comments from three different people, all experienced >> hackers, disagreeing with this position; > > Who is the third person you speak of? Perhaps they will speak again if > they wish to be heard. Tom Lane. It was the very first email posted in response to the very first version of this patch you ever posted. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
On 29 February 2012 17:16, Tom Lane wrote: > Thom Brown writes: >> So could we introduce either a command to show which objects are owned >> by a particular role, or allow a dry-run of DROP OWNED BY? > > It's always been possible to do that: > > begin; > drop owned by joe; > rollback; > > I believe this is already the recommended approach if you're concerned > about what DROP CASCADE will do. No, the cascade part is fine. It's the objects which won't cause a cascade that are an issue. Putting it in a transaction for rolling back doesn't help find out what it intends to drop. How can the user tell what the statement would drop (ignoring cascades)? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LIST OWNED BY...
Thom Brown writes: > So could we introduce either a command to show which objects are owned > by a particular role, or allow a dry-run of DROP OWNED BY? It's always been possible to do that: begin; drop owned by joe; rollback; I believe this is already the recommended approach if you're concerned about what DROP CASCADE will do. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] LIST OWNED BY...
Hi all, If someone wants to drop objects owned by a particular role, they'll use DROP OWNED BY role. However, the implications of this statement aren't easily known, and once you've run it, it's not communicated which objects were dropped. So could we introduce either a command to show which objects are owned by a particular role, or allow a dry-run of DROP OWNED BY? Also when DROP OWNED BY has been executed, shouldn't we get feedback about what it did? At the moment the only feedback provided is what objects the statement cascaded to. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 11:24 AM, Simon Riggs wrote: > On Wed, Feb 29, 2012 at 3:46 PM, Heikki Linnakangas > wrote: > >>> Are you saying you would accept the patch if we had this? > >> I think I would still be uncomfortable with the hacks in the page header. > > There are no "hacks". There are some carefully designed changes with > input from multiple people, including yourself, and it copes as > gracefully as it can with backwards compatibility requirements. You have comments from three different people, all experienced hackers, disagreeing with this position; Heikki and I have both proposed alternate approaches. I'm not sure that we're at a point where we can say that we know what the best solution is, but I think it is clear that there's enough concern about this that you ought not to be denying that there is a problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 3:46 PM, Heikki Linnakangas wrote: >> Are you saying you would accept the patch if we had this? > I think I would still be uncomfortable with the hacks in the page header. There are no "hacks". There are some carefully designed changes with input from multiple people, including yourself, and it copes as gracefully as it can with backwards compatibility requirements. > Less so than in the current form - you wouldn't need a flag to indicate > whether the page has a valid checksum or not, which would clean it up quite > a bit - but still. I agree this would remove the reliance on a bit in the page header and so make it even more robust. I'll add the 2 phase enabling feature, making it happen at database level. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY with hints, rebirth
On Sat, Feb 25, 2012 at 6:58 PM, Simon Riggs wrote: > On Sat, Feb 25, 2012 at 6:24 PM, Kevin Grittner > wrote: >> Simon Riggs wrote: >> >>> This patch extends that and actually sets the tuple header flag as >>> HEAP_XMIN_COMMITTED during the load. >> >> Fantastic! > I think we could add that as an option on COPY, since "breaking MVCC" > in that way is only a bad thing if it happens accidentally without the > user's permission and knowledge - which they may wish to give in many > cases, such as reloading a database from a dump. I've coded up COPY FREEZE to do just that. This version doesn't require any changes to transaction machinery. But it is very effective at avoiding 4 out of the 5 writes you mention. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml index a73b022..4912449 100644 --- a/doc/src/sgml/ref/copy.sgml +++ b/doc/src/sgml/ref/copy.sgml @@ -34,6 +34,7 @@ COPY { table_name [ ( format_name OIDS [ boolean ] +FREEZE [ boolean ] DELIMITER 'delimiter_character' NULL 'null_string' HEADER [ boolean ] @@ -181,6 +182,23 @@ COPY { table_name [ ( +FREEZE + + + Specifies copying the data with rows already frozen, just as they + would be after running the VACUUM FREEZE command. + This only occurs if the table being loaded has been created in this + subtransaction, there are no cursors open and there are no older + snapshots held by this transaction. + Note that all sessions will immediately be able to see the data + if it has been successfully loaded, which specifically operates + outside of the normal definitions of MVCC. This is a performance + option intended for use only during initial data loads. + + + + + DELIMITER diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index c910863..68534bf 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -2050,7 +2050,13 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid, tup->t_data->t_infomask &= ~(HEAP_XACT_MASK); tup->t_data->t_infomask2 &= ~(HEAP2_XACT_MASK); tup->t_data->t_infomask |= HEAP_XMAX_INVALID; - HeapTupleHeaderSetXmin(tup->t_data, xid); + if (options & HEAP_INSERT_FREEZE) + { + HeapTupleHeaderSetXmin(tup->t_data, FrozenTransactionId); + tup->t_data->t_infomask |= HEAP_XMIN_COMMITTED; + } + else + HeapTupleHeaderSetXmin(tup->t_data, xid); HeapTupleHeaderSetCmin(tup->t_data, cid); HeapTupleHeaderSetXmax(tup->t_data, 0); /* for cleanliness */ tup->t_tableOid = RelationGetRelid(relation); diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 110480f..ec0bed8 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -43,6 +43,7 @@ #include "utils/builtins.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/portal.h" #include "utils/rel.h" #include "utils/snapmgr.h" @@ -108,6 +109,7 @@ typedef struct CopyStateData char *filename; /* filename, or NULL for STDIN/STDOUT */ bool binary; /* binary format? */ bool oids; /* include OIDs? */ + bool freeze; /* freeze rows on loading? */ bool csv_mode; /* Comma Separated Value format? */ bool header_line; /* CSV header line? */ char *null_print; /* NULL marker string (server encoding!) */ @@ -889,6 +891,14 @@ ProcessCopyOptions(CopyState cstate, errmsg("conflicting or redundant options"))); cstate->oids = defGetBoolean(defel); } + else if (strcmp(defel->defname, "freeze") == 0) + { + if (cstate->freeze) +ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("conflicting or redundant options"))); + cstate->freeze = defGetBoolean(defel); + } else if (strcmp(defel->defname, "delimiter") == 0) { if (cstate->delim) @@ -1922,6 +1932,11 @@ CopyFrom(CopyState cstate) hi_options |= HEAP_INSERT_SKIP_FSM; if (!XLogIsNeeded()) hi_options |= HEAP_INSERT_SKIP_WAL; + + if (cstate->freeze && + ThereAreNoPriorRegisteredSnapshots() && + ThereAreNoReadyPortals()) + hi_options |= HEAP_INSERT_FREEZE; } /* diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index d1ce2ab..cd2b243 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -2315,6 +2315,10 @@ copy_opt_item: { $$ = makeDefElem("oids", (Node *)makeInteger(TRUE)); } + | FREEZE +{ + $$ = makeDefElem("freeze", (Node *)makeInteger(TRUE)); +} | DELIMITER opt_as Sconst { $$ = makeDefElem("delimiter", (Node *)makeString($3)); diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c index cfb73c1..24075db 100644 --- a/src/backend/utils/mmgr/portalmem.c +++ b/src/backend/utils/mmgr/portalmem.c @@ -1055,3 +1055,22 @@ pg_cursor(PG_FUNCTIO
Re: [HACKERS] swapcache-style cache?
On Mon, Feb 27, 2012 at 11:54 PM, Andrea Suisani wrote: > On 02/28/2012 04:52 AM, Rob Wultsch wrote: >> >> On Wed, Feb 22, 2012 at 2:31 PM, james >> wrote: >>> >>> Has anyone considered managing a system like the DragonFLY swapcache for >>> a >>> DBMS like PostgreSQL? >>> >> >> https://www.facebook.com/note.php?note_id=388112370932 >> > > in the same vein: > > http://bcache.evilpiepirate.org/ > > from the main page: > > "Bcache is a patch for the Linux kernel to use SSDs to cache other block > devices. It's analogous to L2Arc for ZFS, > but Bcache also does writeback caching, and it's filesystem agnostic. It's > designed to be switched on with a minimum > of effort, and to work well without configuration on any setup. By default > it won't cache sequential IO, just the random > reads and writes that SSDs excel at. It's meant to be suitable for desktops, > servers, high end storage arrays, and perhaps > even embedded." > > it was submitted to linux kernel mailing list a bunch of time, the last one: > > https://lkml.org/lkml/2011/9/10/13 > > > Andrea I am pretty sure I won't get fired (or screw up the IPO) by saying that I have a high opinion of Flashcache (at least within the fb environment). Is anyone using bcache at scale? -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 29.02.2012 17:42, Simon Riggs wrote: On Wed, Feb 29, 2012 at 3:30 PM, Heikki Linnakangas wrote: Surely it can be done online. You'll just need a third state between off and on, where checksums are written but not verified, while the cluster is scanned. Are you saying you would accept the patch if we had this? I think I would still be uncomfortable with the hacks in the page header. Less so than in the current form - you wouldn't need a flag to indicate whether the page has a valid checksum or not, which would clean it up quite a bit - but still. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 3:30 PM, Heikki Linnakangas wrote: > Surely it can be done online. You'll just need a third state between off and > on, where checksums are written but not verified, while the cluster is > scanned. Are you saying you would accept the patch if we had this? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
Shigeru Hanada writes: > How about removing postgresql_fdw_validator from backend binary, and > changing dblink to use contrib/postgresql_fdw's validator? It breaks > some backward compatibility and requires contrib/postgresql_fdw to be > installed before using contrib/dblink with foreign servers, but ISTM > that it doesn't become so serious. I don't think that creating such a dependency is acceptable. Even if we didn't mind the dependency, you said yourself that contrib/postgresql_fdw's validator will accept stuff that's not appropriate for dblink. If we don't think postgresql_fdw_validator belongs in core after all, we should just move it to dblink. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 29.02.2012 17:01, Simon Riggs wrote: On Wed, Feb 29, 2012 at 2:40 PM, Heikki Linnakangas wrote: On 22.02.2012 14:30, Simon Riggs wrote: Agreed. No reason to change a checksum unless we rewrite the block, no matter whether page_checksums is on or off. This can happen: 1. checksums are initially enabled. A page is written, with a correct checksum. 2. checksums turned off. 3. A hint bit is set on the page. 4. While the page is being written out, someone pulls the power cord, and you get a torn write. The hint bit change made it to disk, but the clearing of the checksum in the page header did not. 5. Sometime after restart, checksums are turned back on. The page now has an incorrect checksum on it. The next time it's read, you get a checksum error. Yes, you will. And you'll get a checksum error because the block no longer passes. So an error should be reported. We can and should document that turning this on/off/on can cause problems. Hopefully crashing isn't that common a situation. Hopefully not, but then again, you get interested in fiddling with this setting, when you do experience crashes. This feature needs to be trustworthy in the face of crashes. I'm pretty uncomfortable with this idea of having a flag on the page itself to indicate whether it has a checksum or not. No matter how many bits we use for that flag. You can never be quite sure that all your data is covered by the checksum, and there's a lot of room for subtle bugs like the above, where a page is reported as corrupt when it isn't, or vice versa. That is necessary to allow upgrade. It's not their for any other reason. Understood. I'm uncomfortable with it regardless of why it's there. This thing needs to be reliable and robust. The purpose of a checksum is to have an extra sanity check, to detect faulty hardware. If it's complicated, whenever you get a checksum mismatch, you'll be wondering if you have broken hardware or if you just bumped on a PostgreSQL bug. I think you need a flag in pg_control or somewhere to indicate whether checksums are currently enabled or disabled, and a mechanism to scan and rewrite all the pages with checksums, before they are verified. That would require massive downtime, so again, it has been ruled out for practicality. Surely it can be done online. You'll just need a third state between off and on, where checksums are written but not verified, while the cluster is scanned. I've said this before, but I still don't like the hacks with the version number in the page header. Even if it works, I would much prefer the straightforward option of extending the page header for the new field. Yes, it means you have to deal with pg_upgrade, but it's a hurdle we'll have to jump at some point anyway. What you suggest might happen in the next release, or maybe longer. There may be things that block it completely, so it might never happen. My personal opinion is that it is not possible to make further block format changes until we have a fully online upgrade process, otherwise we block people from upgrading - not everybody can take their site down to run pg_upgrade. I plan to work on that, but it may not happen for 9.3; Yep, we should bite the bullet and work on that. perhaps you will object to that also when it comes. Heh, quite possible :-). But only if there's a reason. I do want to see a solution to this, I have a few page-format changing ideas myself that I'd like to implement at some point. This patch is very specifically something that makes the best of the situation, now, for those that want and need it. If you don't want it, you don't have to use it. Whether I use it or not, I'll have to live with it in the source tree. But that shouldn't stop us giving it to the people that do want it. I'm hearing general interest and support for this feature from people that run their business on PostgreSQL. If you ask someone "would you like to have checksums in PostgreSQL?", he'll say "sure". If you ask him "would you like to keep the PostgreSQL source tree as simple as possible, to make it easy for new developers to join the effort?", he'll say "yes". It's all about how you frame the question. Even if you want to have checksums on data pages, it doesn't necessarily mean you want them so badly you can't wait another release or two for a cleaner solution, or that you'd be satisfied with the implementation proposed here. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 2:40 PM, Heikki Linnakangas wrote: > On 22.02.2012 14:30, Simon Riggs wrote: >> >> On Wed, Feb 22, 2012 at 7:06 AM, Noah Misch wrote: >>> >>> On Sun, Feb 19, 2012 at 05:04:06PM -0500, Robert Haas wrote: Another disadvantage of the current scheme is that there's no particularly easy way to know that your whole cluster has checksums. No matter how we implement checksums, you'll have to rewrite every table in the cluster in order to get them fully turned on. But with the current design, there's no easy way to know how much of the cluster is actually checksummed. If you shut checksums off, they'll linger until those pages are rewritten, and there's no easy way to find the relations from which they need to be removed, either. >>> >>> >>> I'm not seeing value in rewriting pages to remove checksums, as opposed >>> to >>> just ignoring those checksums going forward. Did you have a particular >>> scenario in mind? >> >> >> Agreed. No reason to change a checksum unless we rewrite the block, no >> matter whether page_checksums is on or off. > > > This can happen: > > 1. checksums are initially enabled. A page is written, with a correct > checksum. > 2. checksums turned off. > 3. A hint bit is set on the page. > 4. While the page is being written out, someone pulls the power cord, and > you get a torn write. The hint bit change made it to disk, but the clearing > of the checksum in the page header did not. > 5. Sometime after restart, checksums are turned back on. > > The page now has an incorrect checksum on it. The next time it's read, you > get a checksum error. Yes, you will. And you'll get a checksum error because the block no longer passes. So an error should be reported. We can and should document that turning this on/off/on can cause problems. Hopefully crashing isn't that common a situation. The production default would be "off". The default in the patch is "on" only for testing. > I'm pretty uncomfortable with this idea of having a flag on the page itself > to indicate whether it has a checksum or not. No matter how many bits we use > for that flag. You can never be quite sure that all your data is covered by > the checksum, and there's a lot of room for subtle bugs like the above, > where a page is reported as corrupt when it isn't, or vice versa. That is necessary to allow upgrade. It's not their for any other reason. > This thing needs to be reliable and robust. The purpose of a checksum is to > have an extra sanity check, to detect faulty hardware. If it's complicated, > whenever you get a checksum mismatch, you'll be wondering if you have broken > hardware or if you just bumped on a PostgreSQL bug. I think you need a flag > in pg_control or somewhere to indicate whether checksums are currently > enabled or disabled, and a mechanism to scan and rewrite all the pages with > checksums, before they are verified. That would require massive downtime, so again, it has been ruled out for practicality. > I've said this before, but I still don't like the hacks with the version > number in the page header. Even if it works, I would much prefer the > straightforward option of extending the page header for the new field. Yes, > it means you have to deal with pg_upgrade, but it's a hurdle we'll have to > jump at some point anyway. What you suggest might happen in the next release, or maybe longer. There may be things that block it completely, so it might never happen. My personal opinion is that it is not possible to make further block format changes until we have a fully online upgrade process, otherwise we block people from upgrading - not everybody can take their site down to run pg_upgrade. I plan to work on that, but it may not happen for 9.3; perhaps you will object to that also when it comes. So we simply cannot rely on this "jam tomorrow" vision. This patch is very specifically something that makes the best of the situation, now, for those that want and need it. If you don't want it, you don't have to use it. But that shouldn't stop us giving it to the people that do want it. I'm hearing general interest and support for this feature from people that run their business on PostgreSQL. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI rw-conflicts and 2PC
Heikki Linnakangas wrote: > On 23.02.2012 01:36, Jeff Davis wrote: >> On Tue, 2012-02-14 at 19:32 -0500, Dan Ports wrote: >>> Hmm, it occurs to me if we have to abort a transaction due to >>> serialization failure involving a prepared transaction, we might >>> want to include the prepared transaction's gid in the errdetail. >> >> I like this idea. > > +1. Anyone want to put together a patch? Unless Dan claims it before I start the work, I'll do it. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On 22.02.2012 14:30, Simon Riggs wrote: On Wed, Feb 22, 2012 at 7:06 AM, Noah Misch wrote: On Sun, Feb 19, 2012 at 05:04:06PM -0500, Robert Haas wrote: Another disadvantage of the current scheme is that there's no particularly easy way to know that your whole cluster has checksums. No matter how we implement checksums, you'll have to rewrite every table in the cluster in order to get them fully turned on. But with the current design, there's no easy way to know how much of the cluster is actually checksummed. If you shut checksums off, they'll linger until those pages are rewritten, and there's no easy way to find the relations from which they need to be removed, either. I'm not seeing value in rewriting pages to remove checksums, as opposed to just ignoring those checksums going forward. Did you have a particular scenario in mind? Agreed. No reason to change a checksum unless we rewrite the block, no matter whether page_checksums is on or off. This can happen: 1. checksums are initially enabled. A page is written, with a correct checksum. 2. checksums turned off. 3. A hint bit is set on the page. 4. While the page is being written out, someone pulls the power cord, and you get a torn write. The hint bit change made it to disk, but the clearing of the checksum in the page header did not. 5. Sometime after restart, checksums are turned back on. The page now has an incorrect checksum on it. The next time it's read, you get a checksum error. I'm pretty uncomfortable with this idea of having a flag on the page itself to indicate whether it has a checksum or not. No matter how many bits we use for that flag. You can never be quite sure that all your data is covered by the checksum, and there's a lot of room for subtle bugs like the above, where a page is reported as corrupt when it isn't, or vice versa. This thing needs to be reliable and robust. The purpose of a checksum is to have an extra sanity check, to detect faulty hardware. If it's complicated, whenever you get a checksum mismatch, you'll be wondering if you have broken hardware or if you just bumped on a PostgreSQL bug. I think you need a flag in pg_control or somewhere to indicate whether checksums are currently enabled or disabled, and a mechanism to scan and rewrite all the pages with checksums, before they are verified. I've said this before, but I still don't like the hacks with the version number in the page header. Even if it works, I would much prefer the straightforward option of extending the page header for the new field. Yes, it means you have to deal with pg_upgrade, but it's a hurdle we'll have to jump at some point anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI rw-conflicts and 2PC
On 23.02.2012 01:36, Jeff Davis wrote: On Tue, 2012-02-14 at 19:32 -0500, Dan Ports wrote: On Tue, Feb 14, 2012 at 09:27:58AM -0600, Kevin Grittner wrote: Heikki Linnakangas wrote: On 14.02.2012 04:57, Dan Ports wrote: The easiest answer would be to just treat every prepared transaction found during recovery as though it had a conflict in and out. This is roughly a one-line change, and it's certainly safe. +1. I don't even see this as much of a problem. Prepared transactions hanging around for arbitrary periods of time cause all kinds of problems already. Those using them need to be careful to resolve them quickly -- and if there's a crash involved, I think it's reasonable to say they should be resolved before continuing normal online operations. Committed this now. (sorry for the delay) Hmm, it occurs to me if we have to abort a transaction due to serialization failure involving a prepared transaction, we might want to include the prepared transaction's gid in the errdetail. I like this idea. +1. Anyone want to put together a patch? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Hello 2012/2/28 Alvaro Herrera : > > > In gram.y we have a new check_option_list nonterminal. This is mostly > identical to explain_option_list, except that the option args do not > take a NumericOnly (only opt_boolean_or_string and empty). I wonder if > it's really worthwhile having a bunch of separate productions for this; > how about we just use the existing explain_option_list instead and get > rid of those extra productions? > > elog() is used in many user-facing messages (errors and notices). Full > ereport() calls should be used there, so that messages are marked for > translations and so on. I replaced elog by ereport for all not internal errors > > Does the patched pg_dump work with older servers? > it should to do > I don't like CheckFunction being declared in defrem.h. It seems > completely out of place there. I don't see any better place though, so > I'm thinking maybe we should have a new header file for it (say > commands/functions.h; but we already have executor/functions.h so > perhaps it's better to find another name). This addition means that > there's a distressingly large number of .c files that are now getting > dest.h, which was previously pretty confined. please, fix it like you wish Regards Pavel > > -- > Álvaro Herrera > The PostgreSQL Company - Command Prompt, Inc. > PostgreSQL Replication, Consulting, Custom Development, 24x7 support check_function-2012-02-29-1.diff.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Unnecessary WAL archiving after failover
Hi, In streaming replication, after failover, new master might have lots of un-applied WAL files with old timeline ID. They are the WAL files which were recycled as a future ones when the server was running as a standby. Since they will never be used later, they don't need to be archived after failover. But since they have neither .ready nor .done file in archive_status, checkpoints after failover newly create .reacy files for them, and then finally they are archived. Which might cause disk I/O spike both in WAL and archive storage. To avoid the above problem, I think that un-applied WAL files with old timeline ID should be marked as already-archived and recycled immediately at the end of recovery. Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
2012/2/29 Shigeru Hanada : > (2012/02/29 4:07), Peter Eisentraut wrote: >> Let's at least be clear about the reasons here. The fact that >> postgresql_fdw_validator exists means (a) there is a possible naming >> conflict that has not been discussed yet, and/or (b) the name is already >> settled and we need to think of a way to make postgresql_fdw_validator >> work with the new actual FDW. > > We can avoid conflict of name by using postgres_fdw or pgsql_fdw, but it > doesn't solve fundamental issue. ISTM that maintaining two similar > validators is wasteful and confusing, and FDW for PostgreSQL should be > just one, at least in the context of core distribution. > > Current pgsql_fdw_validator accepts every FDW options which is accepted > by postgresql_fdw_validator, and additionally accepts FDW specific > options such as fetch_count. So, if dblink can ignore unknown FDW > options, pgsql_fdw_validator can be used to create foreign servers for > dblink connection. > > How about removing postgresql_fdw_validator from backend binary, and > changing dblink to use contrib/postgresql_fdw's validator? It breaks > some backward compatibility and requires contrib/postgresql_fdw to be > installed before using contrib/dblink with foreign servers, but ISTM > that it doesn't become so serious. > +1 pavel stehule > Of course dblink is still available by itself if user specifies > connection information with "key = value" string, not with server name. > > One concern is how to avoid duplicated list of valid libpq options. > Adding new libpq function, like below, which returns 1 when given name > is a valid libpq option would help. > > int PQisValidOption(const char *keyword); > > -- > Shigeru Hanada > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
(2012/02/29 4:07), Peter Eisentraut wrote: > Let's at least be clear about the reasons here. The fact that > postgresql_fdw_validator exists means (a) there is a possible naming > conflict that has not been discussed yet, and/or (b) the name is already > settled and we need to think of a way to make postgresql_fdw_validator > work with the new actual FDW. We can avoid conflict of name by using postgres_fdw or pgsql_fdw, but it doesn't solve fundamental issue. ISTM that maintaining two similar validators is wasteful and confusing, and FDW for PostgreSQL should be just one, at least in the context of core distribution. Current pgsql_fdw_validator accepts every FDW options which is accepted by postgresql_fdw_validator, and additionally accepts FDW specific options such as fetch_count. So, if dblink can ignore unknown FDW options, pgsql_fdw_validator can be used to create foreign servers for dblink connection. How about removing postgresql_fdw_validator from backend binary, and changing dblink to use contrib/postgresql_fdw's validator? It breaks some backward compatibility and requires contrib/postgresql_fdw to be installed before using contrib/dblink with foreign servers, but ISTM that it doesn't become so serious. Of course dblink is still available by itself if user specifies connection information with "key = value" string, not with server name. One concern is how to avoid duplicated list of valid libpq options. Adding new libpq function, like below, which returns 1 when given name is a valid libpq option would help. int PQisValidOption(const char *keyword); -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] swapcache-style cache?
On 02/28/2012 08:54 AM, Andrea Suisani wrote: On 02/28/2012 04:52 AM, Rob Wultsch wrote: On Wed, Feb 22, 2012 at 2:31 PM, james wrote: Has anyone considered managing a system like the DragonFLY swapcache for a DBMS like PostgreSQL? https://www.facebook.com/note.php?note_id=388112370932 in the same vein: http://bcache.evilpiepirate.org/ [cut] it was submitted to linux kernel mailing list a bunch of time, the last one: https://lkml.org/lkml/2011/9/10/13 forgot to mention another good write-up https://lwn.net/Articles/394672/ Andrea -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
On Mon, Feb 27, 2012 at 4:26 AM, Peter Geoghegan wrote: > This does not appear to have any user-visible effect on caret position > for all variations in coercion syntax, while giving me everything that > I need. I had assumed that we were relying on things being this way, > but apparently this is not the case. The system is correctly blaming > the coercion token when it finds the coercion is at fault, and the > const token when it finds the Const node at fault, just as it did > before. So this looks like a case of removing what amounts to dead > code. To shed some light on that hypothesis, attached is a patch whereby I use 'semantic analysis by compiler error' to show the extent of the reach of the changes by renaming (codebase-wide) the Const node's location symbol. The scope whereby the error token will change position is small and amenable to analysis. I don't see a problem, nor wide-reaching consequences. As Peter says: probably dead code. Note that the cancellation of the error position happens very soon, after an invocation of stringTypeDatum (on two sides of a branch). Pre and post-patch is puts the carat at the beginning of the constant string, even in event there is a failure to parse it properly to the destined type. -- fdr Straw-man-to-show-the-effects-of-the-change-to-const.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Tue, Feb 28, 2012 at 10:35 PM, Tom Lane wrote: > The flaw in this logic, of course, is that the seqscan might be cheaper > than the parameterized indexscan, but *it produces a whole lot more > rows*, meaning that any subsequent join will be a lot more expensive. > Previously add_path didn't have to worry about that, because all > ordinary paths for a given relation produce the same number of rows > (and we studiously kept things like inner indexscan paths out of > add_path's view of the world). > > The most obvious thing to do about this is to consider that one path can > dominate another on cost only if it is both cheaper *and* produces no > more rows. But I'm concerned about the cost of inserting yet another > test condition into add_path, which is slow enough already. Has anyone > got an idea for a different formulation that would avoid that? It seems clear that we shouldn't be making that decision at that point. It would be better to default towards processing fewer rows initially and then swoop in later to improve decision making on larger plans. Can't we save the SeqScan costs at every node, then re-add SeqScan plans as a post-processing step iff the index/nestd loops plans appear costly? So have an additional post processing step that only cuts in with larger plans. Seqscan plans are bad for many reasons, such as pushing data out of cache, making the result more sensitive to growing data volumes or selectivity mistakes as well as producing confusing stats for people trying to add the right indexes. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers