Re: [HACKERS] new autovacuum criterion for visible pages
On Sun, Jan 22, 2017 at 3:27 AM, Stephen Frostwrote: > All, > > * Simon Riggs (si...@2ndquadrant.com) wrote: >> On 12 August 2016 at 01:01, Tom Lane wrote: >> > Michael Paquier writes: >> >> In short, autovacuum will need to scan by itself the VM of each >> >> relation and decide based on that. >> > >> > That seems like a worthwhile approach to pursue. The VM is supposed to be >> > small, and if you're worried it isn't, you could sample a few pages of it. >> > I do not think any of the ideas proposed so far for tracking the >> > visibility percentage on-the-fly are very tenable. >> >> Sounds good, but we can't scan the VM for every table, every minute. >> We need to record something that will tell us how many VM bits have >> been cleared, which will then allow autovac to do a simple SELECT to >> decide what needs vacuuming. >> >> Vik's proposal to keep track of the rows inserted seems like the best >> approach to this issue. > > I tend to agree with Simon on this. I'm also worried that an approach > which was based off of a metric like "% of table not all-visible" might > result in VACUUM running over and over on a table because it isn't able > to actually make any progress towards improving that percentage. We'd > have to have some kind of "cool-off" period or something. > > Tracking INSERTs and then kicking off a VACUUM based on them seems to > address that in a natural way and also seems like something that users > would generally understand as it's very similar to what we do for > UPDATEs and DELETEs. > > Tracking the INSERTs as a reason to VACUUM is also very natural when you > consider the need to update BRIN indexes. > Another possible advantage of tracking INSERTs is for hash indexes where after split we need to remove tuples from buckets that underwent split recently. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protect syscache from bloating with negative cache entries
Jim Nasbywrites: > The other (possibly naive) question I have is how useful negative > entries really are? Will Postgres regularly incur negative lookups, or > will these only happen due to user activity? It varies depending on the particular syscache, but in at least some of them, negative cache entries are critical for performance. See for example RelnameGetRelid(), which basically does a RELNAMENSP cache lookup for each schema down the search path until it finds a match. For any user table name with the standard search_path, there's a guaranteed failure in pg_catalog before you can hope to find a match. If we don't have negative cache entries, then *every invocation of this function has to go to disk* (or at least to shared buffers). It's possible that we could revise all our lookup patterns to avoid this sort of thing. But I don't have much faith in that always being possible, and exactly none that we won't introduce new lookup patterns that need it in future. I spent some time, for instance, wondering if RelnameGetRelid could use a SearchSysCacheList lookup instead, doing the lookup on table name only and then inspecting the whole list to see which entry is frontmost according to the current search path. But that has performance failure modes of its own, for example if you have identical table names in a boatload of different schemas. We do it that way for some other cases such as function lookups, but I think it's much less likely that people have identical function names in N schemas than that they have identical table names in N schemas. If you want to poke into this for particular test scenarios, building with CATCACHE_STATS defined will yield a bunch of numbers dumped to the postmaster log at each backend exit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remote_apply for logical replication?
On Sun, Jan 22, 2017 at 4:06 AM, Petr Jelinekwrote: > Because we don't have intermediate steps in logical replication, writes > happen immediately and in whole transactions so whatever was received by > the time we send reply is already written (it might not necessarily be > that way forever so the code may become more complicated eventually). Ah. Thanks for the explanation. Very cool work, congratulations! -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Protect syscache from bloating with negative cache entries
On 12/26/16 2:31 AM, Kyotaro HORIGUCHI wrote: The points of discussion are the following, I think. 1. The first patch seems working well. It costs the time to scan the whole of a catcache that have negative entries for other reloids. However, such negative entries are created by rather unusual usages. Accesing to undefined columns, and accessing columns on which no statistics have created. The whole-catcache scan occurs on ATTNAME, ATTNUM and STATRELATTINH for every invalidation of a relcache entry. I took a look at this. It looks sane, though I've got a few minor comment tweaks: + * Remove negative cache tuples maching a partial key. s/maching/matching/ +/* searching with a paritial key needs scanning the whole cache */ s/needs/means/ + * a negative cache entry cannot be referenced so we can remove s/referenced/referenced,/ I was wondering if there's a way to test the performance impact of deleting negative entries. 2. The second patch also works, but flushing negative entries by hash values is inefficient. It scans the bucket corresponding to given hash value for OIDs, then flushing negative entries iterating over all the collected OIDs. So this costs more time than 1 and flushes involving entries that is not necessary to be removed. If this feature is valuable but such side effects are not acceptable, new invalidation category based on cacheid-oid pair would be needed. I glanced at this and it looks sane. Didn't go any farther since this one's pretty up in the air. ISTM it'd be better to do some kind of aging instead of patch 2. The other (possibly naive) question I have is how useful negative entries really are? Will Postgres regularly incur negative lookups, or will these only happen due to user activity? I can't think of a case where an app would need to depend on fast negative lookup (in other words, it should be considered a bug in the app). I can see where getting rid of them completely might be problematic, but maybe we can just keep a relatively small number of them around. I'm thinking a simple LRU list of X number of negative entries; when that fills you reuse the oldest one. You'd have to pay the LRU maintenance cost on every negative hit, but if those shouldn't be that common it shouldn't be bad. That might well necessitate another GUC, but it seems a lot simpler than most of the other ideas. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 1/21/17 10:02 AM, Tom Lane wrote: Magnus Haganderwrites: Is it time to enable checksums by default, and give initdb a switch to turn it off instead? Have we seen *even one* report of checksums catching problems in a useful way? I've experienced multiple corruption events that were ultimately tracked down to storage problems. These first manifested as corruption to PG page structures, and I have no way to know how much user data might have been corrupted. Obviously I can't prove that checksums would have caught all of those errors, but there's one massive benefit they would have given us: we'd know that Postgres was not the source of the corruption. That would have eliminated a lot of guesswork. So that we can stop guessing about performance, I did a simple benchmark on my laptop. Stock config except for synchronous_commit=off and checkpoint_timeout=1min, with the idea being that we want to test flushing buffers. Both databases initialized with scale=50, or 800MB. shared_buffers was 128MB. After a couple runs in each database to create dead tuples, runs were performed with pgbench -rT 300 & sleep 2 && PGPORT=5444 pgbench -rT 300 No checksumschecksums 818 tps 758 tps 821 tps 877 tps 879 tps 799 tps 739 tps 808 tps 867 tps 845 tps 854 tps 831 tps Looking at per-statement latency, the variation is always in the update to branches. I'll try to get some sequential runs tonight. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Not at all; I just think that it's not clear that they are a net win > for the average user, and so I'm unconvinced that turning them on by > default is a good idea. I could be convinced otherwise by suitable > evidence. What I'm objecting to is turning them on without making > any effort to collect such evidence. As it happens, rather unexpectedly, we had evidence of a bit-flip happening on a 9.1.24 install show up on IRC today: https://paste.fedoraproject.org/533186/85041907/ What that shows is the output from: select * from heap_page_items(get_raw_page('theirtable', 4585)); With a row whose t_ctid is (134222313,18). Looking at the base-2 format of 4585 and 134222313: 0001 0001 1110 1001 1000 0001 0001 1110 1001 There appears to be other issues with the page also but this was discovered through a pg_dump where the user was trying to get data out to upgrade to something more recent. Not clear if the errors on the page all happened at once or if it was over time, of course, but it's at least possible that this particular area of storage has been degrading over time and that identifying an error when it was just the bit-flip in the t_ctid (thanks to a checksum) might have allowed the user to pull out the data. During the discussion on IRC, someone else mentioned a similar problem which was due to not having ECC memory in their server. As discussed, that might mean that we wouldn't have caught the corruption since we only calculate the checksum on the way out of shared_buffers, but it's also entirely possible that we would have because it could have happened in kernel space too. We're still working with the user to see if we can get their data out, but that looks like pretty good evidence that maybe we should care about enabling checksums to catch corruption before it causes undo pain for our users. The raw page is here: https://paste.fedoraproject.org/533195/48504224/ if anyone is curious to look at it further (we're looking through it too). Thanks! Stephen signature.asc Description: Digital signature
Re: Updating MATERIALIZED VIEWs (Re: [HACKERS] delta relations in AFTER triggers)
On 1/20/17 5:38 PM, Nico Williams wrote: If these triggers could be automatically generated, that sure would be nice, but some control would be needed over when to update the MV vs. mark it as needing a refresh. FWIW, pg_classy[1], which is still a work in progress, would allow for that. The idea is that you define a code template which you can then call with arguments (such as the name of a matview table), and those arguments get put into the template before executing the resulting SQL. Most of the basic framework for that is in place; I just need to finish code that will allow for the extension to track arbitrary database objects that were created. 1: https://github.com/decibel/pg_classy/blob/master/doc/pg_classy.asc -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Thomas, * Thomas Munro (thomas.mu...@enterprisedb.com) wrote: > On Sun, Jan 22, 2017 at 7:37 AM, Stephen Frostwrote: > > Exactly, and that awareness will allow a user to prevent further data > > loss or corruption. Slow corruption over time is a very much known and > > accepted real-world case that people do experience, as well as bit > > flipping enough for someone to write a not-that-old blog post about > > them: > > > > https://blogs.oracle.com/ksplice/entry/attack_of_the_cosmic_rays1 > > I have no doubt that low frequency cosmic ray bit flipping in main > memory is a real phenomenon, having worked at a company that runs > enough computers to see ECC messages in kernel logs on a regular > basis. But our checksums can't actually help with that, can they? We > verify checksums on the way into shared buffers, and compute new > checksums on the way back to disk, so any bit-flipping that happens in > between those two times -- while your data is a sitting duck in shared > buffers -- would not be detected by this scheme. That's ECC's job. Ideally, everyone's gonna run with ECC and have that handle it. That said, there's still the possibility that the bit is flipped after we've calculated the checksum but before it's hit disk, or before it's actually been written out to the storage system underneath. You're correct that if the bit is flipped before we go to write the buffer out that we won't detect that case, but there's not much help for that without compromising performance more than even I'd be ok with. > So the risk being defended against is corruption while in the disk > subsystem, whatever that might consist of (and certainly that includes > more buffers in strange places that themselves are susceptible to > memory faults etc, and hopefully they have their own error detection > and correction). Certainly the ZFS community thinks that pile of > turtles can't be trusted and that extra checks are worthwhile, and you > can find anecdotal reports and studies about filesystem corruption > being detected, for example in the links from > https://en.wikipedia.org/wiki/ZFS#Data_integrity . Agreed. wrt your point above, if you consider "everything that happens after we have passed over a given bit to incorporate its value into our CRC" to be "disk subsystem" then I think we're in agreement on this point and that there's a bunch of stuff that happens then which could be caught by checking our CRC. I've seen some other funny things out there in the wild too though, like a page suddenly being half-zero'd because the virtualization system ran out of memory and barfed. I realize that our CRC might not catch such a case if it's in our shared buffers before we write the page out, but if it happens in the kernel's write buffer after we pushed it from shared buffers then our CRC would detect it. Which actually brings up another point when it comes to if CRCs save from data-loss: they certainly do if you catch it happening before you have expired the WAL and the WAL data is clean. > So +1 for enabling it by default. I always turn that on. Ditto. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On Sun, Jan 22, 2017 at 7:37 AM, Stephen Frostwrote: > Exactly, and that awareness will allow a user to prevent further data > loss or corruption. Slow corruption over time is a very much known and > accepted real-world case that people do experience, as well as bit > flipping enough for someone to write a not-that-old blog post about > them: > > https://blogs.oracle.com/ksplice/entry/attack_of_the_cosmic_rays1 I have no doubt that low frequency cosmic ray bit flipping in main memory is a real phenomenon, having worked at a company that runs enough computers to see ECC messages in kernel logs on a regular basis. But our checksums can't actually help with that, can they? We verify checksums on the way into shared buffers, and compute new checksums on the way back to disk, so any bit-flipping that happens in between those two times -- while your data is a sitting duck in shared buffers -- would not be detected by this scheme. That's ECC's job. So the risk being defended against is corruption while in the disk subsystem, whatever that might consist of (and certainly that includes more buffers in strange places that themselves are susceptible to memory faults etc, and hopefully they have their own error detection and correction). Certainly the ZFS community thinks that pile of turtles can't be trusted and that extra checks are worthwhile, and you can find anecdotal reports and studies about filesystem corruption being detected, for example in the links from https://en.wikipedia.org/wiki/ZFS#Data_integrity . So +1 for enabling it by default. I always turn that on. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new autovacuum criterion for visible pages
All, * Simon Riggs (si...@2ndquadrant.com) wrote: > On 12 August 2016 at 01:01, Tom Lanewrote: > > Michael Paquier writes: > >> In short, autovacuum will need to scan by itself the VM of each > >> relation and decide based on that. > > > > That seems like a worthwhile approach to pursue. The VM is supposed to be > > small, and if you're worried it isn't, you could sample a few pages of it. > > I do not think any of the ideas proposed so far for tracking the > > visibility percentage on-the-fly are very tenable. > > Sounds good, but we can't scan the VM for every table, every minute. > We need to record something that will tell us how many VM bits have > been cleared, which will then allow autovac to do a simple SELECT to > decide what needs vacuuming. > > Vik's proposal to keep track of the rows inserted seems like the best > approach to this issue. I tend to agree with Simon on this. I'm also worried that an approach which was based off of a metric like "% of table not all-visible" might result in VACUUM running over and over on a table because it isn't able to actually make any progress towards improving that percentage. We'd have to have some kind of "cool-off" period or something. Tracking INSERTs and then kicking off a VACUUM based on them seems to address that in a natural way and also seems like something that users would generally understand as it's very similar to what we do for UPDATEs and DELETEs. Tracking the INSERTs as a reason to VACUUM is also very natural when you consider the need to update BRIN indexes. I am a bit worried that if we focus just on if the VM needs to be updated or not that we might miss out on cases where we need to VACUUM because the BRIN indexes are out of date. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On Sat, Jan 21, 2017 at 10:16 PM, Michael Banckwrote: > On Sat, Jan 21, 2017 at 09:02:25PM +0200, Ants Aasma wrote: >> On Sat, Jan 21, 2017 at 6:41 PM, Andreas Karlsson wrote: >> > It might be worth looking into using the CRC CPU instruction to reduce this >> > overhead, like we do for the WAL checksums. Since that is a different >> > algorithm it would be a compatibility break and we would need to support >> > the >> > old algorithm for upgraded clusters.. >> >> We looked at that when picking the algorithm. At that point it seemed >> that CRC CPU instructions were not universal enough to rely on them. >> The algorithm we ended up on was designed to be fast on SIMD hardware. >> Unfortunately on x86-64 that required SSE4.1 integer instructions, so >> with default compiles there is a lot of performance left on table. A >> low hanging fruit would be to do CPU detection like the CRC case and >> enable a SSE4.1 optimized variant when those instructions are >> available. IIRC it was actually a lot faster than the naive hardware >> CRC that is used for WAL and about on par with interleaved CRC. > > I am afraid that won't fly with most end-user packages, cause > distributions can't just built packages on their newest machine and then > users get SIGILL or whatever cause their 2014 server doesn't have that > instruction, or would they still work? > > So you would have to do runtime detection of the CPU features, and use > the appropriate code if they are available. Which also makes regression > testing harder, as not all codepaths would get exercised all the time. Runtime detection is exactly what I had in mind. The code path would also be the same as at least the two most important compilers only need a compilation flag change. And the required instruction was introduced in 2007 so I think anybody who is concerned about performance is covered. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure in commit_ts tap tests
David Bowenwrites: > If the intent of the test is to compare strings, you should be using ne not > !=. Sure. We were just confused as to how it had ever appeared to work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add function to import operating system collations
Peter Eisentrautwrites: > On 1/21/17 12:50 PM, Tom Lane wrote: >> I have to question the decision to make "no locales" a hard error. >> What's the point of that? In fact, should we even be bothering with >> a warning, considering how often initdb runs unattended these days? > Hmm, it was a warning in initdb, so making it an error now is probably a > mistake. We should change it back to a warning at least. I'd drop it altogether I think ... it was useful debug back in the day but now I doubt it is worth much. > Also, if we add ICU initialization to this, then it's not clear how we > would report if one provider provided zero locales but another did > provide some. Would it help to redefine the function as returning the number of locale entries it successfully added? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Tom Lane points out: > Yeah, and there's a bunch of usability tooling that we don't have, > centered around "what do you do after you get a checksum error?". I've asked myself this as well, and came up with a proof of conecpt repair tool called pg_healer: http://blog.endpoint.com/2016/09/pghealer-repairing-postgres-problems.html It's very rough, but my vision is that someday Postgres will have a background process akin to autovacuum that constantly sniffs out corruption problems and (optionally) repairs them. The ability to self-repair is very limited unless checksums are enabled. I agree that there is work needed and problems to be solved with our checksum implementation (e.g. what if cosmic ray hits the checksum itself!?), but I would love to see what we do have enabled by default so we dramatically increase the pool of people with checksums enabled. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201701211522 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAliDw5oACgkQvJuQZxSWSshy4QCfXokvagoishfTUnmujjpBNTUT q7IAn0dR74bFy0mj0EMoTU7Taj0db3Sh =qBEJ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pdf versus single-html
On 2017-01-21 17:12, Tom Lane wrote: Erik Rijkerswrites: With the pdf increasingly in readability-decline (more and more text parts fall off the (right) side of the page and quite a few tables contain unreable bits) I would like to have a single page html. Given the size of our manual, I find it hard to believe that most readers would perform acceptably with that ... have you tried just concatenating all the parts manually and testing performance? Yes, id did a straightforward concatenation; and it's really not a problem: it comes to a 15 MB file (12 MB when skipping the release pages). Either size (in standard firefox) takes some time (seconds) but scrolling and searching is no problem. Maybe I wasn't clear: I wouldn't argue to make single-file the default output, of course; just an option to choose if it's at all possible. And even if such an option isn't deemed feasible I was hoping that someone more xslt-knowledgeable than I am has some hints on how to produce a single html file in the correct order and with link correctly (self-)referencing. I might figure it out myself - in that case will post here. thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Add function to import operating system collations
On 1/21/17 12:50 PM, Tom Lane wrote: > I have to question the decision to make "no locales" a hard error. > What's the point of that? In fact, should we even be bothering with > a warning, considering how often initdb runs unattended these days? Hmm, it was a warning in initdb, so making it an error now is probably a mistake. We should change it back to a warning at least. The reason for the warning originally was to have some output in case the whole collation initialization fails altogether and does nothing. For example, at the time, we didn't know how reliable and portable `locale -a` actually was. Maybe this didn't actually turn out to be a problem. Also, if we add ICU initialization to this, then it's not clear how we would report if one provider provided zero locales but another did provide some. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On Sat, Jan 21, 2017 at 09:02:25PM +0200, Ants Aasma wrote: > On Sat, Jan 21, 2017 at 6:41 PM, Andreas Karlssonwrote: > > It might be worth looking into using the CRC CPU instruction to reduce this > > overhead, like we do for the WAL checksums. Since that is a different > > algorithm it would be a compatibility break and we would need to support the > > old algorithm for upgraded clusters.. > > We looked at that when picking the algorithm. At that point it seemed > that CRC CPU instructions were not universal enough to rely on them. > The algorithm we ended up on was designed to be fast on SIMD hardware. > Unfortunately on x86-64 that required SSE4.1 integer instructions, so > with default compiles there is a lot of performance left on table. A > low hanging fruit would be to do CPU detection like the CRC case and > enable a SSE4.1 optimized variant when those instructions are > available. IIRC it was actually a lot faster than the naive hardware > CRC that is used for WAL and about on par with interleaved CRC. I am afraid that won't fly with most end-user packages, cause distributions can't just built packages on their newest machine and then users get SIGILL or whatever cause their 2014 server doesn't have that instruction, or would they still work? So you would have to do runtime detection of the CPU features, and use the appropriate code if they are available. Which also makes regression testing harder, as not all codepaths would get exercised all the time. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 tl;dr +1 from me for changing the default, it is worth it. Tom Lane wrote: > Have we seen *even one* report of checksums catching > problems in a usefuld way? Sort of chicken-and-egg, as most places don't have it enabled. Which leads us to: Stephen Frost replies: > This isn't the right question. > > The right question is "have we seen reports of corruption which > checksums *would* have caught?" Well, I've seen corruption that almost certainly would have got caught much earlier than stumbling upon it later on when the corruption happened to finally trigger an error. I don't normally report such things to the list: it's almost always a hardware bug or bad RAM. I would only post if it were caused by a Postgres bug. Tom Lane wrote: > I think this will be making the average user pay X% for nothing. I think you mean "the average user who doesn't check what initdb options are available". And we can certainly post a big notice about this in the release notes, so people can use the initdb option - --disable-data-checksums if they want. > ... pay X% for nothing. It is not for nothing, it is for increasing reliability by detecting (and pinpointing!) corruption as early as possible. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201701211513 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAliDwU4ACgkQvJuQZxSWSsi06QCgpPUg4SljERHMWP9tTJnoIRic U2cAoLZINh2rSECNYOwjldlD4dK00FiV =pYQ/ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure in commit_ts tap tests
If the intent of the test is to compare strings, you should be using ne not !=. David Bowen On Sat, Jan 21, 2017 at 10:38 AM, Pavan Deolaseewrote: > > > On Sat, Jan 21, 2017 at 9:39 PM, Tom Lane wrote: > >> Pavan Deolasee writes: >> > On Sat, Jan 21, 2017 at 9:09 PM, Tom Lane wrote: >> >> Hm, but what of the "null" value? Also, I get >> >> >> >> $ perl -e 'use warnings; use Test::More; ok("2017-01-01" != "null", >> "ok");' >> >> Argument "null" isn't numeric in numeric ne (!=) at -e line 1. >> >> Argument "2017-01-01" isn't numeric in numeric ne (!=) at -e line 1. >> >> ok 1 - ok >> >> > It declares the test as "passed", right? >> >> Oh! So it does. That is one darn weird behavior of the != operator. >> >> > Indeed! See this: > > # first numeric matches, doesn't check beyond that > $ perl -e 'if ("2017-23" != "2017-24") {print "Not equal\n"} else {print > "Equal\n"}' > Equal > > # first numeric doesn't match, operators works ok > $ perl -e 'if ("2017-23" != "2018-24") {print "Not equal\n"} else {print > "Equal\n"}' > Not equal > > # comparison of numeric with non-numeric, works ok > $ perl -e 'if ("2017-23" != "Foo") {print "Not equal\n"} else {print > "Equal\n"}' > Not equal > > # numeric on RHS, works ok > $ perl -e 'if ("Foo" != "2018-24") {print "Not equal\n"} else {print > "Equal\n"}' > Not equal > > These tests show that the operator returns the correct result it finds a > numeric value at the start of the string, either on LHS or RHS. Also, it > will only compare the numeric values until first non-numeric character is > found. > > # no numeric on either side > $ perl -e 'if ("Fri 2017-23" != "Fri 2017-23") {print "Not equal\n"} else > {print "Equal\n"}' > Equal > > *# no numeric on either side, arbitrary strings declared as equal* > $ perl -e 'if ("Fri 2017-23" != "Foo") {print "Not equal\n"} else {print > "Equal\n"}' > Equal > > These two tests show why we saw no failure earlier. If neither LHS or RHS > string has a starting numeric value, the operator declares the arguments as > equal, irrespective of their values. I tested the same with == operator and > that also exhibits the same behaviour. Weird and I wonder how it's not a > source of constant bugs in perl code (I don't use perl a lot, so may be > those who do are used to either turning warnings on or know this already. > > >> >> >> There's still the point that we're not actually exercising this script >> in the buildfarm ... >> > > Yes indeed. > > Thanks, > Pavan > > -- > Pavan Deolasee http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Checksums by default?
* Ants Aasma (ants.aa...@eesti.ee) wrote: > On Sat, Jan 21, 2017 at 7:39 PM, Petr Jelinek >wrote: > > So in summary "postgresql.conf options are easy to change" while "initdb > > options are hard to change", I can see this argument used both for > > enabling or disabling checksums by default. As I said I would be less > > worried if it was easy to turn off, but we are not there afaik. And even > > then I'd still want benchmark first. > > Adding support for disabling checksums is almost trivial as it only > requires flipping a value in the control file. And I have somewhere > sitting around a similarly simple tool for turning on checksums while > the database is offline. FWIW, based on customers and fellow > conference goers I have talked to most would gladly take the > performance hit, but not the downtime to turn it on on an existing > database. I've had the same reaction from folks I've talked to, unless it was the cases where they were just floored that we didn't have them enabled by default and now they felt the need to go get them enabled on all their systems... Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On Sat, Jan 21, 2017 at 7:39 PM, Petr Jelinekwrote: > So in summary "postgresql.conf options are easy to change" while "initdb > options are hard to change", I can see this argument used both for > enabling or disabling checksums by default. As I said I would be less > worried if it was easy to turn off, but we are not there afaik. And even > then I'd still want benchmark first. Adding support for disabling checksums is almost trivial as it only requires flipping a value in the control file. And I have somewhere sitting around a similarly simple tool for turning on checksums while the database is offline. FWIW, based on customers and fellow conference goers I have talked to most would gladly take the performance hit, but not the downtime to turn it on on an existing database. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On Sat, Jan 21, 2017 at 6:41 PM, Andreas Karlssonwrote: > On 01/21/2017 04:48 PM, Stephen Frost wrote: >> >> * Fujii Masao (masao.fu...@gmail.com) wrote: >>> >>> If the performance overhead by the checksums is really negligible, >>> we may be able to get rid of wal_log_hints parameter, as well. >> >> >> Prior benchmarks showed it to be on the order of a few percent, as I >> recall, so I'm not sure that we can say it's negligible (and that's not >> why Magnus was proposing changing the default). > > > It might be worth looking into using the CRC CPU instruction to reduce this > overhead, like we do for the WAL checksums. Since that is a different > algorithm it would be a compatibility break and we would need to support the > old algorithm for upgraded clusters.. We looked at that when picking the algorithm. At that point it seemed that CRC CPU instructions were not universal enough to rely on them. The algorithm we ended up on was designed to be fast on SIMD hardware. Unfortunately on x86-64 that required SSE4.1 integer instructions, so with default compiles there is a lot of performance left on table. A low hanging fruit would be to do CPU detection like the CRC case and enable a SSE4.1 optimized variant when those instructions are available. IIRC it was actually a lot faster than the naive hardware CRC that is used for WAL and about on par with interleaved CRC. That said the actual checksum calculation was not a big issue for performance. The only way to make it really matter was with a larger than shared buffers smaller than RAM workload with tiny per page execution overhead. My test case was SELECT COUNT(*) on wide rows with a small fill factor. Having to WAL log hint bits is the major issue. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Andres Freundwrites: > > Sure, it might be easy, but we don't have it. Personally I think > > checksums just aren't even ready for prime time. If we had: > > - ability to switch on/off at runtime (early patches for that have IIRC > > been posted) > > - *builtin* tooling to check checksums for everything > > - *builtin* tooling to compute checksums after changing setting > > - configurable background sweeps for checksums > > Yeah, and there's a bunch of usability tooling that we don't have, > centered around "what do you do after you get a checksum error?". > AFAIK there's no way to check or clear such an error; but without > such tools, I'm afraid that checksums are as much of a foot-gun > as a benefit. Uh, ignore_checksum_failure and zero_damanged_pages ...? Not that I'd suggest flipping those on for your production database the first time you see a checksum failure, but we aren't completely without a way to address such cases. Or the to-be-implemented ability to disable checksums for a cluster. > I think developing all this stuff is a good long-term activity, > but I'm hesitant about turning checksums loose on the average > user before we have it. What I dislike about this stance is that it just means we're going to have more and more systems out there that won't have checksums enabled, and there's not going to be an easy way to fix that. > To draw a weak analogy, our checksums right now are more or less > where our replication was in 9.1 --- we had it, but there were still > an awful lot of rough edges. It was only just in this past release > cycle that we started to change default settings towards the idea > that they should support replication by default. I think the checksum > tooling likewise needs years of maturation before we can say that it's > realistically ready to be the default. Well, checksums were introduced in 9.3, which would mean that this is really only being proposed a year earlier than the replication timeline case, if I'm following correctly. I do agree that checksums have not seen quite as much love as the replicaiton work, though I'm tempted to argue that's because they aren't an "interesting" feature now that we've got them- but even those uninteresting features really need someone to champion them when they're important. Unfortunately, our situation with checksums does make me feel a bit like they were added to satisfy a check-box requirement and then not really developed very much further. Following your weak analogy though, it's not like users are going to have to dump/restore their entire cluster to change their systems to take advantage of the new replication capabilities. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Andres Freund (and...@anarazel.de) wrote: > On 2017-01-21 13:03:52 -0500, Stephen Frost wrote: > > * Andres Freund (and...@anarazel.de) wrote: > > > On 2017-01-21 12:46:05 -0500, Stephen Frost wrote: > > > > Do you run with all defaults in those environments? > > > > > > Irrelevant - changing requires re-initdb'ing. That's unrealistic. > > > > I disagree. Further, we can add an option to be able to disable > > checksums without needing to re-initdb pretty trivially, which addresses > > the case where someone's having a problem because it's enabled, as > > discussed. > > Sure, it might be easy, but we don't have it. Personally I think > checksums just aren't even ready for prime time. If we had: > - ability to switch on/off at runtime (early patches for that have IIRC > been posted) > - *builtin* tooling to check checksums for everything > - *builtin* tooling to compute checksums after changing setting > - configurable background sweeps for checksums I'm certainly all for adding, well, all of that. I don't think we need *all* of it before considering enabling checksums by default, but I do think it'd be great if we had people working on adding those. > then the story would look differently. Right now checksums just aren't > particularly useful due to not having the above. Just checking recent > data doesn't really guarantee much - failures are more likely in old > data, and the data might even be read from ram. I agree that failures tend to be more likely in old data, though, as with everything, "it depends." Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frostwrites: > > Because I see having checksums as, frankly, something we always should > > have had (as most other databases do, for good reason...) and because > > they will hopefully prevent data loss. I'm willing to give us a fair > > bit to minimize the risk of losing data. > > To be perfectly blunt, that's just magical thinking. Checksums don't > prevent data loss in any way, shape, or form. In fact, they can *cause* > data loss, or at least make it harder for you to retrieve your data, > in the event of bugs causing false-positive checksum failures. This is not a new argument, at least to me, and I don't agree with it. > What checksums can do for you, perhaps, is notify you in a reasonably > timely fashion if you've already lost data due to storage-subsystem > problems. But in a pretty high percentage of cases, that fact would > be extremely obvious anyway, because of visible data corruption. Exactly, and that awareness will allow a user to prevent further data loss or corruption. Slow corruption over time is a very much known and accepted real-world case that people do experience, as well as bit flipping enough for someone to write a not-that-old blog post about them: https://blogs.oracle.com/ksplice/entry/attack_of_the_cosmic_rays1 A really nice property of checksums on pages is that they also tell you what data you *didn't* lose, which can be extremely valuable. > I think the only really clear benefit accruing from checksums is that > they make it easier to distinguish storage-subsystem failures from > Postgres bugs. That can certainly be a benefit to some users, but > I remain dubious that the average user will find it worth any noticeable > amount of overhead. Or memory errors, or kernel bugs, or virtualization bugs, if they happen at the right time. We keep adding to the bits between the DB and the storage and to think they're all perfect is certainly a step farther than I'd go. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
Andres Freundwrites: > Sure, it might be easy, but we don't have it. Personally I think > checksums just aren't even ready for prime time. If we had: > - ability to switch on/off at runtime (early patches for that have IIRC > been posted) > - *builtin* tooling to check checksums for everything > - *builtin* tooling to compute checksums after changing setting > - configurable background sweeps for checksums Yeah, and there's a bunch of usability tooling that we don't have, centered around "what do you do after you get a checksum error?". AFAIK there's no way to check or clear such an error; but without such tools, I'm afraid that checksums are as much of a foot-gun as a benefit. I think developing all this stuff is a good long-term activity, but I'm hesitant about turning checksums loose on the average user before we have it. To draw a weak analogy, our checksums right now are more or less where our replication was in 9.1 --- we had it, but there were still an awful lot of rough edges. It was only just in this past release cycle that we started to change default settings towards the idea that they should support replication by default. I think the checksum tooling likewise needs years of maturation before we can say that it's realistically ready to be the default. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 2017-01-21 13:04:18 -0500, Tom Lane wrote: > Andres Freundwrites: > > On 2017-01-21 12:46:05 -0500, Stephen Frost wrote: > >> Do you run with all defaults in those environments? > > > Irrelevant - changing requires re-initdb'ing. That's unrealistic. > > If you can't turn checksums *off* without re-initdb, that raises the > stakes for this enormously. Not right now, no. > But why is that so hard? Don't think it is, it's just that nobody has done it ;) Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 2017-01-21 13:03:52 -0500, Stephen Frost wrote: > * Andres Freund (and...@anarazel.de) wrote: > > On 2017-01-21 12:46:05 -0500, Stephen Frost wrote: > > > Do you run with all defaults in those environments? > > > > Irrelevant - changing requires re-initdb'ing. That's unrealistic. > > I disagree. Further, we can add an option to be able to disable > checksums without needing to re-initdb pretty trivially, which addresses > the case where someone's having a problem because it's enabled, as > discussed. Sure, it might be easy, but we don't have it. Personally I think checksums just aren't even ready for prime time. If we had: - ability to switch on/off at runtime (early patches for that have IIRC been posted) - *builtin* tooling to check checksums for everything - *builtin* tooling to compute checksums after changing setting - configurable background sweeps for checksums then the story would look differently. Right now checksums just aren't particularly useful due to not having the above. Just checking recent data doesn't really guarantee much - failures are more likely in old data, and the data might even be read from ram. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Stephen Frostwrites: > Because I see having checksums as, frankly, something we always should > have had (as most other databases do, for good reason...) and because > they will hopefully prevent data loss. I'm willing to give us a fair > bit to minimize the risk of losing data. To be perfectly blunt, that's just magical thinking. Checksums don't prevent data loss in any way, shape, or form. In fact, they can *cause* data loss, or at least make it harder for you to retrieve your data, in the event of bugs causing false-positive checksum failures. What checksums can do for you, perhaps, is notify you in a reasonably timely fashion if you've already lost data due to storage-subsystem problems. But in a pretty high percentage of cases, that fact would be extremely obvious anyway, because of visible data corruption. I think the only really clear benefit accruing from checksums is that they make it easier to distinguish storage-subsystem failures from Postgres bugs. That can certainly be a benefit to some users, but I remain dubious that the average user will find it worth any noticeable amount of overhead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > > Do you run with all defaults in those environments? > > For initdb? Mostly yes. Ok, fine, but you probably wouldn't if this change went in. For me, it's the other way- I have to go enable checksums at initdb time unless there's an excuse not to, and that's only when I get to be involved at initdb time, which is often not the case, sadly. There's certainly lots and lots of environments where the extra CPU and WAL aren't even remotely close to being an issue. * Andres Freund (and...@anarazel.de) wrote: > On 2017-01-21 12:46:05 -0500, Stephen Frost wrote: > > Do you run with all defaults in those environments? > > Irrelevant - changing requires re-initdb'ing. That's unrealistic. I disagree. Further, we can add an option to be able to disable checksums without needing to re-initdb pretty trivially, which addresses the case where someone's having a problem because it's enabled, as discussed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
Andres Freundwrites: > On 2017-01-21 12:46:05 -0500, Stephen Frost wrote: >> Do you run with all defaults in those environments? > Irrelevant - changing requires re-initdb'ing. That's unrealistic. If you can't turn checksums *off* without re-initdb, that raises the stakes for this enormously. But why is that so hard? Seems like if you just stop checking them, you're done. I see that we have the state recorded in pg_control, but surely we could teach some utility or other to update that file while the postmaster is stopped. I think a reasonable prerequisite before we even consider this change is a patch to make it possible to turn off checksumming. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 2017-01-21 12:46:05 -0500, Stephen Frost wrote: > > I stand by the opinion that changing default which affect performance > > without any benchmark is bad idea. > > I'd be surprised if the performance impact has really changed all that > much since the code went in. Perhaps that's overly optimistic of me. Back then there were cases with well over 20% overhead. More edge cases, but that's a lot. And our scalability back then was a lot worse than where we are today. > > And for the record, I care much less about overall TPS, I care a lot > > more about amount of WAL produced because in 90%+ environments that I > > work with any increase in WAL amount means at least double the increase > > in network bandwidth due to replication. > > Do you run with all defaults in those environments? Irrelevant - changing requires re-initdb'ing. That's unrealistic. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 21/01/17 18:46, Stephen Frost wrote: > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> As we don't know the performance impact is (there was no benchmark done >> on reasonably current code base) I really don't understand how you can >> judge if it's worth it or not. > > Because I see having checksums as, frankly, something we always should > have had (as most other databases do, for good reason...) and because > they will hopefully prevent data loss. I'm willing to give us a fair > bit to minimize the risk of losing data. > >> I stand by the opinion that changing default which affect performance >> without any benchmark is bad idea. > > I'd be surprised if the performance impact has really changed all that > much since the code went in. Perhaps that's overly optimistic of me. > My problem is that we are still only guessing. And while my gut also tells me that the TPS difference will not be big, it also tells me that changes like this in important software like PostgreSQL should not be made based purely on it. >> And for the record, I care much less about overall TPS, I care a lot >> more about amount of WAL produced because in 90%+ environments that I >> work with any increase in WAL amount means at least double the increase >> in network bandwidth due to replication. > > Do you run with all defaults in those environments? > For initdb? Mostly yes. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add function to import operating system collations
Peter Eisentrautwrites: > Add function to import operating system collations BTW, hamster's been failing since this went in: running bootstrap script ... ok performing post-bootstrap initialization ... 2017-01-19 03:17:38.748 JST [14656] FATAL: no usable system locales were found 2017-01-19 03:17:38.748 JST [14656] STATEMENT: SELECT pg_import_system_collations(if_not_exists => false, schema => 'pg_catalog'); child process exited with exit code 1 Looking at older reports, I see that it didn't find any system locales then either, but we did not treat it as a hard error: running bootstrap script ... ok performing post-bootstrap initialization ... No usable system locales were found. Use the option "--debug" to see details. ok syncing data to disk ... ok I have to question the decision to make "no locales" a hard error. What's the point of that? In fact, should we even be bothering with a warning, considering how often initdb runs unattended these days? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > As we don't know the performance impact is (there was no benchmark done > on reasonably current code base) I really don't understand how you can > judge if it's worth it or not. Because I see having checksums as, frankly, something we always should have had (as most other databases do, for good reason...) and because they will hopefully prevent data loss. I'm willing to give us a fair bit to minimize the risk of losing data. > I stand by the opinion that changing default which affect performance > without any benchmark is bad idea. I'd be surprised if the performance impact has really changed all that much since the code went in. Perhaps that's overly optimistic of me. > And for the record, I care much less about overall TPS, I care a lot > more about amount of WAL produced because in 90%+ environments that I > work with any increase in WAL amount means at least double the increase > in network bandwidth due to replication. Do you run with all defaults in those environments? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On 21/01/17 18:15, Stephen Frost wrote: > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> On 21/01/17 17:51, Stephen Frost wrote: >>> I'm quite sure that the performance numbers for the CREATE TABLE + COPY >>> case with wal_level=minimal would have been *far* better than for >>> wal_level > minimal. >> >> Which is random usecase very few people do on regular basis. Checksums >> affect *everybody*. > > It's not a 'random usecase very few people do on a regular basis', it's > a different usecase which a subset of our users do. I agree that > changing the default for checksums would affect everyone who uses just > the defaults. > > What I think is different about checksums, really, is that you have to > pass an option to initdb to enable them. That's right from a technical > perspective, but I seriously doubt everyone re-reads the 'man' page for > initdb when they're setting up a new cluster, and if they didn't read > the release notes for that particular release where checksums were > introduced, they might not be aware that they exist or that they're > defaulted to off. > > Values in postgresql.conf, at least in my experience, get a lot more > regular review as there's always things changing there from > release-to-release and anyone even slightly experienced with PG knows > that our defaults in postgresql.conf pretty much suck and have to be > adjusted. I expect we'd see a lot more people using checksums if they > were in postgresql.conf somehow. I don't have any particular answer to > that problem, just thought it interesting to consider how flags to > initdb differ from postgresql.conf configuration options. So in summary "postgresql.conf options are easy to change" while "initdb options are hard to change", I can see this argument used both for enabling or disabling checksums by default. As I said I would be less worried if it was easy to turn off, but we are not there afaik. And even then I'd still want benchmark first. > >> What the benchmarks gave us is a way to do informed decision for common >> use. All I am asking for here is to be able to do informed decision as >> well. > > From above: > >> The change of wal_level was supported by benchmark, I think it's >> reasonable to ask for this to be as well. > > No, it wasn't, it was that people felt the cases where changing > wal_level would seriously hurt performance didn't out-weigh the value of > making the change to the default. > > In other words, the change of the *default* for checksums, at least in > my view, is well worth the performance impact. > As we don't know the performance impact is (there was no benchmark done on reasonably current code base) I really don't understand how you can judge if it's worth it or not. I stand by the opinion that changing default which affect performance without any benchmark is bad idea. And for the record, I care much less about overall TPS, I care a lot more about amount of WAL produced because in 90%+ environments that I work with any increase in WAL amount means at least double the increase in network bandwidth due to replication. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Andres Freundwrites: > What wouldn't hurt is enabling it by default in pg_regress on master for > a while. That seems like a good thing to do independent of flipping the > default. Yeah, I could get behind that. I'm not certain how much the regression tests really stress checksumming: most of the tests work with small amounts of data that probably never leave shared buffers. But it'd be some hard evidence at least. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Andres Freund (and...@anarazel.de) wrote: > On 2017-01-21 12:09:53 -0500, Tom Lane wrote: > > Also, if we do decide to do that, there's the question of timing. > > As I mentioned, one of the chief risks I see is the possibility of > > false-positive checksum failures due to bugs; I think that code has seen > > sufficiently little field use that we should have little confidence that > > no such bugs remain. So if we're gonna do it, I'd prefer to do it at the > > very start of a devel cycle, so as to have the greatest opportunity to > > find bugs before we ship the new default. > > What wouldn't hurt is enabling it by default in pg_regress on master for > a while. That seems like a good thing to do independent of flipping the > default. Oh. I like that idea, a lot. +1. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frostwrites: > > As for checksums, I do see value in them and I'm pretty sure that the > > author of that particular feature did as well, or we wouldn't even have > > it as an option. You seem to be of the opinion that we might as well > > just rip all of that code and work out as being useless. > > Not at all; I just think that it's not clear that they are a net win > for the average user, and so I'm unconvinced that turning them on by > default is a good idea. I could be convinced otherwise by suitable > evidence. What I'm objecting to is turning them on without making > any effort to collect such evidence. > Also, if we do decide to do that, there's the question of timing. > As I mentioned, one of the chief risks I see is the possibility of > false-positive checksum failures due to bugs; I think that code has seen > sufficiently little field use that we should have little confidence that > no such bugs remain. So if we're gonna do it, I'd prefer to do it at the > very start of a devel cycle, so as to have the greatest opportunity to > find bugs before we ship the new default. I can agree with a goal to make sure we aren't enabling code that a bunch of people are going to be seeing false positives due to, as that would certainly be bad. I also agree that further testing is generally a good idea, I just dislike the idea that we would prevent this change to the default due to performance concerns of a few percent in a default install. I'll ask around with some of the big PG installed bases (RDS, Heroku, etc) and see if they're a) enabling checksums already (and therefore doing a lot of testing of those code paths), and b) if they are, if they have seen any true or false positive reports from it. If we hear back that the large installed bases in those environments are already running with checksums enabled, without issues, then I'm not sure that we need to wait until the start of the next devel cycle to change the default. Alternativly, if we end up wanting to change the on-disk format, as discussed elsewhere on this thread, then we might want to wait and make that change at the same time as a change to the default, or at least not change the default right before we decide to change the format. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 21/01/17 17:51, Stephen Frost wrote: > > I'm quite sure that the performance numbers for the CREATE TABLE + COPY > > case with wal_level=minimal would have been *far* better than for > > wal_level > minimal. > > Which is random usecase very few people do on regular basis. Checksums > affect *everybody*. It's not a 'random usecase very few people do on a regular basis', it's a different usecase which a subset of our users do. I agree that changing the default for checksums would affect everyone who uses just the defaults. What I think is different about checksums, really, is that you have to pass an option to initdb to enable them. That's right from a technical perspective, but I seriously doubt everyone re-reads the 'man' page for initdb when they're setting up a new cluster, and if they didn't read the release notes for that particular release where checksums were introduced, they might not be aware that they exist or that they're defaulted to off. Values in postgresql.conf, at least in my experience, get a lot more regular review as there's always things changing there from release-to-release and anyone even slightly experienced with PG knows that our defaults in postgresql.conf pretty much suck and have to be adjusted. I expect we'd see a lot more people using checksums if they were in postgresql.conf somehow. I don't have any particular answer to that problem, just thought it interesting to consider how flags to initdb differ from postgresql.conf configuration options. > What the benchmarks gave us is a way to do informed decision for common > use. All I am asking for here is to be able to do informed decision as > well. From above: > The change of wal_level was supported by benchmark, I think it's > reasonable to ask for this to be as well. > >>> > >>> No, it wasn't, it was that people felt the cases where changing > >>> wal_level would seriously hurt performance didn't out-weigh the value of > >>> making the change to the default. In other words, the change of the *default* for checksums, at least in my view, is well worth the performance impact, and I came to that conclusion based on the previously published work when the feature was being developed, which was a few percent, as I recall, though I'd be fine with having it be the default even if it was 5%. At what point would you say we shouldn't have the default be to have checksums enabled? 1%? 5%? 10%? All that said, to be clear, I don't have any objection to Tomas (or whomever) doing performance testing of this case if he's interested and has time, but that I don't feel we really need a huge analysis of this. We did an analysis when the feature was developed and I doubt we would have been able to even get it included if it resulted in a 10% or more performance hit, though it wasn't that high, as I recall. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On 2017-01-21 12:09:53 -0500, Tom Lane wrote: > Also, if we do decide to do that, there's the question of timing. > As I mentioned, one of the chief risks I see is the possibility of > false-positive checksum failures due to bugs; I think that code has seen > sufficiently little field use that we should have little confidence that > no such bugs remain. So if we're gonna do it, I'd prefer to do it at the > very start of a devel cycle, so as to have the greatest opportunity to > find bugs before we ship the new default. What wouldn't hurt is enabling it by default in pg_regress on master for a while. That seems like a good thing to do independent of flipping the default. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Stephen Frostwrites: > As for checksums, I do see value in them and I'm pretty sure that the > author of that particular feature did as well, or we wouldn't even have > it as an option. You seem to be of the opinion that we might as well > just rip all of that code and work out as being useless. Not at all; I just think that it's not clear that they are a net win for the average user, and so I'm unconvinced that turning them on by default is a good idea. I could be convinced otherwise by suitable evidence. What I'm objecting to is turning them on without making any effort to collect such evidence. Also, if we do decide to do that, there's the question of timing. As I mentioned, one of the chief risks I see is the possibility of false-positive checksum failures due to bugs; I think that code has seen sufficiently little field use that we should have little confidence that no such bugs remain. So if we're gonna do it, I'd prefer to do it at the very start of a devel cycle, so as to have the greatest opportunity to find bugs before we ship the new default. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 21/01/17 17:51, Stephen Frost wrote: > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> On 21/01/17 17:31, Stephen Frost wrote: >>> This is just changing the *default*, not requiring checksums to always >>> be enabled. We do not hold the same standards for our defaults as we do >>> for always-enabled code, for clear reasons- not every situation is the >>> same and that's why we have defaults that people can change. >> >> I can buy that. If it's possible to turn checksums off without >> recreating data directory then I think it would be okay to have default on. > > I'm glad to hear that. > The change of wal_level was supported by benchmark, I think it's reasonable to ask for this to be as well. >>> >>> No, it wasn't, it was that people felt the cases where changing >>> wal_level would seriously hurt performance didn't out-weigh the value of >>> making the change to the default. >> >> Really? > > Yes. > >> https://www.postgresql.org/message-id/d34ce5b5-131f-66ce-f7c5-eb406dbe0...@2ndquadrant.com > > From the above link: > >> So while it'd be trivial to construct workloads demonstrating the >> optimizations in wal_level=minimal (e.g. initial loads doing CREATE >> TABLE + COPY + CREATE INDEX in a single transaction), but that would be >> mostly irrelevant I guess. > >> Instead, I've decided to run regular pgbench TPC-B-like workload on a >> bunch of different scales, and measure throughput + some xlog stats with >> each of the three wal_level options. > > In other words, there was no performance testing of the cases where > wal_level=minimal (the old default) optimizations would have been > compared against wal_level > minimal. > > I'm quite sure that the performance numbers for the CREATE TABLE + COPY > case with wal_level=minimal would have been *far* better than for > wal_level > minimal. Which is random usecase very few people do on regular basis. Checksums affect *everybody*. What the benchmarks gave us is a way to do informed decision for common use. All I am asking for here is to be able to do informed decision as well. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 2017-01-22 00:41:55 +0900, Fujii Masao wrote: > On Sun, Jan 22, 2017 at 12:18 AM, Petr Jelinek >wrote: > > On 21/01/17 11:39, Magnus Hagander wrote: > >> Is it time to enable checksums by default, and give initdb a switch to > >> turn it off instead? > > > > I'd like to see benchmark first, both in terms of CPU and in terms of > > produced WAL (=network traffic) given that it turns on logging of hint bits. > > +1 > > If the performance overhead by the checksums is really negligible, > we may be able to get rid of wal_log_hints parameter, as well. It's not just the performance overhead, but also the volume of WAL due to the additional FPIs... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 2017-01-21 11:39:18 +0100, Magnus Hagander wrote: > Is it time to enable checksums by default, and give initdb a switch to turn > it off instead? -1 - the WAL overhead is quite massive, and in contrast to the other GUCs recently changed you can't just switch this around. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Andreas Karlsson (andr...@proxel.se) wrote: > On 01/21/2017 04:48 PM, Stephen Frost wrote: > >* Fujii Masao (masao.fu...@gmail.com) wrote: > >>If the performance overhead by the checksums is really negligible, > >>we may be able to get rid of wal_log_hints parameter, as well. > > > >Prior benchmarks showed it to be on the order of a few percent, as I > >recall, so I'm not sure that we can say it's negligible (and that's not > >why Magnus was proposing changing the default). > > It might be worth looking into using the CRC CPU instruction to > reduce this overhead, like we do for the WAL checksums. Since that > is a different algorithm it would be a compatibility break and we > would need to support the old algorithm for upgraded clusters.. +1. I'd be all for removing the option and requiring checksums if we do that and it turns out that the performance hit ends up being less than 1%. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Stephen Frostwrites: > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > >> The change of wal_level was supported by benchmark, I think it's > >> reasonable to ask for this to be as well. > > > No, it wasn't, it was that people felt the cases where changing > > wal_level would seriously hurt performance didn't out-weigh the value of > > making the change to the default. > > It was "supported" in the sense that somebody took the trouble to measure > the impact, so that we had some facts on which to base the value judgment > that the cost was acceptable. In the case of checksums, you seem to be in > a hurry to arrive at a conclusion without any supporting evidence. No, no one measured the impact in the cases where wal_level=minimal makes a big difference, that I saw, at least. Further info with links to what was done are in my reply to Petr. As for checksums, I do see value in them and I'm pretty sure that the author of that particular feature did as well, or we wouldn't even have it as an option. You seem to be of the opinion that we might as well just rip all of that code and work out as being useless. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 21/01/17 17:31, Stephen Frost wrote: > > This is just changing the *default*, not requiring checksums to always > > be enabled. We do not hold the same standards for our defaults as we do > > for always-enabled code, for clear reasons- not every situation is the > > same and that's why we have defaults that people can change. > > I can buy that. If it's possible to turn checksums off without > recreating data directory then I think it would be okay to have default on. I'm glad to hear that. > >> The change of wal_level was supported by benchmark, I think it's > >> reasonable to ask for this to be as well. > > > > No, it wasn't, it was that people felt the cases where changing > > wal_level would seriously hurt performance didn't out-weigh the value of > > making the change to the default. > > Really? Yes. > https://www.postgresql.org/message-id/d34ce5b5-131f-66ce-f7c5-eb406dbe0...@2ndquadrant.com From the above link: > So while it'd be trivial to construct workloads demonstrating the > optimizations in wal_level=minimal (e.g. initial loads doing CREATE > TABLE + COPY + CREATE INDEX in a single transaction), but that would be > mostly irrelevant I guess. > Instead, I've decided to run regular pgbench TPC-B-like workload on a > bunch of different scales, and measure throughput + some xlog stats with > each of the three wal_level options. In other words, there was no performance testing of the cases where wal_level=minimal (the old default) optimizations would have been compared against wal_level > minimal. I'm quite sure that the performance numbers for the CREATE TABLE + COPY case with wal_level=minimal would have been *far* better than for wal_level > minimal. That case was entirely punted on as "mostly irrelevant" even though there are known production environments where those optimizations make a huge difference. Those are OLAP cases though, and not nearly enough folks around here seem to care one bit about them, which I continue to be disappointed by. Even so, I *did* agree with the change to the default of wal_level, based on an understanding of its value and that users could change to wal_level=minimal if they wished to, just as I am arguing that same thing here when it comes to checksums. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
Stephen Frostwrites: > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> The change of wal_level was supported by benchmark, I think it's >> reasonable to ask for this to be as well. > No, it wasn't, it was that people felt the cases where changing > wal_level would seriously hurt performance didn't out-weigh the value of > making the change to the default. It was "supported" in the sense that somebody took the trouble to measure the impact, so that we had some facts on which to base the value judgment that the cost was acceptable. In the case of checksums, you seem to be in a hurry to arrive at a conclusion without any supporting evidence. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 01/21/2017 04:48 PM, Stephen Frost wrote: * Fujii Masao (masao.fu...@gmail.com) wrote: If the performance overhead by the checksums is really negligible, we may be able to get rid of wal_log_hints parameter, as well. Prior benchmarks showed it to be on the order of a few percent, as I recall, so I'm not sure that we can say it's negligible (and that's not why Magnus was proposing changing the default). It might be worth looking into using the CRC CPU instruction to reduce this overhead, like we do for the WAL checksums. Since that is a different algorithm it would be a compatibility break and we would need to support the old algorithm for upgraded clusters.. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 21/01/17 17:31, Stephen Frost wrote: > Petr, > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> On 21/01/17 16:40, Stephen Frost wrote: >>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: On 21/01/17 11:39, Magnus Hagander wrote: > Is it time to enable checksums by default, and give initdb a switch to > turn it off instead? I'd like to see benchmark first, both in terms of CPU and in terms of produced WAL (=network traffic) given that it turns on logging of hint bits. >>> >>> Benchmarking was done previously, but I don't think it's really all that >>> relevant, we should be checksum'ing by default because we care about the >>> data and it's hard to get checksums enabled on a running system. >> >> I do think that performance implications are very relevant. And I >> haven't seen any serious benchmark that would incorporate all current >> differences between using and not using checksums. > > This is just changing the *default*, not requiring checksums to always > be enabled. We do not hold the same standards for our defaults as we do > for always-enabled code, for clear reasons- not every situation is the > same and that's why we have defaults that people can change. I can buy that. If it's possible to turn checksums off without recreating data directory then I think it would be okay to have default on. >> The change of wal_level was supported by benchmark, I think it's >> reasonable to ask for this to be as well. > > No, it wasn't, it was that people felt the cases where changing > wal_level would seriously hurt performance didn't out-weigh the value of > making the change to the default. > Really? https://www.postgresql.org/message-id/d34ce5b5-131f-66ce-f7c5-eb406dbe0...@2ndquadrant.com https://www.postgresql.org/message-id/83b33502-1bf8-1ffb-7c73-5b61ddeb6...@2ndquadrant.com -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure in commit_ts tap tests
On Sat, Jan 21, 2017 at 9:39 PM, Tom Lanewrote: > Pavan Deolasee writes: > > On Sat, Jan 21, 2017 at 9:09 PM, Tom Lane wrote: > >> Hm, but what of the "null" value? Also, I get > >> > >> $ perl -e 'use warnings; use Test::More; ok("2017-01-01" != "null", > "ok");' > >> Argument "null" isn't numeric in numeric ne (!=) at -e line 1. > >> Argument "2017-01-01" isn't numeric in numeric ne (!=) at -e line 1. > >> ok 1 - ok > > > It declares the test as "passed", right? > > Oh! So it does. That is one darn weird behavior of the != operator. > > Indeed! See this: # first numeric matches, doesn't check beyond that $ perl -e 'if ("2017-23" != "2017-24") {print "Not equal\n"} else {print "Equal\n"}' Equal # first numeric doesn't match, operators works ok $ perl -e 'if ("2017-23" != "2018-24") {print "Not equal\n"} else {print "Equal\n"}' Not equal # comparison of numeric with non-numeric, works ok $ perl -e 'if ("2017-23" != "Foo") {print "Not equal\n"} else {print "Equal\n"}' Not equal # numeric on RHS, works ok $ perl -e 'if ("Foo" != "2018-24") {print "Not equal\n"} else {print "Equal\n"}' Not equal These tests show that the operator returns the correct result it finds a numeric value at the start of the string, either on LHS or RHS. Also, it will only compare the numeric values until first non-numeric character is found. # no numeric on either side $ perl -e 'if ("Fri 2017-23" != "Fri 2017-23") {print "Not equal\n"} else {print "Equal\n"}' Equal *# no numeric on either side, arbitrary strings declared as equal* $ perl -e 'if ("Fri 2017-23" != "Foo") {print "Not equal\n"} else {print "Equal\n"}' Equal These two tests show why we saw no failure earlier. If neither LHS or RHS string has a starting numeric value, the operator declares the arguments as equal, irrespective of their values. I tested the same with == operator and that also exhibits the same behaviour. Weird and I wonder how it's not a source of constant bugs in perl code (I don't use perl a lot, so may be those who do are used to either turning warnings on or know this already. > > > There's still the point that we're not actually exercising this script > in the buildfarm ... > Yes indeed. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Checksums by default?
Stephen Frostwrites: > * Tom Lane (t...@sss.pgh.pa.us) wrote: >> Have we seen *even one* report of checksums catching problems in a useful >> way? > This isn't the right question. I disagree. If they aren't doing something useful for people who have turned them on, what's the reason to think they'd do something useful for the rest? > The right question is "have we seen reports of corruption which > checksums *would* have caught?" Sure, that's also a useful question, one which hasn't been answered. A third useful question is "have we seen any reports of false-positive checksum failures?". Even one false positive, IMO, would have costs that likely outweigh any benefits for typical installations with reasonably reliable storage hardware. I really do not believe that there's a case for turning on checksums by default, and I *certainly* won't go along with turning them on without somebody actually making that case. "Is it time yet" is not an argument. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Petr, * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 21/01/17 16:40, Stephen Frost wrote: > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > >> On 21/01/17 11:39, Magnus Hagander wrote: > >>> Is it time to enable checksums by default, and give initdb a switch to > >>> turn it off instead? > >> > >> I'd like to see benchmark first, both in terms of CPU and in terms of > >> produced WAL (=network traffic) given that it turns on logging of hint > >> bits. > > > > Benchmarking was done previously, but I don't think it's really all that > > relevant, we should be checksum'ing by default because we care about the > > data and it's hard to get checksums enabled on a running system. > > I do think that performance implications are very relevant. And I > haven't seen any serious benchmark that would incorporate all current > differences between using and not using checksums. This is just changing the *default*, not requiring checksums to always be enabled. We do not hold the same standards for our defaults as we do for always-enabled code, for clear reasons- not every situation is the same and that's why we have defaults that people can change. There are interesting arguments to be made about if checksum'ing is every worthwhile at all (some seem to see that the feature is entirely useless and we should just rip that code out, but I don't agree with that), or if we should just always enable it (because fewer options is a good thing and we care about our user's data and checksum'ing is worth the performance hit if it's a small hit; I'm more on the fence when it comes to this one as I have heard people say that they've run into cases where it does enough of a difference in performance to matter for them). We don't currently configure the defaults for any system to be the fastest possible performance, or we wouldn't have changed wal_level and we would have move aggressive settings for things like default work_mem, maintenance_work_mem, shared_buffers, max_wal_size, checkpoint_completion_target, all of the autovacuum settings, effective_io_concurrency, effective_cache_size, etc, etc. > The change of wal_level was supported by benchmark, I think it's > reasonable to ask for this to be as well. No, it wasn't, it was that people felt the cases where changing wal_level would seriously hurt performance didn't out-weigh the value of making the change to the default. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On 21/01/17 16:40, Stephen Frost wrote: > Petr, > > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: >> On 21/01/17 11:39, Magnus Hagander wrote: >>> Is it time to enable checksums by default, and give initdb a switch to >>> turn it off instead? >> >> I'd like to see benchmark first, both in terms of CPU and in terms of >> produced WAL (=network traffic) given that it turns on logging of hint bits. > > Benchmarking was done previously, but I don't think it's really all that > relevant, we should be checksum'ing by default because we care about the > data and it's hard to get checksums enabled on a running system. > I do think that performance implications are very relevant. And I haven't seen any serious benchmark that would incorporate all current differences between using and not using checksums. The change of wal_level was supported by benchmark, I think it's reasonable to ask for this to be as well. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pdf versus single-html
Erik Rijkerswrites: > With the pdf increasingly in readability-decline (more and more text > parts fall off the (right) side of the page and quite a few tables > contain unreable bits) I would like to have a single page html. Given the size of our manual, I find it hard to believe that most readers would perform acceptably with that ... have you tried just concatenating all the parts manually and testing performance? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Magnus Haganderwrites: > > Is it time to enable checksums by default, and give initdb a switch to turn > > it off instead? > > Have we seen *even one* report of checksums catching problems in a useful > way? This isn't the right question. The right question is "have we seen reports of corruption which checksums *would* have caught?" Admittedly, that's a much harder question to answer, but I've definitely seen various reports of corruption in the field, but it's reasonably rare (which I am sure we can all be thankful for). I can't say for sure which of those cases would have been caught if checksums had been enabled, but I have a hard time believing that none of them would have been caught sooner if checksums had been enabled and regular checksum validation was being performed. Given our current default and the relative rarity that it happens, it'll be a great deal longer until we see such a report- but when we do (and I don't doubt that we will, eventually) what are we going to do about it? Tell the vast majority of people who still don't have checksums enabled because it wasn't the default that they need to pg_dump/reload? That's not a good way to treat our users. > I think this will be making the average user pay X% for nothing. Have we seen *even one* report of someone having to disable checksums for performance reasons? If so, that's an argument for giving a way for users who really trust their hardware, virtualization system, kernel, storage network, and everything else involved, to disable checksums (as I suggested elsewhere), not a reason to keep the current default. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Failure in commit_ts tap tests
Pavan Deolaseewrites: > On Sat, Jan 21, 2017 at 9:09 PM, Tom Lane wrote: >> Hm, but what of the "null" value? Also, I get >> >> $ perl -e 'use warnings; use Test::More; ok("2017-01-01" != "null", "ok");' >> Argument "null" isn't numeric in numeric ne (!=) at -e line 1. >> Argument "2017-01-01" isn't numeric in numeric ne (!=) at -e line 1. >> ok 1 - ok > It declares the test as "passed", right? Oh! So it does. That is one darn weird behavior of the != operator. > I am not saying that's a correct > behaviour, but that's why we didn't catch the problem earlier. Check. Mystery solved. There's still the point that we're not actually exercising this script in the buildfarm ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure in commit_ts tap tests
On Sat, Jan 21, 2017 at 9:09 PM, Tom Lanewrote: > > > Hm, but what of the "null" value? Also, I get > > $ perl -e 'use warnings; use Test::More; ok("2017-01-01" != "null", "ok");' > Argument "null" isn't numeric in numeric ne (!=) at -e line 1. > Argument "2017-01-01" isn't numeric in numeric ne (!=) at -e line 1. > ok 1 - ok > It declares the test as "passed", right? I am not saying that's a correct behaviour, but that's why we didn't catch the problem earlier. May be its a bug in the Test module? If I add a non-numeric prefix to LHS string, test fails. $ perl -e 'use warnings; use Test::More; ok("Fri 2017-46" != "null", "ok");' Argument "null" isn't numeric in numeric ne (!=) at -e line 1. Argument "Fri 2017-46" isn't numeric in numeric ne (!=) at -e line 1. not ok 1 - ok # Failed test 'ok' # at -e line 1. # Tests were run but no plan was declared and done_testing() was not seen. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Checksums by default?
Magnus Haganderwrites: > Is it time to enable checksums by default, and give initdb a switch to turn > it off instead? Have we seen *even one* report of checksums catching problems in a useful way? I think this will be making the average user pay X% for nothing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Fujii Masao (masao.fu...@gmail.com) wrote: > On Sun, Jan 22, 2017 at 12:18 AM, Petr Jelinek >wrote: > > On 21/01/17 11:39, Magnus Hagander wrote: > >> Is it time to enable checksums by default, and give initdb a switch to > >> turn it off instead? > > > > I'd like to see benchmark first, both in terms of CPU and in terms of > > produced WAL (=network traffic) given that it turns on logging of hint bits. > > +1 > > If the performance overhead by the checksums is really negligible, > we may be able to get rid of wal_log_hints parameter, as well. Prior benchmarks showed it to be on the order of a few percent, as I recall, so I'm not sure that we can say it's negligible (and that's not why Magnus was proposing changing the default). Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On Sun, Jan 22, 2017 at 12:18 AM, Petr Jelinekwrote: > On 21/01/17 11:39, Magnus Hagander wrote: >> Is it time to enable checksums by default, and give initdb a switch to >> turn it off instead? > > I'd like to see benchmark first, both in terms of CPU and in terms of > produced WAL (=network traffic) given that it turns on logging of hint bits. +1 If the performance overhead by the checksums is really negligible, we may be able to get rid of wal_log_hints parameter, as well. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Petr, * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote: > On 21/01/17 11:39, Magnus Hagander wrote: > > Is it time to enable checksums by default, and give initdb a switch to > > turn it off instead? > > I'd like to see benchmark first, both in terms of CPU and in terms of > produced WAL (=network traffic) given that it turns on logging of hint bits. Benchmarking was done previously, but I don't think it's really all that relevant, we should be checksum'ing by default because we care about the data and it's hard to get checksums enabled on a running system. If this is going to be a serious argument made against making this change (and, frankly, I don't believe that it should be) then what we should do is simply provide a way for users to disable checksums. It would be one-way and require a restart, of course, but it wouldn't be hard to do. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Failure in commit_ts tap tests
Pavan Deolaseewrites: > I think I understand why it's only affecting me and not others. > I've PGDATESTYLE set to "Postgres, MDY" in my bashrc and that formats the > commit timestamp as "Fri Jan 20 07:59:52.322811 2017 PST". If I unset that, > the result comes in a format such as "2017-01-20 21:31:47.766371-08". > Looks like perl doesn't throw an error if it can parse the leading part of > the string as a numeric. It still throws a warning, but the test passes. Hm, but what of the "null" value? Also, I get $ perl -e 'use warnings; use Test::More; ok("2017-01-01" != "null", "ok");' Argument "null" isn't numeric in numeric ne (!=) at -e line 1. Argument "2017-01-01" isn't numeric in numeric ne (!=) at -e line 1. ok 1 - ok # Tests were run but no plan was declared and done_testing() was not seen. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 21/01/17 11:39, Magnus Hagander wrote: > Is it time to enable checksums by default, and give initdb a switch to > turn it off instead? I'd like to see benchmark first, both in terms of CPU and in terms of produced WAL (=network traffic) given that it turns on logging of hint bits. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] remote_apply for logical replication?
On 21/01/17 01:34, Thomas Munro wrote: > Hi, > > In src/backend/replication/logical/worker.c: > > pq_sendbyte(reply_message, 'r'); > pq_sendint64(reply_message, recvpos); /* write */ > pq_sendint64(reply_message, flushpos); /* flush */ > pq_sendint64(reply_message, writepos); /* apply */ > > Is 'writepos' really applied? Why do we report the merely received > position as written? > Because we don't have intermediate steps in logical replication, writes happen immediately and in whole transactions so whatever was received by the time we send reply is already written (it might not necessarily be that way forever so the code may become more complicated eventually). > I haven't tried any of this stuff out yet so this may be a stupid > question, but I'm wondering: should xinfo & > XACT_COMPLETION_APPLY_FEEDBACK trigger logical replication to send a > reply, so that synchronous_commit = remote_apply would work? > In fact everything is remote_apply in logical replication for same reason described above. The differences can only happen when the subscription is running with synchronous_commit = off where flush position is behind. The xinfo & XACT_COMPLETION_APPLY_FEEDBACK does not affect logical replication though as it does not have access to that information (subscriber does not receive raw WAL and logical decoding does not decode this). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
Magnus, * Magnus Hagander (mag...@hagander.net) wrote: > On Sat, Jan 21, 2017 at 3:05 PM, Michael Paquier> wrote: > > > On Sat, Jan 21, 2017 at 7:39 PM, Magnus Hagander > > wrote: > > > Is it time to enable checksums by default, and give initdb a switch to > > turn > > > it off instead? > > > > > > I keep running into situations where people haven't enabled it, because > > (a) > > > they didn't know about it, or (b) their packaging system ran initdb for > > them > > > so they didn't even know they could. And of course they usually figure > > this > > > out once the db has enough data and traffic that the only way to fix it > > is > > > to set up something like slony/bucardo/pglogical and a whole new server > > to > > > deal with it.. (Which is something that would also be good to fix -- but > > > having the default changed would be useful as well) > > > > Perhaps that's not mandatory, but I think that one obstacle in > > changing this default is to be able to have pg_upgrade work from a > > checksum-disabled old instance to a checksum-enabled instance. That > > would really help with its adoption. > > That's a different usecase though. Agreed. > If we just change the default, then we'd have to teach pg_upgrade to > initialize the upgraded cluster without checksums. We still need to keep > that *option*, just reverse the default. Just to clarify- pg_upgrade doesn't init the new database, the user (or a distribution script) does. As such *pg_upgradecluster* would have to know to init the new cluster correctly based on the options the old cluster was init'd with, but it might actually already do that (not sure off-hand), and, even if it doesn't, it shouldn't be too hard to make it to that. > Being able to enable checksums on the fly is a different feature. Which I'd > really like to have. I have some unfinished code for it, but it's a bit too > unfinished so far :) Agreed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
* Michael Paquier (michael.paqu...@gmail.com) wrote: > On Sat, Jan 21, 2017 at 7:39 PM, Magnus Haganderwrote: > > Is it time to enable checksums by default, and give initdb a switch to turn > > it off instead? > > > > I keep running into situations where people haven't enabled it, because (a) > > they didn't know about it, or (b) their packaging system ran initdb for them > > so they didn't even know they could. And of course they usually figure this > > out once the db has enough data and traffic that the only way to fix it is > > to set up something like slony/bucardo/pglogical and a whole new server to > > deal with it.. (Which is something that would also be good to fix -- but > > having the default changed would be useful as well) > > Perhaps that's not mandatory, but I think that one obstacle in > changing this default is to be able to have pg_upgrade work from a > checksum-disabled old instance to a checksum-enabled instance. That > would really help with its adoption. That's moving the goal-posts here about 3000 miles away and I don't believe it's necessary to have that to make this change. I agree that it'd be great to have, of course, and we're looking at if we could do something like: backup a checksum-disabled system, perform a restore which adds checksums and marks the cluster as now having checksums. If we can work out a good way to do that *and* have it work with incremental backup/restore, then we could possibly provide a small-downtime-window way to upgrade to a database with checksums. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Checksums by default?
On Sat, Jan 21, 2017 at 3:05 PM, Michael Paquierwrote: > On Sat, Jan 21, 2017 at 7:39 PM, Magnus Hagander > wrote: > > Is it time to enable checksums by default, and give initdb a switch to > turn > > it off instead? > > > > I keep running into situations where people haven't enabled it, because > (a) > > they didn't know about it, or (b) their packaging system ran initdb for > them > > so they didn't even know they could. And of course they usually figure > this > > out once the db has enough data and traffic that the only way to fix it > is > > to set up something like slony/bucardo/pglogical and a whole new server > to > > deal with it.. (Which is something that would also be good to fix -- but > > having the default changed would be useful as well) > > Perhaps that's not mandatory, but I think that one obstacle in > changing this default is to be able to have pg_upgrade work from a > checksum-disabled old instance to a checksum-enabled instance. That > would really help with its adoption. > That's a different usecase though. If we just change the default, then we'd have to teach pg_upgrade to initialize the upgraded cluster without checksums. We still need to keep that *option*, just reverse the default. Being able to enable checksums on the fly is a different feature. Which I'd really like to have. I have some unfinished code for it, but it's a bit too unfinished so far :) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Checksums by default?
On Sat, Jan 21, 2017 at 7:39 PM, Magnus Haganderwrote: > Is it time to enable checksums by default, and give initdb a switch to turn > it off instead? > > I keep running into situations where people haven't enabled it, because (a) > they didn't know about it, or (b) their packaging system ran initdb for them > so they didn't even know they could. And of course they usually figure this > out once the db has enough data and traffic that the only way to fix it is > to set up something like slony/bucardo/pglogical and a whole new server to > deal with it.. (Which is something that would also be good to fix -- but > having the default changed would be useful as well) Perhaps that's not mandatory, but I think that one obstacle in changing this default is to be able to have pg_upgrade work from a checksum-disabled old instance to a checksum-enabled instance. That would really help with its adoption. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
* Magnus Hagander (mag...@hagander.net) wrote: > Is it time to enable checksums by default, and give initdb a switch to turn > it off instead? Yes, please. We've already agreed to make changes to have a better user experience and ask those who really care about certain performance aspects to have to configure for performance instead (see: wal_level changes), I view this as being very much in that same vein. I know one argument in the past has been that we don't have a tool that can be used to check all of the checksums, but that's also changed now that pgBackRest supports verifying checksums during backups. I'm all for adding a tool to core to perform a validation too, of course, though it does make a lot of sense to validate checksums during backup since you're reading all the pages anyway. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Review: GIN non-intrusive vacuum of posting tree
Hello Jeff! >Review of the code itself: How do you think, is there anything else to improve in that patch or we can mark it as 'Ready for committer'? I've checked the concurrency algorithm with original work of Lehman and Yao on B-link-tree. For me everything looks fine (safe, deadlock free and not increased probability of livelock due to LockBufferForCleanup call). Best regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pdf versus single-html
Hi, With the pdf increasingly in readability-decline (more and more text parts fall off the (right) side of the page and quite a few tables contain unreable bits) I would like to have a single page html. (cf the bash scipting guide at http://tldp.org/guides.html even if that is smaller than our html (2.3 MB vs 12 MB)) Simple concatenation is trivial but to keep the texts in the correct order, and to keep the links working is a bit more complicated. Ideally I'd like a single file with chapter-headings either at the top or even (gulp) in a left-hand frame. The goal is to make the text directly searchable, like one can do in the pdf. It might even be good to include such a single-file html in the Makefile as an option. I'll give it a try but has anyone done this work already, perhaps? Thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Checksums by default?
Is it time to enable checksums by default, and give initdb a switch to turn it off instead? I keep running into situations where people haven't enabled it, because (a) they didn't know about it, or (b) their packaging system ran initdb for them so they didn't even know they could. And of course they usually figure this out once the db has enough data and traffic that the only way to fix it is to set up something like slony/bucardo/pglogical and a whole new server to deal with it.. (Which is something that would also be good to fix -- but having the default changed would be useful as well) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/