Re: [HACKERS] 16-bit page checksums for 9.2
Heikki Linnakangas wrote: On 28.12.2011 01:39, Simon Riggs wrote: On Tue, Dec 27, 2011 at 8:05 PM, Heikki Linnakangas wrote: On 25.12.2011 15:01, Kevin Grittner wrote: I don't believe that. Double-writing is a technique to avoid torn pages, but it requires a checksum to work. This chicken- and-egg problem requires the checksum to be implemented first. I don't think double-writes require checksums on the data pages themselves, just on the copies in the double-write buffers. In the double-write buffer, you'll need some extra information per- page anyway, like a relfilenode and block number that indicates which page it is in the buffer. You are clearly right -- if there is no checksum in the page itself, you can put one in the double-write metadata. I've never seen that discussed before, but I'm embarrassed that it never occurred to me. How would you know when to look in the double write buffer? You scan the double-write buffer, and every page in the double write buffer that has a valid checksum, you copy to the main storage. There's no need to check validity of pages in the main storage. Right. I'll recap my understanding of double-write (from memory -- if there's a material error or omission, I hope someone will correct me). The write-ups I've seen on double-write techniques have all the writes to the double-write buffer (a single, sequential file that stays around). This is done as sequential writing to a file which is overwritten pretty frequently, making the writes to a controller very fast, and a BBU write-back cache unlikely to actually write to disk very often. On good server-quality hardware, it should be blasting RAM-to_RAM very efficiently. The file is fsync'd (like I said, hopefully to BBU cache), then each page in the double-write buffer is written to the normal page location, and that is fsync'd. Once that is done, the database writes have no risk of being torn, and the double-write buffer is marked as empty. This all happens at the point when you would be writing the page to the database, after the WAL-logging. On crash recovery you read through the double-write buffer from the start and write the pages which look good (including a good checksum) to the database before replaying WAL. If you find a checksum error in processing the double-write buffer, you assume that you never got as far as the fsync of the double-write buffer, which means you never started writing the buffer contents to the database, which means there can't be any torn pages there. If you get to the end and fsync, you can be sure any torn pages from a previous attempt to write to the database itself have been overwritten with the good copy in the double-write buffer. Either way, you move on to WAL processing. You wind up with a database free of torn pages before you apply WAL. full_page_writes to the WAL are not needed as long as double-write is used for any pages which would have been written to the WAL. If checksums were written to the double-buffer metadata instead of adding them to the page itself, this could be implemented alone. It would probably allow a modest speed improvement over using full_page_writes and would eliminate those full-page images from the WAL files, making them smaller. If we do add a checksum to the page header, that could be used for testing for torn pages in the double-write buffer without needing a redundant calculation for double-write. With no torn pages in the actual database, checksum failures there would never be false positives. To get this right for a checksum in the page header, double-write would need to be used for all cases where full_page_writes now are used (i.e., the first write of a page after a checkpoint), and for all unlogged writes (e.g., hint-bit-only writes). There would be no correctness problem for always using double-write, but it would be unnecessary overhead for other page writes, which I think we can avoid. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] failed regress test
Hello I checked yesterday HEAD and there is problem with create_index test Regards Pavel regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Dec 28, 2011 at 5:45 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 28.12.2011 11:22, Simon Riggs wrote: On Wed, Dec 28, 2011 at 7:42 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: How would you know when to look in the double write buffer? You scan the double-write buffer, and every page in the double write buffer that has a valid checksum, you copy to the main storage. There's no need to check validity of pages in the main storage. OK, then we are talking at cross purposes. Double write buffers, in the way you explain them allow us to remove full page writes. They clearly don't do anything to check page validity on read. Torn pages are not the only fault we wish to correct against... and the double writes idea is orthogonal to the idea of checksums. The reason we're talking about double write buffers in this thread is that double write buffers can be used to solve the problem with hint bits and checksums. Torn pages are not the only problem we need to detect. You said You scan the double write buffer When exactly would you do that? Please explain how a double write buffer detects problems that do not occur as the result of a crash. We don't have much time, so please be clear and lucid. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Heikki Linnakangas wrote: Simon Riggs wrote: OK, then we are talking at cross purposes. Double write buffers, in the way you explain them allow us to remove full page writes. They clearly don't do anything to check page validity on read. Torn pages are not the only fault we wish to correct against... and the double writes idea is orthogonal to the idea of checksums. The reason we're talking about double write buffers in this thread is that double write buffers can be used to solve the problem with hint bits and checksums. Exactly. Every time the issue of page checksums is raised, there are objections because OS or hardware crashes could cause torn pages for hint-bit-only writes which would be treated as serious errors (potentially indicating hardware failure) when they are in fact expected and benign. Some time before the thread dies, someone generally points out that double-write technology would be a graceful way to handle that, with the side benefit of smaller WAL files. All available evidence suggests it would also allow a small performance improvement, although I hesitate to emphasize that aspect of it; the other benefits fully justify the effort without that. I do feel there is value in a page checksum patch even without torn page protection. The discussion on the list has convinced me that a failed checksum should be treated as seriously as other page format errors, rather than as a warning, even though (in the absence of torn page protection) torn hint-bit-only page writes would be benign. As an example of how this might be useful, consider our central databases which contain all the detail replicated from the circuit court databases in all the counties. These are mission-critical, so we have redundant servers in separate buildings. At one point, one of them experienced hardware problems and we started seeing invalid pages. Since we can shift the load between these servers without down time, we moved all applications to other servers, and investigated. Now, it's possible that for some time before we got errors on the bad pages, there could have been subtle corruption which didn't generate errors but presented bad data on our web site. A page checksum would help prevent that sort of problem, and a post-crash false positive might waste a little time in investigation, but that cost would be far outweighed by the benefit of better accuracy guarantees. Of course, it will be a big plus if we can roll this out in 9.2 in conjunction with a double-write feature. Not only will double-write probably be a bit faster than full_page_writes in the WAL log, but it will allow protection against torn pages on hint-bit-only writes without adding those writes to the WAL or doing any major rearrangement of where they sit that would break pg_upgrade. It would be nice not to have to put all sorts of caveats and explanations into the docs about how a checksum error might be benign due to hint bit writes. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] grants vs. inherited tables
On Thu, Dec 29, 2011 at 03:12:50PM -0300, Alvaro Herrera wrote: Excerpts from Marko Kreen's message of jue dic 29 15:04:49 -0300 2011: 3) Have a way to format aclitem into something that can used to create GRANT statement easily. Eg: pg_get_privilege_info( IN priv aclitem, OUT rolename text, OUT privlist text[], OUT privlist_with_grant_option text[]); This allows doing complex introspection in pl/pgsql and also helps tools that want to re-create table structure in other databases. aclexplode? I guess that decides the name. :) -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] grants vs. inherited tables
Excerpts from Marko Kreen's message of jue dic 29 15:04:49 -0300 2011: 3) Have a way to format aclitem into something that can used to create GRANT statement easily. Eg: pg_get_privilege_info( IN priv aclitem, OUT rolename text, OUT privlist text[], OUT privlist_with_grant_option text[]); This allows doing complex introspection in pl/pgsql and also helps tools that want to re-create table structure in other databases. aclexplode? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib/README
Excerpts from Dimitri Fontaine's message of mié dic 28 15:12:48 -0300 2011: Tom Lane t...@sss.pgh.pa.us writes: I wonder whether it's time to drop that file altogether ... it served a purpose back before we integrated contrib into the SGML docs, but now I'm not quite sure why we should bother with it. I wonder if we shouldn't keep the file and have it just point to the relevant documentation chapters (extend, contribs). +1 Only start-scripts needs to be listed separately. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Dec 29, 2011 at 11:08:43AM -0600, Kevin Grittner wrote: Heikki Linnakangas wrote: Simon Riggs wrote: OK, then we are talking at cross purposes. Double write buffers, in the way you explain them allow us to remove full page writes. They clearly don't do anything to check page validity on read. Torn pages are not the only fault we wish to correct against... and the double writes idea is orthogonal to the idea of checksums. The reason we're talking about double write buffers in this thread is that double write buffers can be used to solve the problem with hint bits and checksums. Of course, it will be a big plus if we can roll this out in 9.2 in conjunction with a double-write feature. Not only will double-write probably be a bit faster than full_page_writes in the WAL log, but it will allow protection against torn pages on hint-bit-only writes without adding those writes to the WAL or doing any major rearrangement of where they sit that would break pg_upgrade. [Thanks for your recent thread summaries.] A double-write buffer, like a WAL-logged full-page image, is a technique for performing atomic writes wider than those automatically provided by components further down the storage stack. The two strategies have different performance characteristics, and we're told that a double-write buffer would better serve us overall. However, its benefits would not be *greater* for hint-only writes than for any other write. For that reason, I think we should consider these changes independently. With page checksums enabled, remove the hazard of torn hint-only writes by ensuring that a WAL FPI has flushed since the last checkpoint. When necessary, emit an FPI-only record. Separately, optimize first-since-checkpoint writes by replacing FPIs with double-write buffers. The double-write patch will reduce the added WAL of the checksum/safe-hint-updates patch to zero. If the double-writes patch founders, we'll just have more-costly, yet equally reliable, page checksums. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [RFC] grants vs. inherited tables
Hello, I tried to generalize a function that creates partitions for a table and found out it's impossible to do it for grants. Basically, what I want is a child table that takes it's grants from parent table. IMHO quite reasonable request. But I don't see a way to do it in pl/pgsql. (Writing parser in plpgsql for aclitemout() output does not count.) The form for the create statement is: CREATE TABLE part ( [pre-9.0]LIKE parent INCLUDING INDEXES INCLUDING CONSTRAINTS [9.0+] LIKE parent INCLUDING ALL-- skips grants ) INHERITS (parent); Unless I'm missing something obvious, there is no way to take grants from parent table. My suggestions: 1) Have 'GRANTS' option for LIKE. Seems obvious. 2) Include 'GRANTS' option in 'ALL'. Also obvious. 3) Have a way to format aclitem into something that can used to create GRANT statement easily. Eg: pg_get_privilege_info( IN priv aclitem, OUT rolename text, OUT privlist text[], OUT privlist_with_grant_option text[]); This allows doing complex introspection in pl/pgsql and also helps tools that want to re-create table structure in other databases. Although 1)+2) and 3) seem like alternatives, I suggest doing all of them, thus improving GRANT usage across the board. Comments? -- marko NB: this mail is about designing and accepting TODO-items. I might do them myself sometime, but I don't mind if anyone implements them before me.. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] failed regress test
Pavel Stehule pavel.steh...@gmail.com writes: I checked yesterday HEAD and there is problem with create_index test What locale are you using? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] failed regress test
Hello LANG=cs_CZ.utf8 Regards Pavel 2011/12/29 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: I checked yesterday HEAD and there is problem with create_index test What locale are you using? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] spinlocks on HP-UX
On 29.12.2011 04:36, Manabu Ori wrote: I believe lwarx hint would be no harm for recent PowerPC processors. What I tested are: (1) Built postgres on POWER6 + RHEL5, which got lwarx hint included. Then copy these src tree to POWER5 + RHEL4 and run make test, finished successfully. (2) Lwarx test in configure failed on POWER5 + RHEL4. Note that POWER6 understands lwarx hint and POWER5 doesn't. RHEL5 binutils supports lwarx hint and RHEL4 binutils doesn't. The only concern is for very old PowerPC. Referring to Power Instruction Set Architecture manual(*1), on some processors that precede PowerISA v2.00, executing lwarx with hint will cause the illegal instruction error. Lwarx test in configure should fail on these kind of processors, guessing from my test(2). The Linux kernel does this (arch/powerpc/include/asm/ppc-opcode.h): 127 /* 128 * Only use the larx hint bit on 64bit CPUs. e500v1/v2 based CPUs will treat a 129 * larx with EH set as an illegal instruction. 130 */ 131 #ifdef CONFIG_PPC64 132 #define __PPC_EH(eh)(((eh) 0x1) 0) 133 #else 134 #define __PPC_EH(eh)0 135 #endif We can't copy-paste code from Linux directly, and I'm not sure I like that particular phrasing of the macro, but perhaps we should steal the idea and only use the hint on 64-bit PowerPC processors? I presume all the processors that support the hint are 64-bit, so the question is, is there any 64-bit PowerPC processors that would get upset about it? It's quite arbitrary to tie it to the word length, but if it works as a dividing line in practice, I'm fine with it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ALTER DOMAIN DROP CONSTRAINT doesn't catch errors
Is there a secret reason why ALTER DOMAIN foo DROP CONSTRAINT nonexistent; doesn't report any error? If not, I think we should add one and also add the usual IF EXISTS option. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] get_fn_expr_argtype() vs. internal calls
We document that a polymorphic C-language function may identify the concrete data type of each argument using calls to get_fn_expr_argtype(). That relies on FmgrInfo.fn_expr, which only the executor sets. Calls of internal origin, by way of {Direct,,Oid}FunctionCall*(), don't cons up an fn_expr, so get_fn_expr_argtype() just returns InvalidOid every time. (Indeed, we couldn't easily do better in many cases.) To what extent is it safe to rely on this situation remaining as it is? I ask on account of some second thoughts I had about CheckIndexCompatible(). When writing it, I did not explicitly consider operator classes having polymorphic opcintype. If get_fn_expr_argtype() were to work in a function called from the btree search code, CheckIndexCompatible() should impose stricter checks on indexes having opclasses of polymorphic opcintype. If that's not too likely to happen, I might just add a comment instead. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] failed regress test
Pavel Stehule pavel.steh...@gmail.com writes: 2011/12/29 Tom Lane t...@sss.pgh.pa.us: What locale are you using? LANG=cs_CZ.utf8 Fixed, thanks. (It'd be nice if there were a buildfarm member checking that locale, since it causes problems on a regular basis.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] spinlocks on HP-UX
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: The Linux kernel does this (arch/powerpc/include/asm/ppc-opcode.h): Yeah, I was looking at that too. We can't copy-paste code from Linux directly, and I'm not sure I like that particular phrasing of the macro, but perhaps we should steal the idea and only use the hint on 64-bit PowerPC processors? The info that I've found says that the hint exists beginning in POWER6, and there were certainly 64-bit Power machines before that. However, it might be that the only machines that actually spit up on the hint bit (rather than ignore it) were 32-bit, in which case this would be a usable heuristic. Not sure how we can research that ... do we want to just assume the kernel guys know what they're doing? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] SEGFAULT on SELECT * FROM view
Hi all, When SELECTing from one particular view, I get reproducible SEGFAULTs in my pg 9.1.2 production database. To test for hardware errors, I ran memtest, which succeeded. I then asked for help in #postgres and got advice to create stacktraces, but I couldn't find the problem yet. Scenario: VIEW 'vwa' on TABLE 'tba' and some subselects and aggregates. I've no idea what's wrong with this view, but a SELECT * FROM vwa WHERE myid = 110 LIMIT 100 OFFSET 0; and even EXPLAINing this statement SEGFAULTs the server. gdb trace: http://pgsql.privatepaste.com/eacd1b6c5d gdb ec output: http://pgsql.privatepaste.com/a61db5b564 smaps: http://pgsql.privatepaste.com/3c4f494015 Then I used the definition of vwa, and created vwb, and the very same SELECT (just with vwb) works perfectly fine. I've a pastebin link comparing the two view definitions, which I'm willing to share privately if that helps. The views actually differ, although the look identical with \d+ in the psql console, in that the newer view names more columns that were added to the referenced tables lately. So, you tell me, what's wrong with the old view? Thanks a lot, Chris -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SEGFAULT on SELECT * FROM view
chris r. chri...@gmx.net writes: So, you tell me, what's wrong with the old view? Nothing; that looks like a planner bug to me. Please submit a self-contained test case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SEGFAULT on SELECT * FROM view
So, you tell me, what's wrong with the old view? Nothing; that looks like a planner bug to me. Please submit a self-contained test case. I ported the entire schema to my test DB server and could not reproduce the error there. Note that probably recreating the view solves this issue. Given this, how should I proceed to create a test case? Any tutorial on this? (I'm not too familiar with all this yet.) Chris -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] failed regress test
2011/12/29 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: 2011/12/29 Tom Lane t...@sss.pgh.pa.us: What locale are you using? LANG=cs_CZ.utf8 Fixed, thanks. (It'd be nice if there were a buildfarm member checking that locale, since it causes problems on a regular basis.) Thank you I hope so we will have a server for buildfarm that we can use for check with some specific locales. We have to sign some papers only. Regards Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Dec 29, 2011 at 6:44 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: positives. To get this right for a checksum in the page header, double-write would need to be used for all cases where full_page_writes now are used (i.e., the first write of a page after a checkpoint), and for all unlogged writes (e.g., hint-bit-only writes). There would be no correctness problem for always using double-write, but it would be unnecessary overhead for other page writes, which I think we can avoid. Unless I'm missing something, double-writes are needed for all writes, not only the first page after a checkpoint. Consider this sequence of events: 1. Checkpoint 2. Double-write of page A (DW buffer write, sync, heap write) 3. Sync of heap, releasing DW buffer for new writes. ... some time goes by 4. Regular write of page A 5. OS writes one part of page A 6. Crash! Now recovery comes along, page A is broken in the heap with no double-write buffer backup nor anything to recover it by in the WAL. -- Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SEGFAULT on SELECT * FROM view
chris r. chri...@gmx.net writes: Nothing; that looks like a planner bug to me. Please submit a self-contained test case. I ported the entire schema to my test DB server and could not reproduce the error there. Note that probably recreating the view solves this issue. Given this, how should I proceed to create a test case? Any tutorial on this? (I'm not too familiar with all this yet.) It's possibly statistics-dependent; make sure you have the same stats targets on both DBs, and try re-analyzing a few times. Check other planner parameters are the same, too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
2011/12/30 Ants Aasma ants.aa...@eesti.ee: On Thu, Dec 29, 2011 at 6:44 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: positives. To get this right for a checksum in the page header, double-write would need to be used for all cases where full_page_writes now are used (i.e., the first write of a page after a checkpoint), and for all unlogged writes (e.g., hint-bit-only writes). There would be no correctness problem for always using double-write, but it would be unnecessary overhead for other page writes, which I think we can avoid. Unless I'm missing something, double-writes are needed for all writes, not only the first page after a checkpoint. Consider this sequence of events: 1. Checkpoint 2. Double-write of page A (DW buffer write, sync, heap write) 3. Sync of heap, releasing DW buffer for new writes. ... some time goes by 4. Regular write of page A 5. OS writes one part of page A 6. Crash! Now recovery comes along, page A is broken in the heap with no double-write buffer backup nor anything to recover it by in the WAL. I guess the assumption is that the write in (4) is either backed by the WAL, or made safe by double writing. ISTM that such reasoning is only correct if the change that is expressed by the WAL record can be applied in the context of inconsistent (i.e., partially written) pages, which I assume is not the case (excuse my ignorance regarding such basic facts). So I think you are right. Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] failed regress test
On 12/29/2011 05:06 PM, Tom Lane wrote: Pavel Stehulepavel.steh...@gmail.com writes: 2011/12/29 Tom Lanet...@sss.pgh.pa.us: What locale are you using? LANG=cs_CZ.utf8 Fixed, thanks. (It'd be nice if there were a buildfarm member checking that locale, since it causes problems on a regular basis.) I have added it to the duty roster for nightjar. You don't need a separate buildfarm member, a single member can check multiple locales. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] backup_label during crash recovery: do we know how to solve it?
On Sat, Dec 3, 2011 at 8:04 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: At the moment, if the situation is ambiguous, the system assumes that you're restoring from a backup. What your suggestion amounts to is to reverse tht assumption, and assume instead that you're doing crash recovery on a system where a backup was being taken. In that case, if you take a backup with pg_base_backup(), and fail to archive the WAL files correctly, or forget to create a recovery.conf file, the database will happily start up from the backup, but is in fact corrupt. That is not good either. Sorry for my lengthy time before getting around to writing a response, but I do think there is, in practice, a way around this conundrum, whose fundamental goal is to make sure that the backup is not, in actuality, a full binary copy of the database. A workaround that has a much smaller restart-hole is to move the backup_label in and out of the database directory after having copied it to the archive and before calling stop_backup. How about this revised protocol (names and adjustments welcome), to enable a less-terrible approach? Not only is that workaround incorrect (it has a small window where the system will not be able to restart), but it's pretty inconvenient. New concepts: pg_prepare_backup: readies postgres for backing up. Saves the backup_label content in volatile memory. The next start_backup will write that volatile information to disk, and the information within can be used to compute a backup-key backup-key: a subset of the backup label, all it needs (as far as I know) might be the database-id and then the WAL position (timeline, seg, offset) the backup is starting at. Protocol: 1. select pg_prepare_backup(); (Backup process remembers that backup-key is in progress (say, writes it to /backup-keys/%k) 2. select pg_start_backup(); (perform copying) 3. select pg_stop_backup(); 4. backup process can optionally clear its state remembering the backup-key (rm /backup-keys/%k) A crash at each point would be resolved this way: Before step 1: Nothing has happened, so normal crash recovery. Before step 2: (same, as it doesn't involve a state transition in postgres) Before step 3: when the crash occurs and postgres starts up, postgres asks the external software if a backup was in progress, say via a backup-in-progress command. It is responsible for looking at /backup-keys/%k and saying yes, it was. The database can then do normal crash recovery. The backup can even be continuing through this time, I think. Before step 4: The archiver may leak the backup-key. Because backup-keys using the information I defined earlier have an ordering, it should be possible to reap these if necessary at intervals. Fundamentally, the way this approach gets around the 'physical copy' conundrum is asking the archiver software to remember something well out of the way of the database directory on the system that is being backed up. The main usability gain is that there will be a standardized way to have postgres check to see if it was doing a backup (and thus should use normal crash recovery) regardless of how it's started, rather than hacks around, say, upstart scripts on ubuntu, or pg_ctl that are idiosyncratic to what is a common need. What do you think? I think this may even be backwards compatible, because if one doesn't call pg_prepare_backup then one can fall back to that upon calling pg_start_backup. The backup in progress command is additive, and doesn't change anything for systems that do not have it defined. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] grants vs. inherited tables
Excerpts from Marko Kreen's message of jue dic 29 15:22:49 -0300 2011: On Thu, Dec 29, 2011 at 03:12:50PM -0300, Alvaro Herrera wrote: Excerpts from Marko Kreen's message of jue dic 29 15:04:49 -0300 2011: 3) Have a way to format aclitem into something that can used to create GRANT statement easily. Eg: pg_get_privilege_info( IN priv aclitem, OUT rolename text, OUT privlist text[], OUT privlist_with_grant_option text[]); This allows doing complex introspection in pl/pgsql and also helps tools that want to re-create table structure in other databases. aclexplode? I guess that decides the name. :) I have the (hopefully wrong) impression that you're missing the fact that it already exists, at least in 9.0. I have a backported version of it we wrote for a customer, in case you're interested on using it in previous releases. Not that it's all that difficult to write ... -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [NOVICE] index refuses to build
Merlin Moncure mmonc...@gmail.com writes: On Thu, Dec 29, 2011 at 5:10 PM, Jean-Yves F. Barbier 12u...@gmail.com wrote: CREATE INDEX tst1m_name_lu_ix ON tst1m(unaccent(name)); ERROR: functions in index expression must be marked IMMUTABLE your problem is the unaccent function. it's defined stable because the rules function it depends on can change after the index is built -- that would effectively introduce index corruption. it's possible to bypass that restriction, but are you sure that's what you want to do? Hmm ... it's clear why unaccent(text) is only stable, because it depends on the current search_path to find the unaccent dictionary. But I wonder whether it was an oversight that unaccent(regdictionary, text) is stable and not immutable. We don't normally mark functions as stable just because you could in principle change their behavior by altering some outside-the-database configuration files. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] spinlocks on powerpc
2011/12/30 Tom Lane t...@sss.pgh.pa.us Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: The Linux kernel does this (arch/powerpc/include/asm/ppc-opcode.h): Yeah, I was looking at that too. We can't copy-paste code from Linux directly, and I'm not sure I like that particular phrasing of the macro, but perhaps we should steal the idea and only use the hint on 64-bit PowerPC processors? The info that I've found says that the hint exists beginning in POWER6, and there were certainly 64-bit Power machines before that. However, it might be that the only machines that actually spit up on the hint bit (rather than ignore it) were 32-bit, in which case this would be a usable heuristic. Not sure how we can research that ... do we want to just assume the kernel guys know what they're doing? I'm a bit confused and might miss the point, but... If we can decide whether to use the hint operand when we build postgres, I think it's better to check if we can compile and run a sample code with lwarx hint operand than to refer to some arbitrary defines, such as FOO_PPC64 or something. I still wonder when to judge the hint availability, compile time or runtime. I don't have any idea how to decide that on runtime, though. P.S. I changed the subject since it's no longer related to HPUX. Regards, Manabu Ori