Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 9/22/11 6:59 PM, Euler Taveira de Oliveira eu...@timbira.com wrote: If needed, I could do that, if I had the exact procedure... Currently, during the start of the backup I take the following information: Just show us the output of pg_start_backup and part of the standby log with the following message 'redo starts at' and the subsequent messages up to the failure. Unfortunately, it's impossible, because the error message Could not read from file pg_clog/0001 at offset 32768: Success is shown (and startup aborted) before the turn for redo starts at message arrives. For comparison purposes, here is the log from the original message: 2011-09-21 13:41:05 CEST LOG: database system was interrupted; last known up at 2011-09-21 13:40:50 CEST Restoring 0014.history mv: cannot stat `/opt/PostgreSQL/9.1/archive/0014.history': No such file or directory Restoring 0013.history 2011-09-21 13:41:05 CEST LOG: restored log file 0013.history from archive ! 2011-09-21 13:41:05 CEST LOG: entering standby mode Restoring 0013000600DC 2011-09-21 13:41:05 CEST LOG: restored log file 0013000600DC from archive Restoring 0013000600DB 2011-09-21 13:41:05 CEST LOG: restored log file 0013000600DB from archive 2011-09-21 13:41:05 CEST FATAL: could not access status of transaction 1188673 2011-09-21 13:41:05 CEST DETAIL: Could not read from file pg_clog/0001 at offset 32768: Success. 2011-09-21 13:41:05 CEST LOG: startup process (PID 13819) exited with exit code 1 2011-09-21 13:41:05 CEST LOG: aborting startup due to startup process Failure As you can see, there is no redo starts at message. If we compare this to a healthy standby startup, we can see that the pg_clog error appears after the entering standby mode, but before the redo starts at, hence the latter is not reached. Healthy log example: 2011-09-23 09:52:31 CEST LOG: database system was interrupted; last known up at 2011-09-23 09:52:15 CEST Restoring 0007.history mv: cannot stat `/opt/PostgreSQL/9.1/archive/0007.history': No such file or directory Restoring 0006.history 2011-09-23 09:52:31 CEST LOG: restored log file 0006.history from archive 2011-09-23 09:52:31 CEST LOG: entering standby mode Restoring 00060065 2011-09-23 09:52:31 CEST LOG: restored log file 00060065 from archive 2011-09-23 09:52:31 CEST LOG: redo starts at 0/6520 2011-09-23 09:52:31 CEST LOG: consistent recovery state reached at 0/6600 2011-09-23 09:52:31 CEST LOG: database system is ready to accept read only connections Restoring 00060066 mv: cannot stat `/opt/PostgreSQL/9.1/archive/00060066': No such file or directory Restoring 0007.history mv: cannot stat `/opt/PostgreSQL/9.1/archive/0007.history': No such file or directory 2011-09-23 09:52:31 CEST LOG: streaming replication successfully connected to primary -- Best regards, Linas Virbalas http://flyingclusters.blogspot.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 23.09.2011 11:02, Linas Virbalas wrote: On 9/22/11 6:59 PM, Euler Taveira de Oliveiraeu...@timbira.com wrote: If needed, I could do that, if I had the exact procedure... Currently, during the start of the backup I take the following information: Just show us the output of pg_start_backup and part of the standby log with the following message 'redo starts at' and the subsequent messages up to the failure. Unfortunately, it's impossible, because the error message Could not read from file pg_clog/0001 at offset 32768: Success is shown (and startup aborted) before the turn for redo starts at message arrives. It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. Which shouldn't happen, of course, because the start-backup checkpoint should flush all the clog that's needed by recovery to disk before the backup procedure begins to them. Can you do ls -l pg_clog in the master and the backup, and pg_controldata on the backup dir, and post the results, please? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sep23, 2011, at 10:41 , Heikki Linnakangas wrote: On 23.09.2011 11:02, Linas Virbalas wrote: On 9/22/11 6:59 PM, Euler Taveira de Oliveiraeu...@timbira.com wrote: If needed, I could do that, if I had the exact procedure... Currently, during the start of the backup I take the following information: Just show us the output of pg_start_backup and part of the standby log with the following message 'redo starts at' and the subsequent messages up to the failure. Unfortunately, it's impossible, because the error message Could not read from file pg_clog/0001 at offset 32768: Success is shown (and startup aborted) before the turn for redo starts at message arrives. It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. Which shouldn't happen, of course, because the start-backup checkpoint should flush all the clog that's needed by recovery to disk before the backup procedure begins to them. Yeah. What confuses me though is that we fail while trying to read from the clog. When do we do that during normal (non-standby) recovery? One other possibility would be that the problem is somehow triggered by vacuum running while the start-backup checkpoint is commencing. That'd explain why the problem goes away with immediate checkpoints, because those make the windows mach smaller. But I haven't got a concrete theory of whats happening.. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 23.09.2011 11:48, Florian Pflug wrote: On Sep23, 2011, at 10:41 , Heikki Linnakangas wrote: On 23.09.2011 11:02, Linas Virbalas wrote: On 9/22/11 6:59 PM, Euler Taveira de Oliveiraeu...@timbira.com wrote: If needed, I could do that, if I had the exact procedure... Currently, during the start of the backup I take the following information: Just show us the output of pg_start_backup and part of the standby log with the following message 'redo starts at' and the subsequent messages up to the failure. Unfortunately, it's impossible, because the error message Could not read from file pg_clog/0001 at offset 32768: Success is shown (and startup aborted) before the turn for redo starts at message arrives. It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. Which shouldn't happen, of course, because the start-backup checkpoint should flush all the clog that's needed by recovery to disk before the backup procedure begins to them. Yeah. What confuses me though is that we fail while trying to read from the clog. When do we do that during normal (non-standby) recovery? I believe the OP is setting up a standby. He didn't say if it's a hot standby, though. That changes the startup sequence a bit. At the end of recovery, we read the last clog page, containing the next XID that will be assigned to a transaction, and zero out the future part of that clog file (StartupCLOG). In hot standby, we do that earlier, after reading the checkpoint record but before we start replaying records. One other possibility would be that the problem is somehow triggered by vacuum running while the start-backup checkpoint is commencing. That'd explain why the problem goes away with immediate checkpoints, because those make the windows mach smaller. But I haven't got a concrete theory of whats happening.. Hmm, that's a good theory. The logic in SimpleLruPhysicakReadPage() tolerates non-existent files, because the old clog files might've been truncated away by a vacuum after the backup started. That's OK, because they will be recreated, and later deleted again, during the WAL replay. But what if something like this happens: 0. pg_start_backup(). 1. rsync starts. It gets a list of all files. It notes that pg_clog/0001 exists. 2. vacuum runs, and deletes pg_clog/0001 3. rsync starts to copy pg_clog/0001. Oops, it's gone. Instead of leaving it out altogether, it includes it as an empty file. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
Hi, I think I have comprehended roughly around the constructs and the concept underlying. At Thu, 22 Sep 2011 12:35:56 -0400, Tom Lane t...@sss.pgh.pa.us wrote in 23159.1316709...@sss.pgh.pa.us tgl Sure, if the increment the top byte strategy proves to not accomplish tgl that effectively. But I'd prefer not to design a complex strategy until tgl it's been proven that a simpler one doesn't work. Ok, I understand indistinctly that thought. But I have not grasp your measure for the complexity. The current make_greater_string tips up the tail of bare byte sequence and cheks the whole byte sequence to be valid against the database encoding and try the next if not. On the other hand, the patch (although the style is corrupted..) searches for the last CHARACTER and try to tipping the last CARACTER up and decline if failed. Looking within the scope of the function make_greater_string, feel more complexity on the former because of the check and loop construct. Yes, altough the `tipping the character up' has complexity within, but the complexity is capsulated within single function. From the performance perspective, the current implement always slipps 64 times (0xff - 0xbf, for UTF8) and checks the WHOLE pattern string on every slippage, and eventually declines for the only but not negligible 100 (within Japanese chars only) code points. The check-and-retry loop can't be a help for these cases. And checks the whole pattern string at least once nevertheless successfully landed. While the algorithm of the patch seeks the whole pattern string to find the last character but makes no slippage for whole the code space and declines only on the point of chainging the character length. (Of cource it is possible to save thses points but it is `too expensive' for the gain to me:). Not only checking the whole string, but also checking the character after increment operation is essentially needless for this method. To summarise from my view, these two methods seems not so different on performance for the `incrementable's by current method and the latter seems rather efficient and applicable for most of the `unincrementable's. The patch now does cheking the validity of the result as last-resort because of the possible inconsistency caused by careless chainging of the validity check function (changing the character set, in other word, very unlikely.). But It is unnessessary itself if the consistency between the incrementer and the validator has been checked after the last modification. The four-bytes memcpy would be get out by changing the rewinding method. These modifications make the operation gets low-cost (I think..) for UTF8 and it for others left untouched with regard to the behavior. As I've written in previous message, the modification on pg_wchar_table can be rewinded before until another incrementer comes. Can I have another chance to show the another version of the patch according to the above? Regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present Details: From the documentation of sslmode values in http://www.postgresql.org/docs/9.0/static/libpq-ssl.html , it looks like libpq will not verify the server certificate when the option sslmode=require is used, and will perform different levels of certificate verification in the cases sslmode=verify-ca and sslmode=verify-full. The observed behaviour is a bit different. If the ~/.postgresql/root.crt file (or any other filename set through sslrootcert option) is found, sslmode=require also performs the same level of certificate verification as verify-ca. The difference between require and verify-ca is that it is an error for the file to not exist when sslmode is verify-ca. I looked at this again, and I'm pretty sure we did this intentionally. The idea being that before we had the verify-ca/verify-full options, adding the root cert would enable the verification. And we didn't want to turn installations that previously did verify the certificate to stop doing so in the new version. So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. The question is, have we had the new options long enough now that we should change it so that we don't verify the cert in the case of cert-exists-but-verification-wasn't-explicitly-asked-for? Or should we just update the documentation to mention how this works? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: plpgsql - remove unnecessary ccache search when a array variable is updated
On Fri, Sep 23, 2011 at 12:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I fixed crash that described Tom. Do you know about other? No, I just don't see a new version of the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch: plpgsql - remove unnecessary ccache search when a array variable is updated
2011/9/23 Robert Haas robertmh...@gmail.com: On Fri, Sep 23, 2011 at 12:26 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I fixed crash that described Tom. Do you know about other? No, I just don't see a new version of the patch. sorry - my mistake - I sent it only to Tom Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company *** ./src/pl/plpgsql/src/gram.y.orig 2011-09-19 10:25:28.0 +0200 --- ./src/pl/plpgsql/src/gram.y 2011-09-22 22:02:11.0 +0200 *** *** 998,1003 --- 998,1004 new-dtype = PLPGSQL_DTYPE_ARRAYELEM; new-subscript = $3; new-arrayparentno = $1; + new-arraytypoid = InvalidOid; plpgsql_adddatum((PLpgSQL_datum *) new); *** ./src/pl/plpgsql/src/pl_exec.c.orig 2011-09-19 10:25:28.0 +0200 --- ./src/pl/plpgsql/src/pl_exec.c 2011-09-22 22:14:46.0 +0200 *** *** 3991,4008 PLpgSQL_expr *subscripts[MAXDIM]; int subscriptvals[MAXDIM]; bool oldarrayisnull; ! Oid arraytypeid, ! arrayelemtypeid; int32 arraytypmod; - int16 arraytyplen, - elemtyplen; - bool elemtypbyval; - char elemtypalign; Datum oldarraydatum, coerced_value; ArrayType *oldarrayval; ArrayType *newarrayval; SPITupleTable *save_eval_tuptable; /* * We need to do subscript evaluation, which might require --- 3991,4004 PLpgSQL_expr *subscripts[MAXDIM]; int subscriptvals[MAXDIM]; bool oldarrayisnull; ! Oid arraytypoid; int32 arraytypmod; Datum oldarraydatum, coerced_value; ArrayType *oldarrayval; ArrayType *newarrayval; SPITupleTable *save_eval_tuptable; + PLpgSQL_arrayelem *base_arrayelem; /* * We need to do subscript evaluation, which might require *** *** 4027,4032 --- 4023,4030 { PLpgSQL_arrayelem *arrayelem = (PLpgSQL_arrayelem *) target; + base_arrayelem = arrayelem; + if (nsubscripts = MAXDIM) ereport(ERROR, (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), *** *** 4038,4061 /* Fetch current value of array datum */ exec_eval_datum(estate, target, ! arraytypeid, arraytypmod, oldarraydatum, oldarrayisnull); ! /* If target is domain over array, reduce to base type */ ! arraytypeid = getBaseTypeAndTypmod(arraytypeid, arraytypmod); ! /* ... and identify the element type */ ! arrayelemtypeid = get_element_type(arraytypeid); ! if (!OidIsValid(arrayelemtypeid)) ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(subscripted object is not an array))); ! get_typlenbyvalalign(arrayelemtypeid, ! elemtyplen, ! elemtypbyval, ! elemtypalign); ! arraytyplen = get_typlen(arraytypeid); /* * Evaluate the subscripts, switch into left-to-right order. --- 4036,4067 /* Fetch current value of array datum */ exec_eval_datum(estate, target, ! arraytypoid, arraytypmod, oldarraydatum, oldarrayisnull); ! /* If base_arrayelem has initialized metadata, then use it */ ! if (base_arrayelem-arraytypoid != arraytypoid || ! base_arrayelem-arraytypmod != arraytypmod) ! { ! /* If target is domain over array, reduce to base type */ ! arraytypoid = getBaseTypeAndTypmod(arraytypoid, arraytypmod); ! base_arrayelem-arraytypoid = arraytypoid; ! base_arrayelem-arraytypmod = arraytypmod; ! /* ... and identify the element type */ ! base_arrayelem-arrayelemtypoid = get_element_type(arraytypoid); ! if (!OidIsValid(base_arrayelem-arrayelemtypoid)) ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(subscripted object is not an array))); ! ! get_typlenbyvalalign(base_arrayelem-arrayelemtypoid, ! (base_arrayelem-elemtyplen), ! (base_arrayelem-elemtypbyval), ! (base_arrayelem-elemtypalign)); ! base_arrayelem-arraytyplen = get_typlen(arraytypoid); ! } /* * Evaluate the subscripts, switch into left-to-right order. *** *** 4093,4099 /* Coerce source value to match array element type. */ coerced_value = exec_simple_cast_value(value, valtype, ! arrayelemtypeid, arraytypmod, *isNull); --- 4099,4105 /* Coerce source value to match array element type. */ coerced_value = exec_simple_cast_value(value, valtype, ! base_arrayelem-arrayelemtypoid, arraytypmod, *isNull); *** *** 4107,4118 * array, either, so that's a no-op too. This is all ugly but * corresponds to the current
[HACKERS] Satisfy extension dependency by one of multiple extensions
Hello list, I have a use case where an extension dependency can be satisfied by one of five other extensions. Currently I'm unable to express that in the extension control file, since the elements from 'requires' are currently searched on exact name match. The attached patch changes this behaviour for list elements that end with a *, into prefix matching, so that e.g. table* matches tablefunc. This allows me to specify in a controlfile requires 'vocab*' which is satisfied by having either one of the following extensions loaded: vocab2005 vocab2006 vocab2008 vocab2009 vocab2010 thoughts? regards, Yeb Havinga -- Yeb Havinga http://www.mgrid.net/ Mastering Medical Data diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c new file mode 100644 index 278bbcf..5bf7f8e *** a/src/backend/commands/extension.c --- b/src/backend/commands/extension.c *** static void ApplyExtensionUpdates(Oid ex *** 114,145 Oid get_extension_oid(const char *extname, bool missing_ok) { ! Oid result; ! Relation rel; SysScanDesc scandesc; HeapTuple tuple; ScanKeyData entry[1]; rel = heap_open(ExtensionRelationId, AccessShareLock); ScanKeyInit(entry[0], Anum_pg_extension_extname, ! BTEqualStrategyNumber, F_NAMEEQ, CStringGetDatum(extname)); ! scandesc = systable_beginscan(rel, ExtensionNameIndexId, true, ! SnapshotNow, 1, entry); ! tuple = systable_getnext(scandesc); ! /* We assume that there can be at most one matching tuple */ ! if (HeapTupleIsValid(tuple)) ! result = HeapTupleGetOid(tuple); ! else ! result = InvalidOid; ! systable_endscan(scandesc); heap_close(rel, AccessShareLock); if (!OidIsValid(result) !missing_ok) --- 114,152 Oid get_extension_oid(const char *extname, bool missing_ok) { ! Oid result = InvalidOid; ! Relation rel, idx; SysScanDesc scandesc; HeapTuple tuple; ScanKeyData entry[1]; + int len = strlen(extname) - 1; + boolwildcard = (extname[len] == '*'); rel = heap_open(ExtensionRelationId, AccessShareLock); + idx = index_open(ExtensionNameIndexId, AccessShareLock); ScanKeyInit(entry[0], Anum_pg_extension_extname, ! BTGreaterEqualStrategyNumber, ! F_NAMEGE, CStringGetDatum(extname)); ! scandesc = systable_beginscan_ordered(rel, idx, ! SnapshotNow, 1, entry); ! if (HeapTupleIsValid(tuple = systable_getnext_ordered(scandesc, ForwardScanDirection))) ! { ! Form_pg_extension form = (Form_pg_extension) GETSTRUCT(tuple); ! if (wildcard ! ? (strncmp(extname, NameStr(form-extname), len) == 0) ! : (strcmp(extname, NameStr(form-extname)) == 0)) ! result = HeapTupleGetOid(tuple); ! } ! systable_endscan_ordered(scandesc); + index_close(idx, AccessShareLock); heap_close(rel, AccessShareLock); if (!OidIsValid(result) !missing_ok) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Satisfy extension dependency by one of multiple extensions
On 23.09.2011 14:56, Yeb Havinga wrote: I have a use case where an extension dependency can be satisfied by one of five other extensions. Currently I'm unable to express that in the extension control file, since the elements from 'requires' are currently searched on exact name match. The attached patch changes this behaviour for list elements that end with a *, into prefix matching, so that e.g. table* matches tablefunc. That's going to quickly extend into even more complex rules, like foo OR bar, (foo OR bar) AND (foobar) etc. IIRC the extension control file format was modeled after some other package management system (.deb ?). You might want to look at the past discussions when the extension control file format was decided. We might want to have a system where an extension can declare that it provides capabilites, and then have another extension require those capabilities. That would be a neater solution to the case that there are multiple extensions that all provide the same capability. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
From: Magnus Hagander mag...@hagander.net To: Srinivas Aji srinivas@emc.com Cc: PostgreSQL-development pgsql-hackers@postgresql.org Sent: Friday, September 23, 2011 7:28:09 AM Subject: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present Details: ... The observed behaviour is a bit different. If the ~/.postgresql/root.crt file (or any other filename set through sslrootcert option) is found, sslmode=require also performs the same level of certificate verification as verify-ca. The difference between require and verify-ca is that it is an error for the file to not exist when sslmode is verify-ca. I looked at this again, and I'm pretty sure we did this intentionally. The idea being that before we had the verify-ca/verify-full options, adding the root cert would enable the verification. And we didn't want to turn installations that previously did verify the certificate to stop doing so in the new version. So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. The question is, have we had the new options long enough now that we should change it so that we don't verify the cert in the case of cert-exists-but-verification-wasn't-explicitly-asked-for? Or should we just update the documentation to mention how this works? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ Magnus, If you're accepting votes on this: I would say 'yes' - change the behavior to the most logically consistent ones; ie, isolate the verification bits a bit more explicitly. And, in documentation, indicate the deprecation of the old behavior. Our mileage, in practical terms, is that the perceived inconsistencies create a minor support hassle - we don't want to present any - even trivial - hurdle to adoption of SSL to our clients. Lou Picciano
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present Details: ... The observed behaviour is a bit different. If the ~/.postgresql/root.crt file (or any other filename set through sslrootcert option) is found, sslmode=require also performs the same level of certificate verification as verify-ca. The difference between require and verify-ca is that it is an error for the file to not exist when sslmode is verify-ca. I looked at this again, and I'm pretty sure we did this intentionally. The idea being that before we had the verify-ca/verify-full options, adding the root cert would enable the verification. And we didn't want to turn installations that previously did verify the certificate to stop doing so in the new version. So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. The question is, have we had the new options long enough now that we should change it so that we don't verify the cert in the case of cert-exists-but-verification-wasn't-explicitly-asked-for? Or should we just update the documentation to mention how this works? Magnus, If you're accepting votes on this: I would say 'yes' - change the behavior to the most logically consistent ones; ie, isolate the verification bits a bit more explicitly. And, in documentation, indicate the deprecation of the old behavior. Our mileage, in practical terms, is that the perceived inconsistencies create a minor support hassle - we don't want to present any - even trivial - hurdle to adoption of SSL to our clients. There are really two options to this as well - we can backpatch such a change, or we can change it only in 9.2. I'm leaning towards a no on the backport, because that will change things for existing users. So probably a doc change in backbranches and a behaviour change in 9.2 would be the reasonable choice in this case. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Anyway, I won't stand in the way of the patch as long as it's modified to limit the number of values considered for any one character position to something reasonably small. I think that limit in both the old and new code is 1, except that the new code does it more efficiently. Am I confused? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Fri, Sep 23, 2011 at 5:16 AM, Kyotaro HORIGUCHI horiguchi.kyot...@oss.ntt.co.jp wrote: Can I have another chance to show the another version of the patch according to the above? You can always post a new version of any patch. I think what you need to focus on is cleaning up the coding style to match PG project standards. In particular, you need to fix your brace positioning and the way function declarations are laid out. Someone will have to do that before this is committed, and ideally that person should be you rather than, say, me. :-) It would be good to fix the memcpy() issue Tom found too, and make a careful examination of the code for anything else that can be tidied up or made more bulletproof. It seems like we are getting close to agreement on the basic behavior, so that is good. We can always fine-tune that later; that shouldn't be much work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 9/23/11 12:05 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. Which shouldn't happen, of course, because the start-backup checkpoint should flush all the clog that's needed by recovery to disk before the backup procedure begins to them. Yeah. What confuses me though is that we fail while trying to read from the clog. When do we do that during normal (non-standby) recovery? I believe the OP is setting up a standby. He didn't say if it's a hot standby, though. That changes the startup sequence a bit. I confirm that it is a Hot Standby with Streaming Replication. One other possibility would be that the problem is somehow triggered by vacuum running while the start-backup checkpoint is commencing. That'd explain why the problem goes away with immediate checkpoints, because those make the windows mach smaller. But I haven't got a concrete theory of whats happening.. Hmm, that's a good theory. The logic in SimpleLruPhysicakReadPage() tolerates non-existent files, because the old clog files might've been truncated away by a vacuum after the backup started. That's OK, because they will be recreated, and later deleted again, during the WAL replay. But what if something like this happens: 0. pg_start_backup(). 1. rsync starts. It gets a list of all files. It notes that pg_clog/0001 exists. 2. vacuum runs, and deletes pg_clog/0001 3. rsync starts to copy pg_clog/0001. Oops, it's gone. Instead of leaving it out altogether, it includes it as an empty file. I cannot confirm that. I have tried this scenario one more time and was observing how pg_clog/ looked on the master and standby. Here's an overview: 1. pg_start_backup('base_backup') - waiting... 2. Checking the size of pg_clog periodically on the master: -bash-3.2$ ls -l pg_clog total 8 -rw--- 1 postgres postgres 8192 Sep 23 14:31 -bash-3.2$ ls -l pg_clog total 8 -rw--- 1 postgres postgres 8192 Sep 23 14:31 -bash-3.2$ ls -l pg_clog total 16 3. Somewhere just before (1) step releases and we get into rsync phase, the pg_clog size changes: -rw--- 1 postgres postgres 16384 Sep 23 14:33 -bash-3.2$ ls -l pg_clog total 16 -rw--- 1 postgres postgres 16384 Sep 23 14:33 -bash-3.2$ ls -l pg_clog total 16 -rw--- 1 postgres postgres 16384 Sep 23 14:33 4. The rsync does transfer the file: ... INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO management.script.ScriptPlugin base/16394/16405_fsm INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO management.script.ScriptPlugin base/16394/pg_internal.init INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO management.script.ScriptPlugin global/ INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO management.script.ScriptPlugin global/12691 INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO management.script.ScriptPlugin global/12696 INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO management.script.ScriptPlugin global/12697 INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO management.script.ScriptPlugin global/pg_internal.init INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO management.script.ScriptPlugin pg_clog/ INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO management.script.ScriptPlugin pg_multixact/offsets/ INFO | jvm 1| 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO management.script.ScriptPlugin pg_notify/ ... But on the standby its size is the old one (thus, it seems, that the size changed after the rsync transfer and before the pg_stop_backup() was called): ls -l pg_clog/ total 8 -rw--- 1 postgres postgres 8192 Sep 23 14:31 5. Hence, the server doesn't start and pg_log complains: 2011-09-23 14:33:45 CEST LOG: streaming replication successfully connected to primary 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: the database system is starting up 2011-09-23 14:33:46 CEST FATAL: could not access status of transaction 37206 2011-09-23 14:33:46 CEST DETAIL: Could not read from file pg_clog/ at offset 8192: Success. 6. Now, if I do something that, of course, should never be done, and copy this file from master to the standby soon enough (without even starting/stopping backup), the standby starts up successfully. -- Hope this helps, Linas Virbalas http://flyingclusters.blogspot.com/ -- Sent via pgsql-hackers mailing list
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present Details: ... The observed behaviour is a bit different. If the ~/.postgresql/root.crt file (or any other filename set through sslrootcert option) is found, sslmode=require also performs the same level of certificate verification as verify-ca. The difference between require and verify-ca is that it is an error for the file to not exist when sslmode is verify-ca. I looked at this again, and I'm pretty sure we did this intentionally. The idea being that before we had the verify-ca/verify-full options, adding the root cert would enable the verification. And we didn't want to turn installations that previously did verify the certificate to stop doing so in the new version. So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. The question is, have we had the new options long enough now that we should change it so that we don't verify the cert in the case of cert-exists-but-verification-wasn't-explicitly-asked-for? Or should we just update the documentation to mention how this works? Magnus, If you're accepting votes on this: I would say 'yes' - change the behavior to the most logically consistent ones; ie, isolate the verification bits a bit more explicitly. And, in documentation, indicate the deprecation of the old behavior. Our mileage, in practical terms, is that the perceived inconsistencies create a minor support hassle - we don't want to present any - even trivial - hurdle to adoption of SSL to our clients. There are really two options to this as well - we can backpatch such a change, or we can change it only in 9.2. I'm leaning towards a no on the backport, because that will change things for existing users. So probably a doc change in backbranches and a behaviour change in 9.2 would be the reasonable choice in this case. I definitely don't think we should back-patch a behavior change that silently weakens security. That's not good. But what about not doing it in master, either? It seems to me that we could avoid ever breaking backward compatibility by adding a new option require-no-verify. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
Robert Haas robertmh...@gmail.com writes: On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Anyway, I won't stand in the way of the patch as long as it's modified to limit the number of values considered for any one character position to something reasonably small. I think that limit in both the old and new code is 1, except that the new code does it more efficiently. Am I confused? Yes, or else I am. Consider a 4-byte UTF8 character at the end of the string. The existing code increments the last byte up to 255 (rejecting everything past 0xBF), then gives up and truncates that character away. So the maximum number of tries for that character position is between 0 and 127 depending on what the original character was (with at most 63 of the incremented values getting past the verifymbstr test). The proposed patch is going to iterate through all Unicode code points up to U+7F before giving up. Since it's possible that we need to increment something further left to succeed at all, this doesn't seem like a good plan. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Fri, Sep 23, 2011 at 8:47 AM, Linas Virbalas linas.virba...@continuent.com wrote: But on the standby its size is the old one (thus, it seems, that the size changed after the rsync transfer and before the pg_stop_backup() was called): Now that seems pretty weird - I don't think that file should ever shrink. Are you sure that the rsync isn't starting until after pg_start_backup() completes? Because if you were starting it just a tiny bit early, that would explain the observed symptoms perfectly, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
Magnus Hagander mag...@hagander.net writes: I looked at this again, and I'm pretty sure we did this intentionally. Yeah, we did. Or should we just update the documentation to mention how this works? +1 for doc change only. I think the behavior was thought through carefully, and the wording of the docs maybe not so much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
From: Magnus Hagander mag...@hagander.net To: Lou Picciano loupicci...@comcast.net Cc: PostgreSQL-development pgsql-hackers@postgresql.org, Srinivas Aji srinivas@emc.com Sent: Friday, September 23, 2011 8:38:00 AM Subject: Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present Details: ... The observed behaviour is a bit different. If the ~/.postgresql/root.crt file (or any other filename set through sslrootcert option) is found, sslmode=require also performs the same level of certificate verification as verify-ca. The difference between require and verify-ca is that it is an error for the file to not exist when sslmode is verify-ca. I looked at this again, and I'm pretty sure we did this intentionally. The idea being that before we had the verify-ca/verify-full options, adding the root cert would enable the verification. And we didn't want to turn installations that previously did verify the certificate to stop doing so in the new version. So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. The question is, have we had the new options long enough now that we should change it so that we don't verify the cert in the case of cert-exists-but-verification-wasn't-explicitly-asked-for? Or should we just update the documentation to mention how this works? Magnus, If you're accepting votes on this: I would say 'yes' - change the behavior to the most logically consistent ones; ie, isolate the verification bits a bit more explicitly. And, in documentation, indicate the deprecation of the old behavior. Our mileage, in practical terms, is that the perceived inconsistencies create a minor support hassle - we don't want to present any - even trivial - hurdle to adoption of SSL to our clients. There are really two options to this as well - we can backpatch such a change, or we can change it only in 9.2. I'm leaning towards a no on the backport, because that will change things for existing users. So probably a doc change in backbranches and a behaviour change in 9.2 would be the reasonable choice in this case. Again, if you were soliciting votes, I'd take the aggressive stance: +1 for the backport to 9.1. Of the population using SSL, you'd be pulling out the subset getting all the way down into PKI implementation, then, those actually doing apps teasing out these differences in verification behavior... Among _that_ group, you're only concerned with recent adopters of 9.1, and only those who wouldn't be in a position to adapt pretty quickly. Probably a pretty small cohort for something this esoteric. In our case, we do run into it - for our new clients. We find ourselves in something of a support role regarding pqlib's SSL capabilities! Lou Picciano
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Fri, Sep 23, 2011 at 8:51 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Sep 22, 2011 at 10:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Anyway, I won't stand in the way of the patch as long as it's modified to limit the number of values considered for any one character position to something reasonably small. I think that limit in both the old and new code is 1, except that the new code does it more efficiently. Am I confused? Yes, or else I am. Consider a 4-byte UTF8 character at the end of the string. The existing code increments the last byte up to 255 (rejecting everything past 0xBF), then gives up and truncates that character away. So the maximum number of tries for that character position is between 0 and 127 depending on what the original character was (with at most 63 of the incremented values getting past the verifymbstr test). The proposed patch is going to iterate through all Unicode code points up to U+7F before giving up. Since it's possible that we need to increment something further left to succeed at all, this doesn't seem like a good plan. I think you're misreading the code. It does this: while (len 0) { boring stuff; if (charincfunc(lastchar, charlen)) { more boring stuff; if (we made a greater string) return it; cleanup; } truncate away last character; } I don't see how that's ever going to try more than one character in the same position. What may be confusing you is that the old code has two loops: an outer loop that tests whether we've made a greater string, and an inner loop that tests whether we've made a validly encoded string at all. In the new code, at least in the UTF-8 case, the inner loop is GONE altogether. Instead of iterating until we construct a valid character, we just use our mad UTF-8 skillz to assemble one, and return it. Or else I need to go drink a few cups of tea and look at this again. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new createuser option for replication role
On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Agreed. Attached is the updated version of the patch. It adds two options --replication and --no-replication. If neither specified, neither REPLICATION nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case, replication privilege is granted to only superuser. Committed. Do we need to make any changes in interactive mode, when we prompt? In theory this could be either wanted or not wanted for either superusers or non-superusers, but I'm not really sure it's worth it, and I certainly don't want the command to go into interactive mode just because neither --replication nor --no-replication was specified. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Satisfy extension dependency by one of multiple extensions
On 2011-09-23 14:19, Heikki Linnakangas wrote: On 23.09.2011 14:56, Yeb Havinga wrote: I have a use case where an extension dependency can be satisfied by one of five other extensions. Currently I'm unable to express that in the extension control file, since the elements from 'requires' are currently searched on exact name match. The attached patch changes this behaviour for list elements that end with a *, into prefix matching, so that e.g. table* matches tablefunc. That's going to quickly extend into even more complex rules, like foo OR bar, (foo OR bar) AND (foobar) etc. IIRC the extension control file format was modeled after some other package management system (.deb ?). You might want to look at the past discussions when the extension control file format was decided. Ech.. 2364 hits on 'extension' in my mailbox. However I found a thread 'extension dependency checking' that also talks about version numbers and the 'provides' capability you mention below. We might want to have a system where an extension can declare that it provides capabilites, and then have another extension require those capabilities. That would be a neater solution to the case that there are multiple extensions that all provide the same capability. Yes that would be neater. I can't think of anything else however than to add 'extprovides' to pg_extension, fill it with an explicit 'provides' from the control file when present, or extname otherwise, and use that column to check the 'requires' list on extension creation time. -- Yeb Havinga http://www.mgrid.net/ Mastering Medical Data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] pg_last_xact_insert_timestamp
On Thu, Sep 15, 2011 at 4:52 AM, Fujii Masao masao.fu...@gmail.com wrote: Thanks for the review! Koyotaro Horiguchi - Are you going to re-review the latest version of this patch? Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
On Fri, Sep 23, 2011 at 14:49, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present Details: ... The observed behaviour is a bit different. If the ~/.postgresql/root.crt file (or any other filename set through sslrootcert option) is found, sslmode=require also performs the same level of certificate verification as verify-ca. The difference between require and verify-ca is that it is an error for the file to not exist when sslmode is verify-ca. I looked at this again, and I'm pretty sure we did this intentionally. The idea being that before we had the verify-ca/verify-full options, adding the root cert would enable the verification. And we didn't want to turn installations that previously did verify the certificate to stop doing so in the new version. So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. The question is, have we had the new options long enough now that we should change it so that we don't verify the cert in the case of cert-exists-but-verification-wasn't-explicitly-asked-for? Or should we just update the documentation to mention how this works? Magnus, If you're accepting votes on this: I would say 'yes' - change the behavior to the most logically consistent ones; ie, isolate the verification bits a bit more explicitly. And, in documentation, indicate the deprecation of the old behavior. Our mileage, in practical terms, is that the perceived inconsistencies create a minor support hassle - we don't want to present any - even trivial - hurdle to adoption of SSL to our clients. There are really two options to this as well - we can backpatch such a change, or we can change it only in 9.2. I'm leaning towards a no on the backport, because that will change things for existing users. So probably a doc change in backbranches and a behaviour change in 9.2 would be the reasonable choice in this case. I definitely don't think we should back-patch a behavior change that silently weakens security. That's not good. But what about not doing it in master, either? It seems to me that we could avoid ever breaking backward compatibility by adding a new option require-no-verify. Hmm. Intersting. and we could then deprecate the require option and kill it off 4 releases later or so, I guess... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH
On Wed, Sep 21, 2011 at 12:19 PM, Andres Freund and...@anarazel.de wrote: /* * We also disallow data-modifying WITH in a cursor. (This could be * allowed, but the semantics of when the updates occur might be * surprising.) */ if (result-hasModifyingCTE) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg(DECLARE CURSOR must not contain data-modifying statements in WITH))); Given that cursors are about the only sensible way to return larger amounts of data, that behaviour reduces the usefulness of wCTEs a bit. Whats the exact cause of concern here? I personally don't think there is a problem documenting that you should fetch the cursor fully before relying on the updated tables to be in a sensible state. But that may be just me. Well, it looks like right now you can't even using a simple INSERT .. RETURNING there: rhaas=# create table wuzzle (a int); CREATE TABLE rhaas=# declare w cursor for insert into wuzzle select g from generate_series(1, 10) g returning g; ERROR: syntax error at or near insert LINE 1: declare w cursor for insert into wuzzle select g from genera... ^ -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new createuser option for replication role
2011/9/23 Robert Haas robertmh...@gmail.com: On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Agreed. Attached is the updated version of the patch. It adds two options --replication and --no-replication. If neither specified, neither REPLICATION nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case, replication privilege is granted to only superuser. Committed. Do we need to make any changes in interactive mode, when we prompt? In theory this could be either wanted or not wanted for either superusers or non-superusers, but I'm not really sure it's worth it, and I certainly don't want the command to go into interactive mode just because neither --replication nor --no-replication was specified. Thoughts? I believe the intereactive mode is useless. There is still an issue with the patch for the documentation: a superuser, even with NOREPLICATION is allowed to perform pg_start_backup() and pg_stop_backup(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] new createuser option for replication role
2011/9/23 Cédric Villemain cedric.villemain.deb...@gmail.com: 2011/9/23 Robert Haas robertmh...@gmail.com: On Thu, Sep 22, 2011 at 12:45 PM, Fujii Masao masao.fu...@gmail.com wrote: Agreed. Attached is the updated version of the patch. It adds two options --replication and --no-replication. If neither specified, neither REPLICATION nor NOREPLICATION is specified in CREATE ROLE, i.e., in this case, replication privilege is granted to only superuser. Committed. Do we need to make any changes in interactive mode, when we prompt? In theory this could be either wanted or not wanted for either superusers or non-superusers, but I'm not really sure it's worth it, and I certainly don't want the command to go into interactive mode just because neither --replication nor --no-replication was specified. Thoughts? I believe the intereactive mode is useless. There is still an issue with the patch for the documentation: a superuser, even with NOREPLICATION is allowed to perform pg_start_backup() and pg_stop_backup(). noise, sorry I've just read the commited patch which fixed that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
Excerpts from Magnus Hagander's message of vie sep 23 10:39:46 -0300 2011: On Fri, Sep 23, 2011 at 14:49, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. I definitely don't think we should back-patch a behavior change that silently weakens security. That's not good. But what about not doing it in master, either? It seems to me that we could avoid ever breaking backward compatibility by adding a new option require-no-verify. Hmm. Intersting. and we could then deprecate the require option and kill it off 4 releases later or so, I guess... So we would have sslmode=verify-ca / require-no-verify / verify-full / disable / allow / prefer ? This seems strange to me. Why not have a second option to let the user indicate the desired SSL verification? sslmode=disable/allow/prefer/require sslverify=none/ca-if-present/ca/full (ca-if-present being the current require sslmode behavior). We could then deprecate sslmode=verify and verify-full and have them be synonyms of sslmode=require and corresponding sslverify. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory
[ moving to -hacker s] On Thu, Sep 22, 2011 at 9:26 PM, Thom Brown t...@linux.com wrote: On 22 September 2011 17:38, Josh Berkus j...@agliodbs.com wrote: So are there any plans to allow swappable drive/volatile storage unlogged tables? Be our guest. ;-) Oh it can't be that difficult. On first glance it looks like it's a case of piggy-backing mdopen and getting it to treat RELPERSISTENCE_TEMP relations in the same way as it would for relations during the bootstrap script (i.e. create it if it doesn't exist)... then telling it not to try reading anything from the relation... or something like this. But I don't know C so... *puppy-dog eyes* I don't think that's it. It seems to me that what we really need to do is put the _init forks in a different directory than all the other forks (and then fix pg_upgrade to move them if upgrading from 9.1) - so the core code change is actually in relpathbackend(). The question is how you want to expose that to the user. I mean, you could just put them in another directory under the rug, and call it good. Users who know about the magic under the hood can fiddle with their mount points, and everyone else will be none the wiser. I'm not sure whether that's acceptable from a usability standpoint, however. If you want to expose an SQL-level interface, this gets to be a harder problem... starting with defining what exactly that interface should look like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
On Fri, Sep 23, 2011 at 15:55, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Magnus Hagander's message of vie sep 23 10:39:46 -0300 2011: On Fri, Sep 23, 2011 at 14:49, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 23, 2011 at 8:38 AM, Magnus Hagander mag...@hagander.net wrote: On Fri, Sep 23, 2011 at 14:35, Lou Picciano loupicci...@comcast.net wrote: On Wed, Aug 31, 2011 at 11:59, Srinivas Aji srinivas@emc.com wrote: The following bug has been logged online: Bug reference: 6189 Logged by: Srinivas Aji Email address: srinivas@emc.com PostgreSQL version: 9.0.4 Operating system: Linux Description: libpq: sslmode=require verifies server certificate if root.crt is present So basically, the behaviour that is by design is: * require: if certificate exists, verify. if certificate doesn't exist, don't verify. * verify-ca: if certificate exists, verify. if certificate doesn't exist, disconnect. I definitely don't think we should back-patch a behavior change that silently weakens security. That's not good. But what about not doing it in master, either? It seems to me that we could avoid ever breaking backward compatibility by adding a new option require-no-verify. Hmm. Intersting. and we could then deprecate the require option and kill it off 4 releases later or so, I guess... So we would have sslmode=verify-ca / require-no-verify / verify-full / disable / allow / prefer ? This seems strange to me. Why not have a second option to let the user indicate the desired SSL verification? sslmode=disable/allow/prefer/require sslverify=none/ca-if-present/ca/full (ca-if-present being the current require sslmode behavior). We could then deprecate sslmode=verify and verify-full and have them be synonyms of sslmode=require and corresponding sslverify. Hmm. I agree that the other suggestion was a bit weird, but I'm not sure I like the multiple-options approach either. That's going to require redesign of all software that deals with it at all today :S Maybe we should just update the docs and be done with it :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only scans
2011/8/16 Anssi Kääriäinen anssi.kaariai...@thl.fi: On 08/14/2011 12:31 AM, Heikki Linnakangas wrote: The same idea could of course be used to calculate the effective cache hit ratio for each table. Cache hit ratio would have the problem of feedback loops, though. Yeah, I'm not excited about making the planner and statistics more dynamic. Feedback loops and plan instability are not fun. I might be a little out of my league here... But I was thinking about the cache hit ratio and feedback loops. I understand automatic tuning would be hard. But making automatic tuning easier (by using pg_tune for example) would be a big plus for most use cases. To make it easier to tune the page read costs automatically, it would be nice if there would be four variables instead of the current two: - random_page_cost is the cost of reading a random page from storage. Currently it is not, it is the cost of accessing a random page, taking in account it might be in memory. - seq_page_cost is the cost of reading pages sequentially from storage - memory_page_cost is the cost of reading a page in memory - cache_hit_ratio is the expected cache hit ratio memory_page_cost would be server global, random and seq page costs tablespace specific, and cache_hit_ratio relation specific. You would get the current behavior by tuning *_page_costs realistically, and setting cache_hit_ratio globally so that the expected random_page_cost / seq_page_cost stays the same as now. The biggest advantage of this would be that the correct values are much easier to detect automatically compared to current situation. This can be done using pg_statio_* views and IO speed testing. They should not be tuned automatically by PostgreSQL, at least not the cache_hit_ratio, as that leads to the possibility of feedback loops and plan instability. The variables would also be much easier to understand. There is the question if one should be allowed to tune the *_page_costs at all. If I am not missing something, it is possible to detect the correct values programmatically and they do not change if you do not change the hardware. Cache hit ratio is the real reason why they are currently so important for tuning. An example why the current random_page_cost and seq_page_cost tuning is not adequate is that you can only set random_page_cost per tablespace. That makes perfect sense if random_page_cost would be the cost of accessing a page in storage. But it is not, it is a combination of that and caching effects, so that it actually varies per relation (and over time). How do you set it correctly for a query where one relation is fully cached and another one not? Another problem is that if you use random_page_cost == seq_page_cost, you are effectively saying that everything is in cache. But if everything is in cache, the cost of page access relative to cpu_*_costs is way off. The more random_page_cost and seq_page_cost are different, the more they mean the storage access costs. When they are the same, they mean the memory page cost. There can be an order of magnitude in difference of a storage page cost and a memory page cost. So it is hard to tune the cpu_*_costs realistically for cases where sometimes data is in cache and sometimes not. Ok, enough hand waving for one post :) Sorry if this all is obvious / discussed before. My googling didn't turn out anything directly related, although these have some similarity: - Per-table random_page_cost for tables that we know are always cached [http://archives.postgresql.org/pgsql-hackers/2008-04/msg01503.php] - Script to compute random page cost [http://archives.postgresql.org/pgsql-hackers/2002-09/msg00503.php] - The science of optimization in practical terms? [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php], getting really interesting starting from here: [http://archives.postgresql.org/pgsql-hackers/2009-02/msg00787.php] late reply. You can add this link to your list: http://archives.postgresql.org/pgsql-hackers/2011-06/msg01140.php -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory
On 23 September 2011 15:12, Robert Haas robertmh...@gmail.com wrote: [ moving to -hacker s] On Thu, Sep 22, 2011 at 9:26 PM, Thom Brown t...@linux.com wrote: On 22 September 2011 17:38, Josh Berkus j...@agliodbs.com wrote: So are there any plans to allow swappable drive/volatile storage unlogged tables? Be our guest. ;-) Oh it can't be that difficult. On first glance it looks like it's a case of piggy-backing mdopen and getting it to treat RELPERSISTENCE_TEMP relations in the same way as it would for relations during the bootstrap script (i.e. create it if it doesn't exist)... then telling it not to try reading anything from the relation... or something like this. But I don't know C so... *puppy-dog eyes* I don't think that's it. It seems to me that what we really need to do is put the _init forks in a different directory than all the other forks (and then fix pg_upgrade to move them if upgrading from 9.1) - so the core code change is actually in relpathbackend(). The question is how you want to expose that to the user. I mean, you could just put them in another directory under the rug, and call it good. Users who know about the magic under the hood can fiddle with their mount points, and everyone else will be none the wiser. I'm not sure whether that's acceptable from a usability standpoint, however. If you want to expose an SQL-level interface, this gets to be a harder problem... starting with defining what exactly that interface should look like. Couldn't this come under tablespace changes then? After all the use-case stated would require a separate tablespace, and you could do something like: CREATE VOLATILE TABLESPACE drive_made_of_wax_left_in_the_sun LOCATION '/mnt/ramdisk'; All objects then created or reassigned therein would insert magic stuff here. In theory it would be independent of UNLOGGEDness, but I can see this would be problematic because such tables wouldn't be allowed foreign key references to tables within a stable tablespace and vice-versa, since the wonky tablespace could collapse any minute and integrity with it. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
Excerpts from Magnus Hagander's message of vie sep 23 11:31:37 -0300 2011: On Fri, Sep 23, 2011 at 15:55, Alvaro Herrera alvhe...@commandprompt.com wrote: This seems strange to me. Why not have a second option to let the user indicate the desired SSL verification? sslmode=disable/allow/prefer/require sslverify=none/ca-if-present/ca/full (ca-if-present being the current require sslmode behavior). We could then deprecate sslmode=verify and verify-full and have them be synonyms of sslmode=require and corresponding sslverify. Hmm. I agree that the other suggestion was a bit weird, but I'm not sure I like the multiple-options approach either. That's going to require redesign of all software that deals with it at all today :S Why? They could continue to use the existing options; or switch to the new options if they wanted different behavior, as is the case of the OP. Maybe we should just update the docs and be done with it :-) That's another option, sure ... :-) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large C files
On Fri, Sep 9, 2011 at 11:28 PM, Peter Geoghegan pe...@2ndquadrant.com wrote: It's very difficult or impossible to anticipate how effective the tool will be in practice, but when you consider that it works and does not produce false positives for the first 3 real-world cases tested, it seems reasonable to assume that it's at least worth having around. Tom recently said of a previous pgrminclude campaign in July 2006 that It took us two weeks to mostly recover, but we were still dealing with some fallout in December. I think that makes the case for adding this tool or some refinement as a complement to pgrminclude in src/tools fairly compelling. I'm not opposed to adding something like this, but I think it needs to either be tied into the actual running of the script, or have a lot more documentation than it does now, or both. I am possibly stupid, but I can't understand from reading the script (or, honestly, the thread) exactly what kind of pgrminclude-induced errors this is protecting against; but even if we clarify that, it seems like it would be a lot of work to run it manually on all the files that might be affected by a pgrminclude run, unless we can somehow automate that. I'm also curious to see how much more fallout we're going to see from that run. We had a few glitches when it was first done, but it didn't seem like they were really all that bad. It might be that we'd be better off running pgrminclude a lot *more* often (like once a cycle, or even after every CommitFest), because the scope of the changes would then be far smaller and we wouldn't be dealing with 5 years of accumulated cruft all at once; we'd also get a lot more experience with what works or does not work with the script, which might lead to improvements in that script on a less-than-geologic time scale. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH
On Friday 23 Sep 2011 15:42:48 Robert Haas wrote: On Wed, Sep 21, 2011 at 12:19 PM, Andres Freund and...@anarazel.de wrote: /* * We also disallow data-modifying WITH in a cursor. (This could be * allowed, but the semantics of when the updates occur might be * surprising.) */ if (result-hasModifyingCTE) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg(DECLARE CURSOR must not contain data-modifying statements in WITH))); Given that cursors are about the only sensible way to return larger amounts of data, that behaviour reduces the usefulness of wCTEs a bit. Whats the exact cause of concern here? I personally don't think there is a problem documenting that you should fetch the cursor fully before relying on the updated tables to be in a sensible state. But that may be just me. Well, it looks like right now you can't even using a simple INSERT .. RETURNING there: rhaas=# create table wuzzle (a int); CREATE TABLE rhaas=# declare w cursor for insert into wuzzle select g from generate_series(1, 10) g returning g; ERROR: syntax error at or near insert LINE 1: declare w cursor for insert into wuzzle select g from genera... One could argue that its a easier to implement it using a wCTE because the query will be simply materialize the query upfront. That makes handling the case where somebody fetches 3 tuples from a query updating 10 easier. Thats a bit harder for the normal cursor case because there is no tuplestore around to do that (except the WITH HOLD case where that is only used on commit...). I find it an acceptable way to enforce using a CTE to do cursors on DML because it makes it more clear that they will be fully executed on start... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory
On Fri, Sep 23, 2011 at 10:37 AM, Thom Brown t...@linux.com wrote: Couldn't this come under tablespace changes then? After all the use-case stated would require a separate tablespace, and you could do something like: CREATE VOLATILE TABLESPACE drive_made_of_wax_left_in_the_sun LOCATION '/mnt/ramdisk'; All objects then created or reassigned therein would insert magic stuff here. In theory it would be independent of UNLOGGEDness, but I can see this would be problematic because such tables wouldn't be allowed foreign key references to tables within a stable tablespace and vice-versa, since the wonky tablespace could collapse any minute and integrity with it. I don't get it. It would certainly be possible to create a VOLATILE TABLESPACE in which only TEMPORARY tables could be created. We would just disallow the creation of anything other than a temporary table within that tablespace, and if the contents of the tablespace get wiped out, WDC. (Mind you, I think we'd likely want to insist that the pg-version directory manufactured by CREATE TABLESPACE would stick around... or else we'd need some provision for recreating it on every startup.) However, if you want a VOLATILE TABLESPACE to allow not only TEMPORARY but also UNLOGGED objects, it's not so simple, because the _init forks of an unlogged relation are not disposable. Those are not allowed to disappear, or you're going to be in trouble. So the issue still comes down to this: where are we gonna put those _init forks? I guess we could do something like this: CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; All forks of temporary relations, and all non-_init forks of non-temporary relations, could be stored in the VOLATILE LOCATION, while everything else could be stored in the regular LOCATION. Hmm... actually, I kind of like that. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory
On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas robertmh...@gmail.com wrote: CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; All forks of temporary relations, and all non-_init forks of non-temporary relations, could be stored in the VOLATILE LOCATION, while everything else could be stored in the regular LOCATION. Hmm... actually, I kind of like that. Thoughts? Gah. I mean, all forks of temporary relations, and all non-_init forks of *unlogged* relations, could be stored in the VOLATILE LOCATION. Permanent tables would have all forks in the regular LOCATION, along with _init forks of unlogged tables. Of course, that would have the problem that relpathbackend() would need to know the relpersistence value in order to compute the pathname, which I think is going to be ugly, come to think of it. Hmm... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH
On Fri, Sep 23, 2011 at 10:53 AM, Andres Freund and...@anarazel.de wrote: One could argue that its a easier to implement it using a wCTE because the query will be simply materialize the query upfront. That makes handling the case where somebody fetches 3 tuples from a query updating 10 easier. Thats a bit harder for the normal cursor case because there is no tuplestore around to do that (except the WITH HOLD case where that is only used on commit...). I find it an acceptable way to enforce using a CTE to do cursors on DML because it makes it more clear that they will be fully executed on start... Hmm, maybe. But if that's true, why does the comment read the way it does? If the updates all occur at the beginning, that wouldn't be surprising, would it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Thu, Sep 22, 2011 at 10:45 PM, Jeff Davis pg...@j-davis.com wrote: + for (i = 0; i num_items; ++i) + /* do something with q-items[i] */ + +This code turns out to be unsafe, because the writer might increment +q-num_items before it finishes storing the new item into the appropriate slot. +More subtly, the reader might prefetch the contents of the q-items array +before reading q-num_items. How would the reader prefetch the contents of the items array, without knowing how big it is? I don't think this is as mysterious as it sounds. Imagine the compiler unrolled the loop so that it does something like fetch num_items into register if register 0 fetch q[0], process it if register 1 fetch q[1], process it ... Then the cpu could easily do branch prediction on the ifs and fetch q[0] while the fetch of num_items was still in progress. If it turns out that num_items is 0 then it would invalidate the operations initiated after the branch prediction but if it sees 1 (ie after the increment in the other process) then it's not so it doesn't. So you have two memory fetches which I guess I still imagine have to be initiated in the right order but they're both in flight at the same time. I have no idea how the memory controller works and I could easily imagine either one grabbing the value before the other. I'm not even clear how other processors can reasonably avoid this. It must be fantastically expensive to keep track of which branch predictions depended on which registers and which memory fetches the value of those registers depended on. And then it would have to somehow inform the memory controller of those old memory fetches that this new memory fetch is dependent on and have it ensure that the fetches are read the right order? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
But on the standby its size is the old one (thus, it seems, that the size changed after the rsync transfer and before the pg_stop_backup() was called): Now that seems pretty weird - I don't think that file should ever shrink. It seems, I was not clear in my last example. The pg_clog file doesn't shrink. On the contrary, when rsync kicks in it is still small, but after it completes and somewhere around the pg_stop_backup() call, the pg_clog file grows on the master. Hence, it is left smaller on the standby. Are you sure that the rsync isn't starting until after pg_start_backup() completes? 100% - the procedure is scripted in a single threaded application, so rsync is started only after pg_start_backup(...) returns. Because if you were starting it just a tiny bit early, that would explain the observed symptoms perfectly, I think. I agree, but that is not the case. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory
On 23 September 2011 15:56, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas robertmh...@gmail.com wrote: CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; All forks of temporary relations, and all non-_init forks of non-temporary relations, could be stored in the VOLATILE LOCATION, while everything else could be stored in the regular LOCATION. Hmm... actually, I kind of like that. Thoughts? Gah. I mean, all forks of temporary relations, and all non-_init forks of *unlogged* relations, could be stored in the VOLATILE LOCATION. Permanent tables would have all forks in the regular LOCATION, along with _init forks of unlogged tables. Of course, that would have the problem that relpathbackend() would need to know the relpersistence value in order to compute the pathname, which I think is going to be ugly, come to think of it. I doubt I understand the whole _init forks thing correctly, but can't the main tablespace provide sanctuary to such volatile supporting meta data (pg_version, _init and whatever else you're worried about) except the actual relation (and its vm/fsm)? Anything you can't afford to lose you get the main tablespace to look after. And instead of having a dir linked to the location in pg_tblspc, an actual dir could exist, containing items directly linked to items in the volatile location. Hmm... it doesn't sound quite right to me either. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On tis, 2011-09-20 at 16:38 -0400, Robert Haas wrote: For now, I think we're best off not changing the terminology, and confining the remit of this patch to (a) turning all of the existing recovery.conf parameters into GUCs and (b) replacing recovery.conf with a sentinel file a sentinel file (name TBB) to indicate that the server is to start in recovery mode. The naming isn't great but the more we change at once the less chance of reaching agreement. It seems like we have pretty broad agreement on the basics here, so let's start with that. The only thing that's slightly bogus about that is that if you were doing an archive recovery, you'd have to edit the main postgresql.conf with one-shot parameters for that particular recovery (and then delete them again, or leave them in place, confusing the next guy). But perhaps that's worth the overall simplification. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Fri, Sep 23, 2011 at 4:41 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Unfortunately, it's impossible, because the error message Could not read from file pg_clog/0001 at offset 32768: Success is shown (and startup aborted) before the turn for redo starts at message arrives. It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. Which shouldn't happen, of course, because the start-backup checkpoint should flush all the clog that's needed by recovery to disk before the backup procedure begins to them. I think the point here is that recover *never starts*. Something in the standby startup is looking for a value in a clog block that recovery hadn't had a chance to replay (produce) yet. So the standby is looking into the data directory *before* recovery has had a chance to run, and based on that,goes to look for something in clog page that wasn't guarenteed to exists at the start of the backup period, and bombing out before recovery has a chance to start replaying WAL and write the new clog page. a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
Excerpts from Linas Virbalas's message of vie sep 23 09:47:20 -0300 2011: On 9/23/11 12:05 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: But on the standby its size is the old one (thus, it seems, that the size changed after the rsync transfer and before the pg_stop_backup() was called): ls -l pg_clog/ total 8 -rw--- 1 postgres postgres 8192 Sep 23 14:31 Sounds like rsync is caching the file size at the start of the run, and then copying that many bytes, ignoring the growth that occurred after it started. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sep 23, 2011 5:59 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Linas Virbalas's message of vie sep 23 09:47:20 -0300 2011: On 9/23/11 12:05 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: But on the standby its size is the old one (thus, it seems, that the size changed after the rsync transfer and before the pg_stop_backup() was called): ls -l pg_clog/ total 8 -rw--- 1 postgres postgres 8192 Sep 23 14:31 Sounds like rsync is caching the file size at the start of the run, and then copying that many bytes, ignoring the growth that occurred after it started. That pretty much matches what Daniel does when he got the same failure case. Is this not allowed? Shouldn't wal replay fix this? /Magnus
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
Hi, On Wednesday 21 Sep 2011 16:44:30 Linas Virbalas wrote: 2011-09-21 13:41:05 CEST DETAIL: Could not read from file pg_clog/0001 at offset 32768: Success. Any chance you can attach gdb to the startup process and provide a backtrace from the place where this message is printed? Greetings, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Tue, Sep 20, 2011 at 5:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: I sympathise with this view, to an extent. If people want to put all parameters in one file, they can do so. So +1 to that. Should they be forced to adopt that new capability by us deliberately breaking their existing setups? No. So -1 to that. If we do an automatic include of recovery.conf first, then follow by reading postgresql,conf then we will preserve the old as well as allowing the new. I don't buy this argument at all. I don't believe that recovery.conf is part of anyone's automated processes at all, let alone to an extent that they won't be able to cope with a change to rationalize the file layout. There are many. Tools I can name include pgpool, 2warm, PITRtools, but there are also various tools from Sun, an IBM reseller I have forgotten the name of, OmniTI and various other backup software providers. Those are just the ones I can recall quickly. We've encouraged people to write software on top and they have done so. Breaking backwards compatibility isn't something we do elsewhere, when its easy to keep it. I don't object to new functionality (and agreed to it upthread), just don't break the old way when there is no need. And most especially I don't buy that someone who does want to keep using it couldn't cope with adding an include to postgresql.conf manually. This just creates a barrier to upgrade. Most people using PostgreSQL use multiple releases, so its a pain to have to maintain multiple versions of scripts to have things work properly. Even people that don't mind changing won't be able to do it fully. That creates confusion, which leads to mistakes. These things relate to backup and recovery. Any changes to them give risk of data loss. Cosmetic considerations are secondary. There is no command to safely confirm that include recovery.conf is added to postgresql.conf, so leaving it optional makes it unclear as to whether the old ways will work or not. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On 09/20/2011 09:23 AM, Tom Lane wrote: Simon Riggssi...@2ndquadrant.com writes: I sympathise with this view, to an extent. If we do an automatic include of recovery.conf first, then follow by reading postgresql,conf then we will preserve the old as well as allowing the new. I don't buy this argument at all. I don't believe that recovery.conf is part of anyone's automated processes at all, let alone to an extent that they won't be able to cope with a change to rationalize the file layout. And most especially I don't buy that someone who does want to keep using it couldn't cope with adding an include to postgresql.conf manually. As Simon has already appropriately posted You would be incorrect. Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Single pass vacuum - take 2
On Tue, Aug 30, 2011 at 6:38 AM, Pavan Deolasee pavan.deola...@gmail.com wrote: Yeah. If we don't know the status of the vacuum that collected the line pointer and marked it vacuum-dead, the next vacuum will pick it up again and stamp it with its own generation number. I'm still not really comfortable with the handling of vacuum generation numbers. If we're going to say that 2^30 is large enough that we don't need to worry about the counter wrapping around, then we need some justification for that position. Why can't we have 2^30 consecutive failed vacuums on a single table? Sure, it would take a long time, but we guard against many failure conditions that would take a long time, and the result is that we have fewer corner-case failures. I want an explanation of why it's *safe*, and what the smallest number of vacuum generations that we must support to make it safe is. If we blow the handling of this, we are going to eat the user's data, so we had better have a really convincing argument as to why what we're doing is OK. Here's a possible alternative implementation: we allow up to 32 vacuum generations to exist at once. We keep a 64 bit integer indicating the state of each vacuum generation: 00 = no line pointers with this vacuum generation exist in the heap, 01 = some line pointers with this vacuum generation may exist in the heap, but they are not removable, 11 = some line pointers with this vacuum generation exist in the heap, and they are removable. Then, when we start a VACUUM, we look for a vacuum generation with flags 01. If we find one, we adopt that as the generation number for this vacuum. If not, we look for one with flags 00, and if we find one, we set its flags to 01 and adopt it as the generation number for this vacuum. (If this too fails, then all vacuums are in state 11. There are several ways that could be handled - either we make a pass over the heap just to free dead line pointers, or we randomly select a vacuum generation number and push it back to state 01, or we make all line pointers encountered during the vacuum merely dead rather than dead-vacuumed; I think I like that option best.) When we complete the heap scan, we set the flags of any vacuum generation numbers that were previously 11 back to 00 (assuming we've visited all the not-all-visible pages). When we complete the index pass, we set the flags of our chosen vacuum generation number to 11. There is clearly room for argument about the details here; for example, as the algorithm is presented, it's hard to see how you would end up with more than one vacuum generation number in each state, so maybe you only need three values, not 32. I suppose it could be useful to have more values if you want to sometimes vacuum only part of the heap, because then you'd only get to mark vacuum generation numbers as unused on those occasions when you actually did scan the whole heap. But regardless of that detail, the thing I like about what I'm proposing here is that it provides a closed loop around the management of vacuum generation numbers - we always know the exact state of each vacuum generation number, as opposed to just hoping that by the billionth vacuum there won't be any leftovers. Of course, it may be also that we can convince ourselves that your algorithm as implemented is safe ... but I'm not convinced, yet. Another thing I'm not sure whether to worry about is the question of where we store the vacuum generation information. I mean, if we store it in pg_class, then what happens if the user does a manual update of pg_class just as we're updating the vacuum generation information? We had better make sure that there are no cases where we can accidentally think that it's OK to reclaim dead line pointers that really still have references, or we're going to end up with some awfully difficult-to-find bugs... never mind the fact the possibility of the user manually updating the value and hosing themselves. Of course, we already have some of those issues - relfrozenxid probably has the same problems - and I'm not 100% sure whether this one is any worse. It would be really nice to have those non-transactional tables that Alvaro keeps mumbling about, though, or some other way to store this information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] citext operator precedence fix
I'm OK with the proposed behavior change and I agree that it's probably what people want, but I am awfully suspicious that those extra casts are going to break something you haven't thought about. It might be worth posting a rough version first just to see if I (or someone else) can break it before you spend a lot of time on it. Additional breakage confirmed (hash functions, etc.) Looks like I need to add a lot more support functions and test. This is still worth doing, but don't expect it for the next commitfest. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 23.09.2011 19:03, Magnus Hagander wrote: On Sep 23, 2011 5:59 PM, Alvaro Herreraalvhe...@commandprompt.com wrote: Excerpts from Linas Virbalas's message of vie sep 23 09:47:20 -0300 2011: On 9/23/11 12:05 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: But on the standby its size is the old one (thus, it seems, that the size changed after the rsync transfer and before the pg_stop_backup() was called): ls -l pg_clog/ total 8 -rw--- 1 postgres postgres 8192 Sep 23 14:31 Sounds like rsync is caching the file size at the start of the run, and then copying that many bytes, ignoring the growth that occurred after it started. That pretty much matches what Daniel does when he got the same failure case. Is this not allowed? Shouldn't wal replay fix this? That's OK. The effect is the same as if rsync had copied the file at the start. What's not OK is to store the file as empty or truncated file in the backup, when the file is deleted from pg_clog while rsync is running. The file must have length = the length the file had when backup started, or the file must be omitted from the backup altogether. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Fri, Sep 23, 2011 at 11:43 AM, Aidan Van Dyk ai...@highrise.ca wrote: On Fri, Sep 23, 2011 at 4:41 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Unfortunately, it's impossible, because the error message Could not read from file pg_clog/0001 at offset 32768: Success is shown (and startup aborted) before the turn for redo starts at message arrives. It looks to me that pg_clog/0001 exists, but it shorter than recovery expects. Which shouldn't happen, of course, because the start-backup checkpoint should flush all the clog that's needed by recovery to disk before the backup procedure begins to them. I think the point here is that recover *never starts*. Something in the standby startup is looking for a value in a clog block that recovery hadn't had a chance to replay (produce) yet. Ah. I think you are right - Heikki made the same point. Maybe some of the stuff that happens just after this comment: /* * Initialize for Hot Standby, if enabled. We won't let backends in * yet, not until we've reached the min recovery point specified in * control file and we've established a recovery snapshot from a * running-xacts WAL record. */ ...actually needs to be postponed until after we've reached consistency? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sep23, 2011, at 18:03 , Magnus Hagander wrote: On Sep 23, 2011 5:59 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Sounds like rsync is caching the file size at the start of the run, and then copying that many bytes, ignoring the growth that occurred after it started. That pretty much matches what Daniel does when he got the same failure case. Is this not allowed? Shouldn't wal replay fix this? I don't see how it could be forbidden. ISTM that we absolutely *have* to be able to deal with each byte of a file's date, including it's meta-data, being in any state it was between the time pg_start_backup() returned and pg_stop_backup() was issued. With the individual states being in no way synchronized. (OK, in reality restricting this to individual 512-byte blocks is probably enough, but still...). This, I think, is also the reasons we need to force full_page_writes to on during a hot backup. If a page was modified at all between pg_start_backup() and pg_stop_backup(), we essentially have to assume it's totally garbled. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Sep23, 2011, at 18:45 , Robert Haas wrote: Ah. I think you are right - Heikki made the same point. Maybe some of the stuff that happens just after this comment: /* * Initialize for Hot Standby, if enabled. We won't let backends in * yet, not until we've reached the min recovery point specified in * control file and we've established a recovery snapshot from a * running-xacts WAL record. */ ...actually needs to be postponed until after we've reached consistency? I came the the same conclusion. It seems the before we've reached consistency, we shouldn't attempt to read anything because the data can be pretty arbitrarily garbled. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Simon, There are many. Tools I can name include pgpool, 2warm, PITRtools, but there are also various tools from Sun, an IBM reseller I have forgotten the name of, OmniTI and various other backup software providers. Those are just the ones I can recall quickly. We've encouraged people to write software on top and they have done so. Actually, just to correct this list: * there are no tools from Sun * pgPool2 does not deal with recovery.conf * there are additional ones: WAL-E, etc., which may or may not need to be edited. Breaking backwards compatibility isn't something we do elsewhere, when its easy to keep it. FWIW, I've already found that I have to modify all my backup scripts for 9.0 from 8.4, and again for 9.1 from 9.0. So we do break backwards compatibility, frequently. I don't object to new functionality (and agreed to it upthread), just don't break the old way when there is no need. I'm happy to make upgrades easier, but I want a path which eventually ends in recovery.conf going away. It's a bad API, confuses our users, and is difficult to support and maintain. If you think it's easier on our users to do that in stages over several versions rather than in one fell swoop, then let's plan it that way. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On 23.09.2011 19:49, Florian Pflug wrote: On Sep23, 2011, at 18:45 , Robert Haas wrote: Ah. I think you are right - Heikki made the same point. Maybe some of the stuff that happens just after this comment: /* * Initialize for Hot Standby, if enabled. We won't let backends in * yet, not until we've reached the min recovery point specified in * control file and we've established a recovery snapshot from a * running-xacts WAL record. */ ...actually needs to be postponed until after we've reached consistency? I came the the same conclusion. It seems the before we've reached consistency, we shouldn't attempt to read anything because the data can be pretty arbitrarily garbled. There are pretty clear rules on what state clog can be in. When you launch postmaster in a standby: * Any clog preceding the nextXid from the checkpoint record we start recovery from, must either be valid, or the clog file must be missing altogether (which can happen when it was vacuumed away while the backup in progress - if the clog is still needed at the end of backup it must not be missing, of course). * Any clog following nextXid can be garbled or missing. Recovery will overwrite any clog after nextXid from the WAL, but not the clog before it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] citext operator precedence fix
On Fri, Sep 23, 2011 at 12:37 PM, Josh Berkus j...@agliodbs.com wrote: I'm OK with the proposed behavior change and I agree that it's probably what people want, but I am awfully suspicious that those extra casts are going to break something you haven't thought about. It might be worth posting a rough version first just to see if I (or someone else) can break it before you spend a lot of time on it. Additional breakage confirmed (hash functions, etc.) Looks like I need to add a lot more support functions and test. This is still worth doing, but don't expect it for the next commitfest. I would also be looking carefully at whether you can construct a scenario where the operator resolution code can't decide between =(text,citext) or =(text,text) - you probably need a third type like varchar or bpchar in the mix to trigger a failure, if there's one to be found. Or you might have a problem with citext = bpchar being ambiguous between =(text,citext) and =(varchar,text), or some such thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Fri, Sep 23, 2011 at 12:51 PM, Josh Berkus j...@agliodbs.com wrote: I'm happy to make upgrades easier, but I want a path which eventually ends in recovery.conf going away. It's a bad API, confuses our users, and is difficult to support and maintain. I agree. GUC = Grand Unified Configuration, but recovery.conf is an example of where it's not so unified after all. We've already done a non-trivial amount of work to allow recovery.conf values to be specified without quotes, a random incompatibility with GUCs that resulted from having different parsing code for each file. If that were the last issue, then maybe it wouldn't be worth worrying about, but it's not. For example, it would be nice to have reload behavior on SIGHUP. And we keep talking about having an ALTER SYSTEM SET guc = value or SET PERMANENT guc = value command, and I think ALTER SYSTEM SET recovery_target_time = '...' would be pretty sweet. I don't want us to have to implement such things separately for postgresql.conf and recovery.conf. Now, it's true that Simon's proposal (of having recovery.conf automatically included) if it exists doesn't necessarily preclude those things. But it seems to me that it is adding a lot of complexity to core for a pretty minimal benefit to end-users, and that the semantics are not going to end up being very clean. For example, now you potentially have the situation where recovery.conf has work_mem=128MB and postgresql.conf has work_mem=4MB, and now when you end recovery you've got to make sure that everyone picks up the new setting. Now, in some sense you could say that's a feature addition, and I'm not going to deny that it might be useful to some people, but I think it's also going to require a fairly substantial convolution of the GUC machinery, and it's going to discourage people from moving away from recovery.conf. And like Josh, I think that ought to be the long-term goal, for the reasons he states. I don't want to go willy-nilly breaking third-party tools that work with PostgreSQL, but in this case I think that the reason there are so many tools in the first place is because what we're providing in core is not very good. If we are unwilling to improve it for fear of breaking compatibility with the tools, then we are stuck. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
One idea: col like 'foo%' could be translated to col = 'foo' and col = foo || 'zzz' , where 'z' is the largest possible character. This should be good enough for calculating stats. How to find such a character, i do not know. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedures
On Thu, Sep 1, 2011 at 12:18 PM, Josh Berkus j...@agliodbs.com wrote: On 8/31/11 12:15 PM, Merlin Moncure wrote: An out of process, autonomous transaction type implementation should probably not sit under stored procedures for a number of reasons -- mainly that it's going to expose too many implementation details to the user. For example, does a SP heavy app have 2*N running processes? Or do we slot them into a defined number of backends for that purpose? Yuck yuck. I like the AT feature, and kludge it frequently via dblink, but it's a solution for a different set of problems. I think that transaction control without parallelism would be the 80% solution. That is, an SP has transaction control, but those transactions are strictly serial, and cannot be run in parallel. For example, if you were writing an SP in PL/pgSQL, each BEGIN ... END block would be an explicit transaction, and standalone-only statements be allowed between BEGIN ... END blocks, or possibly in their own special block type (I prefer the latter). One issue we'd need to deal with is exception control around single-statement transactions and non-transactional statements (VACUUM, CREATE INDEX CONCURRENTLY, CHECKPOINT, etc.). In some cases, the user is going to want to catch exceptions and abort the SP, and in other cases ignore them, so both need to be possible. Totally agree -- was thinking about this very issue. One of the things I'd really like to see SP be able to do is to abstract some of the nasty details of MVCC away from the client -- setting isolation mode, replaying errors on serialization, etc. This requires error handling. Unfortunately, this (exception handling in non transaction context) is probably going to add some complexity to the implementation. Are we on the right track here (that is, maybe we really *should* be looking at out of process execution)? How do procedures fit in terms of execution from the tcop down? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Unlogged vs. In-Memory
On Fri, Sep 23, 2011 at 11:36 AM, Thom Brown t...@linux.com wrote: On 23 September 2011 15:56, Robert Haas robertmh...@gmail.com wrote: On Fri, Sep 23, 2011 at 10:54 AM, Robert Haas robertmh...@gmail.com wrote: CREATE TABLESPACE now_you_see_me_now_you_dont LOCATION '/mnt/highly_reliable_san' VOLATILE LOCATION '/mnt/ramdisk'; All forks of temporary relations, and all non-_init forks of non-temporary relations, could be stored in the VOLATILE LOCATION, while everything else could be stored in the regular LOCATION. Hmm... actually, I kind of like that. Thoughts? Gah. I mean, all forks of temporary relations, and all non-_init forks of *unlogged* relations, could be stored in the VOLATILE LOCATION. Permanent tables would have all forks in the regular LOCATION, along with _init forks of unlogged tables. Of course, that would have the problem that relpathbackend() would need to know the relpersistence value in order to compute the pathname, which I think is going to be ugly, come to think of it. I doubt I understand the whole _init forks thing correctly, Basically, for every unlogged table, you get an empty _init fork, and for every index of an unlogged table, you get an _init fork initialized to an empty index. The _init forks are copied over the main forks by the startup process before entering normal running. but can't the main tablespace provide sanctuary to such volatile supporting meta data (pg_version, _init and whatever else you're worried about) except the actual relation (and its vm/fsm)? Anything you can't afford to lose you get the main tablespace to look after. And instead of having a dir linked to the location in pg_tblspc, an actual dir could exist, containing items directly linked to items in the volatile location. Hmm... it doesn't sound quite right to me either. Well, we could certainly Decree From On High that the _init forks are all going to be stored under $PGDATA rather than in the tablespace directories. That would make things simple. Of course, it also means that if you want the _init forks stored somewhere, you are out of luck. Now maybe that is an unlikely scenario. Off the top of my head, the only case I can think of would be if the storage space or inode consumption requirements were problematic - and even then you could stick a symlink in there someplace to make it work, if you're the sort of person who knows how to do that. So maybe it's OK. But it makes me a little uneasy. When people ask to put stuff in a tablespace, I tend to think they want it to show up in that tablespace. Hmm, hmm... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load
On Fri, Sep 23, 2011 at 12:58 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: There are pretty clear rules on what state clog can be in. When you launch postmaster in a standby: * Any clog preceding the nextXid from the checkpoint record we start recovery from, must either be valid, or the clog file must be missing altogether (which can happen when it was vacuumed away while the backup in progress - if the clog is still needed at the end of backup it must not be missing, of course). * Any clog following nextXid can be garbled or missing. Recovery will overwrite any clog after nextXid from the WAL, but not the clog before it. So the actual error message in the last test was: 2011-09-21 13:41:05 CEST FATAL: could not access status of transaction 1188673 ...but we can't tell if that was before or after nextXid, which seems like it would be useful to know. If Linas can rerun his experiment, but also capture the output of pg_controldata before firing up the standby for the first time, then we'd able to see that information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Use new oom_score_adj without a new compile-time constant
On Mon, Sep 19, 2011 at 4:36 PM, Dan McGee d...@archlinux.org wrote: [ patch ] I suppose it's Tom who really needs to comment on this, but I'm not too enthusiastic about this approach. Duplicating the Linux kernel calculation into our code means that we could drift if the formula changes again. I like Tom's previous suggestion (I think) of allowing both constants to be defined - if they are, then we try oom_score_adj first and fall back to oom_adj if that fails. For bonus points, we could have postmaster stat() its own oom_score_adj file before forking and set a global variable to indicate the results. That way we'd only ever need to test once per postmaster startup (at least until someone figures out a way to swap out the running kernel without stopping the server...!). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Large C files
On 23 September 2011 15:46, Robert Haas robertmh...@gmail.com wrote: I'm not opposed to adding something like this, but I think it needs to either be tied into the actual running of the script, or have a lot more documentation than it does now, or both. I am possibly stupid, but I can't understand from reading the script (or, honestly, the thread) exactly what kind of pgrminclude-induced errors this is protecting against; The basic idea is simple. There is a high likelihood that if removing a header alters the behaviour of an object file silently, that it will also alter the symbol table for the same object file - in particular, the value of function symbols (their local offset in the object file), which relates to the number of instructions in each function. Yes, that is imperfect, but it's better than nothing, and intuitively I think that the only things that it won't catch in practice are completely inorganic bugs (i.e. things done for the express purpose of proving that it can be broken). Thinking about it now though, I have to wonder if I could have done a better job with objdump -td. but even if we clarify that, it seems like it would be a lot of work to run it manually on all the files that might be affected by a pgrminclude run, unless we can somehow automate that. I would have automated it if anyone had expressed any interest in the basic idea - it might be an over-reaction to the problem. I'm not sure. I'd have had it detect which object files might have been affected (through directly including the header, or indirectly including it by proxy). It could rename them such that, for example, xlog.o was renamed to xlog_old.o . Then, you make your changes, rebuild, and run the program again in a different mode. It notices the *_old.o files, and runs nm-diff on them. I'm also curious to see how much more fallout we're going to see from that run. We had a few glitches when it was first done, but it didn't seem like they were really all that bad. It might be that we'd be better off running pgrminclude a lot *more* often (like once a cycle, or even after every CommitFest), because the scope of the changes would then be far smaller and we wouldn't be dealing with 5 years of accumulated cruft all at once; we'd also get a lot more experience with what works or does not work with the script, which might lead to improvements in that script on a less-than-geologic time scale. Fair point. I'm a little busy with other things right now, but I'll revisit it soon. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] database object class of contrib/sepgsql
On Mon, Sep 12, 2011 at 5:45 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: The attached patch is a portion that we splitted off when we added pg_shseclabel system catalog. It enables the control/sepgsql to assign security label on pg_database objects that are utilized as a basis to compute a default security label of schema object. Committed, although the fact that it didn't compile until I made schema.c include pg_database.h makes me wonder how thoroughly you tested this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Sep23, 2011, at 17:46 , Peter Eisentraut wrote: On tis, 2011-09-20 at 16:38 -0400, Robert Haas wrote: For now, I think we're best off not changing the terminology, and confining the remit of this patch to (a) turning all of the existing recovery.conf parameters into GUCs and (b) replacing recovery.conf with a sentinel file a sentinel file (name TBB) to indicate that the server is to start in recovery mode. The naming isn't great but the more we change at once the less chance of reaching agreement. It seems like we have pretty broad agreement on the basics here, so let's start with that. The only thing that's slightly bogus about that is that if you were doing an archive recovery, you'd have to edit the main postgresql.conf with one-shot parameters for that particular recovery (and then delete them again, or leave them in place, confusing the next guy). But perhaps that's worth the overall simplification. OTOH, if they're GUCs, you can specify them on the postmaster's command line. We could even get crazy and patch pg_ctl to allow untar base backup pg_ctl recover -D dir --target_xid=the xid that broke stuff --target_inclusive=false pg_ctl start -D dir best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)
On Thu, Sep 22, 2011 at 11:31 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Sep 22, 2011 at 11:25 AM, Thom Brown t...@linux.com wrote: s/visca-versa/vice-versa/ s/laods/loads/ Fixed. v4 attached. Since it seems like people are fairly happy with this now, I've gone ahead and committed this version. I suspect there are bugs, but I don't think we're going to get much further until we actually try to use this for something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] Fix Leaky View Problem
On Mon, Sep 12, 2011 at 3:31 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote: I updated the patches of fix-leaky-view problem, according to the previous discussion. The NOLEAKY option was replaced by LEAKPROOF option, and several regression test cases were added. Rest of stuffs are unchanged. You have a leftover reference to NOLEAKY. For convenience of reviewer, below is summary of these patches: The Part-1 implements corresponding SQL syntax stuffs which are security_barrier reloption of views, and LEAKPROOF option on creation of functions to be stored new pg_proc.proleakproof field. The way you have this implemented, we just blow away all view options whenever we do CREATE OR REPLACE VIEW. Is that the behavior we want? If a security_barrier view gets accidentally turned into a non-security_barrier view, doesn't that create a security_hole? I'm also wondering if the way you're using ResetViewOptions() is the right way to handle this anyhow. Isn't that going to update pg_class twice? I guess that's probably harmless from a performance standpoint, but wouldn't it be better not to? I guess we could define something like AT_ReplaceRelOptions to handle this case. The documentation in general is not nearly adequate, at least IMHO. I'm a bit nervous about storing security_barrier in the RTE. What happens to stored rules if the security_barrier option gets change later? More when I've had more time to look at this... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Josh, There are many. Tools I can name include pgpool, 2warm, PITRtools, but there are also various tools from Sun, an IBM reseller I have forgotten the name of, OmniTI and various other backup software providers. Those are just the ones I can recall quickly. We've encouraged people to write software on top and they have done so. Actually, just to correct this list: * there are no tools from Sun * pgPool2 does not deal with recovery.conf I'm not sure what you mean by not deal with but part of pgpool-II's functionality assumes that we can easily generate recovery.conf. If reconf.conf is integrated into postgresql.conf, we need to edit postgresql.conf, which is a little bit harder than generating recovery.conf, I think. * there are additional ones: WAL-E, etc., which may or may not need to be edited. Breaking backwards compatibility isn't something we do elsewhere, when its easy to keep it. FWIW, I've already found that I have to modify all my backup scripts for 9.0 from 8.4, and again for 9.1 from 9.0. So we do break backwards compatibility, frequently. I don't object to new functionality (and agreed to it upthread), just don't break the old way when there is no need. I'm happy to make upgrades easier, but I want a path which eventually ends in recovery.conf going away. It's a bad API, confuses our users, and is difficult to support and maintain. If you think it's easier on our users to do that in stages over several versions rather than in one fell swoop, then let's plan it that way. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
I'm not sure what you mean by not deal with but part of pgpool-II's functionality assumes that we can easily generate recovery.conf. If reconf.conf is integrated into postgresql.conf, we need to edit postgresql.conf, which is a little bit harder than generating recovery.conf, I think. Oh? Clearly I've been abusing pgPool2 then. Where's the code that generates that? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers