Re: [HACKERS] Hot standby, slot ids and stuff
On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote: When a backend dies with FATAL, it writes an abort record before exiting. (I was under the impression it doesn't until few minutes ago myself, when I actually read the shutdown code :-)) Not in all cases; keep reading :-) If it doesn't, that's a bug. A FATAL exit is not supposed to leave the shared state corrupted, it's only supposed to be a forcible session termination. Any open transaction should be rolled back. Please look back at the earlier discussion we had on this exact point: http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php Heikki, perhaps now you understand my continued opposition to your ideas for simplification: I had already thought of them and was forced to rule them out, not through my own choice. Tom, note that I listen to what you say and try to write code that meets those requirements. From here, I don't mind which way we go. I want code that is as simple as possible to do the job, whatever we agree that to be. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Including kerberos realm
Magnus Hagander wrote: Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: Alvaro Herrera wrote: Not that this affects me in any way, but should there be a GUC variable to set the default behavior system-wide? I thought about that, but I don't want to add extra gucs without a good reason. You'd typically not have very many different lines in pg_hba for this, and just duplicating the parameter there would be ok I think. I'd rather move more of the krb parameters to be *just* in pg_hba.conf, but for now I left those in postgresql.conf as fallbacks.. If you think those parameters would make more sense in pg_hba.conf, let's just move them and be done with it. There has never been any intention that administrator-only GUCs would be promised compatible across versions. And the GUC mechanism is really rather a lot of overhead compared to options on a pg_hba line ... Well, it does make sense to have defaults in postgresql.conf - but I don't think it's worth the overhead. I'll commit the stuff I have for now and put it on my TODO to remove them completely from postgresql.conf later. I'll see if I have time to get it done for 8.4. Ok, I've applied a patch for this for the parameter krb_realm and krb_server_hostname, which are the ones that currently supported both. Should we also consider moving the remaining ones there? (krb_server_keyfile, krb_srvname, krb_caseinsens_users) They do make sense to set on a per-server basis, on the other hand they are the only remaining authentication-method-specific parameters left... //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
Simon Riggs wrote: On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote: When a backend dies with FATAL, it writes an abort record before exiting. (I was under the impression it doesn't until few minutes ago myself, when I actually read the shutdown code :-)) Not in all cases; keep reading :-) If it doesn't, that's a bug. A FATAL exit is not supposed to leave the shared state corrupted, it's only supposed to be a forcible session termination. Any open transaction should be rolled back. Please look back at the earlier discussion we had on this exact point: http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php I think the running-xacts list we dump to WAL at every checkpoint is enough to handle that. Just treat the dead transaction as in-progress until the next running-xacts record. It's presumably extremely rare to have a process die with FATAL, and not write an abort record. A related issue is that currently the recovery PANICs if it runs out of recovery procs. I think that's not acceptable, regardless of whether we use slotids or some other method to avoid it in normal operation, because it means you can't recover at all if you set max_connections too low in the standby (or in the primary, and you have to recover from crash), or you run out of recovery procs because of an abort failed in the primary like discussed on that thread. The standby should just fast-forward to the next running-xacts record in that case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Open item: kerberos warning message
For what it's worth this always bothered me. I often - but nit always - - have kerberos tickets gsst...@... lying around but my unix id is stark. I never set up kerberos authentication for postgres but whrn the tickets happen to be there it fails to authenticate. I think I complained about this in the past but I don't recall - it would have been a long time ago. -- Greg On 8 Jan 2009, at 11:22, Stephen Frost sfr...@snowman.net wrote: Magnus, et al, * Magnus Hagander (mag...@hagander.net) wrote: Looking at the open item about the new error message shown when Kerberos is compiled in, and not used: assword: FATAL: password authentication failed for user mha psql: pg_krb5_init: krb5_cc_get_principal: No credentials cache found FATAL: password authentication failed for user mha That is annoying, I can understand that. The reason this is happening is that we are initializing Kerberos even if we're not going to use it. The reason for doing *this*, is that if kerberos is compiled in, we use it to find out if we should try a different username than the one logged in to the local system - we look at the kerberos login. This made sense before we had mappings support because the only user you could possibly be in PG is the one you authenticated as. We don't do this for any other login, including kerberos over GSSAPI. AFAIK, we've heard no complaints. Well, I havn't moved all my systems to GSSAPI yet.. :) Thoughts? Now that we have support for mappings, I expect it will be more common for a user to authenticate with princ 'A' and then connect using their Unix id 'B' to a PG user 'B'. As such, I'm alright with dropping support for this. Users can always use -U (or equiv) if necessary. Thanks, Stephen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote: When a backend dies with FATAL, it writes an abort record before exiting. (I was under the impression it doesn't until few minutes ago myself, when I actually read the shutdown code :-)) Not in all cases; keep reading :-) If it doesn't, that's a bug. A FATAL exit is not supposed to leave the shared state corrupted, it's only supposed to be a forcible session termination. Any open transaction should be rolled back. Please look back at the earlier discussion we had on this exact point: http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php I think the running-xacts list we dump to WAL at every checkpoint is enough to handle that. Just treat the dead transaction as in-progress until the next running-xacts record. It's presumably extremely rare to have a process die with FATAL, and not write an abort record. I agree, but I'll wait for Tom to speak further. A related issue is that currently the recovery PANICs if it runs out of recovery procs. I think that's not acceptable, regardless of whether we use slotids or some other method to avoid it in normal operation, because it means you can't recover at all if you set max_connections too low in the standby (or in the primary, and you have to recover from crash), or you run out of recovery procs because of an abort failed in the primary like discussed on that thread. The standby should just fast-forward to the next running-xacts record in that case. What do you mean by fast forward? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
Simon Riggs wrote: On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote: A related issue is that currently the recovery PANICs if it runs out of recovery procs. I think that's not acceptable, regardless of whether we use slotids or some other method to avoid it in normal operation, because it means you can't recover at all if you set max_connections too low in the standby (or in the primary, and you have to recover from crash), or you run out of recovery procs because of an abort failed in the primary like discussed on that thread. The standby should just fast-forward to the next running-xacts record in that case. What do you mean by fast forward? I mean the standby should stop trying to track the in progress transactions in recovery procs, and apply the WAL records like it does before the consistent state is reached. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
You don't want to just modify pg_standby to accept small files, because then you've made it harder to make absolutely sure when the file is ready to be processed if a non-atomic copy is being done. It is hard, but I think it is the right way forward. Anyway I think the size is not robust at all because some (most ? e.g. win32) non-atomic copy implementations will also show the final size right from the beginning. Could we use stricter file locking when opening WAL for recovery ? Or implement a wait loop when the crc check (+ a basic validity check) for the next record fails (and the next record is on a 512 byte boundary ?). I think standby and restore recovery can be treated differently to startup recovery because a copied wal file, even if the copy is not atomic, will not have trailing valid WAL records from a recycled WAL. (A solution that recycles WAL files for restore/standby would need to make sure it renames the files *after* restoring the content.) Btw how do we detect end of WAL when restoring a backup and WAL after PANIC ? 1) Provide the length as part of the archive command +1 Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visibility map and freezing
Jeff Davis wrote: On Wed, 2009-01-07 at 09:34 +0200, Heikki Linnakangas wrote: autovacuum_freeze_max_age - autovacuum_freeze_scan_age vacuum_freeze_max_age - vacuum_freeze_scan_age vacuum_freeze_min_age - vacuum_freeze_tuple_age *_scan_age settings control when the table is fully scanned to freeze tuples and advance relfrozenxid, and vacuum_freeze_tuple_age controls how old a tuple needs to be to be frozen. One objection is that you can read freeze_scan to mean that a scan is frozen, like a tuple is frozen. Any better ideas? I see what you mean about the possible misinterpretation, but I think it's a big improvement, and I don't have a better suggestion. Thinking about this some more, I'm not too happy with those names either. vacuum_freeze_scan_age and autovacuum_freeze_scan_age don't mean quite the same thing, like vacuum_cost_delay and autovacuum_vacuum_cost_delay do, for example. I'm now leaning towards: autovacuum_freeze_max_age vacuum_freeze_table_age vacuum_freeze_min_age where autovacuum_freeze_max_age and vacuum_freeze_min_age are unchanged, and vacuum_freeze_table_age is the new setting that controls when VACUUM or autovacuum should perform a full scan of the table to advance relfrozenxid. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Automatic view update rules
--On Sonntag, Dezember 28, 2008 15:29:58 +0100 Bernd Helmle be...@oopsware.de wrote: maybe the better solution is to not allow such a view to be updatable Yes, it seems we have to check for target lists having negative attnums in checkTree(). Another solution would be to simply ignore those columns (extract them from the target list and include all updatable columns only). While looking at this it turned out that the problem of updatability is more complex than i originally thought: Consider a relation tree of the following views/relations: View1 - View2 - View3 - Table1 That means, View1 consists of View2 and so on. What happens now if someone is going to change View3, so that it's not updatable anymore? What the patch actually does is, scanning all relations/views involved in a current view (and cascading updates) und reject update rules as soon as it finds more than one relation within a view definition. Unfortunately this seems not to be enough, we had really check all involved views for updatability recursively. The infrastructure for this is already there, but i wonder if it could be made easier when we are going to maintain a separate is_updatable flag somewhere in the catalog, which would make checking the relation tree for updatability more easier. Comments? -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote: A related issue is that currently the recovery PANICs if it runs out of recovery procs. I think that's not acceptable, regardless of whether we use slotids or some other method to avoid it in normal operation, because it means you can't recover at all if you set max_connections too low in the standby (or in the primary, and you have to recover from crash), or you run out of recovery procs because of an abort failed in the primary like discussed on that thread. The standby should just fast-forward to the next running-xacts record in that case. What do you mean by fast forward? I mean the standby should stop trying to track the in progress transactions in recovery procs, and apply the WAL records like it does before the consistent state is reached. If you say something is not acceptable you need to say what behaviour you do find acceptable in its place. It's good to come up with new ideas, but it's not good to ignore the problems the new ideas have. This is a general point that applies both here and to your proposals with synch rep. It's much harder to say how it should work in a way that covers all the requirements and points others have made, otherwise its just handwaving. So, if we don't PANIC, how should we behave? Without full information on running-xacts we would be unable to take a snapshot, so should: * backends be forcibly disconnected? * backends hang waiting for snapshot info to be re-available again in X minutes worth of WAL time? * backends throw an ERROR: unable to provide snapshot at this time, DETAIL: retry your statement later. ...other alternatives and possibly prevent new connections. If max_connections is higher on primary then the standby will *never* be available for querying. Should we have multiple ERRORs depending upon whether the situation is hopefully-temporary or looks-permanent? Don't assume I want the PANIC. That clearly needs to be revisited if we change slotids. I just want to make a balanced judgement based upon a full consideration of the options. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
Simon Riggs wrote: On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote: I mean the standby should stop trying to track the in progress transactions in recovery procs, and apply the WAL records like it does before the consistent state is reached. ... So, if we don't PANIC, how should we behave? Without full information on running-xacts we would be unable to take a snapshot, so should: * backends be forcibly disconnected? * backends hang waiting for snapshot info to be re-available again in X minutes worth of WAL time? * backends throw an ERROR: unable to provide snapshot at this time, DETAIL: retry your statement later. ...other alternatives and possibly prevent new connections. All of those seem reasonable to me. The 2nd option seems nicest, X minutes should probably be controlled by max_standby_delay, after which you can throw an error. If we care enough, we could also keep tracking the transactions in backend-private memory of the startup process, until there's enough room in proc array. That would make the outage shorter, because you wouldn't have to wait until the next running-xacts record, but only until enough transactions have finished that they all fit in proc array again. But whatever is the simplest, really. If max_connections is higher on primary then the standby will *never* be available for querying. Should we have multiple ERRORs depending upon whether the situation is hopefully-temporary or looks-permanent? Don't assume I want the PANIC. That clearly needs to be revisited if we change slotids. It needs to be revisited whether we change slotids or not, IMHO. Note that with slotids, you have a problem as soon as any of the slots that don't exist on standby are used, regardless of how many concurrent transactions there actually is. Without slots you only have a problem if you really have more than standby's max_connections concurrent transactions. That makes a big difference in practice. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solve a problem of LC_TIME of windows.
ITAGAKI Takahiro wrote: Hiroshi Inoue in...@tpf.co.jp wrote: Seems LC_CTYPE and LC_TIME should be convertible even though we use wcsftime (which internally calls strftime?). Ok, wcsftime() requries both LC_TIME and LC_CTYPE are the same setting (at least encoding) on Windows. The attached patch is an updated version to fix cache_locale_time(). Now it sets LC_TIME and LC_CTYPE to the specified locale and restore them at end of the function. I tested the patch on Windows XP Japanese Edition (SJIS) with UTF-8 and EUCJP databases, and worked expectedly. #ifdef WIN32 codes seems to be ugly in the patch, but I have no other idea... I have applied this version of the patch (with only a minor further addition to the comment). Thank you all for your work and patience in getting this fixed! Let's hope it stays fixed :-) //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
On Fri, 2009-01-09 at 14:38 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Fri, 2009-01-09 at 13:23 +0200, Heikki Linnakangas wrote: I mean the standby should stop trying to track the in progress transactions in recovery procs, and apply the WAL records like it does before the consistent state is reached. ... So, if we don't PANIC, how should we behave? Without full information on running-xacts we would be unable to take a snapshot, so should: * backends be forcibly disconnected? * backends hang waiting for snapshot info to be re-available again in X minutes worth of WAL time? * backends throw an ERROR: unable to provide snapshot at this time, DETAIL: retry your statement later. ...other alternatives and possibly prevent new connections. All of those seem reasonable to me. The 2nd option seems nicest, X minutes should probably be controlled by max_standby_delay, after which you can throw an error. Hmm, we use the recovery procs to track transactions that have TransactionIds assigned. That means we will overflow only if we have approach 100% write transactions at any time, or if we have more write transactions in progress than we have max_connections on standby. So it sounds like the overflow situation would probably be both rare and, if it did occur, may not occur for long periods. If we care enough, we could also keep tracking the transactions in backend-private memory of the startup process, until there's enough room in proc array. That would make the outage shorter, because you wouldn't have to wait until the next running-xacts record, but only until enough transactions have finished that they all fit in proc array again. But whatever is the simplest, really. The above does sound best since it would allow us to have the snapshot hang for a short period. But at this stage of the game, more complex. For now though, since it looks like it would happen fairly rarely, I'd opt for the simplest: throw an ERROR. If max_connections is higher on primary then the standby will *never* be available for querying. Should we have multiple ERRORs depending upon whether the situation is hopefully-temporary or looks-permanent? Don't assume I want the PANIC. That clearly needs to be revisited if we change slotids. It needs to be revisited whether we change slotids or not, IMHO. Note that with slotids, you have a problem as soon as any of the slots that don't exist on standby are used, regardless of how many concurrent transactions there actually is. Without slots you only have a problem if you really have more than standby's max_connections concurrent transactions. That makes a big difference in practice. Sometimes, but mostly people set max_connections higher because they intend to use those extra connections. So no real advantage there against the slotid approach :-) -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buffer pool statistics in Explain Analyze
ITAGAKI Takahiro itagaki.takah...@oss.ntt.co.jp writes: I think there two independent items here: [1] Should we add those statistics to pg_stat_statements or not? [2] Should we add those statistics to EXPLAIN ANALYZE or not? I wanted to have [1] and proposed it, but it is rejected from 8.4. However, the reason is not because we have little demand for it, but [1] and [2] are mixed in the patch and they are bad designed. No, I think you misunderstood me entirely. The reason that I rejected those parts of the patch is that I think the statistics that are available are wrong/useless. If the bufmgr.c counters were really good for something they'd have been exposed long since (and we'd probably never have built a lot of the other stats collection infrastructure). The EXPLAIN ANALYZE code you submitted is actually kinda cute, and I'd have had no problem with it if I thought it were displaying numbers that were useful and unlikely to be obsoleted in future releases. The work that needs to be done is on collecting the numbers more than displaying them. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about truncate
David Fetter wrote: On Thu, Jan 08, 2009 at 02:39:52PM +0200, Peter Eisentraut wrote: David Fetter wrote: +1 for adding recursion to GRANT/REVOKE :) This area is under SQL standard control, so we can't really invent our own behavior. Consider the following: CREATE TABLE persons (name, email); CREATE TABLE employees (grade, salary) INHERITS (persons); GRANT SELECT ON persons TO allstaff; -- ??? GRANT SELECT ON employees TO managers; What you want in practice is that allstaff can read only those columns of employees that come from the persons table. Both recursive and nonrecursive GRANT do the wrong thing here. What *would* do the right thing here, or would anything? I think we don't need GRANT to be recursive, but instead the permission checks at runtime should allow SELECT * FROM persons; to succeed even if there are no permissions on employees. But only on the columns of persons and only if actually queried through persons. Needs a more detailed analysis, but that is how I imagine it ought to work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about truncate
Peter Eisentraut pete...@gmx.net writes: This area is under SQL standard control, so we can't really invent our own behavior. What *would* do the right thing here, or would anything? I think we don't need GRANT to be recursive, but instead the permission checks at runtime should allow SELECT * FROM persons; to succeed even if there are no permissions on employees. Hmm, if we are supposing that the spec should control this, then surely we can find chapter and verse spelling out what should happen. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] foreign_data test fails with non-C locale
The foreign_data test case is failing when I run make installcheck against a server that's been initialized with a locale other than C (en_GB.UTF-8). The reason is the different ordering of upper and lower case characters, per attached diff file. We can simply add an alternative expected output file, but I'd prefer not to if we can modify the test case instead. We could rename some of the object so that they sort the same in all locales, but that seems a bit awkward in this case. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** /home/hlinnaka/git-sandbox/pgsql/src/test/regress/expected/foreign_data.out Fri Jan 9 13:11:06 2009 --- /home/hlinnaka/git-sandbox/pgsql/src/test/regress/results/foreign_data.out Fri Jan 9 15:47:27 2009 *** *** 658,667 SELECT * FROM information_schema.foreign_servers ORDER BY 1, 2; foreign_server_catalog | foreign_server_name | foreign_data_wrapper_catalog | foreign_data_wrapper_name | foreign_server_type | foreign_server_version | authorization_identifier +-+--+---+-++-- - regression | S6 | regression | foo | || foreign_data_user regression | s4 | regression | foo | oracle || foreign_data_user regression | s5 | regression | foo | | 15.0 | regress_test_role regression | s6 | regression | foo | | 16.0 | regress_test_indirect regression | s8 | regression | postgresql| || foreign_data_user regression | st1 | regression | foo | || regress_test_indirect regression | st2 | regression | foo | || regress_test_role --- 658,667 SELECT * FROM information_schema.foreign_servers ORDER BY 1, 2; foreign_server_catalog | foreign_server_name | foreign_data_wrapper_catalog | foreign_data_wrapper_name | foreign_server_type | foreign_server_version | authorization_identifier +-+--+---+-++-- regression | s4 | regression | foo | oracle || foreign_data_user regression | s5 | regression | foo | | 15.0 | regress_test_role regression | s6 | regression | foo | | 16.0 | regress_test_indirect + regression | S6 | regression | foo | || foreign_data_user regression | s8 | regression | postgresql| || foreign_data_user regression | st1 | regression | foo | || regress_test_indirect regression | st2 | regression | foo | || regress_test_role *** *** 670,680 SELECT * FROM information_schema.foreign_server_options ORDER BY 1, 2, 3; foreign_server_catalog | foreign_server_name | option_name| option_value +-+--+-- - regression | S6 | mixed_case_names | true regression | s4 | dbname | b regression | s4 | host | a regression | s6 | dbname | b regression | s6 | host | a regression | s8 | connect_timeout | 30 regression | s8 | dbname | db1 (7 rows) --- 670,680 SELECT * FROM
Re: [HACKERS] SET TRANSACTION and SQL Standard
Simon Riggs wrote: I notice that we allow commands such as SET TRANSACTION read only read write read only; BEGIN TRANSACTION read only read only read only; Unsurprisingly, these violate the SQL Standard: * p.977 section 19.1 syntax (1) * p.957 section 17.3 syntax (2) Well, we allow a lot of things. Violations of the SQL standard happen when a command that appears in the standard doesn't do what the standard says. Allowing commands that are not in the standard is not a violation. While there is no huge use case for these particular cases, tolerating redundant options is sometimes useful. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
Heikki Linnakangas wrote: The foreign_data test case is failing when I run make installcheck against a server that's been initialized with a locale other than C (en_GB.UTF-8). The reason is the different ordering of upper and lower case characters, per attached diff file. We can simply add an alternative expected output file, but I'd prefer not to if we can modify the test case instead. We could rename some of the object so that they sort the same in all locales, but that seems a bit awkward in this case. Regression tests have always failed on non-C locales AFAIK. The buildfarm goes out of its way to avoid that. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
Andrew Dunstan wrote: Regression tests have always failed on non-C locales AFAIK. The buildfarm goes out of its way to avoid that. The regression tests should work just fine in non-C locales. If the buildfarm goes out of its way to avoid non-C locales, then it loses some significant code coverage, considering that there are several variant code paths for locales, and considering the amount of users that use them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: new border setting in psql
On Thu, 8 Jan 2009 18:45:43 -0500 (EST) Greg Smith gsm...@gregsmith.com wrote: A. Einstein was a really smart dude. Which character in the above example would you escape. . is on the long list of characters to be escaped I sent out earlier. The parser looks for all sorts of enumeration syntaxes--A., I), (IV)--but they all require some punctuation which makes those characters the ones to focus on. I tried escaping the '.' but it didn't change the behaviour. I would suggest that if we want actual ReST-safe output we should create a border = 4 setting. The code changes would be minimal. All we need to do is check for a value of 4 in addition to checking whether escaping is necessary. This seems like a reasonable spec to me. If you just want that general format, you can get that and may very well end up with something that's useful ReST anyway with the border=3 mode your existing patch implements. As you demonstrated, there are several situations where a character you think might do something special turns out to be ignored, with \ being the most likely to cause trouble. Enter the following into my test rig http://www.druid.darcy/rest.py: +++ | id | name | +++ | 8 | T'est | +++ | 9 | T*est | +++ | 10 | T\est | +++ | 11 | T\\est | +++ | 12 | T_est | +++ | 13 | T`est | +++ Notice that only the backslash needs to be escaped. However, if you just add the backslash it won't work because the table will be malformed. You need to widen every other field as well. As Tom has pointed out, ReST has problems beyond our implementation. People who use it are aware of these warts. Given that I really think that the cleanest solution is to just give them border 3, don't mention ReST in the docs and have it happily work for 95% of the cases in a ReST processor. How about my other proposal under the Output filter for psql subject? That would let people create parsers as safe as they need them. I think that this proposal is still useful for those that need something quick and dirty though. -- D'Arcy J.M. Cain da...@druid.net | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Including kerberos realm
Magnus Hagander wrote: Magnus Hagander wrote: Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: Alvaro Herrera wrote: Not that this affects me in any way, but should there be a GUC variable to set the default behavior system-wide? I thought about that, but I don't want to add extra gucs without a good reason. You'd typically not have very many different lines in pg_hba for this, and just duplicating the parameter there would be ok I think. I'd rather move more of the krb parameters to be *just* in pg_hba.conf, but for now I left those in postgresql.conf as fallbacks.. If you think those parameters would make more sense in pg_hba.conf, let's just move them and be done with it. There has never been any intention that administrator-only GUCs would be promised compatible across versions. And the GUC mechanism is really rather a lot of overhead compared to options on a pg_hba line ... Well, it does make sense to have defaults in postgresql.conf - but I don't think it's worth the overhead. I'll commit the stuff I have for now and put it on my TODO to remove them completely from postgresql.conf later. I'll see if I have time to get it done for 8.4. Ok, I've applied a patch for this for the parameter krb_realm and krb_server_hostname, which are the ones that currently supported both. Should we also consider moving the remaining ones there? (krb_server_keyfile, krb_srvname, krb_caseinsens_users) They do make sense to set on a per-server basis, on the other hand they are the only remaining authentication-method-specific parameters left... If they make more sense in postgresql.conf, I would just leave them there. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Isn't this redundant given the existence of pglesslog? It does the same as pglesslog, but is simpler to use because it is automatic. Which also means that everyone pays the performance penalty whether they get any benefit or not. The point of the external solution is to do the work only in installations that get some benefit. We've been over this ground before... If there is a performance penalty, you are right, but if the zeroing is done as part of the archiving, it seems near zero cost enough to do it all the time, no? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] parallel restore
Attached is the latest version. Changes: . some tidy up as variously requested. . some common code is factored out . some descriptive comments added . platform specific stuff (e.g. spawn, reap) is factored out . --truncate_before_load is gone, and we now do this during parallel restore if we created the table. For a non-parallel restore the equivalent would be to run the whole restore in a single transaction. One of the hardest parts of getting this to work was handling dependencies right. I will work on adding some more comments regarding that. Simon asked about a way to adjust the number of worker children as we go along. That's way out of scope at this stage. In testing it appears that the sweet spot is roughly m=number_of_processors, which makes some sense, but more experience will clarify this. cheers andrew parallel_restore_14.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET TRANSACTION and SQL Standard
Peter Eisentraut pete...@gmx.net writes: Simon Riggs wrote: I notice that we allow commands such as SET TRANSACTION read only read write read only; BEGIN TRANSACTION read only read only read only; Well, we allow a lot of things. Violations of the SQL standard happen when a command that appears in the standard doesn't do what the standard says. Allowing commands that are not in the standard is not a violation. I agree that spec violation is not a good argument for rejecting these. However, self-consistency with our own common practice should be considered. In practically every utility command we have that takes a list of options, we throw conflicting or redundant options errors in similar cases. My own feeling is that the second example is okay but the first should be rejected, since (a) it's quite unclear what the user wants, and (b) the ensuing behavior would be determined by implementation artifacts like which order we processed the options in. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET TRANSACTION and SQL Standard
On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote: Simon Riggs wrote: I notice that we allow commands such as SET TRANSACTION read only read write read only; BEGIN TRANSACTION read only read only read only; Unsurprisingly, these violate the SQL Standard: * p.977 section 19.1 syntax (1) * p.957 section 17.3 syntax (2) Well, we allow a lot of things. Violations of the SQL standard happen when a command that appears in the standard doesn't do what the standard says. Allowing commands that are not in the standard is not a violation. Except when the standard explicitly forbids it, as with the above. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
Heikki Linnakangas wrote: The foreign_data test case is failing when I run make installcheck against a server that's been initialized with a locale other than C (en_GB.UTF-8). I have removed one of the differences but can't reproduce the other right now (although it looks consequential). I'll check that on a different machine. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pgindent does a pretty awful job with function-pointer typedefs
I wonder if this can be improved: http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/optimizer/planner.h.diff?r1=1.41;r2=1.42 Similar examples can be found elsewhere ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
Andrew Dunstan wrote: Heikki Linnakangas wrote: The foreign_data test case is failing when I run make installcheck against a server that's been initialized with a locale other than C (en_GB.UTF-8). The reason is the different ordering of upper and lower case characters, per attached diff file. We can simply add an alternative expected output file, but I'd prefer not to if we can modify the test case instead. We could rename some of the object so that they sort the same in all locales, but that seems a bit awkward in this case. Regression tests have always failed on non-C locales AFAIK. The buildfarm goes out of its way to avoid that. No, that's the only test case that's failing. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET TRANSACTION and SQL Standard
Simon Riggs wrote: On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote: Simon Riggs wrote: I notice that we allow commands such as SET TRANSACTION read only read write read only; BEGIN TRANSACTION read only read only read only; Unsurprisingly, these violate the SQL Standard: * p.977 section 19.1 syntax (1) * p.957 section 17.3 syntax (2) Well, we allow a lot of things. Violations of the SQL standard happen when a command that appears in the standard doesn't do what the standard says. Allowing commands that are not in the standard is not a violation. Except when the standard explicitly forbids it, as with the above. No, it just means that the statement SET TRANSACTION read only read write read only; doesn't conform to the standard, and it's therefore implementation-dependent what it does. See the meaning of shall in Syntax Rules, section 6.3.3.2 Terms denoting rule requirements. I agree with Tom that the 2nd form is harmless, but we should throw an error for the first. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
Peter Eisentraut wrote: Andrew Dunstan wrote: Regression tests have always failed on non-C locales AFAIK. The buildfarm goes out of its way to avoid that. The regression tests should work just fine in non-C locales. If the buildfarm goes out of its way to avoid non-C locales, then it loses some significant code coverage, considering that there are several variant code paths for locales, and considering the amount of users that use them. It was discussed here at the time, IIRC, and we put in the check precisely because other locales broke the buildfarm. Originally buildfarm just inherited the locale from its environment. If it is no longer true that other locales break the tests, then I'm happy to examine alternatives. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solve a problem of LC_TIME of windows.
Hi. Thanks all. I tried CVS-HEAD now. HIROSHI=# select to_char(now(),'TMDay'); to_char -- Saturday (1 行) HIROSHI=# set LC_MESSAGES=Ja; SET HIROSHI=# select to_char(now(),'TMDay'); to_char -- Saturday (1 行) Umm, It does not look at a comfortable result.:-( I will check it on tomorrow night. sorry busy now.. Regards, Hiroshi Saito - Original Message - From: Magnus Hagander mag...@hagander.net ITAGAKI Takahiro wrote: Hiroshi Inoue in...@tpf.co.jp wrote: Seems LC_CTYPE and LC_TIME should be convertible even though we use wcsftime (which internally calls strftime?). Ok, wcsftime() requries both LC_TIME and LC_CTYPE are the same setting (at least encoding) on Windows. The attached patch is an updated version to fix cache_locale_time(). Now it sets LC_TIME and LC_CTYPE to the specified locale and restore them at end of the function. I tested the patch on Windows XP Japanese Edition (SJIS) with UTF-8 and EUCJP databases, and worked expectedly. #ifdef WIN32 codes seems to be ugly in the patch, but I have no other idea... I have applied this version of the patch (with only a minor further addition to the comment). Thank you all for your work and patience in getting this fixed! Let's hope it stays fixed :-) //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: %p provides the path, %f the file name, and now %l the length. That makes an example archive command something like: head -c %l %p | gzip /mnt/server/archivedir/%f Hard to beat for performance. I thought there was some technical snag. Expanding it back to always be 16MB on the other side might require some trivial script, can't think of a standard UNIX tool suitable for that but it's easy enough to write. Untested, but it seems like something close to this would work: cat $p $( dd if=/dev/null blocks=1 ibs=$(( (16 * 1024 * 1024) - $(stat -c%s $p) )) ) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Which also means that everyone pays the performance penalty whether they get any benefit or not. The point of the external solution is to do the work only in installations that get some benefit. We've been over this ground before... If there is a performance penalty, you are right, but if the zeroing is done as part of the archiving, it seems near zero cost enough to do it all the time, no? It's the same cost no matter which process does it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET TRANSACTION and SQL Standard
On Fri, 2009-01-09 at 17:11 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Fri, 2009-01-09 at 16:14 +0200, Peter Eisentraut wrote: Simon Riggs wrote: I notice that we allow commands such as SET TRANSACTION read only read write read only; BEGIN TRANSACTION read only read only read only; Unsurprisingly, these violate the SQL Standard: * p.977 section 19.1 syntax (1) * p.957 section 17.3 syntax (2) Well, we allow a lot of things. Violations of the SQL standard happen when a command that appears in the standard doesn't do what the standard says. Allowing commands that are not in the standard is not a violation. Except when the standard explicitly forbids it, as with the above. No, it just means that the statement SET TRANSACTION read only read write read only; doesn't conform to the standard, and it's therefore implementation-dependent what it does. See the meaning of shall in Syntax Rules, section 6.3.3.2 Terms denoting rule requirements. which says If any condition required by Syntax Rules is not satisfied when the evaluation of Access or General Rules is attempted and the implementation is neither processing non-conforming SQL language nor processing conforming SQL language in a non-conforming manner, then an exception condition is raised: syntax error or access rule violation. If we *choose* to be an SQL implementation that conforms to the SQL standard, then it should throw an error. Of course, we can *choose* not to conform to the standard in this or any case, but exactly why would we? I thought we had made a choice to conform to the SQL Standard, unless we have specific reason not to. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Kevin Grittner kevin.gritt...@wicourts.gov writes: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: Hard to beat for performance. I thought there was some technical snag. Yeah: the archiver process doesn't have that information available. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: new border setting in psql
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 D'Arcy J.M. Cain a écrit : As Tom has pointed out, ReST has problems beyond our implementation. People who use it are aware of these warts. Given that I really think that the cleanest solution is to just give them border 3, don't mention ReST in the docs and have it happily work for 95% of the cases in a ReST processor. +1 How about my other proposal under the Output filter for psql subject? That would let people create parsers as safe as they need them. I think that this proposal is still useful for those that need something quick and dirty though. Can be interesting, but for my own usage border=3 will be enough. - -- Cédric Villemain Administrateur de Base de Données Cel: +33 (0)6 74 15 56 53 http://dalibo.com - http://dalibo.org -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAklneBwACgkQo/dppWjpEvxucQCeIuTatyfoEw/TkqAVLK/hI0wq WkIAn3mt4tnMz3BAjXIJqtmMlj9d4r4l =Ykl+ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
All that is useless until we get a %l in archive_command... *I* didn't see an easy way to get at the written size later on in the chain (i.e. in the actual archiving), so I took the path of least resitance. The reason *I* shy way from pg_lesslog and pg_clearxlogtail, is that they seem to possibly be frail... I'm just scared of somethign changing in PG some time, and my pg_clearxlogtail not nowing, me forgetting to upgrade, and me not doing enough test of my actually restoring backups... Sure, it's all me being neglgent, but the simpler, the better... If I wrapped this zeroing in GUC, people could choose wether to pay the penalty or not, would that satisfy anyone? Again, *I* think that the force_switch case is going to happen when the admin's quite happy to pay that penalty... But obviously not everyone... a. * Kevin Grittner kevin.gritt...@wicourts.gov [090109 11:01]: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: %p provides the path, %f the file name, and now %l the length. That makes an example archive command something like: head -c %l %p | gzip /mnt/server/archivedir/%f Hard to beat for performance. I thought there was some technical snag. Expanding it back to always be 16MB on the other side might require some trivial script, can't think of a standard UNIX tool suitable for that but it's easy enough to write. Untested, but it seems like something close to this would work: cat $p $( dd if=/dev/null blocks=1 ibs=$(( (16 * 1024 * 1024) - $(stat -c%s $p) )) ) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] SET TRANSACTION and SQL Standard
Simon Riggs si...@2ndquadrant.com writes: If any condition required by Syntax Rules is not satisfied when the evaluation of Access or General Rules is attempted and the implementation is neither processing non-conforming SQL language nor processing conforming SQL language in a non-conforming manner, then an exception condition is raised: syntax error or access rule violation. If we *choose* to be an SQL implementation that conforms to the SQL standard, then it should throw an error. That reading would forbid any nonstandard syntax whatsoever... What this is actually describing is the standards conformance checking mode that the standard says you ought to provide, but we never have (nor have most other vendors AFAIK). In SQL92 this was described as a SQL Flagger and it was optional. Not sure what the latest spec says about that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solve a problem of LC_TIME of windows.
Hiroshi Saito z-sa...@guitar.ocn.ne.jp writes: HIROSHI=# set LC_MESSAGES=Ja; SET HIROSHI=# select to_char(now(),'TMDay'); to_char -- Saturday (1 è¡) I thought this was supposed to be driven by LC_TIME now, not LC_MESSAGES. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
Andrew Dunstan and...@dunslane.net writes: Peter Eisentraut wrote: The regression tests should work just fine in non-C locales. It was discussed here at the time, IIRC, and we put in the check precisely because other locales broke the buildfarm. Originally buildfarm just inherited the locale from its environment. I don't think we are prepared to buy into a general policy that the regression tests should pass in *any* locale; maintaining a large number of variant expected-files isn't very practical. However, the de facto policy is that we try to keep them passing in locales that are used by any of the regular developers. I think it would be useful to have buildfarm members testing in a few common locales. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
On Fri, 2009-01-09 at 09:31 -0500, Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: Isn't this redundant given the existence of pglesslog? It does the same as pglesslog, but is simpler to use because it is automatic. Which also means that everyone pays the performance penalty whether they get any benefit or not. The point of the external solution is to do the work only in installations that get some benefit. We've been over this ground before... If there is a performance penalty, you are right, but if the zeroing is done as part of the archiving, it seems near zero cost enough to do it all the time, no? It can already be done as part of the archiving, using an external tool as Tom notes. Yes, we could make the archiver do this, but I see no big advantage over having it done externally. It's not faster, safer, easier. Not easier because we would want a parameter to turn it off when not wanted. The patch as stands is IMHO not acceptable because the work to zero the file is performed by the unlucky backend that hits EOF on the current WAL file, which is bad enough, but it is also performed while holding WALWriteLock. I like Greg Smith's analysis of this and his conclusion that we could provide a %l option, but even that would require work to have that info passed to the archiver. Perhaps inside the notification file which is already written and read by the write processes. But whether that can or should be done for this release is a different debate. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks
The hot standby patch has some hacks to decide which full-page-images can be restored holding an exclusive lock and which ones need a vacuum-strength lock. It's not very pretty as is, as mentioned in comments too. How about we refactor things so that redo-functions are responsible for calling RestoreBkpBlocks? The redo function can then pass an argument indicating what kind of lock is required. We should also change XLogReadBufferExtended so that it doesn't lock the page; the caller knows better what kind of a lock it needs. That makes it more analogous with ReadBufferExtended too, although I think we should keep XLogReadBuffer() unchanged for now. See attached patch. One shortfall of this patch is that you can pass only one argument to RestoreBkpBlocks, but there can multiple backup blocks in one WAL record. That's OK in current usage, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** src/backend/access/gin/ginxlog.c --- src/backend/access/gin/ginxlog.c *** *** 438,443 gin_redo(XLogRecPtr lsn, XLogRecord *record) --- 438,445 { uint8 info = record-xl_info ~XLR_INFO_MASK; + RestoreBkpBlocks(lsn, record, false); + topCtx = MemoryContextSwitchTo(opCtx); switch (info) { *** src/backend/access/gist/gistxlog.c --- src/backend/access/gist/gistxlog.c *** *** 395,400 gist_redo(XLogRecPtr lsn, XLogRecord *record) --- 395,402 { uint8 info = record-xl_info ~XLR_INFO_MASK; + RestoreBkpBlocks(lsn, record, false); + MemoryContext oldCxt; oldCxt = MemoryContextSwitchTo(opCtx); *** src/backend/access/heap/heapam.c --- src/backend/access/heap/heapam.c *** *** 4777,4782 heap_redo(XLogRecPtr lsn, XLogRecord *record) --- 4777,4784 { uint8 info = record-xl_info ~XLR_INFO_MASK; + RestoreBkpBlocks(lsn, record, false); + switch (info XLOG_HEAP_OPMASK) { case XLOG_HEAP_INSERT: *** *** 4816,4827 heap2_redo(XLogRecPtr lsn, XLogRecord *record) --- 4818,4832 switch (info XLOG_HEAP_OPMASK) { case XLOG_HEAP2_FREEZE: + RestoreBkpBlocks(lsn, record, false); heap_xlog_freeze(lsn, record); break; case XLOG_HEAP2_CLEAN: + RestoreBkpBlocks(lsn, record, true); heap_xlog_clean(lsn, record, false); break; case XLOG_HEAP2_CLEAN_MOVE: + RestoreBkpBlocks(lsn, record, true); heap_xlog_clean(lsn, record, true); break; default: *** src/backend/access/nbtree/nbtxlog.c --- src/backend/access/nbtree/nbtxlog.c *** *** 714,719 btree_redo(XLogRecPtr lsn, XLogRecord *record) --- 714,721 { uint8 info = record-xl_info ~XLR_INFO_MASK; + RestoreBkpBlocks(lsn, record, false); + switch (info) { case XLOG_BTREE_INSERT_LEAF: *** src/backend/access/transam/xlog.c --- src/backend/access/transam/xlog.c *** *** 2934,2941 CleanupBackupHistory(void) * modifications of the page that appear in XLOG, rather than possibly * ignoring them as already applied, but that's not a huge drawback. */ ! static void ! RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn) { Buffer buffer; Page page; --- 2934,2941 * modifications of the page that appear in XLOG, rather than possibly * ignoring them as already applied, but that's not a huge drawback. */ ! void ! RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord *record, bool cleanup) { Buffer buffer; Page page; *** *** 2943,2948 RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn) --- 2943,2951 char *blk; int i; + if (!(record-xl_info XLR_BKP_BLOCK_MASK)) + return; + blk = (char *) XLogRecGetData(record) + record-xl_len; for (i = 0; i XLR_MAX_BKP_BLOCKS; i++) { *** *** 2955,2960 RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn) --- 2958,2968 buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork, bkpb.block, RBM_ZERO); Assert(BufferIsValid(buffer)); + if (cleanup) + LockBufferForCleanup(buffer); + else + LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); + page = (Page) BufferGetPage(buffer); if (bkpb.hole_length == 0) *** *** 5210,5218 StartupXLOG(void) TransactionIdAdvance(ShmemVariableCache-nextXid); } - if (record-xl_info XLR_BKP_BLOCK_MASK) - RestoreBkpBlocks(record, EndRecPtr); - RmgrTable[record-xl_rmid].rm_redo(EndRecPtr, record); /* Pop the error context stack */ --- 5218,5223 *** src/backend/access/transam/xlogutils.c --- src/backend/access/transam/xlogutils.c *** *** 217,224 XLogCheckInvalidPages(void) /* * XLogReadBuffer ! * A shorthand of XLogReadBufferExtended(), for reading from the main ! * fork. * * For historical reasons, instead of a ReadBufferMode argument, this only * supports RBM_ZERO (init == true) and RBM_NORMAL (init == false) modes. --- 217,234
Re: [HACKERS] Improving compressibility of WAL files
* Simon Riggs si...@2ndquadrant.com [090109 11:33]: The patch as stands is IMHO not acceptable because the work to zero the file is performed by the unlucky backend that hits EOF on the current WAL file, which is bad enough, but it is also performed while holding WALWriteLock. Agreed, but noting that that extra zero work is contitional on the force_swich, meaning that commits backup behind that WALWriteLock only during forced xlog switches (like archive_timeout and pg_backup). I actually did look through verify that when I made the patch, although I claim that verification to be something anybody else should beleive ;-) But my given output when I showd the stats/lines/etc did demonstrate that. I like Greg Smith's analysis of this and his conclusion that we could provide a %l option, but even that would require work to have that info passed to the archiver. Perhaps inside the notification file which is already written and read by the write processes. But whether that can or should be done for this release is a different debate. It's certainly not already in this commitfest, just like this patch. I thought the initial reaction after I posted it made it pretty clear it wasn't something people (other than a few of us) were willing to allow... a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] Solve a problem of LC_TIME of windows.
I thought this was supposed to be driven by LC_TIME now, not LC_MESSAGES. Uga, yes yes! HIROSHI=# set LC_TIME=Ja; SET HIROSHI=# select to_char(now(),'TMDay'); to_char - 土曜日 (1 行) Thanks:-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SET TRANSACTION and SQL Standard
On Fri, 2009-01-09 at 11:20 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: If any condition required by Syntax Rules is not satisfied when the evaluation of Access or General Rules is attempted and the implementation is neither processing non-conforming SQL language nor processing conforming SQL language in a non-conforming manner, then an exception condition is raised: syntax error or access rule violation. If we *choose* to be an SQL implementation that conforms to the SQL standard, then it should throw an error. That reading would forbid any nonstandard syntax whatsoever... No, it does allow you to choose on a case by case basis. But yes, I had thought our (not just my) default position was to conform to the standard. What this is actually describing is the standards conformance checking mode that the standard says you ought to provide, but we never have (nor have most other vendors AFAIK). In SQL92 this was described as a SQL Flagger and it was optional. Not sure what the latest spec says about that. I've been thinking about that as something for next release. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] foreign_data test fails with non-C locale
On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, the de facto policy is that we try to keep them passing in locales that are used by any of the regular developers. I think it would be useful to have buildfarm members testing in a few common locales. If you define common locales, I can set up as many new animals as needed to cover the locales needed for any branch we'd like to test. Perhaps we should add a parameter to the buildfarm config file so that the buildfarm script can check the locale is accepted and set it directly. Considering that we won't have the locale information in the animal description, it's a good way to have it in the report. Just let me know. -- Guillaume -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks
On Fri, 2009-01-09 at 18:30 +0200, Heikki Linnakangas wrote: The hot standby patch has some hacks to decide which full-page-images can be restored holding an exclusive lock and which ones need a vacuum-strength lock. It's not very pretty as is, as mentioned in comments too. Agreed! How about we refactor things so that redo-functions are responsible for calling RestoreBkpBlocks? The redo function can then pass an argument indicating what kind of lock is required. We should also change XLogReadBufferExtended so that it doesn't lock the page; the caller knows better what kind of a lock it needs. That makes it more analogous with ReadBufferExtended too, although I think we should keep XLogReadBuffer() unchanged for now. Much better idea, thanks. I felt a new rmgr function was overkill but couldn't think of how to do this. See attached patch. One shortfall of this patch is that you can pass only one argument to RestoreBkpBlocks, but there can multiple backup blocks in one WAL record. That's OK in current usage, though. If we're doing this because of cleanup locks, then I'd say we don't currently need a cleanup lock with any WAL record that uses multiple backup blocks. So we can just document that so anybody adding such a record in the future will be careful. So all seems good. Would you want to push ResolveRedoVisibilityConflicts() down into the rmgrs as well and make reachedSafeStartPoint a global? That is only called for cleanup records. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Automatic view update rules
--On Freitag, Januar 09, 2009 13:20:57 +0100 Bernd Helmle maili...@oopsware.de wrote: That means, View1 consists of View2 and so on. What happens now if someone is going to change View3, so that it's not updatable anymore? What the patch actually does is, scanning all relations/views involved in a current view (and cascading updates) und reject update rules as soon as it finds more than one relation within a view definition. Unfortunately this seems not to be enough, we had really check all involved views for updatability recursively. The infrastructure for this is already there, but i wonder if it could be made easier when we are going to maintain a separate is_updatable flag somewhere in the catalog, which would make checking the relation tree for updatability more easier. I've decided to check updatability of all involved views during view creation. Please find attached a new version with all other open issues adressed. -- Thanks Bernd view_update.patch.bz2 Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Aidan Van Dyk ai...@highrise.ca 01/09/09 10:22 AM The reason *I* shy way from pg_lesslog and pg_clearxlogtail, is that they seem to possibly be frail... I'm just scared of somethign changing in PG some time, and my pg_clearxlogtail not nowing, me forgetting to upgrade, and me not doing enough test of my actually restoring backups... A fair concern. I can't speak about pglesslog, but pg_clearxlogtail goes out of its way to minimize this risk. Changes to log records themselves can't break it; it is only dependent on the partitioning. It bails with a message to stderr and a non-zero return code if it finds anything obviously wrong. It also checks the WAL format for which it was compiled against the WAL format on which it was invoked, and issues a warning if they don't match. We ran into this once on a machine running multiple releases of PostgreSQL where the archive script invoked the wrong executable. It worked correctly in spite of the warning, but the warning was enough to alert us to the mismatch and change the path in the archive script. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks
On Fri, 2009-01-09 at 18:30 +0200, Heikki Linnakangas wrote: How about we refactor things? Was that a patch against HEAD or a patch on patch? It would be useful to nibble away at the patch, committing all the necessary refactorings to make the patch work. That will reduce size of eventual commit. Do you want me to start using the GIT repo to make it easier to extract parts? You'd need to show me the setup you use first. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New patch for Column-level privileges
Jamie, et al, * Jaime Casanova (jcasa...@systemguards.com.ec) wrote: para ! Currently, productnamePostgreSQL/productname does not recognize ! column-level SELECT privileges when a JOIN is involved. One possible workaround is to create a view having just the desired columns and then grant privileges to that view. /para Remove this from the documentation, and: - Other minor cleanups (thanks KaiGai) - Added pg_dump support - Added support for 'ALL(col1)' grant/revokes (helped with pg_dump) - Added more regression tests - Updated documentation accordingly Please test, comment, etc. Thanks, Stephen colprivs_wip.2009010901.diff.gz Description: Binary data signature.asc Description: Digital signature
Re: [HACKERS] foreign_data test fails with non-C locale
Guillaume Smet wrote: On Fri, Jan 9, 2009 at 5:24 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, the de facto policy is that we try to keep them passing in locales that are used by any of the regular developers. I think it would be useful to have buildfarm members testing in a few common locales. If you define common locales, I can set up as many new animals as needed to cover the locales needed for any branch we'd like to test. Perhaps we should add a parameter to the buildfarm config file so that the buildfarm script can check the locale is accepted and set it directly. Considering that we won't have the locale information in the animal description, it's a good way to have it in the report. Sure, we can easily have buildfarm's initdb step set any locale (and encoding, for that matter) we like. That's a simple change. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: Hard to beat for performance. I thought there was some technical snag. Yeah: the archiver process doesn't have that information available. Am I being really dim here - why isn't the first record in the WAL file a fixed-length record containing e.g. txid_start, time_start, txid_end, time_end, length? Write it once when you start using the file and once when it's finished. -- Richard Huxton Archonet Ltd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
* Richard Huxton d...@archonet.com [090109 12:22]: Yeah: the archiver process doesn't have that information available. Am I being really dim here - why isn't the first record in the WAL file a fixed-length record containing e.g. txid_start, time_start, txid_end, time_end, length? Write it once when you start using the file and once when it's finished. It would break the WAL write-block/sync-block forward only progress of the xlog, which avoids the whole torn-page problem that the heap has. a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] Hot standby, RestoreBkpBlocks and cleanup locks
Simon Riggs wrote: On Fri, 2009-01-09 at 18:30 +0200, Heikki Linnakangas wrote: How about we refactor things? Was that a patch against HEAD or a patch on patch? Against HEAD. It would be useful to nibble away at the patch, committing all the necessary refactorings to make the patch work. That will reduce size of eventual commit. Agreed. This in particular is a change I feel pretty confident to commit beforehand. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Maintaining patchset with GIT (was Re: Hot standby, RestoreBkpBlocks and cleanup locks)
Simon Riggs wrote: Do you want me to start using the GIT repo to make it easier to extract parts? It would be useful. Even more so for your own sanity, I think :-). I find maintaining multiple interdependent patches much easier with GIT, though it's still easy to get confused. Feel free to continue with CVS, of course. You'd need to show me the setup you use first. Well, the first thing to do is to clone the repository, see wiki for that. And get an account at git.postgresql.org so that you can publish your stuff as a git repository. (I should get one myself..) For reviewing hot standby, I created one hotstandbyv6a branch, and applied and committed all the patches in right order. But if you want to keep the patches separate, you should create a separate branch for each. If patch B depends on patch A, create branch A first, and then branch branch B from that. That's what I did for the relation forks and FSM work. Just remember to always commit to the right branch. Whenever you commit changes to branch A, also merge those changes to branch B with git checkout B; git merge A. To sync with PostgreSQL CVS HEAD: git merge origin/master To generate diffs, you can do for example git diff A..B to create a diff between A and B, or git diff origin/master..A to create a diff between PostgreSQL CVS HEAD and A. git log is also very helpful. There's a learning curve, but don't hesitate to ask. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: pgindent does a pretty awful job with function-pointer typedefs
Tom Lane wrote: I wonder if this can be improved: http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/optimizer/planner.h.diff?r1=1.41;r2=1.42 Similar examples can be found elsewhere ... Agreed, pgindent does a poor job aligning function pointers. I rerand planner.h here --- 'planner_hook_type' is defined as a typedef to BSD indent, but the alignment is clearly off, but I can't figure out how to improve it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Greg Smith gsm...@gregsmith.com wrote: I thought at one point that the direction this was going toward was to provide the size of the WAL file as a parameter you can use in the archive_command: Hard to beat for performance. I thought there was some technical snag. Yeah: the archiver process doesn't have that information available. OK, thanks, I understand now. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Aidan Van Dyk wrote: * Richard Huxton d...@archonet.com [090109 12:22]: Yeah: the archiver process doesn't have that information available. Am I being really dim here - why isn't the first record in the WAL file a fixed-length record containing e.g. txid_start, time_start, txid_end, time_end, length? Write it once when you start using the file and once when it's finished. It would break the WAL write-block/sync-block forward only progress of the xlog, which avoids the whole torn-page problem that the heap has. I thought that only applied when the filesystem page-size was less than the data we were writing? -- Richard Huxton Archonet Ltd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Simon Riggs si...@2ndquadrant.com writes: Yes, we could make the archiver do this, but I see no big advantage over having it done externally. It's not faster, safer, easier. Not easier because we would want a parameter to turn it off when not wanted. And the other question to ask is how much effort and code should we be putting into the concept anyway. AFAICS, file-at-a-time WAL shipping is a stopgap implementation that will be dead as a doornail once the current efforts towards realtime replication are finished. There will still be some use for forced log switches in connection with backups, but that's not going to occur often enough to be important to optimize. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Visibility map and freezing
On Fri, 2009-01-09 at 13:49 +0200, Heikki Linnakangas wrote: Thinking about this some more, I'm not too happy with those names either. vacuum_freeze_scan_age and autovacuum_freeze_scan_age don't mean quite the same thing, like vacuum_cost_delay and autovacuum_vacuum_cost_delay do, for example. If the distinction you're making is that autovacuum_freeze_max_age affects the launching of a vacuum rather than the behavior of a vacuum, maybe we could incorporate the word launch like: autovacuum_launch_freeze_threshold I'm now leaning towards: autovacuum_freeze_max_age vacuum_freeze_table_age vacuum_freeze_min_age where autovacuum_freeze_max_age and vacuum_freeze_min_age are unchanged, and vacuum_freeze_table_age is the new setting that controls when VACUUM or autovacuum should perform a full scan of the table to advance relfrozenxid. I'm still bothered by the fact that max and min really mean the same thing here. I don't think we can perfectly capture the meaning of these GUCs in the name. I think our goal should be to avoid confusion between them. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solve a problem of LC_TIME of windows.
Hiroshi Saito wrote: I thought this was supposed to be driven by LC_TIME now, not LC_MESSAGES. Uga, yes yes! HIROSHI=# set LC_TIME=Ja; SET HIROSHI=# select to_char(now(),'TMDay'); to_char - 土曜日 (1 行) Thanks:-) Great! Thanks for testing! //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Tom Lane t...@sss.pgh.pa.us wrote: AFAICS, file-at-a-time WAL shipping is a stopgap implementation that will be dead as a doornail once the current efforts towards realtime replication are finished. As long as there is a way to rsync log data to multiple targets not running replicas, with compression because of low-speed WAN connections, I'm happy. Doesn't matter whether that is using existing techniques or the new realtime techniques. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
On Fri, 2009-01-09 at 13:22 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Yes, we could make the archiver do this, but I see no big advantage over having it done externally. It's not faster, safer, easier. Not easier because we would want a parameter to turn it off when not wanted. And the other question to ask is how much effort and code should we be putting into the concept anyway. AFAICS, file-at-a-time WAL shipping is a stopgap implementation that will be dead as a doornail once the current efforts towards realtime replication are finished. There will still be some use for forced log switches in connection with backups, but that's not going to occur often enough to be important to optimize. Agreed. Half-filled WAL files were necessary to honour archive_timeout. With continuous streaming all WAL files will be 100% full before we switch, for most purposes. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hot standby, slot ids and stuff
On Fri, 2009-01-09 at 11:20 +, Simon Riggs wrote: On Fri, 2009-01-09 at 12:33 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: On Thu, 2009-01-08 at 15:50 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On Thu, 2009-01-08 at 22:31 +0200, Heikki Linnakangas wrote: When a backend dies with FATAL, it writes an abort record before exiting. (I was under the impression it doesn't until few minutes ago myself, when I actually read the shutdown code :-)) Not in all cases; keep reading :-) If it doesn't, that's a bug. A FATAL exit is not supposed to leave the shared state corrupted, it's only supposed to be a forcible session termination. Any open transaction should be rolled back. Please look back at the earlier discussion we had on this exact point: http://archives.postgresql.org/pgsql-hackers/2008-09/msg01809.php I think the running-xacts list we dump to WAL at every checkpoint is enough to handle that. Just treat the dead transaction as in-progress until the next running-xacts record. It's presumably extremely rare to have a process die with FATAL, and not write an abort record. I agree, but I'll wait for Tom to speak further. OK, will proceed without this. Time is pressing. Heikki and I both agree that FATAL errors that fail to write abort records are rare and an acceptable problem in real usage. If they do occur, their nuisance factor is short-lived because of measures taken within the patch. Hot Standby does not *rely* upon there always an abort record for FATAL errors, so we cannot reasonably say the current design would be unacceptably fragile as I had once thought. So based upon that, out comes the slotid concept and luckily much of the cruftier aspects of the patch. Less code, probably fewer bugs. Good thing. I will produce a new v7 with those changes and merge the changes for v6b also, so we can begin review again from there. Hi ho, hi ho... -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Maintaining patchset with GIT (was Re: Hot standby, RestoreBkpBlocks and cleanup locks)
On Fri, 2009-01-09 at 19:38 +0200, Heikki Linnakangas wrote: Simon Riggs wrote: Do you want me to start using the GIT repo to make it easier to extract parts? It would be useful. Thanks, this is all good. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Hiroshi, is this patch still needed? --- Hiroshi Inoue wrote: Magnus Hagander wrote: On 25 nov 2008, at 05.00, Tom Lane t...@sss.pgh.pa.us wrote: Hiroshi Inoue in...@tpf.co.jp writes: Tom Lane wrote: If that's true then this code is presently broken for *every* locale under Windows, not only Japanese. Maybe there are a few languages/countires where 2 encodings are widely used. UTF8 vs Latin-N? We already special-cases utf8... I think the thing us that as long as the encodings are compatible (latin1 with different names for example) it worked fine. In any case I think the problem is that gettext is looking at a setting that is not what we are looking at. Particularly with the 8.4 changes to allow per-database locale settings, this has got to be fixed in a bulletproof way. Attached is a new patch to apply bind_textdomain_codeset() to most server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL and PG_EUC_JIS_2004. EUC-JP may be OK for EUC_JIS_2004. Unfortunately it's hard for Saito-san and me to check encodings other than EUC-JP. regards, Hiroshi Inoue *** mbutils.c.origSun Nov 23 08:42:57 2008 --- mbutils.c Wed Nov 26 12:17:12 2008 *** *** 822,830 --- 822,870 return clen; } + #ifdef WIN32 + static const struct codeset_map { + int encoding; + const char *codeset; + } codeset_map_array[] = { + {PG_UTF8, UTF-8}, + {PG_LATIN1, LATIN1}, + {PG_LATIN2, LATIN2}, + {PG_LATIN3, LATIN3}, + {PG_LATIN4, LATIN4}, + {PG_ISO_8859_5, ISO-8859-5}, + {PG_ISO_8859_6, ISO_8859-6}, + {PG_ISO_8859_7, ISO-8859-7}, + {PG_ISO_8859_8, ISO-8859-8}, + {PG_LATIN5, LATIN5}, + {PG_LATIN6, LATIN6}, + {PG_LATIN7, LATIN7}, + {PG_LATIN8, LATIN8}, + {PG_LATIN9, LATIN-9}, + {PG_LATIN10, LATIN10}, + {PG_KOI8R, KOI8-R}, + {PG_WIN1250, CP1250}, + {PG_WIN1251, CP1251}, + {PG_WIN1252, CP1252}, + {PG_WIN1253, CP1253}, + {PG_WIN1254, CP1254}, + {PG_WIN1255, CP1255}, + {PG_WIN1256, CP1256}, + {PG_WIN1257, CP1257}, + {PG_WIN1258, CP1258}, + {PG_WIN866, CP866}, + {PG_WIN874, CP874}, + {PG_EUC_CN, EUC-CN}, + {PG_EUC_JP, EUC-JP}, + {PG_EUC_KR, EUC-KR}, + {PG_EUC_TW, EUC-TW}}; + #endif /* WIN32 */ + void SetDatabaseEncoding(int encoding) { + const char *target_codeset = NULL; + if (!PG_VALID_BE_ENCODING(encoding)) elog(ERROR, invalid database encoding: %d, encoding); *** *** 846,852 */ #ifdef ENABLE_NLS if (encoding == PG_UTF8) ! if (bind_textdomain_codeset(postgres, UTF-8) == NULL) elog(LOG, bind_textdomain_codeset failed); #endif } --- 886,907 */ #ifdef ENABLE_NLS if (encoding == PG_UTF8) ! target_codeset = UTF-8; ! #ifdef WIN32 ! else ! { ! int i; ! ! for (i = 0; i sizeof(codeset_map_array) / sizeof(struct codeset_map); i++) ! if (codeset_map_array[i].encoding == encoding) ! { ! target_codeset = codeset_map_array[i].codeset; ! break; ! } ! } ! #endif /* WIN32 */ ! if (target_codeset != NULL) ! if (bind_textdomain_codeset(postgres, target_codeset) == NULL) elog(LOG, bind_textdomain_codeset failed); #endif } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1
On Mon, 2008-12-22 at 13:47 -0600, Kenneth Marshall wrote: Dear PostgreSQL developers, I am re-sending this to keep this last change to the internal hash function on the radar. Hi Ken, A few comments: 1. New patch with very minor changes attached. 2. I reverted the change you made to indices.sgml. We still don't use WAL for hash indexes, and in my opinion we should continue to discourage their use until we do use WAL. We can add back in the comment that hash indexes are suitable for large keys if we have some results to show that. 3. There was a regression test failure in union.sql because the ordering of the results was different. I updated the regression test. 4. Hash functions affect a lot more than hash indexes, so I ran through a variety of tests that use a HashAggregate plan. Test setup and results are attached. These results show no difference between the old and the new code (about 0.1% better). 5. The hash index build time shows some improvement. The new code won in every instance in which a there were a lot of duplicates in the table (100 distinct values, 50K of each) by around 5%. The new code appeared to be the same or slightly worse in the case of hash index builds with few duplicates (100 distinct values, 5 of each). The difference was about 1% worse, which is probably just noise. Note: I'm no expert on hash functions. Take all of my tests with a grain of salt. I would feel a little better if I saw at least one test that showed better performance of the new code on a reasonable-looking distribution of data. The hash index build that you showed only took a second or two -- it would be nice to see a test that lasted at least a minute. Regards, Jeff Davis test_results.tar.gz Description: application/compressed-tar diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c index 96d5643..8a236b5 100644 --- a/src/backend/access/hash/hashfunc.c +++ b/src/backend/access/hash/hashfunc.c @@ -200,39 +200,94 @@ hashvarlena(PG_FUNCTION_ARGS) * hash function, see http://burtleburtle.net/bob/hash/doobs.html, * or Bob's article in Dr. Dobb's Journal, Sept. 1997. * - * In the current code, we have adopted an idea from Bob's 2006 update - * of his hash function, which is to fetch the data a word at a time when - * it is suitably aligned. This makes for a useful speedup, at the cost - * of having to maintain four code paths (aligned vs unaligned, and - * little-endian vs big-endian). Note that we have NOT adopted his newer - * mix() function, which is faster but may sacrifice some randomness. + * In the current code, we have adopted Bob's 2006 update of his hash + * which fetches the data a word at a time when it is suitably aligned. + * This makes for a useful speedup, at the cost of having to maintain + * four code paths (aligned vs unaligned, and little-endian vs big-endian). + * It also two separate mixing functions mix() and final(), instead + * of a slower multi-purpose function. */ /* Get a bit mask of the bits set in non-uint32 aligned addresses */ #define UINT32_ALIGN_MASK (sizeof(uint32) - 1) +#define rot(x,k) (((x)(k)) | ((x)(32-(k /*-- * mix -- mix 3 32-bit values reversibly. - * For every delta with one or two bits set, and the deltas of all three - * high bits or all three low bits, whether the original value of a,b,c - * is almost all zero or is uniformly distributed, - * - If mix() is run forward or backward, at least 32 bits in a,b,c - * have at least 1/4 probability of changing. - * - If mix() is run forward, every bit of c will change between 1/3 and - * 2/3 of the time. (Well, 22/100 and 78/100 for some 2-bit deltas.) + * + * This is reversible, so any information in (a,b,c) before mix() is + * still in (a,b,c) after mix(). + * + * If four pairs of (a,b,c) inputs are run through mix(), or through + * mix() in reverse, there are at least 32 bits of the output that + * are sometimes the same for one pair and different for another pair. + * This was tested for: + * * pairs that differed by one bit, by two bits, in any combination + * of top bits of (a,b,c), or in any combination of bottom bits of + * (a,b,c). + * * differ is defined as +, -, ^, or ~^. For + and -, I transformed + * the output delta to a Gray code (a^(a1)) so a string of 1's (as + * is commonly produced by subtraction) look like a single 1-bit + * difference. + * * the base values were pseudorandom, all zero but one bit set, or + * all zero plus a counter that starts at zero. + * + * This does not achieve avalanche. There are input bits of (a,b,c) + * that fail to affect some output bits of (a,b,c), especially of a. The + * most thoroughly mixed value is c, but it doesn't really even achieve + * avalanche in c. + * + * This allows some parallelism. Read-after-writes are good at doubling + * the number of bits affected, so the goal of mixing pulls in the opposite + * direction as the goal of parallelism. I did what
Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1
On Fri, Jan 09, 2009 at 12:04:15PM -0800, Jeff Davis wrote: On Mon, 2008-12-22 at 13:47 -0600, Kenneth Marshall wrote: Dear PostgreSQL developers, I am re-sending this to keep this last change to the internal hash function on the radar. Hi Ken, A few comments: 1. New patch with very minor changes attached. 2. I reverted the change you made to indices.sgml. We still don't use WAL for hash indexes, and in my opinion we should continue to discourage their use until we do use WAL. We can add back in the comment that hash indexes are suitable for large keys if we have some results to show that. 3. There was a regression test failure in union.sql because the ordering of the results was different. I updated the regression test. 4. Hash functions affect a lot more than hash indexes, so I ran through a variety of tests that use a HashAggregate plan. Test setup and results are attached. These results show no difference between the old and the new code (about 0.1% better). 5. The hash index build time shows some improvement. The new code won in every instance in which a there were a lot of duplicates in the table (100 distinct values, 50K of each) by around 5%. The new code appeared to be the same or slightly worse in the case of hash index builds with few duplicates (100 distinct values, 5 of each). The difference was about 1% worse, which is probably just noise. Note: I'm no expert on hash functions. Take all of my tests with a grain of salt. I would feel a little better if I saw at least one test that showed better performance of the new code on a reasonable-looking distribution of data. The hash index build that you showed only took a second or two -- it would be nice to see a test that lasted at least a minute. Regards, Jeff Davis Jeff, Thanks for the review. I would not really expect any differences in hash index build times other than normal noise variances. The most definitive benchmark that I have seen was done with my original patch submission in March posted by Luke of Greenplum: We just applied this and saw a 5 percent speedup on a hash aggregation query with four columns in a 'group by' clause run against a single TPC-H table. I wonder if they would be willing to re-run their test? Thanks again. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
On Fri, 9 Jan 2009, Simon Riggs wrote: Half-filled WAL files were necessary to honour archive_timeout. With continuous streaming all WAL files will be 100% full before we switch, for most purposes. The main use case I'm concerned about losing support for is: 1) Two systems connected by a WAN with significant transmit latency 2) The secondary system runs a warm standby aimed at disaster recovery 3) Business requirements want the standby to never be more than (say) 5 minutes behind the primary, presuming the WAN is up 4) WAN traffic is expensive (money==bandwidth, one of the two is scarce) This seems a pretty common scenario in my experience. Right now, this case is served quite well like this: -archive_timeout='5 minutes' -[pglesslog|pg_clearxlogtail] | gzip | rsync The main concern I have with switching to a more synchronous scheme is that network efficiency drops as the payload breaks into smaller pieces. I haven't had enough time to keep up with all the sync rep advances recently to know for sure if there's a configuration there that's suitable for this case. If that can be configured to send only in relatively large chunks, while still never letting things lag too far behind, then I'd agree completely that the case for any of these WAL cleaner utilities is dead--presuming said support makes it into the next release. If that's not available, say because the only useful option sends in very small pieces, there may still be a need for some utility to fill in for this particular requirement. Luckily there are many to choose from if it comes to that. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New patch for Column-level privileges
Stephen Frost sfr...@snowman.net writes: Please test, comment, etc. A few random comments based on a very fast scan through the patch: The RTE fields really ought to be bitmaps not integer lists. The list representation is expensive to store, copy, etc. You could use the same approach the HOT patch used, ie offset the indexes by FirstLowInvalidHeapAttributeNumber (cf pull_varattnos). Patch is desperately lacking comments surrounding the altered meaning of ATTRIBUTE_TUPLE_SIZE, the fact that TupleDescs no longer contain complete copies of pg_attribute rows, etc. It might be a good idea to rename ATTRIBUTE_TUPLE_SIZE to ATTRIBUTE_TUPLE_FIXED_PART_SIZE or something like that. Don't like exporting allocacl() from acl.c nor having aclchk.c be so intimate with the internal representation of ACLs. Seems like the operations actually needed could be represented a bit more abstractly, ie copy an ACL or merge two ACLs. applyColumnPrivs is misnamed as it's not applying any privileges nor indeed doing much of anything directly to do with privileges. It should probably be named something more like findReferencedColumns. And what's with the special exception for SortGroupClause!? Actually, though, you probably shouldn't have applyColumnPrivsWalker at all. It requires an additional traversal of the parse tree, and an additional RTE search for each var, for no good reason. Can't we just mark the column as referenced in make_var() and maybe a couple other places that already have their hands on the RTE? I don't see anything in transformDeleteStatement to handle the fact that DELETE ... WHERE foo = 42 requires select on foo. In the gram.y changes, don't treat CREATE differently from the other cases. You need a test and error in the statement execution code anyway for the case of a privilege type inappropriately applied to columns, and it's better to throw that very specific error message than the generic syntax error that bison is going to throw for CREATE (column list). The comment added to InsertPgAttributeTuple seems not to belong to it (copy/paste error?) What's the point of disallowing column privileges on a sequence? (aclcheck.c line 800 or so) It's not nonsensical to want to restrict what someone can do in SELECT * FROM sequence. pg_attribute_aclmask seems to need a rethink. I don't like the amount of policy copied-and-pasted from pg_class_aclmask, nor the fact that it will fail for system attributes (attnum 0), nor the fact that it insists on looping over the attributes even if the relation privileges would provide what's needed. (Indeed, you shouldn't need that merge ACLs operation at all -- you could be ORing a couple of bitmasks instead, no?) I don't find what you've done with no_priv_msg[] to be particularly comfortable: if anyone ever tried to print an ACL_KIND_COLUMN error with the regular code path (hardly unlikely) you'd get a core dump due to the format wanting two %s arguments with only one supplied. I think the best thing is for no_priv_msg[] to include just + gettext_noop(permission denied for column %s), and then make aclcheck_error_col() use its own error text rather than pulling from the array. Don't like naming of Priv node type, it's way too nonspecific. Also, you need outfuncs.c support for it, see comment at head of that file. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Bruce Momjian br...@momjian.us writes: Hiroshi, is this patch still needed? This patch is for a problem that's entirely separate from the LC_TIME issue, if that's what you were wondering. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
Greg Smith gsm...@gregsmith.com wrote: The main use case I'm concerned about losing support for is: 1) Two systems connected by a WAN with significant transmit latency 2) The secondary system runs a warm standby aimed at disaster recovery 3) Business requirements want the standby to never be more than (say) 5 minutes behind the primary, presuming the WAN is up 4) WAN traffic is expensive (money==bandwidth, one of the two is scarce) This seems a pretty common scenario in my experience. Right now, this case is served quite well like this: -archive_timeout='5 minutes' -[pglesslog|pg_clearxlogtail] | gzip | rsync You've come pretty close to describing our environment, other than having 72 primaries each using rsync to push the WAL files to another server at the same site while a server at the central site uses rsync to pull them back. We don't run warm standby on the backup server at the site of origin, and don't want to have to do so. It is critically important that the flow of xlog data never hold up the primary databases, and that failure to copy xlog to either of the targets not interfere with copying to the other. (We have WAN failures surprising often, sometimes for days at a time, and the backup server on-site is in the same rack of the same cabinet as the database server.) Compression of xlog data is important not only for WAN transmission, but for storage space. We keep two weeks of WAL files to allow recovery from either of the last two weekly backups, and we archive the first weekly backup of each month, with the WAL files needed for recovery, for one year. So it appears we care about somewhat similar issues. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Bruce Momjian wrote: Hiroshi, is this patch still needed? Yes though it should be slightly changed now. *set lc_messages does not work* issue isn't directly related to this issue. regards, Hiroshi Inoue -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1
On Fri, 2009-01-09 at 14:29 -0600, Kenneth Marshall wrote: Jeff, Thanks for the review. I would not really expect any differences in hash index build times other than normal noise variances. The most definitive benchmark that I have seen was done with my original patch submission in March posted by Luke of Greenplum: We just applied this and saw a 5 percent speedup on a hash aggregation query with four columns in a 'group by' clause run against a single TPC-H table. I wonder if they would be willing to re-run their test? Thanks again. Separating mix() and final() should have some performance benefit, right? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buffer pool statistics in Explain Analyze
Tom Lane t...@sss.pgh.pa.us writes: No, I think you misunderstood me entirely. The reason that I rejected those parts of the patch is that I think the statistics that are available are wrong/useless. If the bufmgr.c counters were really good for something they'd have been exposed long since (and we'd probably never have built a lot of the other stats collection infrastructure). The collective stats across the whole cluster and the individual stats for a specific query broken down by plan node are complementary. Depending on the circumstance people sometimes need each. I actually also wrote a patch exposing this same data. I think the bufmgr counters are flawed but still useful. Just as an example think of how often you have to explain why a sequential scan of a small table can be faster than an index scan. Seeing the index scan actually require more logical buffer fetches than a sequential scan would go a long way to clearing up that confusion. Better yet, users would be in a position to see whether the planner is actually estimating i/o costs accurately. The EXPLAIN ANALYZE code you submitted is actually kinda cute, and I'd have had no problem with it if I thought it were displaying numbers that were useful and unlikely to be obsoleted in future releases. The work that needs to be done is on collecting the numbers more than displaying them. I agree that we need more data -- my favourite direction is to use a programmatic interface to dtrace to find out how many buffer reads are satisfied from filesystem cache and how many from physical reads. But when we do that doesn't obviate the need for these stats, it would enhance them. You would get a clear view of how many buffer fetches were satisfied from shared buffers, filesystem cache, and physical reads. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 2 small patches that fix 8.3.5 compile issues on Vista+MingW+Msys
Uh, do we still need this patch? --- Charlie Savage wrote: Charlie Savage wrote: A couple of months ago I noted that 8.3.4 doesn't compile on Vista using MingW+msys under certain conditions: http://archives.postgresql.org/pgsql-hackers/2008-09/msg01496.php 8.3.5 has the same problem. Attached are two one line patches that fix it. Sorry, I attached incorrect patches in the original email. Here are the correct ones. Thanks, Charlie -- Charlie Savage http://cfis.savagexi.com *** libpq-be.h.oldWed Nov 5 23:32:50 2008 --- libpq-be.hWed Nov 5 23:35:52 2008 *** *** 47,52 --- 47,53 #ifdef ENABLE_SSPI #define SECURITY_WIN32 + #include ntsecapi.h #include security.h #undef SECURITY_WIN32 *** libpq-int.h.old Wed Nov 5 23:37:48 2008 --- libpq-int.h Wed Nov 5 23:38:14 2008 *** *** 54,59 --- 54,60 #ifdef ENABLE_SSPI #define SECURITY_WIN32 + #include ntsecapi.h #include security.h #undef SECURITY_WIN32 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #4516: FOUND variable does not work after RETURN QUERY
Bruce Momjian br...@momjian.us writes: Uh, is this ready to be applied? I don't think any consensus has been reached on changing this behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] per-database locale: createdb switches
Heikki Linnakangas wrote: Alvaro Herrera wrote: Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Alvaro Herrera wrote: I like Teodor's proposal; I'll see about implementing that. Attached. You missed updating the sgml docs, and personally I'd be inclined to list -l before the individual --lc switches; otherwise it looks fine. Thanks, committed that way. I noticed that --lc-ctype and --lc-collate were forgotten in SGML docs, so I added them too. Should we have a shorthand CREATE DATABASE option like that as well? createdb is really about convenience; not sure it is warranted for CREATE DATABASE. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
On Fri, 9 Jan 2009, Aidan Van Dyk wrote: *I* didn't see an easy way to get at the written size later on in the chain (i.e. in the actual archiving), so I took the path of least resitance. I was hoping it might fall out of the other work being done in that area, given how much that code is still being poked at right now. As Hannu pointed out, from a conceptual level you just need to carry along the same information that pg_current_xlog_location() returns to the archiver on all the paths where a segment might end early. If I wrapped this zeroing in GUC, people could choose wether to pay the penalty or not, would that satisfy anyone? Rather than creating a whole new GUC, it might it be possible to turn archive_mode into an enum setting: off, on, and cleaned as the modes perhaps. That would avoid making a new setting, with the downside that a bunch of critical code would look less clear than it does with a boolean. Again, *I* think that the force_switch case is going to happen when the admin's quite happy to pay that penalty... But obviously not everyone... I understand the case you've made for why it doesn't matter, and for almost every case you're right. The situation it may be vulnerable to is where a burst of transactions come in just as the archive timeout expires after minimal WAL activity. There I think you can end up with a bunch of clients waiting behind an almost full zero fill operation, which pushes up the worst-case latency. I've been able to measure the impact of the similar case where zero-filling a brand new segment can impact things; this would be much less like to happen because the timing would have to line up just wrong, but I think it's still possible. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1389)
Tom Lane wrote: I guess I'm still wondering which part of this actually needs to be hand-coded so that it can be flexible. I'm envisioning the whole loop replaced by something like FillRelOptions((void *) rdopts, options, constanttable); where the constant table contains entries like { fillfactor, RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor) } I attach a patch that does things this way (it includes the btree test code because I'm too lazy right now to strip it out). I'm not really sure about removing the other macros completely, because they would be useful whenever one wanted to create something nonstandard. BTW, are we just assuming that there's never a possibility of no match? It seems like there ought to be an elog complaint if you get to the bottom of the loop; which again is something I don't see the point of writing out each time. We need to be quiet about it when not validating, I think. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. Index: src/backend/access/common/reloptions.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/access/common/reloptions.c,v retrieving revision 1.17 diff -c -p -r1.17 reloptions.c *** src/backend/access/common/reloptions.c 8 Jan 2009 19:34:41 - 1.17 --- src/backend/access/common/reloptions.c 9 Jan 2009 23:40:36 - *** *** 34,43 * To add an option: * * (i) decide on a class (integer, real, bool, string), name, default value, ! * upper and lower bounds (if applicable). ! * (ii) add a record below. ! * (iii) add it to StdRdOptions if appropriate ! * (iv) add a block to the appropriate handling routine (probably * default_reloptions) * (v) don't forget to document the option * --- 34,44 * To add an option: * * (i) decide on a class (integer, real, bool, string), name, default value, ! * upper and lower bounds (if applicable); for string types, consider a ! * validation routine. ! * (ii) add a record below (or use add_type_reloption). ! * (iii) add it to the appropriate options struct (perhaps StdRdOptions) ! * (iv) add it to the appropriate handling routine (perhaps * default_reloptions) * (v) don't forget to document the option * *** parse_one_reloption(relopt_value *option *** 762,767 --- 763,854 pfree(value); } + void * + allocateReloptStruct(Size base, relopt_value *options, int numoptions) + { + Size size = base; + int i; + + for (i = 0; i numoptions; i++) + if (options[i].gen-type == RELOPT_TYPE_STRING) + size += GET_STRING_RELOPTION_LEN(options[i]) + 1; + + return palloc0(size); + } + + void + fillRelOptions(void *rdopts, Size basesize, relopt_value *options, + int numoptions, bool validate, reloptParseElem *elems, + int numelems) + { + int i; + int offset = basesize; + + for (i = 0; i numoptions; i++) + { + int j; + bool found = false; + + for (j = 0; j numelems; j++) + { + if (pg_strcasecmp(options[i].gen-name, elems[j].optname) == 0) + { + relopt_string *optstring; + char *itempos = ((char *) rdopts) + elems[j].offset; + char *string_val; + + switch (options[i].gen-type) + { + case RELOPT_TYPE_BOOL: + *(bool *) itempos = options[i].isset ? + options[i].values.bool_val : + ((relopt_bool *) options[i].gen)-default_val; + break; + case RELOPT_TYPE_INT: + *(int *) itempos = options[i].isset ? + options[i].values.int_val : + ((relopt_int *) options[i].gen)-default_val; + break; + case RELOPT_TYPE_REAL: + *(double *) itempos = options[i].isset ? + options[i].values.real_val : + ((relopt_real *) options[i].gen)-default_val; + break; + case RELOPT_TYPE_STRING: + optstring = (relopt_string *) options[i].gen; + if (options[i].isset) + string_val = options[i].values.string_val; + else if (!optstring-default_isnull) + string_val = optstring-default_val; + else + string_val = NULL; + + if (string_val == NULL) + *(int *) itempos = 0; + else + { + strcpy((char *) rdopts + offset, string_val); + *(int *) itempos = offset; + offset += strlen(string_val) + 1; + } + break; + default: + elog(ERROR, unrecognized reloption type %c, + options[i].gen-type); + break; + } + found = true; + break; + } + } + if (validate !found) + elog(ERROR, storate parameter \%s\ not found in parse table, + options[i].gen-name); + } + SET_VARSIZE(rdopts, offset); + } + + /* * Option parser for anything that uses StdRdOptions (i.e. fillfactor only) */ *** default_reloptions(Datum reloptions, boo *** 770,779 { relopt_value *options; StdRdOptions
Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1389)
Alvaro Herrera wrote: Tom Lane wrote: I guess I'm still wondering which part of this actually needs to be hand-coded so that it can be flexible. I'm envisioning the whole loop replaced by something like FillRelOptions((void *) rdopts, options, constanttable); where the constant table contains entries like { fillfactor, RELOPT_TYPE_INT, offsetof(StdRdOptions, fillfactor) } I attach a patch that does things this way (it includes the btree test code because I'm too lazy right now to strip it out). The irony of doing things this way is that we've come full-circle from the original coding of these routines (the main difference being that the default values and checks no longer need to be written as code, but rather as table entries). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving compressibility of WAL files
* Greg Smith gsm...@gregsmith.com [090109 18:39]: I was hoping it might fall out of the other work being done in that area, given how much that code is still being poked at right now. As Hannu pointed out, from a conceptual level you just need to carry along the same information that pg_current_xlog_location() returns to the archiver on all the paths where a segment might end early. I was(am) also hoping that somethig falls out of sync-rep that gives me better PITR backups (better than a small archive_timeout)... That hope is what made me abandon this patch after the initial feedback. Rather than creating a whole new GUC, it might it be possible to turn archive_mode into an enum setting: off, on, and cleaned as the modes perhaps. That would avoid making a new setting, with the downside that a bunch of critical code would look less clear than it does with a boolean. I'm content to wait and see what falls out of sync-rep stuff... ... for now ... I understand the case you've made for why it doesn't matter, and for almost every case you're right. The situation it may be vulnerable to is where a burst of transactions come in just as the archive timeout expires after minimal WAL activity. There I think you can end up with a bunch of clients waiting behind an almost full zero fill operation, which pushes up the worst-case latency. I've been able to measure the impact of the similar case where zero-filling a brand new segment can impact things; this would be much less like to happen because the timing would have to line up just wrong, but I think it's still possible. Ya, and it's one of just many of the times PG hits these worst-latency spikes ;-) GEnerally, it's *very* good... and once in a while, when all the stars line up correctly, you get these spikes But even with these spikes, it's plenty fast enough for the stuff I've done... a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] New patch for Column-level privileges
Tom, et al, * Tom Lane (t...@sss.pgh.pa.us) wrote: A few random comments based on a very fast scan through the patch: Thanks for the feedback! The RTE fields really ought to be bitmaps not integer lists. The list representation is expensive to store, copy, etc. You could use the same approach the HOT patch used, ie offset the indexes by FirstLowInvalidHeapAttributeNumber (cf pull_varattnos). Done (was actually easier than I expected it to be). Patch is desperately lacking comments surrounding the altered meaning of ATTRIBUTE_TUPLE_SIZE, the fact that TupleDescs no longer contain complete copies of pg_attribute rows, etc. It might be a good idea to rename ATTRIBUTE_TUPLE_SIZE to ATTRIBUTE_TUPLE_FIXED_PART_SIZE or something like that. Done. Don't like exporting allocacl() from acl.c nor having aclchk.c be so intimate with the internal representation of ACLs. Seems like the operations actually needed could be represented a bit more abstractly, ie copy an ACL or merge two ACLs. Done. applyColumnPrivs is misnamed as it's not applying any privileges nor indeed doing much of anything directly to do with privileges. It should probably be named something more like findReferencedColumns. And what's with the special exception for SortGroupClause!? I'm not sure what the story with SortGroupClause is.. Might just have been a bit more difficult to do. KaiGai? Actually, though, you probably shouldn't have applyColumnPrivsWalker at all. It requires an additional traversal of the parse tree, and an additional RTE search for each var, for no good reason. Can't we just mark the column as referenced in make_var() and maybe a couple other places that already have their hands on the RTE? I certainly like this idea and I'll look into it, but it might take me a bit longer to find the appropriate places beyond make_var(). I don't see anything in transformDeleteStatement to handle the fact that DELETE ... WHERE foo = 42 requires select on foo. I've fixed this (I believe). In the gram.y changes, don't treat CREATE differently from the other cases. You need a test and error in the statement execution code anyway for the case of a privilege type inappropriately applied to columns, and it's better to throw that very specific error message than the generic syntax error that bison is going to throw for CREATE (column list). Done. The comment added to InsertPgAttributeTuple seems not to belong to it (copy/paste error?) Fixed. What's the point of disallowing column privileges on a sequence? (aclcheck.c line 800 or so) It's not nonsensical to want to restrict what someone can do in SELECT * FROM sequence. I've removed the offending code but to be honest I'm a bit nervous that there are other parts of the code that aren't expecting a sequence and may have to be changed. pg_attribute_aclmask seems to need a rethink. I don't like the amount of policy copied-and-pasted from pg_class_aclmask, nor the fact that it will fail for system attributes (attnum 0), nor the fact that it insists on looping over the attributes even if the relation privileges would provide what's needed. (Indeed, you shouldn't need that merge ACLs operation at all -- you could be ORing a couple of bitmasks instead, no?) I'll have to think of the entry points for pg_attribute_aclmask. In general, we shouldn't ever get to it if the relation privileges are sufficient but I think it's exposed to the user at some level, where we would be wrong to say a user doesn't have rights on the column when they do have the appropriate table-level rights. Unfortunately, aclmask() uses information beyond the bitmasks (who granted them, specifically) so I don't think we can just OR the bitmasks. With regard to looping through the attributes, I could split it up into two functions, would that be better? A function that handles all-attribute cases (either 'AND'd or 'OR'd together depending on what the caller wants) could be added pretty easily and then pg_attribute_aclmask could be reverted to just handling a single attribute at a time (which would fix it for system attributes as well..). I don't find what you've done with no_priv_msg[] to be particularly comfortable: if anyone ever tried to print an ACL_KIND_COLUMN error with the regular code path (hardly unlikely) you'd get a core dump due to the format wanting two %s arguments with only one supplied. I think the best thing is for no_priv_msg[] to include just + gettext_noop(permission denied for column %s), and then make aclcheck_error_col() use its own error text rather than pulling from the array. Done. Don't like naming of Priv node type, it's way too nonspecific. Also, you need outfuncs.c support for it, see comment at head of that file. Done. Updated patch attached. Thanks! Stephen colprivs_2009010902.diff.gz Description: Binary data signature.asc Description: Digital signature