Re: [HACKERS] review: tab completion for set search_path TO
Hello I am sorry, I expected so review shold to start with new thread. I was wrong, so next reviews I will use e existing threads Regards pavel Dne 23. 6. 2014 7:39 Michael Paquier michael.paqu...@gmail.com napsal(a): On Sun, Jun 22, 2014 at 2:22 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello, this patch https://commitfest.postgresql.org/action/patch_view?id=1443 is trivial with zero risk. Patch is applicable without any issues, compilation was without any issues too. Just wondering: why creating a new thread for a review and not reply directly reply to the exiting one? This makes the review/patch submission flow rather complicated to follow. -- Michael
Re: [HACKERS] pg_resetxlog to clear backup start/end locations.
Hello, thank you for the comments. On Sun, Jun 22, 2014 at 8:54 PM, Simon Riggs si...@2ndquadrant.com wrote: On 13 June 2014 12:27, Fujii Masao masao.fu...@gmail.com wrote: I think that pg_resetxlog should reset backup locations by default since they are useless (rather harmful) after pg_resetxlog. Thought? +1 Do we regard that point as a bug that should be backpatched? Yep, I think so. ... Ouch, I was too short-sighted :( That is pretty natural to do so after hearing that. I should have pointed this at the previous discusion. I assume the primary usage of this patch to be, as described before, Dissolving a recovery freezing caused by wrongly placed backup label. Crash recovery has been already done at that time so resetxlog's current behavior seems also fittin the situation, I suppose. Ok, I'm doing modify it to reset backup locations by default and remove the new option '-b' to do that. Since this seems looking to be a bug for the poeple, I'll provide backpatches back to... 8.4? (Final release of 8.4 is scheduled at July 2014) Concerning documentation, I see no need to edit it if no one wants it more detailed after the visible option has been called off. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to change the pgsql source code and build it??
Hello, I don't know you environment so I don't see how many additional changes are needed, but still I think the message should not be seen there. @Fabrizio de Royes Mello, Even upon making changes as per your suggestion, I could see that initdb is failing for the same reason: /mswitch/pgsql/bin # ./initdb -D ../data/ ... creating template1 database in ../data/base/1 ... root execution of the PostgreSQL server is not permitted. The server must be started under an unprivileged user ID to prevent .. Please let me know if I need to make changes in any other place. Appreciate your help! The complaint looks to be made at the first one among the four points I have shown, as far as I can see. Are you sure that the 'postgres' binary invoked at the time has been built after modification? 'which postgres' will show you the file to be checked and replaced. | postgresql $ find . -type f -print | xargs grep -nH 'geteuid() == 0' | ./src/backend/main/main.c:377: if (geteuid() == 0) | ./src/bin/pg_ctl/pg_ctl.c:2121: if (geteuid() == 0) | ./src/bin/initdb/initdb.c:778: if (geteuid() == 0) /* 0 is root's uid */ | ./src/bin/pg_resetxlog/pg_resetxlog.c:250: if (geteuid() == 0) On Fri, Jun 13, 2014 at 9:43 AM, Tom Lane t...@sss.pgh.pa.us wrote: =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes: On Thu, Jun 12, 2014 at 10:59 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Try replacing these conditions with (0 geteuid() == 0) and you would see it run as root. Maybe a compile option like '--enable-run-as-root' could be added to allow it without the need of change the source code. If we wanted it to be easy to run PG as root, those tests wouldn't be there in the first place. We don't want that, so there will never be anything to override that policy as easily as setting a configure option. In the case at hand, the OP wants to run on some weird system that apparently doesn't have multiple users at all. It's a pretty safe bet that hacking out those geteuid tests is just the tip of the iceberg of what he's going to have to change to make it work, because it probably deviates from typical-Unix-environment in a lot of other ways too. So I can't get too concerned about how easy this one aspect is for him. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_resetxlog to clear backup start/end locations.
On Mon, Jun 23, 2014 at 3:49 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, thank you for the comments. On Sun, Jun 22, 2014 at 8:54 PM, Simon Riggs si...@2ndquadrant.com wrote: On 13 June 2014 12:27, Fujii Masao masao.fu...@gmail.com wrote: I think that pg_resetxlog should reset backup locations by default since they are useless (rather harmful) after pg_resetxlog. Thought? +1 Do we regard that point as a bug that should be backpatched? Yep, I think so. ... Ouch, I was too short-sighted :( That is pretty natural to do so after hearing that. I should have pointed this at the previous discusion. I assume the primary usage of this patch to be, as described before, Dissolving a recovery freezing caused by wrongly placed backup label. Crash recovery has been already done at that time so resetxlog's current behavior seems also fittin the situation, I suppose. One question is; is there case where a user wants to reset only backup locations? I'm not sure if there are such cases. If they exist, probably we should implement new option which resets only backup locations. Thought? Ok, I'm doing modify it to reset backup locations by default and remove the new option '-b' to do that. Since this seems looking to be a bug for the poeple, I'll provide backpatches back to... 8.4? (Final release of 8.4 is scheduled at July 2014) I was thinking that we don't need to backpatch this to 8.4 because 8.4 doesn't have any backup locations. No? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
Hello 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Main reason is a PGPASSWORD -- it is probably most used env variable with psql PGPASSWORD=** psql is very often used pattern Regards Pavel Stehule Regards, -- Fujii Masao
Re: [HACKERS] Audit of logout
On Sat, Jun 21, 2014 at 12:59 PM, Joe Conway m...@joeconway.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 06/13/2014 07:29 AM, Tom Lane wrote: Fujii Masao masao.fu...@gmail.com writes: On Thu, Jun 12, 2014 at 8:51 PM, Fujii Masao masao.fu...@gmail.com wrote: Some users enable log_disconnections in postgresql.conf to audit all logouts. But since log_disconnections is defined with PGC_BACKEND, it can be changed at connection start. This means that any client (even nonsuperuser) can freely disable log_disconnections not to log his or her logout even when the system admin enables it in postgresql.conf. Isn't this problematic for audit? That's harmful for audit purpose. I think that we should make log_disconnections PGC_SUSET rather than PGC_BACKEND in order to forbid non-superusers from changing its setting. Attached patch does this. This whole argument seems wrong unless I'm missing something: test=# set log_connections = on; ERROR: parameter log_connections cannot be set after connection start test=# set log_disconnections = off; ERROR: parameter log_disconnections cannot be set after connection start You can change log_connections/disconnections via connection option as follows $ grep log_disconnections $PGDATA/postgresql.conf log_disconnections = on $ psql -U hoge -d options='-c log_disconnections=off' = show log_disconnections ; log_disconnections off (1 row) = \du List of roles Role name | Attributes | Member of ---++--- hoge || {} postgres | Superuser, Create role, Create DB, Replication | {} I wonder whether we should just get rid of log_disconnections as a separate variable, instead logging disconnections when log_connections is set. That might be a good idea though. David pointed the merit of keeping those two parameters separate upthread and I understand his thought. http://www.postgresql.org/message-id/1402675662004-5807224.p...@n5.nabble.com Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Main reason is a PGPASSWORD -- it is probably most used env variable with psql PGPASSWORD=** psql is very often used pattern But it's not recommended as the help message which the patch added says ;) Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
2014-06-23 10:57 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Main reason is a PGPASSWORD -- it is probably most used env variable with psql PGPASSWORD=** psql is very often used pattern But it's not recommended as the help message which the patch added says ;) why? who can see this value? regards Pavel Regards, -- Fujii Masao
Re: [HACKERS] pg_resetxlog to clear backup start/end locations.
Hi, At Mon, 23 Jun 2014 17:10:05 +0900, Fujii Masao masao.fu...@gmail.com wrote in cahgqgwfy_cdmfuriu6zoat2htqo_eijaj7vwewysol15oct...@mail.gmail.com I assume the primary usage of this patch to be, as described before, Dissolving a recovery freezing caused by wrongly placed backup label. Crash recovery has been already done at that time so resetxlog's current behavior seems also fittin the situation, I suppose. One question is; is there case where a user wants to reset only backup locations? I'm not sure if there are such cases. If they exist, probably we should implement new option which resets only backup locations. Thought? As I described as above, I don't see obvious use case where ONLY backup location is needed to be resetted. The option you proposed not only clears backup locations but also inhibits resetting xlog infos which would be done without the option. The behavior would puzzle users. Ok, I'm doing modify it to reset backup locations by default and remove the new option '-b' to do that. Since this seems looking to be a bug for the poeple, I'll provide backpatches back to... 8.4? (Final release of 8.4 is scheduled at July 2014) I was thinking that we don't need to backpatch this to 8.4 because 8.4 doesn't have any backup locations. No? I'm not so knowledgeable about ancint(?) specs:p Of course this phrase means back to 8.4 that this patch has any meaning. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
Hi, Selecting tableoid on parent causes an error, ERROR: cannot extract system attribute from virtual tuple. The foreign table has an OID which can be reported as tableoid for the rows coming from that foreign table. Do we want to do that? On Fri, Jun 20, 2014 at 1:34 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, Before continueing discussion, I tried this patch on the current head. This patch applies cleanly except one hunk because of a modification just AFTER it, which did no harm. Finally all regression tests suceeded. Attached is the rebased patch of v11 up to the current master. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output
On Sat, Apr 12, 2014 at 9:25 PM, Andres Freund and...@2ndquadrant.com wrote: Hi, The last week I twice had the need to see how many backends had some buffers pinned. Once during development and once while analyzing a stuck vacuum (waiting for a cleanup lock). I'd like to add a column to pg_buffercache exposing that. The name I've come up with is 'pinning_backends' to reflect the fact that it's not the actual pincount due to the backend private arrays. This name sounds good to me. +CREATE OR REPLACE VIEW pg_buffercache AS +SELECT P.* FROM pg_buffercache_pages() AS P +(bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid, + relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2, + pincount int8); pincount should be pinning_backends here? This may be harmless but pinning_backends should be defined as int4 rather than int8 because BufferDesc-refcount is just defined as unsigned and it's converted to Datum by Int32GetDatum(). +-- complain if script is sourced in psql, rather than via CREATE EXTENSION s/CREATE/ALTER +\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit The message should be something like ALTER EXTENSION pg_buffercache UPDATE TO '1.1'. +/* might not be used, but the array is long enough */ +values[8] = Int32GetDatum(fctx-record[i].pinning_backends); +nulls[8] = false; Why is the above source comment needed? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output
On 2014-06-23 18:44:24 +0900, Fujii Masao wrote: On Sat, Apr 12, 2014 at 9:25 PM, Andres Freund and...@2ndquadrant.com wrote: Hi, The last week I twice had the need to see how many backends had some buffers pinned. Once during development and once while analyzing a stuck vacuum (waiting for a cleanup lock). I'd like to add a column to pg_buffercache exposing that. The name I've come up with is 'pinning_backends' to reflect the fact that it's not the actual pincount due to the backend private arrays. This name sounds good to me. +CREATE OR REPLACE VIEW pg_buffercache AS +SELECT P.* FROM pg_buffercache_pages() AS P +(bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid, + relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2, + pincount int8); pincount should be pinning_backends here? Yes. I'd changed my mind around a couple of times and apparently didn't send the right version of the patch. Thanks. This may be harmless but pinning_backends should be defined as int4 rather than int8 because BufferDesc-refcount is just defined as unsigned and it's converted to Datum by Int32GetDatum(). Well, in theory a uint32 can't generally be converted to a int32. That's why I chose a int64 because it's guaranteed to have sufficient range. It's fairly unlikely to have that many pins, but using a int64 seems cheap enough here. +-- complain if script is sourced in psql, rather than via CREATE EXTENSION s/CREATE/ALTER +\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit Hm, right. The message should be something like ALTER EXTENSION pg_buffercache UPDATE TO '1.1'. +/* might not be used, but the array is long enough */ +values[8] = Int32GetDatum(fctx-record[i].pinning_backends); +nulls[8] = false; Why is the above source comment needed? It tries to explain that while the caller doesn't necessarily look at values[8] (if it's the old pg_proc entry) but we're guaranteed to have allocated a long enough values/nulls array. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
On Mon, Jun 23, 2014 at 6:06 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-06-23 10:57 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Main reason is a PGPASSWORD -- it is probably most used env variable with psql PGPASSWORD=** psql is very often used pattern But it's not recommended as the help message which the patch added says ;) why? who can see this value? I'm sure that the document explains this. http://www.postgresql.org/docs/devel/static/libpq-envars.html --- PGPASSWORD behaves the same as the password connection parameter. Use of this environment variable is not recommended for security reasons, as some operating systems allow non-root users to see process environment variables via ps; instead consider using the ~/.pgpass file --- Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output
On Mon, Jun 23, 2014 at 6:51 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-23 18:44:24 +0900, Fujii Masao wrote: On Sat, Apr 12, 2014 at 9:25 PM, Andres Freund and...@2ndquadrant.com wrote: Hi, The last week I twice had the need to see how many backends had some buffers pinned. Once during development and once while analyzing a stuck vacuum (waiting for a cleanup lock). I'd like to add a column to pg_buffercache exposing that. The name I've come up with is 'pinning_backends' to reflect the fact that it's not the actual pincount due to the backend private arrays. This name sounds good to me. +CREATE OR REPLACE VIEW pg_buffercache AS +SELECT P.* FROM pg_buffercache_pages() AS P +(bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid, + relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2, + pincount int8); pincount should be pinning_backends here? Yes. I'd changed my mind around a couple of times and apparently didn't send the right version of the patch. Thanks. This may be harmless but pinning_backends should be defined as int4 rather than int8 because BufferDesc-refcount is just defined as unsigned and it's converted to Datum by Int32GetDatum(). Well, in theory a uint32 can't generally be converted to a int32. That's why I chose a int64 because it's guaranteed to have sufficient range. It's fairly unlikely to have that many pins, but using a int64 seems cheap enough here. Yep, you're right. +-- complain if script is sourced in psql, rather than via CREATE EXTENSION s/CREATE/ALTER +\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit Hm, right. The message should be something like ALTER EXTENSION pg_buffercache UPDATE TO '1.1'. +/* might not be used, but the array is long enough */ +values[8] = Int32GetDatum(fctx-record[i].pinning_backends); +nulls[8] = false; Why is the above source comment needed? It tries to explain that while the caller doesn't necessarily look at values[8] (if it's the old pg_proc entry) but we're guaranteed to have allocated a long enough values/nulls array. Understood. I think you can commit this patch after fixing some minor things. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] crash with assertions and WAL_DEBUG
On 06/21/2014 01:58 PM, Heikki Linnakangas wrote: It's a bit difficult to attach the mark to the palloc calls, as neither the WAL_DEBUG or LWLOCK_STATS code is calling palloc directly, but marking specific MemoryContexts as sanctioned ought to work. I'll take a stab at that. I came up with the attached patch. It adds a function called MemoryContextAllowInCriticalSection(), which can be used to exempt specific memory contexts from the assertion. The following contexts are exempted: * ErrorContext * MdCxt, which is used in checkpointer to absorb fsync requests. (the checkpointer process as a whole is no longer exempt) * The temporary StringInfos used in WAL_DEBUG (a new memory WAL Debug context is now created for them) * LWLock stats hash table (a new LWLock stats context is created for it) Barring objections, I'll commit this to master, and remove the assertion from REL9_4_STABLE. - Heikki diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index abc5682..e141a28 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -60,6 +60,7 @@ #include storage/spin.h #include utils/builtins.h #include utils/guc.h +#include utils/memutils.h #include utils/ps_status.h #include utils/relmapper.h #include utils/snapmgr.h @@ -1258,6 +1259,25 @@ begin:; if (XLOG_DEBUG) { StringInfoData buf; + static MemoryContext walDebugCxt = NULL; + MemoryContext oldCxt; + + /* + * Allocations within a critical section are normally not allowed, + * because allocation failure would lead to a PANIC. But this is just + * debugging code that no-one is going to enable in production, so we + * don't care. Use a memory context that's exempt from the rule. + */ + if (walDebugCxt == NULL) + { + walDebugCxt = AllocSetContextCreate(TopMemoryContext, +WAL Debug, +ALLOCSET_DEFAULT_MINSIZE, +ALLOCSET_DEFAULT_INITSIZE, +ALLOCSET_DEFAULT_MAXSIZE); + MemoryContextAllowInCriticalSection(walDebugCxt, true); + } + oldCxt = MemoryContextSwitchTo(walDebugCxt); initStringInfo(buf); appendStringInfo(buf, INSERT @ %X/%X: , @@ -1282,10 +1302,11 @@ begin:; appendStringInfoString(buf, - ); RmgrTable[rechdr-xl_rmid].rm_desc(buf, (XLogRecord *) recordbuf.data); - pfree(recordbuf.data); } elog(LOG, %s, buf.data); - pfree(buf.data); + + MemoryContextSwitchTo(oldCxt); + MemoryContextReset(walDebugCxt); } #endif diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index d23ac62..debce86 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -106,6 +106,7 @@ typedef struct lwlock_stats static int counts_for_pid = 0; static HTAB *lwlock_stats_htab; +static MemoryContext lwlock_stats_cxt; #endif #ifdef LOCK_DEBUG @@ -143,16 +144,34 @@ init_lwlock_stats(void) { HASHCTL ctl; - if (lwlock_stats_htab != NULL) + if (lwlock_stats_cxt != NULL) { - hash_destroy(lwlock_stats_htab); + MemoryContextDelete(lwlock_stats_cxt); + lwlock_stats_cxt = NULL; + /* the hash table went away with the context */ lwlock_stats_htab = NULL; } + /* + * The LWLock stats will be updated within a critical section, which + * requires allocating new hash entries. Allocations within a critical + * section are normally not allowed because running out of memory would + * lead to a PANIC, but LWLOCK_STATS is debugging code that's not normally + * turned on in production, so that's an acceptable risk. The hash entries + * are small, so the risk of running out of memory is minimal in practice. + */ + lwlock_stats_cxt = AllocSetContextCreate(TopMemoryContext, + LWLock stats, + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + MemoryContextAllowInCriticalSection(lwlock_stats_cxt, true); + MemSet(ctl, 0, sizeof(ctl)); ctl.keysize = sizeof(lwlock_stats_key); ctl.entrysize = sizeof(lwlock_stats); ctl.hash = tag_hash; + ctl.hcxt = lwlock_stats_cxt; lwlock_stats_htab = hash_create(lwlock stats, 16384, ctl, HASH_ELEM | HASH_FUNCTION); counts_for_pid = MyProcPid; diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index 3c1c81a..4264373 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -219,6 +219,16 @@ mdinit(void) hash_ctl, HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT); pendingUnlinks = NIL; + + /* + * XXX: The checkpointer needs to add entries to the pending ops + * table when absorbing fsync requests. That is done within a critical + * section. It means that there's a theoretical possibility that you + * run out of memory while absorbing fsync requests, which leads to + * a PANIC. Fortunately the hash table is small so that's unlikely to + * happen in practice. + */ + MemoryContextAllowInCriticalSection(MdCxt, true); } } diff
Re: [HACKERS] crash with assertions and WAL_DEBUG
On 2014-06-23 12:58:19 +0300, Heikki Linnakangas wrote: On 06/21/2014 01:58 PM, Heikki Linnakangas wrote: It's a bit difficult to attach the mark to the palloc calls, as neither the WAL_DEBUG or LWLOCK_STATS code is calling palloc directly, but marking specific MemoryContexts as sanctioned ought to work. I'll take a stab at that. I came up with the attached patch. It adds a function called MemoryContextAllowInCriticalSection(), which can be used to exempt specific memory contexts from the assertion. The following contexts are exempted: * ErrorContext * MdCxt, which is used in checkpointer to absorb fsync requests. (the checkpointer process as a whole is no longer exempt) * The temporary StringInfos used in WAL_DEBUG (a new memory WAL Debug context is now created for them) * LWLock stats hash table (a new LWLock stats context is created for it) Barring objections, I'll commit this to master, and remove the assertion from REL9_4_STABLE. Sounds generally sane. Some comments: diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index abc5682..e141a28 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -60,6 +60,7 @@ #include storage/spin.h #include utils/builtins.h #include utils/guc.h +#include utils/memutils.h #include utils/ps_status.h #include utils/relmapper.h #include utils/snapmgr.h @@ -1258,6 +1259,25 @@ begin:; if (XLOG_DEBUG) { StringInfoData buf; + static MemoryContext walDebugCxt = NULL; + MemoryContext oldCxt; + + /* + * Allocations within a critical section are normally not allowed, + * because allocation failure would lead to a PANIC. But this is just + * debugging code that no-one is going to enable in production, so we + * don't care. Use a memory context that's exempt from the rule. + */ + if (walDebugCxt == NULL) + { + walDebugCxt = AllocSetContextCreate(TopMemoryContext, + WAL Debug, + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + MemoryContextAllowInCriticalSection(walDebugCxt, true); + } + oldCxt = MemoryContextSwitchTo(walDebugCxt); This will only work though if the first XLogInsert() isn't called from a critical section. I'm not sure it's a good idea to rely on that. diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index 3c1c81a..4264373 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -219,6 +219,16 @@ mdinit(void) hash_ctl, HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT); pendingUnlinks = NIL; + + /* + * XXX: The checkpointer needs to add entries to the pending ops + * table when absorbing fsync requests. That is done within a critical + * section. It means that there's a theoretical possibility that you + * run out of memory while absorbing fsync requests, which leads to + * a PANIC. Fortunately the hash table is small so that's unlikely to + * happen in practice. + */ + MemoryContextAllowInCriticalSection(MdCxt, true); } } Isn't that allowing a bit too much? We e.g. shouldn't allow _fdvec_alloc() within a crritical section. Might make sense to create a child context for it. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Use a signal to trigger a memory context dump?
Hi, I wonder if it'd make sense to allow a signal to trigger a memory context dump? I and others more than once had the need to examine memory usage on production systems and using gdb isn't always realistic. I wonder if we could install a signal handler for some unused signal (e.g. SIGPWR) to dump memory. I'd also considered adding a SQL function that uses the SIGUSR1 signal multiplexing for the purpose but that's not necessarily nice if you have to investigate while SQL access isn't yet possible. There's also the problem that not all possibly interesting processes use the sigusr1 signal multiplexing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On Fri, May 2, 2014 at 3:19 PM, Ian Barwick i...@2ndquadrant.com wrote: Hi Here is an initial version of an auditing extension for Postgres to generate log output suitable for compiling a comprehensive audit trail of database operations. You added this into CF, but its patch has not been posted yet. Are you planning to make a patch? Planned future improvements include: 1. Additional logging facilities, including to a separate audit log file and to syslog, and potentially logging to a table (possibly via a bgworker process). Currently output is simply emitted to the server log via ereport(). 2. To implement per-object auditing configuration, it would be nice to use extensible reloptions (or an equivalent mechanism) Is it possible to implement these outside PostgreSQL by using hooks? If not, it might be better to implement audit feature in core from the beginning. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On 2014-06-22 19:03:32 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: I think we'll want a version of this that just fails the transaction once we have the infrastructure. So we should choose a name that allows for a complimentary GUC. If we stick with the rule that what is to the left of _timeout is what is being cancelled, the a GUC to cancel a transaction which remains idle for too long could be called idle_transaction_timeout. Do you disagree with the general idea of following that pattern? I think that'd be rather confusing. For one it'd need to be idle_in_transaction_timeout which already seems less clear (because the transaction belongs to idle) and for another that distinction seems to be to subtle for users. The reason I suggested idle_in_transaction_termination/cancellation_timeout is that that maps nicely to pg_terminate/cancel_backend() and is rather descriptive. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
(I'm replying as co-author of pgaudit.) At 2014-06-23 19:15:39 +0900, masao.fu...@gmail.com wrote: You added this into CF, but its patch has not been posted yet. Are you planning to make a patch? It's a self-contained contrib module. I thought Ian had posted a tarball, but it looks like he forgot to attach it (or decided to provide only a Github link). I've attached a tarball here for your reference. Planned future improvements include: 1. Additional logging facilities, including to a separate audit log file and to syslog, and potentially logging to a table (possibly via a bgworker process). Currently output is simply emitted to the server log via ereport(). 2. To implement per-object auditing configuration, it would be nice to use extensible reloptions (or an equivalent mechanism) Is it possible to implement these outside PostgreSQL by using hooks? There are some unresolved questions with #2 because the extensible reloptions patch seems to have lost favour, but I'm pretty sure we could figure out some alternative. If not, it might be better to implement audit feature in core from the beginning. Sure, we're open to that possibility. Do you have any ideas about what an in-core implementation should do/look like? -- Abhijit pgaudit.tgz Description: GNU Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
2014-06-23 11:53 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 6:06 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 2014-06-23 10:57 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Main reason is a PGPASSWORD -- it is probably most used env variable with psql PGPASSWORD=** psql is very often used pattern But it's not recommended as the help message which the patch added says ;) why? who can see this value? I'm sure that the document explains this. ok I am too Linux centrist :( it is safe there and I don't see a others Thank you for info Regards Pavel http://www.postgresql.org/docs/devel/static/libpq-envars.html --- PGPASSWORD behaves the same as the password connection parameter. Use of this environment variable is not recommended for security reasons, as some operating systems allow non-root users to see process environment variables via ps; instead consider using the ~/.pgpass file --- Regards, -- Fujii Masao
Re: [HACKERS] tab completion for setting search_path
On 2014-06-22 20:02:57 -0700, Tom Lane wrote: Ian Barwick i...@2ndquadrant.com writes: On 23/06/14 00:58, Andres Freund wrote: I thought about committing this but couldn't get over this bit. If you type SELECT * FROM pg_cattab it'll get autocompleted to pg_catalog.pg_ and pg_temptab will list all the temp schemas including the numeric and toast ones. So we have precedent for *not* bothering about excluding any schemas. I don't think we should start doing so in a piecemal fashion in an individual command's completion. There is an exception of sorts already for system schemas, in that although SELECT * FROM ptab will list the system schemas, it will not list any tables from them, and won't until SELECT * FROM pg_tab is entered (see note in tab-completion.c around line 3722). Personally I'd be mildly annoyed if every SET search_path TO ptab resulted in all the system schemas being displayed when all I want is public; how about having these listed only once pg_ is entered, i.e. SET search_path TO pg_tab? I think there is a pretty strong practical argument for excluding the pg_temp and pg_toast schemas from completion for search_path, namely that when does anyone ever need to include those in their search_path explicitly? Infrequently, yes. I've only done it when trying to break stuff ;) The use-case for including pg_catalog in your path is perhaps a bit greater, but not by much. I don't know. It feelds like inappropriate nannyism to me. More confusing than actually helpful. The schemas are there, so they should get autocompleted. But anyway, the common opinion seems to be swinging against my position, so lets do it that way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On Mon, Jun 23, 2014 at 7:48 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-22 19:03:32 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: I think we'll want a version of this that just fails the transaction once we have the infrastructure. So we should choose a name that allows for a complimentary GUC. If we stick with the rule that what is to the left of _timeout is what is being cancelled, the a GUC to cancel a transaction which remains idle for too long could be called idle_transaction_timeout. Do you disagree with the general idea of following that pattern? I think that'd be rather confusing. For one it'd need to be idle_in_transaction_timeout which already seems less clear (because the transaction belongs to idle) and for another that distinction seems to be to subtle for users. The reason I suggested idle_in_transaction_termination/cancellation_timeout is that that maps nicely to pg_terminate/cancel_backend() and is rather descriptive. Maybe we can remove IIT_termination_timeout when we've implemented IIT_cancellation_timeout. Right? I'm not sure if IIT_termination_timeout is still useful even at that case. *If* it's not useful, I think we don't need to have those two parameters and can just define one parameter IIT_timeout. That's quite simple and it's similar to the current style of statement_timeout and lock_timeout (IOW, we don't have something like statement_termination_timeout and lock_termination_timeout). Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On 2014-06-23 20:29:17 +0900, Fujii Masao wrote: On Mon, Jun 23, 2014 at 7:48 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-22 19:03:32 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: I think we'll want a version of this that just fails the transaction once we have the infrastructure. So we should choose a name that allows for a complimentary GUC. If we stick with the rule that what is to the left of _timeout is what is being cancelled, the a GUC to cancel a transaction which remains idle for too long could be called idle_transaction_timeout. Do you disagree with the general idea of following that pattern? I think that'd be rather confusing. For one it'd need to be idle_in_transaction_timeout which already seems less clear (because the transaction belongs to idle) and for another that distinction seems to be to subtle for users. The reason I suggested idle_in_transaction_termination/cancellation_timeout is that that maps nicely to pg_terminate/cancel_backend() and is rather descriptive. Maybe we can remove IIT_termination_timeout when we've implemented IIT_cancellation_timeout. Right? I'm not sure if IIT_termination_timeout is still useful even at that case. I think both can actually be sensible depending on the use case. It's also not nice to remove a feature without need when people started to rely on it. For a web app termination is probably more sensible. For interactive clients cancellation. *If* it's not useful, I think we don't need to have those two parameters and can just define one parameter IIT_timeout. That's quite simple and it's similar to the current style of statement_timeout and lock_timeout (IOW, we don't have something like statement_termination_timeout and lock_termination_timeout). I don't think those really are comparable. A long idle in transaction state pretty much always indicates a problematic interaction with postgres. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On 06/22/2014 07:47 PM, Andres Freund wrote: On 2014-06-22 09:27:24 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: The idea with the GUC name is that if we ever get support for cancelling transactions we can name that idle_in_transaction_transaction_timeout? That seems a bit awkward... No, the argument was that for all the other *_timeout settings what came before _timeout was the thing that was being terminated. I think there were some votes in favor of the name on that basis, and none against. Feel free to give your reasons for supporting some other name. My reasons for not liking the current GUC name are hinted at above. I think we'll want a version of this that just fails the transaction once we have the infrastructure. So we should choose a name that allows for a complimentary GUC. CAKFQuwZCg2uur=tudz_c2auwbo87offghn9mx_hz4qd-b8f...@mail.gmail.com suggested On 2014-06-19 10:39:48 -0700, David G Johnston wrote: idle_in_transaction_timeout=10s idle_in_transaction_target=session|transaction but I don't like that much. Not sure what'd be good, the best I currently can come up with is: idle_in_transaction_termination_timeout = idle_in_transaction_cancellation_timeout = Except the transaction wouldn't be cancelled, it would be aborted. idle_in_transaction_abortion_timeout seems a little... weird. -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On 2014-06-23 13:33:46 +0200, Vik Fearing wrote: On 06/22/2014 07:47 PM, Andres Freund wrote: On 2014-06-22 09:27:24 -0700, Kevin Grittner wrote: Andres Freund and...@2ndquadrant.com wrote: but I don't like that much. Not sure what'd be good, the best I currently can come up with is: idle_in_transaction_termination_timeout = idle_in_transaction_cancellation_timeout = Except the transaction wouldn't be cancelled, it would be aborted. That ship has sailed with pg_cancel_backend(), no? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: Non-recursive processing of AND/OR lists
Thanks! On Mon, Jun 16, 2014 at 3:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Gurjeet Singh gurj...@singh.im writes: I tried to eliminate the 'pending' list, but I don't see a way around it. We need temporary storage somewhere to store the branches encountered on the right; in recursion case the call stack was serving that purpose. I still think we should fix this in the grammar, rather than introducing complicated logic to try to get rid of the recursion later. For example, as attached. I went looking for (and found) some additional obsoleted comments, and convinced myself that ruleutils.c is okay as-is, and pushed this. regards, tom lane -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
On Wed, Jun 18, 2014 at 8:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh gurj...@singh.im writes: Please find attached the patch. It includes the doc changes as well. Applied with some editorialization. Thanks! would it be possible to include this in 9.4 as well? Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
Hi Abhijit - What's the status of this patch? The latest version of the patch needs a review, and I'd like to get it committed in this CF if possible. Thanks - Nick -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use a signal to trigger a memory context dump?
Andres, * Andres Freund (and...@2ndquadrant.com) wrote: I wonder if it'd make sense to allow a signal to trigger a memory context dump? I and others more than once had the need to examine memory usage on production systems and using gdb isn't always realistic. +100 I keep thinking we have this and then keep being disappointed when I go try to find it. I wonder if we could install a signal handler for some unused signal (e.g. SIGPWR) to dump memory. Interesting thought, but.. I'd also considered adding a SQL function that uses the SIGUSR1 signal multiplexing for the purpose but that's not necessarily nice if you have to investigate while SQL access isn't yet possible. There's also the problem that not all possibly interesting processes use the sigusr1 signal multiplexing. I'd tend to think this would be sufficient. You're suggesting a case where you need to debug prior to SQL access (not specifically sure what you mean by that) or processes which are hopefully less likely to have memory issues, but you don't have gdb.. Another thought along the lines of getting information about running processes would be to see the call stack or execution plan.. I seem to recall there being a patch for the latter at one point? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On Mon, Jun 23, 2014 at 7:51 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: (I'm replying as co-author of pgaudit.) At 2014-06-23 19:15:39 +0900, masao.fu...@gmail.com wrote: You added this into CF, but its patch has not been posted yet. Are you planning to make a patch? It's a self-contained contrib module. I thought Ian had posted a tarball, but it looks like he forgot to attach it (or decided to provide only a Github link). I've attached a tarball here for your reference. Planned future improvements include: 1. Additional logging facilities, including to a separate audit log file and to syslog, and potentially logging to a table (possibly via a bgworker process). Currently output is simply emitted to the server log via ereport(). 2. To implement per-object auditing configuration, it would be nice to use extensible reloptions (or an equivalent mechanism) Is it possible to implement these outside PostgreSQL by using hooks? There are some unresolved questions with #2 because the extensible reloptions patch seems to have lost favour, but I'm pretty sure we could figure out some alternative. If not, it might be better to implement audit feature in core from the beginning. Sure, we're open to that possibility. Do you have any ideas about what an in-core implementation should do/look like? I don't have good idea about that. But maybe we can merge pgaudit.log into log_statement for more flexible settings of what to log. BTW I found that pgaudit doesn't log bind parameters when prepared statements are executed. I'm feeling this behavior is not good for audit purpose. Thought? Also I got only the following log message when I executed PREPARE hoge AS INSERT INTO mytbl VALUES($1) and EXECUTE hoge(1). This means that obviously we cannot track the statements via PREPARED command... LOG: [AUDIT],2014-06-23 21:14:53.667059+09,postgres,postgres,postgres,WRITE,INSERT,TABLE,postgres.mytbl,execute hoge(1); Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
* Fujii Masao (masao.fu...@gmail.com) wrote: On Mon, Jun 23, 2014 at 7:51 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: At 2014-06-23 19:15:39 +0900, masao.fu...@gmail.com wrote: You added this into CF, but its patch has not been posted yet. Are you planning to make a patch? It's a self-contained contrib module. I thought Ian had posted a tarball, but it looks like he forgot to attach it (or decided to provide only a Github link). I've attached a tarball here for your reference. I'm not a huge fan of adding this as a contrib module unless we can be quite sure that there's a path forward from here to a rework of the logging in core which would actually support the features pg_audit is adding, without a lot of pain and upgrade issues. Those issues have kept other contrib modules from being added to core. Splitting up contrib into other pieces, one of which is a 'features' area, might address that but we'd really need a way to have those pieces be able to include/add catalog tables, at least.. If not, it might be better to implement audit feature in core from the beginning. Sure, we're open to that possibility. Do you have any ideas about what an in-core implementation should do/look like? I don't have good idea about that. But maybe we can merge pgaudit.log into log_statement for more flexible settings of what to log. I'd expect a catalog table or perhaps changes to pg_class (maybe other things also..) to define what gets logged.. I'd also like to see the ability to log based on the connecting user, and we need to log under what privileges a command is executing, and, really, a whole host of other things.. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] How to use the 'char() ' as data type and a function name in the same time.
Dear Hackers When I use the pg_catalog.char(integer) function in the postgres, I can only use it like these:select pg_catalog.char(65); select char(65); But I want to use the function by the following way.select char(1);Of coures, There would be a gram error. I know the error is caused by that the 'char()' has been defined as one of character data types in the 'src\backend\parser\gram.y' file. However, In SQLServer, the 'char()' can be used as data type and a function name in the same time. For example:---select char(65) char() used as a function nameA select cast('1' as char(4)); char() used as data type---1 So, I wonder if there is a solution in the postgreSQL to use the char() as data type and function name in the same time like in SQLServer. == Any help appreciated.
Re: [HACKERS] Use a signal to trigger a memory context dump?
On 2014-06-23 08:36:02 -0400, Stephen Frost wrote: Andres, * Andres Freund (and...@2ndquadrant.com) wrote: I wonder if it'd make sense to allow a signal to trigger a memory context dump? I and others more than once had the need to examine memory usage on production systems and using gdb isn't always realistic. +100 I keep thinking we have this and then keep being disappointed when I go try to find it. I wonder if we could install a signal handler for some unused signal (e.g. SIGPWR) to dump memory. Interesting thought, but.. I'd also considered adding a SQL function that uses the SIGUSR1 signal multiplexing for the purpose but that's not necessarily nice if you have to investigate while SQL access isn't yet possible. There's also the problem that not all possibly interesting processes use the sigusr1 signal multiplexing. I'd tend to think this would be sufficient. You're suggesting a case where you need to debug prior to SQL access (not specifically sure what you mean by that) or processes which are hopefully less likely to have memory issues, but you don't have gdb.. prior to SQL access := Before crash recovery finished/hot standby reached consistency. And I don't agree that memory dumps from non-plain backends are that uninteresting. E.g. background workers and logical decoding walsenders both can be interesting. Another thought along the lines of getting information about running processes would be to see the call stack or execution plan.. I seem to recall there being a patch for the latter at one point? I think these are *much* more complicated. I don't want to tackle them at the same time, otherwise we'll never get anywhere. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test failure on latest source
On 16/04/2014 18:55, Marco Atzeri wrote: On 16/04/2014 17:40, Tom Lane wrote: The bigger picture though is that this code isn't failing on the buildfarm. So what we need to ask is what's different about Marco's machine. good question. I checked again and I found that the fault is only on the cygwin 64 bit build but not on the cygwin 32 bit one. I was sure to have tested both, but it seems this time I missed the comparison. regards, tom lane Regards Marco to close the issue, we have identified the fault in the getaddrinfo system call on cygwin64. What happens is that the field ai_addrlen is defined as socklen_t in POSIX, but as size_t in the W32 API. On 64 bit, socklen_t is 4 bytes while size_t is 8 bytes. Setting all the hintp members manually (in contrast to calloc'ing it or memset'ing it to 0) leaves the 4 upper bytes of the ai_addrlen untouched. This in turn leads to a high probability that ai_addrlen has an invalid value when entering Winsock's getsockopt. Regards Marco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tab completion for setting search_path
Andres Freund and...@2ndquadrant.com wrote: On 2014-06-22 20:02:57 -0700, Tom Lane wrote: Ian Barwick i...@2ndquadrant.com writes: On 23/06/14 00:58, Andres Freund wrote: I thought about committing this but couldn't get over this bit. If you type SELECT * FROM pg_cattab it'll get autocompleted to pg_catalog.pg_ and pg_temptab will list all the temp schemas including the numeric and toast ones. So we have precedent for *not* bothering about excluding any schemas. I don't think we should start doing so in a piecemal fashion in an individual command's completion. There is an exception of sorts already for system schemas, in that although SELECT * FROM ptab will list the system schemas, it will not list any tables from them, and won't until SELECT * FROM pg_tab is entered (see note in tab-completion.c around line 3722). Personally I'd be mildly annoyed if every SET search_path TO ptab resulted in all the system schemas being displayed when all I want is public; how about having these listed only once pg_ is entered, i.e. SET search_path TO pg_tab? I think there is a pretty strong practical argument for excluding the pg_temp and pg_toast schemas from completion for search_path, namely that when does anyone ever need to include those in their search_path explicitly? Infrequently, yes. I've only done it when trying to break stuff ;) The use-case for including pg_catalog in your path is perhaps a bit greater, but not by much. I don't know. It feelds like inappropriate nannyism to me. More confusing than actually helpful. The schemas are there, so they should get autocompleted. But anyway, the common opinion seems to be swinging against my position, so lets do it that way. I would be for excluding the pg_toast, pg_toast_temp_n, and pg_temp_n schemas, and including public and pg_catalog. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use a signal to trigger a memory context dump?
Andres, * Andres Freund (and...@2ndquadrant.com) wrote: On 2014-06-23 08:36:02 -0400, Stephen Frost wrote: I'd tend to think this would be sufficient. You're suggesting a case where you need to debug prior to SQL access (not specifically sure what you mean by that) or processes which are hopefully less likely to have memory issues, but you don't have gdb.. prior to SQL access := Before crash recovery finished/hot standby reached consistency. And I don't agree that memory dumps from non-plain backends are that uninteresting. E.g. background workers and logical decoding walsenders both can be interesting. I didn't mean they're uninteresting- I meant that if you're dealing with those kinds of issues, having gdb isn't as huge a hurdle.. Another thought along the lines of getting information about running processes would be to see the call stack or execution plan.. I seem to recall there being a patch for the latter at one point? I think these are *much* more complicated. I don't want to tackle them at the same time, otherwise we'll never get anywhere. Sure, just some things to keep in mind as you're thinking about changes in this area. Just to toss another random thought out there, what about an SQL function which does a LISTEN and then sends a signal to another backend which throws a NOTIFY with payload including the requested info? That'd be *very* useful as there are lots of cases where access to the logs isn't trivial (particularly if they've been properly locked down due to the sensetive info they can contain..). Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Re: [bug fix] multibyte messages are displayed incorrectly on the client
On 04/05/2014 07:56 AM, Tom Lane wrote: MauMau maumau...@gmail.com writes: Then, as a happy medium, how about disabling message localization only if the client encoding differs from the server one? That is, compare the client_encoding value in the startup packet with the result of GetPlatformEncoding(). If they don't match, call disable_message_localization(). AFAICT this is not what was agreed to in this thread. It puts far too much credence in the server-side default for client_encoding, which up to now has never been thought to be very interesting; indeed I doubt most people bother to set it at all. The reason that this issue is even on the table is that that default is too likely to be wrong, no? Also, whatever possessed you to use pg_get_encoding_from_locale to identify the server's encoding? That's expensive and seems fairly unlikely to yield the right answer. I don't remember offhand where we keep the postmaster's idea of what encoding messages should be in, but I'm fairly sure it's stored explicitly somewhere. Or if it isn't, we can for sure do better than recalculating it during every connection attempt. Having said all that, though, I'm unconvinced that this cure isn't worse than the disease. Somebody claimed upthread that no very interesting messages would be delocalized by a change like this, but that's complete nonsense: in particular, *every* message associated with client authentication will be sent in English if we go down this path. Given the nearly complete lack of complaints in the many years that this code has worked like this, I'm betting that most people will find a change like this to be a net reduction in friendliness. Given the changes here to extract client_encoding from the startup packet ASAP, I wonder whether the right thing isn't just to set the client encoding immediately when we do that. Most application libraries pass client encoding in the startup packet anyway (libpq certainly does). Based on Tom's comments above, I'm marking this as returned with feedback in the commitfest. I agree that setting client_encoding as early as possible seems like the right thing to do. Earlier in this thread, MauMau pointed out that we can't do encoding conversions until we have connected to the database because you need to read pg_conversion for that. That's because we support creating custom conversions with CREATE CONVERSION. Frankly, I don't think anyone cares about that feature. If we just dropped the CREATE/DROP CONVERSION feature altogether and hard-coded the conversions we have, there would be close to zero complaints. Even if you want to extend something around encodings and conversions, the CREATE CONVERSION interface is clunky. Firstly, conversions are per-database, and even schema-qualified, which just seems like an extra complication. You'll most likely want to modify the conversion across the whole system. Secondly, rather than define a new conversion between encodings, you'll likely want to define a whole new encoding with conversions to/from existing encodings, but you can't do that anyway without hacking the source code. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication identifier format
On Wed, Jun 18, 2014 at 12:46 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-18 12:36:13 -0400, Robert Haas wrote: I actually don't think any of the discussions I was involved in had the externally visible version of replication identifiers limited to 16bits? If you are referring to my patch, 16bits was just the width of the *internal* name that should basically never be looked at. User visible replication identifiers are always identified by an arbitrary string - whose format is determined by the user of the replication identifier facility. *BDR* currently stores the system identifer, the database id and a name in there - but that's nothing core needs to concern itself with. I don't think you're going to be able to avoid users needing to know about those IDs. The configuration table is going to have to be the same on all nodes, and how are you going to get that set up without those IDs being user-visible? Why? Users and other systems only ever see the external ID. Everything leaving the system is converted to the external form. The short id basically is only used in shared memory and in wal records. For both using longer strings would be problematic. In the patch I have the user can actually see them as they're stored in pg_replication_identifier, but there should never be a need for that. Hmm, so there's no requirement that the short IDs are consistent across different clusters that are replication to each other? If that's the case, that might address my concern, but I'd probably want to go back through the latest patch and think about it a bit more. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication identifier format
On 2014-06-23 10:09:49 -0400, Robert Haas wrote: On Wed, Jun 18, 2014 at 12:46 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-18 12:36:13 -0400, Robert Haas wrote: I actually don't think any of the discussions I was involved in had the externally visible version of replication identifiers limited to 16bits? If you are referring to my patch, 16bits was just the width of the *internal* name that should basically never be looked at. User visible replication identifiers are always identified by an arbitrary string - whose format is determined by the user of the replication identifier facility. *BDR* currently stores the system identifer, the database id and a name in there - but that's nothing core needs to concern itself with. I don't think you're going to be able to avoid users needing to know about those IDs. The configuration table is going to have to be the same on all nodes, and how are you going to get that set up without those IDs being user-visible? Why? Users and other systems only ever see the external ID. Everything leaving the system is converted to the external form. The short id basically is only used in shared memory and in wal records. For both using longer strings would be problematic. In the patch I have the user can actually see them as they're stored in pg_replication_identifier, but there should never be a need for that. Hmm, so there's no requirement that the short IDs are consistent across different clusters that are replication to each other? Nope. That seemed to be a hard requirement in the earlier discussions we had (~2 years ago). If that's the case, that might address my concern, but I'd probably want to go back through the latest patch and think about it a bit more. I'll send out a new version after I'm finished with the newest atomic ops patch. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
On Tue, Jun 17, 2014 at 8:56 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-17 20:47:51 +0530, Amit Kapila wrote: On Tue, Jun 17, 2014 at 6:35 PM, Andres Freund and...@2ndquadrant.com wrote: You have followed it pretty well as far as I can understand from your replies, as there is no reproducible test (which I think is bit tricky to prepare), so it becomes difficult to explain by theory. I'm working an updated patch that moves the releaseOK into the spinlocks. Maybe that's the problem already - it's certainly not correct as is. Sure, I will do the test/performance test with updated patch; you might want to include some more changes based on comments in mail below: You are right, it will wakeup the existing waiters, but I think the new logic has one difference which is that it can allow the backend to take Exclusive lock when there are already waiters in queue. As per above example even though Session-2 and Session-3 are in wait queue, Session-4 will be able to acquire Exclusive lock which I think was previously not possible. I think that was previously possible as well - in a slightly different set of circumstances though. After a process releases a lock and wakes up some of several waiters another process can come in and acquire the lock. Before the woken up process gets scheduled again. lwlocks aren't fair locks... Okay, but I think changing behaviour for lwlocks might impact some tests/applications. As they are not fair, I think defining exact behaviour is not easy and we don't have any concrete scenario which can be effected, so there should not be problem in accepting slightly different behaviour. Few more comments: 1. LWLockAcquireCommon() { .. iterations++; } In current logic, I could not see any use of these *iterations* variable. 2. LWLockAcquireCommon() { .. if (!LWLockDequeueSelf(l)) { /* * Somebody else dequeued us and has or will.. .. */ for (;;) { PGSemaphoreLock(proc-sem, false); if (!proc-lwWaiting) break; extraWaits++; } lock-releaseOK = true; } Do we want to set result = false; after waking in above code? The idea behind setting false is to indicate whether we get the lock immediately or not which previously was decided based on if it needs to queue itself? 3. LWLockAcquireCommon() { .. /* * Ok, at this point we couldn't grab the lock on the first try. We * cannot simply queue ourselves to the end of the list and wait to be * woken up because by now the lock could long have been released. * Instead add us to the queue and try to grab the lock again. If we * suceed we need to revert the queuing and be happy, otherwise we * recheck the lock. If we still couldn't grab it, we know that the * other lock will see our queue entries when releasing since they * existed before we checked for the lock. */ /* add to the queue */ LWLockQueueSelf(l, mode); /* we're now guaranteed to be woken up if necessary */ mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious); .. } a. By reading above code and comments, it is not quite clear why second attempt is important unless somebody thinks on it or refer your comments in *Notes* section at top of file. I think it's better to indicate in some way so that code reader can refer to Notes section or whereever you are planing to keep those comments. b. There is typo in above comment suceed/succeed. 4. LWLockAcquireCommon() { .. if (!LWLockDequeueSelf(l)) { for (;;) { PGSemaphoreLock(proc-sem, false); if (!proc-lwWaiting) break; extraWaits++; } lock-releaseOK = true; .. } Setting releaseOK in above context might not be required because if the control comes in this part of code, it will not retry to acquire another time. 5. LWLockWaitForVar() { .. else mustwait = false; if (!mustwait) break; .. } I think we can directly break in else part in above code. 6. LWLockWaitForVar() { .. /* * Quick test first to see if it the slot is free right now. * * XXX: the caller uses a spinlock before this,... */ if (pg_atomic_read_u32(lock-lockcount) == 0) return true; } Does the part of comment that refers to spinlock is still relevant after using atomic ops? 7. LWLockWaitForVar() { .. /* * Add myself to wait queue. Note that this is racy, somebody else * could wakeup before we're finished queuing. * NB: We're using nearly the same twice-in-a-row lock acquisition * protocol as LWLockAcquire(). Check its comments for details. */ LWLockQueueSelf(l, LW_WAIT_UNTIL_FREE); /* we're now guaranteed to be woken up if necessary */ mustwait = LWLockAttemptLock(l, LW_EXCLUSIVE, false, potentially_spurious); } Why is it important to Attempt lock after queuing in this case, can't we just re-check exclusive lock as done before queuing? 8. LWLockWaitForVar() { .. PRINT_LWDEBUG(LWLockAcquire undo queue, lock, mode); break; } else { PRINT_LWDEBUG(LWLockAcquire waiting 4, lock, mode); } .. } a. I think instead of LWLockAcquire, here we should use LWLockWaitForVar b. Isn't it better
Re: [HACKERS] Atomics hardware support table supported architectures
On Thu, Jun 19, 2014 at 10:43 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Let's not pretend to support platforms we have no practical way of verifying. This is key. The buildfarm defines the set of platforms we support. This criterion has been proposed before, but I'm not sure I really agree with it. If having code around that targets obscure platforms is hindering the development of new features, then that's a reason to get rid of it. If we're pretty sure it's useless and doesn't work, so is that. But if it just hasn't been tested lately, I don't personally think that's a sufficient reason to excise it. Telling people that they can't have even the most minimal platform support code in PostgreSQL unless they're willing to contribute and maintain a BF VM indefinitely is not very friendly. Of course, the risk of their platform getting broken is higher if they don't, but that's different than making it a hard requirement. We should be looking at ways to make it easier not harder for people who don't yet run PostgreSQL to start doing so. We are an open source community; we should try to be, as far as possible, open. But I think this is all a bit off-topic for this thread. Andres has already implemented a fallback for people who haven't got CAS and fetch-and-add on their platform, so whether or not we deprecate some more platforms has no bearing at all on this patch. The question that needs to be resolved in order to move forward here is this: Consider the set of platforms we support, ignoring any that don't have spinlocks. Do any of them have only one of fetch-and-add, or do they all have both or neither? If it's the latter, then we're talking about moving from a two-tiered system that looks like this: 1. No spinlocks. 2. Spinlocks. ...to a three-tiered system that looks like this: 1. No atomics at all. 2. Spinlocks but nothing else. 3. Spinlocks, CAS, and fetch-and-add. I think that's eminently reasonable from a code maintenance and developer testing perspective. It sounds like all common systems and architectures would fall in category 3, meaning that people wouldn't have to worry about (just for example) significant differences between the atomic ops supported on Intel vs. PPC. For older or more obscure platforms, categories 2 and 1 would still be there when needed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] releaseOk and LWLockWaitForVar
On Tue, Jun 17, 2014 at 5:47 PM, Andres Freund and...@2ndquadrant.com wrote: Hi Heikki, All, Amit just pointed me to a case where the lwlock scalability patch apparently causes problems and I went on to review it and came across the following problem in 9.4/master: LWLockWaitForVar() doesn't set releaseOk to true when waiting again. Isn't that a bug? What if there's another locker coming in after LWLockWaitForVar() returns from the PGSemaphoreLock() but before it has acquire the spinlock? I also think above mentioned scenario is a problem if releaseOk is not set to true in above case. While looking at function LWLockWaitForVar(), espacially below code: TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(l), T_ID(l), LW_EXCLUSIVE); I think in this function tracing is done considering the Exclusive lock is acquired, however it might have granted access because of variable updation. Basically this function's trace doesn't distinguish whether the access is granted due to the reason that there is no other exclusive locker or variable is updated. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] replication identifier format
On Mon, Jun 23, 2014 at 10:11 AM, Andres Freund and...@2ndquadrant.com wrote: Why? Users and other systems only ever see the external ID. Everything leaving the system is converted to the external form. The short id basically is only used in shared memory and in wal records. For both using longer strings would be problematic. In the patch I have the user can actually see them as they're stored in pg_replication_identifier, but there should never be a need for that. Hmm, so there's no requirement that the short IDs are consistent across different clusters that are replication to each other? Nope. That seemed to be a hard requirement in the earlier discussions we had (~2 years ago). Oh, great. Somehow I missed the fact that that had been addressed. I had assumed that we still needed global identifiers in which case I think they'd need to be 64+ bits (preferably more like 128). If they only need to be locally significant that makes things much better. Is there any real reason to add a pg_replication_identifier table, or should we just let individual replication solutions manage the identifiers within their own configuration tables? I guess one question is: What happens if there are multiple replication solutions in use on a single server? How do they coordinate? If that's the case, that might address my concern, but I'd probably want to go back through the latest patch and think about it a bit more. I'll send out a new version after I'm finished with the newest atomic ops patch. Sweet. I'm a little backed up right now, but will look at it when able. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP patch for multiple column assignment in UPDATE
On Thu, Jun 19, 2014 at 9:37 AM, Tom Lane t...@sss.pgh.pa.us wrote: Pavel Stehule pavel.steh...@gmail.com writes: I did some tests and It looks so it allows only some form of nested loop. [ shrug... ] It's a subplan. One evaluation per outer row is what people are expecting. Is it theoretically possible to convert a construct like this to a semi-join? I notice we don't, even when this new syntax isn't used: rhaas=# explain select a, (select b from bar where foo.a = bar.a) from foo; QUERY PLAN Seq Scan on foo (cost=0.00..855145.00 rows=1 width=4) SubPlan 1 - Seq Scan on bar (cost=0.00..85.50 rows=1 width=4) Filter: (foo.a = a) Planning time: 0.078 ms ...but I'm wondering if that's an unimplemented feature or if there's some reason why it's a bad idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication commands and log_statements
On Fri, Jun 20, 2014 at 9:48 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: OK, I've just implemented the patch (attached) which does this, i.e., redefines log_statement as a list. Thanks to the patch, log_statement can be set to none, ddl, mod, dml, all, and any combinations of them. The meanings of none, ddl, mod and all are the same as they are now. New setting value dml loggs both data modification statements like INSERT and query access statements like SELECT and COPY TO. I still don't like this one bit. It's turning log_statement from a reasonably clean design into a complete mess, which will be made even worse after you add replication control to it. Well, I don't object to having a separate GUC for replication command logging if people prefer that design. But let's not have any delusions of adequacy about log_statement. I've had more than one conversation with customers about that particular parameter, all of which involved the customer being unhappy that there were only four choices and they couldn't log the stuff that they cared about without logging a lot of other stuff that they didn't care about. Now, providing more choices there will, indisputably, add complexity, but it will also provide utility. What we're talking about here is not unlike what we went through with EXPLAIN syntax. We repeatedly rejected patches that might have added useful functionality to EXPLAIN on the grounds that (1) we didn't want to make new keywords and (2) even if we did add new keywords, extending the EXPLAIN productions would produce grammar conflicts. Then, we implemented the extensible-options stuff, and suddenly it became possible for people to write patches adding useful functionality to EXPLAIN that didn't get sunk before it got out of the gate, and since then we've gotten a new EXPLAIN option every release or two, and IMHO all of those options are pretty useful. Similarly, building a logging facility that meets the needs of real users is going to require a configuration method more flexible than a total order with four choices. I happen to think a list of comma-separated tokens is a pretty good choice, but something else could be OK, too. We need something better than what we've got now, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to use the 'char() ' as data type and a function name in the same time.
rohtodeveloper rohtodevelo...@outlook.com wrote: When I use the pg_catalog.char(integer) function in the postgres, I can only use it like these: select pg_catalog.char(65); select char(65); But I want to use the function by the following way. select char(1); Try using the chr() function. What you are running into is a result of the fact that besides the char datatype, PostgreSQL has a char datatype, with a completely different implementation. While that may not have been the best decision, there is enough working code that depends on both of these types that it is unlikely to change. http://www.postgresql.org/docs/current/interactive/datatype-character.html Be careful, though; I would not generally recommend the use of either char or char in application tables or code. The char type is only included for standard compliance, and has surprising semantics and generally poorer performance than the alternatives. The char type is only intended for use in the system catalogs; if you are considering using it elsewhere, you should consider an enum instead. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and Postgres Variable Queries
On Fri, Jun 20, 2014 at 11:26 AM, Joey Caughey jcaug...@parrotmarketing.com wrote: I’m having an issue with JSON requests in Postgres and was wondering if anyone had an answer. I have an orders table with a field called “json_data”. In the json data there is a plan’s array with an id value in them. { plan”: { “id”: “1” } } } I can do regular queries that will work, like so: SELECT json_data-’plan'-’id' as plan_id FROM orders; But if I try to query on the data that is returned it will fail: SELECT json_data-’plan'-’id' as plan_id FROM orders WHERE plan_id = 1; OR SELECT json_data-’plan'-’id' as plan_id FROM orders GROUP BY plan_id; OR SELECT json_data-’plan'-’id' as plan_id FROM orders ORDER BY plan_id; Is this something that has been overlooked? or is there another way to go about this? You might find a sub-SELECT helpful: SELECT * FROM (SELECT json_data-’plan'-’id' as plan_id FROM orders) x WHERE plan_id = 1 It might be a generally useful thing for WHERE-clause items to be able to reference items from the target list by alias, or maybe it's problematic for some reason that I don't know about, but right now they can't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication commands and log_statements
* Robert Haas (robertmh...@gmail.com) wrote: Similarly, building a logging facility that meets the needs of real users is going to require a configuration method more flexible than a total order with four choices. I happen to think a list of comma-separated tokens is a pretty good choice, but something else could be OK, too. We need something better than what we've got now, though. Absolutely, and not all of that should be done through postgresql.conf, imv. The level of flexibility that is being asked for, from what I've seen and heard, really needs to be met with new catalog tables or extending existing ones. The nearby thread on pg_audit would be a good example. I certainly don't want to be specifying specific table names or providing a list of roles through any GUC (or configuration file of any kind). I'm not adverse to improving log_statement to a list with tokens that indicate various meaning levels but anything done through that mechanism will be very coarse. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Atomics hardware support table supported architectures
On 2014-06-23 10:29:54 -0400, Robert Haas wrote: On Thu, Jun 19, 2014 at 10:43 AM, Merlin Moncure mmonc...@gmail.com wrote: On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote: Let's not pretend to support platforms we have no practical way of verifying. This is key. The buildfarm defines the set of platforms we support. This criterion has been proposed before, but I'm not sure I really agree with it. If having code around that targets obscure platforms is hindering the development of new features, then that's a reason to get rid of it. I think that's the case. That we still have !PG_USE_INLINE support although all buildfarm animals support it since 4c8aa8b (fixing acc) causes absurd constructs (STATIC_IF_INLINE) and fugly macro usage making it harder to read and modify code. I spend a good chunk of time just to make that work. Even if nobody's going to ever use it. That we need fallback for older gccs instead of relying on somebody else having done the atomics abstraction causes significant waste of time. That we have support for platforms that we haven't even documented as working (e.g. SuperH) or platforms that don't work in the realword (m32r) means that that one has to think about and research so many more edge cases that it's really hard to make progress. I don't know how often I've now sequentially gone through s_lock.h, s_lock.c, src/backend/port/tas to understand all the supported combinations. If we're pretty sure it's useless and doesn't work, so is that. But if it just hasn't been tested lately, I don't personally think that's a sufficient reason to excise it. Sure, it just not being tested isn't a reason alone. But it's about more than that. I've now spent *far* too much time understanding the idioms of architectures that'll never get used again so I don't break them accidentally. Sure, that's my choice. But it's imo necessary for postgres to scale I don't know about you, but for me reading/understanding/modifying assembly on some platform you've never used is *HARD*. And even harder if there's nobody around that's going to test it. It's entirely possible to write incorrect locking assembly that'll break when used. Telling people that they can't have even the most minimal platform support code in PostgreSQL unless they're willing to contribute and maintain a BF VM indefinitely is not very friendly. Of course, the risk of their platform getting broken is higher if they don't, but that's different than making it a hard requirement. I agree that we shouldn't actively try to break stuff. But having to understand blindly modify unused code is on the other hand of actively breaking platforms. It's actively hindering development. We should be looking at ways to make it easier not harder for people who don't yet run PostgreSQL to start doing so. We are an open source community; we should try to be, as far as possible, open. Do you really think supporting platform that 20 people worldwide run for fun in their sparetime once in a while (sparc v8plus) is achieving that? I think it's more likely that easier to understand code, quick review for patches and better testing help with that. But I think this is all a bit off-topic for this thread. Andres has already implemented a fallback for people who haven't got CAS and fetch-and-add on their platform, so whether or not we deprecate some more platforms has no bearing at all on this patch. While I indeed have that fallback code, that's statement is still not entirely true. We still need to add atomics support for lots of platforms, otherwise they're just going to be 'less supported' than now. Are we fine with that and just'll accept patches? The question that needs to be resolved in order to move forward here is this: Consider the set of platforms we support, ignoring any that don't have spinlocks. Do any of them have only one of fetch-and-add, or do they all have both or neither? If it's the latter, then we're talking about moving from a two-tiered system that looks like this: Since fetch-and-add is trivially implemented using CAS, there's not much need to distinguish between CAS and CAS + fetch_and_add. From my POV the restriction to just CAS/fetch_and_add isn't actually buying much. Pretty much all platforms but older gcc ones have atomic intrinsics since forever. Once you've dug up the documentation for one operation not adding two more or so doesn't save much. 1. No spinlocks. 2. Spinlocks. ...to a three-tiered system that looks like this: 1. No atomics at all. 2. Spinlocks but nothing else. 3. Spinlocks, CAS, and fetch-and-add. I think that's eminently reasonable from a code maintenance and developer testing perspective. It sounds like all common systems and architectures would fall in category 3, meaning that people wouldn't have to worry about (just for example) significant differences between the atomic ops supported on Intel vs. PPC.
Re: [HACKERS] replication commands and log_statements
On Mon, Jun 23, 2014 at 11:15 AM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: Similarly, building a logging facility that meets the needs of real users is going to require a configuration method more flexible than a total order with four choices. I happen to think a list of comma-separated tokens is a pretty good choice, but something else could be OK, too. We need something better than what we've got now, though. Absolutely, and not all of that should be done through postgresql.conf, imv. The level of flexibility that is being asked for, from what I've seen and heard, really needs to be met with new catalog tables or extending existing ones. The nearby thread on pg_audit would be a good example. I certainly don't want to be specifying specific table names or providing a list of roles through any GUC (or configuration file of any kind). I'm not adverse to improving log_statement to a list with tokens that indicate various meaning levels but anything done through that mechanism will be very coarse. True, but it would be enough to let you make a list of commands you'd like to audit, and I think that might be valuable enough to justify the price of admission. I certainly agree that logging based on which object is being accessed needs a different design, tied into the catalogs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] please review source(SQLServer compatible)
On 06/23/2014 10:51 AM, rohtodeveloper wrote: Dear all, Our application will be switched from SQL Server to PostgreSQL. However, a few functions are not supported yet. So we decided to extend it. The functions are as following: 1.SQL statement support INSERT statement without INTO keyword DELETE statement without FROM keywork 2.Build-in function SQUARE CHAR CHARINDEX LEN REPLICATE SPACE STR STUFF CONVERT DATALENGTH DATEADD DATEDIFF DATEPART DAY MONTH YEAR EOMONTH GETDATE SYSDATETIME 3.Operator operator ! (Not Less Than) operator ! (Not Greater Than) operator + (String Concatenation) 4.Other DataType support(smalldatetime,datetime,datatime2,uniqueidentifer) Date, Time, and Timestamp Escape Sequences ODBC Scalar Functions OCTET_LENGTH CURRENT_DATE CURRENT_TIME The extended functions are almost completed but your opinion is very important to us. Would you please help us to review the extended source? The attachments is the diff source. Thank you very much. I think this effort is fundamentally misguided. It will mean a maintenance nightmare for you. You would be much better off migrating your app to rid it of these SQLServerisms, especially those that require backend changes. If you have layered your application correctly, so that the places it calls SQL are relatively confined, then this should not be terribly difficult. If you have not, then you have bigger problems than these anyway. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minmax indexes
Some comments, aside from the design wrt. bounding boxes etc. : On 06/15/2014 05:34 AM, Alvaro Herrera wrote: Robert Haas wrote: On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Here's an updated version of this patch, with fixes to all the bugs reported so far. Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and Amit Kapila for the reports. I'm not very happy with the use of a separate relation fork for storing this data. Here's a new version of this patch. Now the revmap is not stored in a separate fork, but together with all the regular data, as explained elsewhere in the thread. Thanks! Please update the README accordingly. If I understand the code correctly, the revmap is a three-level deep structure. The bottom level consists of regular revmap pages, and each regular revmap page is filled with ItemPointerDatas, which point to the index tuples. The middle level consists of array revmap pages, and each array revmap page contains an array of BlockNumbers of the regular revmap pages. The top level is an array of BlockNumbers of the array revmap pages, and it is stored in the metapage. With 8k block size, that's just enough to cover the full range of 2^32-1 blocks that you'll need if you set mm_pages_per_range=1. Each regular revmap page can store about 8192/6 = 1365 item pointers, each array revmap page can store about 8192/4 = 2048 block references, and the size of the top array is 8192/4. That's just enough; to store the required number of array pages in the top array, the array needs to be (2^32/1365)/2048)=1536 elements large. But with 4k or smaller blocks, it's not enough. I wonder if it would be simpler to just always store the revmap pages in the beginning of the index, before any other pages. Finding the revmap page would then be just as easy as with a separate fork. When the table/index is extended so that a new revmap page is needed, move the existing page at that block out of the way. Locking needs some consideration, but I think it would be feasible and simpler than you have now. I have followed the suggestion by Amit to overwrite the index tuple when a new heap tuple is inserted, instead of creating a separate index tuple. This saves a lot of index bloat. This required a new entry point in bufpage.c, PageOverwriteItemData(). bufpage.c also has a new function PageIndexDeleteNoCompact which is similar in spirit to PageIndexMultiDelete except that item pointers do not change. This is necessary because the revmap stores item pointers, and such reference would break if we were to renumber items in index pages. ISTM that when the old tuple cannot be updated in-place, the new index tuple is inserted with mm_doinsert(), but the old tuple is never deleted. - Heikki -- - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication identifier format
On 2014-06-23 10:45:51 -0400, Robert Haas wrote: On Mon, Jun 23, 2014 at 10:11 AM, Andres Freund and...@2ndquadrant.com wrote: Why? Users and other systems only ever see the external ID. Everything leaving the system is converted to the external form. The short id basically is only used in shared memory and in wal records. For both using longer strings would be problematic. In the patch I have the user can actually see them as they're stored in pg_replication_identifier, but there should never be a need for that. Hmm, so there's no requirement that the short IDs are consistent across different clusters that are replication to each other? Nope. That seemed to be a hard requirement in the earlier discussions we had (~2 years ago). Oh, great. Somehow I missed the fact that that had been addressed. I had assumed that we still needed global identifiers in which case I think they'd need to be 64+ bits (preferably more like 128). If they only need to be locally significant that makes things much better. Well, I was just talking about the 'short ids' here and how they are used in crash recovery/shmem et al. Those indeed don't need to be coordinated. If you ever use logical decoding on a system that receives changes from other systems (cascading replication, multimaster) you'll likely want to add the *long* form of that identifier to the output in the output plugin so the downstream nodes can identify the source. How one specific replication solution deals with coordinating this between systems is essentially that suite's problem. The external identifier currently is a 'text' column, so essentially unlimited. (Well, I just noticed that the table currently doesn't have a toast table assigned, so it's only a couple kb right now, but ...) Is there any real reason to add a pg_replication_identifier table, or should we just let individual replication solutions manage the identifiers within their own configuration tables? I don't think that'd work. During crash recovery the short/internal IDs are read from WAL records and need to be unique across *all* databases. Since there's no way for different replication solutions or even the same to coordinate this across databases (as there's no way to add shared relations) it has to be builtin. It's also useful so we can have stuff like the 'pg_replication_identifier_progress' view which tells you internal_id, external_id, remote_lsn, local_lsn. Just showing the internal ID would imo be bad. I guess one question is: What happens if there are multiple replication solutions in use on a single server? How do they coordinate? What's your concern here? You're wondering how they can make sure the identifiers they create are non-overlapping? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and Postgres Variable Queries
On 06/23/2014 11:06 AM, Robert Haas wrote: On Fri, Jun 20, 2014 at 11:26 AM, Joey Caughey jcaug...@parrotmarketing.com wrote: I’m having an issue with JSON requests in Postgres and was wondering if anyone had an answer. I have an orders table with a field called “json_data”. In the json data there is a plan’s array with an id value in them. { plan”: { “id”: “1” } } } I can do regular queries that will work, like so: SELECT json_data-’plan'-’id' as plan_id FROM orders; But if I try to query on the data that is returned it will fail: SELECT json_data-’plan'-’id' as plan_id FROM orders WHERE plan_id = 1; OR SELECT json_data-’plan'-’id' as plan_id FROM orders GROUP BY plan_id; OR SELECT json_data-’plan'-’id' as plan_id FROM orders ORDER BY plan_id; Is this something that has been overlooked? or is there another way to go about this? You might find a sub-SELECT helpful: SELECT * FROM (SELECT json_data-’plan'-’id' as plan_id FROM orders) x WHERE plan_id = 1 It might be a generally useful thing for WHERE-clause items to be able to reference items from the target list by alias, or maybe it's problematic for some reason that I don't know about, but right now they can't. Once again, json_data-’plan'-’id' is an expression guaranteed to fail, since - returns text but expects its left hand o0perand to be json, unlike json_data-’plan'-’id' or json_data#'{plan,id}' So I don't believe the OPs original statement about what is and isn't working. The alias issue, of course, is not at all JSON-specific, and the subselect is one solution - a CTE is another. But you CAN use the alias in an ORDER BY or GROUP BY. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2
On 2014-06-23 19:59:10 +0530, Amit Kapila wrote: On Tue, Jun 17, 2014 at 8:56 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-06-17 20:47:51 +0530, Amit Kapila wrote: On Tue, Jun 17, 2014 at 6:35 PM, Andres Freund and...@2ndquadrant.com wrote: You have followed it pretty well as far as I can understand from your replies, as there is no reproducible test (which I think is bit tricky to prepare), so it becomes difficult to explain by theory. I'm working an updated patch that moves the releaseOK into the spinlocks. Maybe that's the problem already - it's certainly not correct as is. Sure, I will do the test/performance test with updated patch; you might want to include some more changes based on comments in mail below: I'm nearly finished in cleaning up the atomics part of the patch which also includes a bit of cleanup of the lwlocks code. Few more comments: 1. LWLockAcquireCommon() { .. iterations++; } In current logic, I could not see any use of these *iterations* variable. It's useful for debugging. Should be gone in the final code. 2. LWLockAcquireCommon() { .. if (!LWLockDequeueSelf(l)) { /* * Somebody else dequeued us and has or will.. .. */ for (;;) { PGSemaphoreLock(proc-sem, false); if (!proc-lwWaiting) break; extraWaits++; } lock-releaseOK = true; } Do we want to set result = false; after waking in above code? The idea behind setting false is to indicate whether we get the lock immediately or not which previously was decided based on if it needs to queue itself? Hm. I don't think it's clear which version is better. 3. LWLockAcquireCommon() { .. /* * Ok, at this point we couldn't grab the lock on the first try. We * cannot simply queue ourselves to the end of the list and wait to be * woken up because by now the lock could long have been released. * Instead add us to the queue and try to grab the lock again. If we * suceed we need to revert the queuing and be happy, otherwise we * recheck the lock. If we still couldn't grab it, we know that the * other lock will see our queue entries when releasing since they * existed before we checked for the lock. */ /* add to the queue */ LWLockQueueSelf(l, mode); /* we're now guaranteed to be woken up if necessary */ mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious); .. } a. By reading above code and comments, it is not quite clear why second attempt is important unless somebody thinks on it or refer your comments in *Notes* section at top of file. I think it's better to indicate in some way so that code reader can refer to Notes section or whereever you are planing to keep those comments. Ok. b. There is typo in above comment suceed/succeed. Thanks, fixed. 4. LWLockAcquireCommon() { .. if (!LWLockDequeueSelf(l)) { for (;;) { PGSemaphoreLock(proc-sem, false); if (!proc-lwWaiting) break; extraWaits++; } lock-releaseOK = true; .. } Setting releaseOK in above context might not be required because if the control comes in this part of code, it will not retry to acquire another time. Hm. You're probably right. 5. LWLockWaitForVar() { .. else mustwait = false; if (!mustwait) break; .. } I think we can directly break in else part in above code. Well, there's another case of mustwait=false above which is triggered while the spinlock is held. Don't think it'd get simpler. 6. LWLockWaitForVar() { .. /* * Quick test first to see if it the slot is free right now. * * XXX: the caller uses a spinlock before this,... */ if (pg_atomic_read_u32(lock-lockcount) == 0) return true; } Does the part of comment that refers to spinlock is still relevant after using atomic ops? Yes. pg_atomic_read_u32() isn't a memory barrier (and explicitly documented not to be). 7. LWLockWaitForVar() { .. /* * Add myself to wait queue. Note that this is racy, somebody else * could wakeup before we're finished queuing. * NB: We're using nearly the same twice-in-a-row lock acquisition * protocol as LWLockAcquire(). Check its comments for details. */ LWLockQueueSelf(l, LW_WAIT_UNTIL_FREE); /* we're now guaranteed to be woken up if necessary */ mustwait = LWLockAttemptLock(l, LW_EXCLUSIVE, false, potentially_spurious); } Why is it important to Attempt lock after queuing in this case, can't we just re-check exclusive lock as done before queuing? Well, that's how Heikki designed LWLockWaitForVar(). 8. LWLockWaitForVar() { .. PRINT_LWDEBUG(LWLockAcquire undo queue, lock, mode); break; } else { PRINT_LWDEBUG(LWLockAcquire waiting 4, lock, mode); } .. } a. I think instead of LWLockAcquire, here we should use LWLockWaitForVar right. b. Isn't it better to use LOG_LWDEBUG instead of PRINT_LWDEBUG(), as PRINT_LWDEBUG() is generally used in file at entry of functions to log info about locks? Fine with me. 9. LWLockUpdateVar() { /*
Re: [HACKERS] [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns
Michael Paquier michael.paqu...@gmail.com writes: Digging more into that, I have found the issue and a fix for it. It happens that populate_recordset_object_start, which is used to initialize the process for the population of the record, is taken *each* time a json object is found, re-creating every time the hash table for the parsing process, hence removing from PopulateRecordsetState all the entries already parsed and creating the problem reported by Matti. The fix I am proposing to fix this issue is rather simple: simply bypass the creation of the hash table if lex_level 1 as we are in presence of a nested object and rely on the existing hash table. Yes, this code is clearly not handling the nested-objects case correctly. I had written a fix more or less equivalent to yours last night. However, it seems to me that these functions (json[b]_to_record[set]) are handling the nested-json-objects case in a fairly brain-dead fashion to start with. I would like to propose that we should think about getting rid of the use_json_as_text flag arguments altogether. What purpose do they serve? If we're going to the trouble of parsing the nested JSON objects anyway, why don't we just reconstruct from that data? (IOW, we probably actually should have nested hash tables in this case. I suspect that the current bug arose from incompletely-written logic to do it like that.) Since we've already decided to force an initdb for 9.4beta2, it's not quite too late to revisit this API, and I think it needs revisiting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
Gurjeet Singh gurj...@singh.im writes: would it be possible to include this in 9.4 as well? While this is clearly an improvement over what we had before, it's impossible to argue that it's a bug fix, and we are way past the 9.4 feature freeze deadline. In particular, packagers who've already done their 9.4 development work might be blindsided by us slipping this into 9.4 release. So while I wouldn't have a problem with putting this change into 9.4 from a technical standpoint, it's hard to argue that it'd meet project norms from a development-process standpoint. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
Stephen Frost sfr...@snowman.net writes: I'd expect a catalog table or perhaps changes to pg_class (maybe other things also..) to define what gets logged. How exactly will that work for log messages generated in contexts where we do not have working catalog access? (postmaster, crash recovery, or pretty much anywhere where we're not in a valid transaction.) This strikes me as much like the periodic suggestions we hear to get rid of the GUC infrastructure in favor of keeping all those settings in a table. It doesn't work because too much of that info is needed below the level of working table access. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [HACKERS] please review source(SQLServer compatible)
Andrew Dunstan and...@dunslane.net wrote: On 06/23/2014 10:51 AM, rohtodeveloper wrote: Our application will be switched from SQL Server to PostgreSQL. However, a few functions are not supported yet. So we decided to extend it. The functions are as following: 1.SQL statement support INSERT statement without INTO keyword DELETE statement without FROM keywork Those would be pretty trivial to do in core; the question is whether the community would agree that a few extra lines in the parser (and compatibility sections of the docs) is worth it for portability from SQL Server and Sybase. 2.Build-in function SQUARE CHAR CHARINDEX LEN REPLICATE SPACE STR STUFF CONVERT DATALENGTH DATEADD DATEDIFF DATEPART DAY MONTH YEAR EOMONTH GETDATE SYSDATETIME 3.Operator operator ! (Not Less Than) operator ! (Not Greater Than) operator + (String Concatenation) It seems likely that you could write an extension to add these (using the CREATE EXTENSION feature) and submit them to http://pgxn.org if you wanted to. Is there some reason you're not going this route? 4.Other DataType support(smalldatetime,datetime,datatime2,uniqueidentifer) Date, Time, and Timestamp Escape Sequences ODBC Scalar Functions OCTET_LENGTH CURRENT_DATE CURRENT_TIME You can add data types (including within extensions), and some of those are things which seem to be implemented in some form. test=# select current_date; date 2014-06-23 (1 row) test=# select current_time; timetz 10:44:36.958967-05 (1 row) test=# select octet_length('abcd'); octet_length -- 4 (1 row) test=# select octet_length('π'); octet_length -- 2 (1 row) If the point is that you want to change the semantics of existing valid PostgreSQL statements, that's probably not a good idea. The extended functions are almost completed but your opinion is very important to us. Would you please help us to review the extended source? http://wiki.postgresql.org/wiki/Submitting_a_Patch The attachments is the diff source. I think if you want someone to look at this, you really need to provide a single file with a unified or context diff of the entire source trees. And you may have trouble finding anyone willing to review it for free unless you are explicitly looking to share the code for free. I think this effort is fundamentally misguided. It will mean a maintenance nightmare for you. You would be much better off migrating your app to rid it of these SQLServerisms, especially those that require backend changes. If you have layered your application correctly, so that the places it calls SQL are relatively confined, then this should not be terribly difficult. If you have not, then you have bigger problems than these anyway. There is certainly something to that point of view, but implementing compatibility shims can reduce the effort of migration, and isn't always a bad idea. One thing which is just going to need to be fixed in the application code is any instance of UPDATE with a FROM clause. SQL Server and PostgreSQL both have non-standard extensions which support such syntax, but with different semantics, so such a statement written for SQL Server will probably run without throwing an error under PostgreSQL, but will not do what it did under SQL Server. In many cases it will run for a *very* long time, and if allowed to finish will probably update rows which were not intended. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [HACKERS] Re: [HACKERS] please review source(SQLServer compatible)
2014-06-23 18:00 GMT+02:00 Kevin Grittner kgri...@ymail.com: Andrew Dunstan and...@dunslane.net wrote: On 06/23/2014 10:51 AM, rohtodeveloper wrote: Our application will be switched from SQL Server to PostgreSQL. However, a few functions are not supported yet. So we decided to extend it. The functions are as following: 1.SQL statement support INSERT statement without INTO keyword DELETE statement without FROM keywork Those would be pretty trivial to do in core; the question is whether the community would agree that a few extra lines in the parser (and compatibility sections of the docs) is worth it for portability from SQL Server and Sybase. I am strongly against - it is murder of ANSI SQL Regards Pavel 2.Build-in function SQUARE CHAR CHARINDEX LEN REPLICATE SPACE STR STUFF CONVERT DATALENGTH DATEADD DATEDIFF DATEPART DAY MONTH YEAR EOMONTH GETDATE SYSDATETIME 3.Operator operator ! (Not Less Than) operator ! (Not Greater Than) operator + (String Concatenation) It seems likely that you could write an extension to add these (using the CREATE EXTENSION feature) and submit them to http://pgxn.org if you wanted to. Is there some reason you're not going this route? 4.Other DataType support(smalldatetime,datetime,datatime2,uniqueidentifer) Date, Time, and Timestamp Escape Sequences ODBC Scalar Functions OCTET_LENGTH CURRENT_DATE CURRENT_TIME You can add data types (including within extensions), and some of those are things which seem to be implemented in some form. test=# select current_date; date 2014-06-23 (1 row) test=# select current_time; timetz 10:44:36.958967-05 (1 row) test=# select octet_length('abcd'); octet_length -- 4 (1 row) test=# select octet_length('π'); octet_length -- 2 (1 row) If the point is that you want to change the semantics of existing valid PostgreSQL statements, that's probably not a good idea. The extended functions are almost completed but your opinion is very important to us. Would you please help us to review the extended source? http://wiki.postgresql.org/wiki/Submitting_a_Patch The attachments is the diff source. I think if you want someone to look at this, you really need to provide a single file with a unified or context diff of the entire source trees. And you may have trouble finding anyone willing to review it for free unless you are explicitly looking to share the code for free. I think this effort is fundamentally misguided. It will mean a maintenance nightmare for you. You would be much better off migrating your app to rid it of these SQLServerisms, especially those that require backend changes. If you have layered your application correctly, so that the places it calls SQL are relatively confined, then this should not be terribly difficult. If you have not, then you have bigger problems than these anyway. There is certainly something to that point of view, but implementing compatibility shims can reduce the effort of migration, and isn't always a bad idea. One thing which is just going to need to be fixed in the application code is any instance of UPDATE with a FROM clause. SQL Server and PostgreSQL both have non-standard extensions which support such syntax, but with different semantics, so such a statement written for SQL Server will probably run without throwing an error under PostgreSQL, but will not do what it did under SQL Server. In many cases it will run for a *very* long time, and if allowed to finish will probably update rows which were not intended. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On Mon, Jun 23, 2014 at 11:16 AM, Andres Freund and...@2ndquadrant.com wrote: This criterion has been proposed before, but I'm not sure I really agree with it. If having code around that targets obscure platforms is hindering the development of new features, then that's a reason to get rid of it. I think that's the case. At some level, I agree, but see below. That we still have !PG_USE_INLINE support although all buildfarm animals support it since 4c8aa8b (fixing acc) causes absurd constructs (STATIC_IF_INLINE) and fugly macro usage making it harder to read and modify code. I spend a good chunk of time just to make that work. Even if nobody's going to ever use it. I expect that we're eventually going to make a decision to require inline support, but that commit was only last month, so probably not just yet. We've been trying hard to maintain support for C89, but I think even Tom is starting to wonder whether we ought to let a few widely-implemented post-C89 features in. That we need fallback for older gccs instead of relying on somebody else having done the atomics abstraction causes significant waste of time. I agree, but I feel that's time well-spent. gcc is not the only compiler, not everyone can or wants to run bleeding-edge versions, and I don't trust the gcc implementors to a sufficient degree to throw away our existing TAS implementations wholesale and rely on theirs instead. Building cross-platform software is sometimes difficult; but it's also got a lot of value. I've spent enough time over the years working on obscure platforms to value software that works even on obscure platforms, and I'm glad that PostgreSQL tries to do that, even if we don't always succeed perfectly. That we have support for platforms that we haven't even documented as working (e.g. SuperH) or platforms that don't work in the realword (m32r) means that that one has to think about and research so many more edge cases that it's really hard to make progress. I don't know how often I've now sequentially gone through s_lock.h, s_lock.c, src/backend/port/tas to understand all the supported combinations. I agree that s_lock.h is the most painful part of this whole thing, mostly because I'd really like to get to the point where SpinLockAcquire() and SpinLockRelease() function as compiler barriers. If we're pretty sure it's useless and doesn't work, so is that. But if it just hasn't been tested lately, I don't personally think that's a sufficient reason to excise it. Sure, it just not being tested isn't a reason alone. But it's about more than that. I've now spent *far* too much time understanding the idioms of architectures that'll never get used again so I don't break them accidentally. Sure, that's my choice. But it's imo necessary for postgres to scale I don't know about you, but for me reading/understanding/modifying assembly on some platform you've never used is *HARD*. And even harder if there's nobody around that's going to test it. It's entirely possible to write incorrect locking assembly that'll break when used. Agree. Telling people that they can't have even the most minimal platform support code in PostgreSQL unless they're willing to contribute and maintain a BF VM indefinitely is not very friendly. Of course, the risk of their platform getting broken is higher if they don't, but that's different than making it a hard requirement. I agree that we shouldn't actively try to break stuff. But having to understand blindly modify unused code is on the other hand of actively breaking platforms. It's actively hindering development. It's actively hindering a small number of important patches, but most developers, most of the time, do not need to care. We should be looking at ways to make it easier not harder for people who don't yet run PostgreSQL to start doing so. We are an open source community; we should try to be, as far as possible, open. Do you really think supporting platform that 20 people worldwide run for fun in their sparetime once in a while (sparc v8plus) is achieving that? I think it's more likely that easier to understand code, quick review for patches and better testing help with that. I don't think we can solve this problem with a sledgehammer. We can whittle down the number of platforms in s_lock.h that require special treatment, and eventually some of the odder cases may go away altogether, and that's great. But I'd rather not make this patch dependent on the rate at which we feel comfortable removing cases from that file. But I think this is all a bit off-topic for this thread. Andres has already implemented a fallback for people who haven't got CAS and fetch-and-add on their platform, so whether or not we deprecate some more platforms has no bearing at all on this patch. While I indeed have that fallback code, that's statement is still not entirely true. We still need to add atomics support for lots of platforms,
Re: [HACKERS] Re: [bug fix] multibyte messages are displayed incorrectly on the client
Heikki Linnakangas hlinnakan...@vmware.com writes: Earlier in this thread, MauMau pointed out that we can't do encoding conversions until we have connected to the database because you need to read pg_conversion for that. That's because we support creating custom conversions with CREATE CONVERSION. Frankly, I don't think anyone cares about that feature. If we just dropped the CREATE/DROP CONVERSION feature altogether and hard-coded the conversions we have, there would be close to zero complaints. Even if you want to extend something around encodings and conversions, the CREATE CONVERSION interface is clunky. Firstly, conversions are per-database, and even schema-qualified, which just seems like an extra complication. You'll most likely want to modify the conversion across the whole system. Secondly, rather than define a new conversion between encodings, you'll likely want to define a whole new encoding with conversions to/from existing encodings, but you can't do that anyway without hacking the source code. There's certainly something to be said for that position. If there were any prospect of extensions defining new encodings someday, I'd argue for keeping CREATE CONVERSION. But the performance headaches would be substantial, and there aren't new encodings coming down the pike often enough to justify the work involved, so I don't see us ever doing CREATE ENCODING; and that means that CREATE CONVERSION is of little value. I'd kind of like to see this go just because having catalog accesses involved in encoding conversion setup is messy and fragile. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON and Postgres Variable Queries
Robert Haas robertmh...@gmail.com writes: You might find a sub-SELECT helpful: SELECT * FROM (SELECT json_data-âplan'-âid' as plan_id FROM orders) x WHERE plan_id = 1 It might be a generally useful thing for WHERE-clause items to be able to reference items from the target list by alias, or maybe it's problematic for some reason that I don't know about, Standards compliance? It's not just a trivial syntactic issue either, but a rather fundamental conceptual one: expressions in the target list are not supposed to be evaluated until after the WHERE clause is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about a proper TEMPORARY TABLESPACE?
On Wed, Jun 18, 2014 at 9:39 PM, Matheus de Oliveira matioli.math...@gmail.com wrote: Then, to summarize Matheus must do: * use an option instead of change the syntax and catalog to indicate that a tablespace is used to store temp objects Yes. I myself wasn't sure TEMPORARY syntax would be good, but I would just like to hear more about. Storing as an options seems nice to me. I kind of like the TEMPORARY syntax, by analogy with what we do for tables. On the other hand, this situation isn't quite analogous: the tablespace itself is permanent; it's only the objects inside the tablespace that are temporary. Hmm. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use a signal to trigger a memory context dump?
From: Andres Freund and...@2ndquadrant.com I wonder if it'd make sense to allow a signal to trigger a memory context dump? I and others more than once had the need to examine memory usage on production systems and using gdb isn't always realistic. +1 It would be nice if there's a generic infrastructure on which the DBA can get information of running backends.I wish for a functionality to dump info of all backends with a single operation as well as one backend at a time, because it would be difficult to ask for users to choose a specific backend or operate on all backends, especially on Windows. The candidate info are: * memory context * stack trace: I'd like to implement this. * GUC settings: to know that backends are running with intended settings. * prepared statements (= pg_prepared_statements): to know if applications are taking advantage of prepared statements for performance. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns
On Mon, Jun 23, 2014 at 10:43 AM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: Digging more into that, I have found the issue and a fix for it. It happens that populate_recordset_object_start, which is used to initialize the process for the population of the record, is taken *each* time a json object is found, re-creating every time the hash table for the parsing process, hence removing from PopulateRecordsetState all the entries already parsed and creating the problem reported by Matti. The fix I am proposing to fix this issue is rather simple: simply bypass the creation of the hash table if lex_level 1 as we are in presence of a nested object and rely on the existing hash table. Yes, this code is clearly not handling the nested-objects case correctly. I had written a fix more or less equivalent to yours last night. However, it seems to me that these functions (json[b]_to_record[set]) are handling the nested-json-objects case in a fairly brain-dead fashion to start with. I would like to propose that we should think about getting rid of the use_json_as_text flag arguments altogether. What purpose do they serve? If we're going to the trouble of parsing the nested JSON objects anyway, why don't we just reconstruct from that data? I think they should be removed. (I called this out in the feature level review: http://www.postgresql.org/message-id/CAHyXU0wqadCJk7MMkeARuuY05VrD=axdn6wdcemtuwo5p4c...@mail.gmail.com). AIUI, the flag was introduced as a workaround to try and deal with mapping nested structures. Text variant 'json' flags have had them. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] please review source(SQLServer compatible)
On 06/23/2014 04:51 PM, rohtodeveloper wrote: 1.SQL statement support INSERT statement without INTO keyword DELETE statement without FROM keywork Why would we want this? -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replication identifier format
On Mon, Jun 23, 2014 at 11:28 AM, Andres Freund and...@2ndquadrant.com wrote: Oh, great. Somehow I missed the fact that that had been addressed. I had assumed that we still needed global identifiers in which case I think they'd need to be 64+ bits (preferably more like 128). If they only need to be locally significant that makes things much better. Well, I was just talking about the 'short ids' here and how they are used in crash recovery/shmem et al. Those indeed don't need to be coordinated. If you ever use logical decoding on a system that receives changes from other systems (cascading replication, multimaster) you'll likely want to add the *long* form of that identifier to the output in the output plugin so the downstream nodes can identify the source. How one specific replication solution deals with coordinating this between systems is essentially that suite's problem. OK. The external identifier currently is a 'text' column, so essentially unlimited. (Well, I just noticed that the table currently doesn't have a toast table assigned, so it's only a couple kb right now, but ...) OK. I have no clear reason to dislike that. Is there any real reason to add a pg_replication_identifier table, or should we just let individual replication solutions manage the identifiers within their own configuration tables? I don't think that'd work. During crash recovery the short/internal IDs are read from WAL records and need to be unique across *all* databases. Since there's no way for different replication solutions or even the same to coordinate this across databases (as there's no way to add shared relations) it has to be builtin. That makes sense. It's also useful so we can have stuff like the 'pg_replication_identifier_progress' view which tells you internal_id, external_id, remote_lsn, local_lsn. Just showing the internal ID would imo be bad. OK. I guess one question is: What happens if there are multiple replication solutions in use on a single server? How do they coordinate? What's your concern here? You're wondering how they can make sure the identifiers they create are non-overlapping? Yeah, I was just thinking that might be why you installed a catalog table for this, but now I see that there are several other reasons also. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL access to database attributes
On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I found only one problem - first patch introduce a new property CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in documentation. But CONNECTION LIMIT is still supported, but it is not documented. So for some readers it can look like breaking compatibility, but it is false. This should be documented better. Yeah, I think the old syntax should be documented also. See, e.g., what we do for COPY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
Robert Haas robertmh...@gmail.com writes: On Mon, Jun 23, 2014 at 11:16 AM, Andres Freund and...@2ndquadrant.com wrote: Since fetch-and-add is trivially implemented using CAS, there's not much need to distinguish between CAS and CAS + fetch_and_add. From my POV the restriction to just CAS/fetch_and_add isn't actually buying much. Pretty much all platforms but older gcc ones have atomic intrinsics since forever. Once you've dug up the documentation for one operation not adding two more or so doesn't save much. Again, the concern that was expressed at the developer's meeting was that the performance characteristics of the CAS loop might be significantly different from a native atomic op as to cause us heartburn. I think that's a valid concern... but if everything in common use has both CAS and fetch-and-add, then I think there's probably no issue here. What I want to know is whether we are going to agree that the set of atomics is going to be CAS plus fetch_and_add plus *nothing else*. Andres seems to envision that those will be a minimal set and then we'll freely use any other atomic op that seems handy as long as we can provide a fallback implementation of it. I agree with Robert that keeping track of the actual cross-platform performance characteristics of such code would be a nightmare. What I want to see is a list of atomic ops that is short enough that we can agree we do not care about PG performance on any platform that hasn't got (fast versions of) all of them. Then we will code and optimize on the basis of having all those ops and no others. We can offer fallback implementations of those ops so that older platforms aren't entirely dead in the water, but they will not be the focus of any performance testing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On 2014-06-23 12:08:08 -0400, Robert Haas wrote: That we still have !PG_USE_INLINE support although all buildfarm animals support it since 4c8aa8b (fixing acc) causes absurd constructs (STATIC_IF_INLINE) and fugly macro usage making it harder to read and modify code. I spend a good chunk of time just to make that work. Even if nobody's going to ever use it. I expect that we're eventually going to make a decision to require inline support, but that commit was only last month, so probably not just yet. We were in pretty much the same position before - it was just the quiet inline test tripping us up... That we have support for platforms that we haven't even documented as working (e.g. SuperH) or platforms that don't work in the realword (m32r) means that that one has to think about and research so many more edge cases that it's really hard to make progress. I don't know how often I've now sequentially gone through s_lock.h, s_lock.c, src/backend/port/tas to understand all the supported combinations. I agree that s_lock.h is the most painful part of this whole thing, mostly because I'd really like to get to the point where SpinLockAcquire() and SpinLockRelease() function as compiler barriers. s_lock.h is basically gone in my patch btw. Only old comments + defines for TAS/TAS_SPIN/S_UNLOCK etc mapping the old calls onto atomic flags remain. The new code should, hopefully, all be barrier safe. I agree that we shouldn't actively try to break stuff. But having to understand blindly modify unused code is on the other hand of actively breaking platforms. It's actively hindering development. It's actively hindering a small number of important patches, but most developers, most of the time, do not need to care. True. I guess I just seem to find myself hit by this a bit heavily :) While I indeed have that fallback code, that's statement is still not entirely true. We still need to add atomics support for lots of platforms, otherwise they're just going to be 'less supported' than now. Are we fine with that and just'll accept patches? I guess it depends on which platforms are affected. I certainly don't think any new facilities we add need to be more comprehensive than what I did in barrier.h, and in fact I think we could omit a few of the things I have there, like PA-RISC, Itanium, and Alpha. And, yeah, if somebody needs an atomics implementation for a platform we haven't got yet, they can submit a patch. I'll omit !gcc PA-RISC unless somebody complains. I don't think I'll be able to modify hpux_hppa.s + build infrastructure correctly and there's no buildfarm. That means it'll require --disable-spinlocks. I've now also removed sunstudio*.s because sunstudio has provided intrinsics for a *long* while. No idea whether it'll work out of the box, but it shouldn't be hard to mop up eventual bugs. The question that needs to be resolved in order to move forward here is this: Consider the set of platforms we support, ignoring any that don't have spinlocks. Do any of them have only one of fetch-and-add, or do they all have both or neither? If it's the latter, then we're talking about moving from a two-tiered system that looks like this: Since fetch-and-add is trivially implemented using CAS, there's not much need to distinguish between CAS and CAS + fetch_and_add. From my POV the restriction to just CAS/fetch_and_add isn't actually buying much. Pretty much all platforms but older gcc ones have atomic intrinsics since forever. Once you've dug up the documentation for one operation not adding two more or so doesn't save much. Again, the concern that was expressed at the developer's meeting was that the performance characteristics of the CAS loop might be significantly different from a native atomic op as to cause us heartburn. I think that's a valid concern... but if everything in common use has both CAS and fetch-and-add, then I think there's probably no issue here. Well, there's platforms where both CAS and atomic add are implemented using load linked/store conditional. So neither really is 100% native. But that's essentially how CAS et al are implemented in microcode on pretty much all somewhat recent cpus anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
On Mon, Jun 23, 2014 at 11:52 AM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh gurj...@singh.im writes: would it be possible to include this in 9.4 as well? While this is clearly an improvement over what we had before, it's impossible to argue that it's a bug fix, and we are way past the 9.4 feature freeze deadline. In particular, packagers who've already done their 9.4 development work might be blindsided by us slipping this into 9.4 release. So while I wouldn't have a problem with putting this change into 9.4 from a technical standpoint, it's hard to argue that it'd meet project norms from a development-process standpoint. While I'd love to reduce the number of future installations without this fix in place, I respect the decision to honor project policy. At the same time, this change does not break anything. It introduces new environment variables which change the behaviour, but behaves the old way in the absence of those variables. So I guess it's not changing the default behaviour in incompatible way. BTW, does the project publish the feature-freeze deadlines and other dates somewhere (apart from on this list). I was looking the other day and didn't find any reference. Either the commitfest app or the 'Developer' section of the wiki [1] seem to be good candidates for this kind of information. [1]: https://wiki.postgresql.org/wiki/Development_information Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost sfr...@snowman.net wrote: I'm also of the opinion that this isn't strictly necessary for the initial RLS offering in PG- there's a clear way we could migrate existing users to a multi-policy system from a single-policy system. Sure, to get the performance and optimization benefits that we'd presumably have in the multi-policy case they'd need to re-work their RLS configuration, but for users who care, they'll likely be very happy to do so to gain those benefits. I think a lot depends on the syntax we choose. If we choose a syntax that only makes sense in a single-policy framework, then I think allowing upgrades to a multi-policy syntax is going to be really difficult. On the other hand, if we choose a syntax that allows multiple policies, I suspect we can support multiple policies from the beginning without much extra effort. - Require the user to specify in some way which of the available policies they want applied, and then apply only that one. I'd want to at least see a way to apply an ordering to the policies being applied, or have PG work out which one is cheapest and try that one first. Cost-based comparison of policies that return different results doesn't seem sensible to me. I think exactly the opposite, for the query planning reasons previously stated. I think the policies will quickly get so complicated that they're no longer optimizable. Here's a simple example: - Policy 1 allows the user to access rows for which complexfunc() returns true. - Policy 2 allows the user to access rows for which a = 1. Most users have access only through policy 2, but some have access through policy 1. Users who have access through policy 1 will always get a sequential scan, This is the thing which I most object to- if the quals being provided at any level are leakproof and would be able to reduce the returned set sufficiently that an index scan is the best bet, we should be doing that. I don't anticipate the RLS quals to be as selective as the quals which the user is adding. I think it would be a VERY bad idea to design the system around the assumption that the RLS quals will be much more or less selective than the user-supplied quals. That's going to be different in different environments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] releaseOk and LWLockWaitForVar
On 06/17/2014 03:17 PM, Andres Freund wrote: LWLockWaitForVar() doesn't set releaseOk to true when waiting again. Isn't that a bug? LWLockWaitForVar() waits in LW_WAIT_UNTIL_FREE mode, because it's not interested in acquiring the lock, it just wants to be woken up when it's released (or the var is updated). LWLockRelease doesn't clear releaseOK when it wakes up a LW_WAIT_UNTIL_FREE-mode waiter. What if there's another locker coming in after LWLockWaitForVar() returns from the PGSemaphoreLock() but before it has acquire the spinlock? Now, it might be that it's unproblematic because of hte specific way these locks are used right now, but it doesn't seem like a good idea to leave it that way. In that scenario, LWLockWaitForVar() will grab the spinlock, after the other process. What happens next depends on the whether the value of the variable it guards was changed. If it was, LWLockWaitForVar() will see that it changed, and return false without waiting again. If the value didn't change, it will sleep until the new locker releases the lock. In either case, I don't see a problem with releaseOK. It seems correct as it is. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On 2014-06-23 09:28:19 -0700, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jun 23, 2014 at 11:16 AM, Andres Freund and...@2ndquadrant.com wrote: Since fetch-and-add is trivially implemented using CAS, there's not much need to distinguish between CAS and CAS + fetch_and_add. From my POV the restriction to just CAS/fetch_and_add isn't actually buying much. Pretty much all platforms but older gcc ones have atomic intrinsics since forever. Once you've dug up the documentation for one operation not adding two more or so doesn't save much. Again, the concern that was expressed at the developer's meeting was that the performance characteristics of the CAS loop might be significantly different from a native atomic op as to cause us heartburn. I think that's a valid concern... but if everything in common use has both CAS and fetch-and-add, then I think there's probably no issue here. What I want to know is whether we are going to agree that the set of atomics is going to be CAS plus fetch_and_add plus *nothing else*. It's going to be TAS, CAS, fetch_and_add, right? Since TAS is the only thing supported on some platforms? The only op I'm currently wondering about adding is a atomic exchange, without compare to that set. All platforms that support CAS also have a non-comparing version of it. Right now the patch also uses __sync_fetch_and_sub() in the generic gcc implementation instead of doing the negation itself, but that's easily fixable. Andres seems to envision that those will be a minimal set and then we'll freely use any other atomic op that seems handy as long as we can provide a fallback implementation of it. Well, I *do* also want pg_atomic_fetch_and/or_u32() - but I'm totally fine with those two only being implemented with CAS. On all platforms. Otherwise the next scalability patch I'm going to submit will just have to open code a CAS loop for it which doesn't seem to help. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] releaseOk and LWLockWaitForVar
On 06/23/2014 05:38 PM, Amit Kapila wrote: While looking at function LWLockWaitForVar(), espacially below code: TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(l), T_ID(l), LW_EXCLUSIVE); I think in this function tracing is done considering the Exclusive lock is acquired, however it might have granted access because of variable updation. Basically this function's trace doesn't distinguish whether the access is granted due to the reason that there is no other exclusive locker or variable is updated. Yeah. Not sure it's worth it to add new TRACE points for this, I'm not really familiar with the way the traces work or how people use them. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL access to database attributes
On 06/23/2014 06:21 PM, Robert Haas wrote: On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I found only one problem - first patch introduce a new property CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in documentation. But CONNECTION LIMIT is still supported, but it is not documented. So for some readers it can look like breaking compatibility, but it is false. This should be documented better. Yeah, I think the old syntax should be documented also. Why do we want to document syntax that should eventually be deprecated? See, e.g., what we do for COPY. Exactly. We're still carrying around baggage from 7.2! Backward compatibility: yes. Backward documentation: no. -- Vik -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL access to database attributes
2014-06-23 18:39 GMT+02:00 Vik Fearing vik.fear...@dalibo.com: On 06/23/2014 06:21 PM, Robert Haas wrote: On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I found only one problem - first patch introduce a new property CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in documentation. But CONNECTION LIMIT is still supported, but it is not documented. So for some readers it can look like breaking compatibility, but it is false. This should be documented better. Yeah, I think the old syntax should be documented also. Why do we want to document syntax that should eventually be deprecated? It is fair to our users. It can be deprecated, ok, we can write in doc - this feature will be deprecated in next three years. Don't use it, but this should be documentated. Pavel See, e.g., what we do for COPY. Exactly. We're still carrying around baggage from 7.2! Backward compatibility: yes. Backward documentation: no. -- Vik
Re: [HACKERS] Atomics hardware support table supported architectures
On Mon, Jun 23, 2014 at 12:29 PM, Andres Freund and...@2ndquadrant.com wrote: That we have support for platforms that we haven't even documented as working (e.g. SuperH) or platforms that don't work in the realword (m32r) means that that one has to think about and research so many more edge cases that it's really hard to make progress. I don't know how often I've now sequentially gone through s_lock.h, s_lock.c, src/backend/port/tas to understand all the supported combinations. I agree that s_lock.h is the most painful part of this whole thing, mostly because I'd really like to get to the point where SpinLockAcquire() and SpinLockRelease() function as compiler barriers. s_lock.h is basically gone in my patch btw. Only old comments + defines for TAS/TAS_SPIN/S_UNLOCK etc mapping the old calls onto atomic flags remain. The new code should, hopefully, all be barrier safe. Urp. I'm not sure that's at all a good idea. I don't love s_lock.h, but if we make a wholesale change, we risk breaking things that work today in obscure ways that are hard to find and debug. I thought we were talking about adding new facilities with their own fallbacks, not massively reengineering the facilities we've already got. I'll omit !gcc PA-RISC unless somebody complains. I don't think I'll be able to modify hpux_hppa.s + build infrastructure correctly and there's no buildfarm. That means it'll require --disable-spinlocks. I dunno whether we should care in that particular case, but in general I don't think it's OK for lack of support for the *new* atomics to prevent us from using the *existing* TAS support. Again, the concern that was expressed at the developer's meeting was that the performance characteristics of the CAS loop might be significantly different from a native atomic op as to cause us heartburn. I think that's a valid concern... but if everything in common use has both CAS and fetch-and-add, then I think there's probably no issue here. Well, there's platforms where both CAS and atomic add are implemented using load linked/store conditional. So neither really is 100% native. But that's essentially how CAS et al are implemented in microcode on pretty much all somewhat recent cpus anyway. I think on platforms that expose LL/SC, we have to assume that spurious failures will be infrequent enough not to matter to performance. If they do, that's something that's going to have to be fixed by the hardware manufacturer, not us. There is a danger here that we could end up with optimizations that are wins on some platforms and losses on others, but I think we're going to have to live with that. Refusing to use atomic ops at all isn't a viable way forward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL access to database attributes
On Mon, Jun 23, 2014 at 12:39 PM, Vik Fearing vik.fear...@dalibo.com wrote: On 06/23/2014 06:21 PM, Robert Haas wrote: On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I found only one problem - first patch introduce a new property CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in documentation. But CONNECTION LIMIT is still supported, but it is not documented. So for some readers it can look like breaking compatibility, but it is false. This should be documented better. Yeah, I think the old syntax should be documented also. Why do we want to document syntax that should eventually be deprecated? Because otherwise existing users will wonder if their dumps can still be restored on newer systems. And also, because documentation is, in general, a good thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
Gurjeet Singh gurj...@singh.im writes: On Mon, Jun 23, 2014 at 11:52 AM, Tom Lane t...@sss.pgh.pa.us wrote: While this is clearly an improvement over what we had before, it's impossible to argue that it's a bug fix, and we are way past the 9.4 feature freeze deadline. In particular, packagers who've already done their 9.4 development work might be blindsided by us slipping this into 9.4 release. So while I wouldn't have a problem with putting this change into 9.4 from a technical standpoint, it's hard to argue that it'd meet project norms from a development-process standpoint. While I'd love to reduce the number of future installations without this fix in place, I respect the decision to honor project policy. At the same time, this change does not break anything. It introduces new environment variables which change the behaviour, but behaves the old way in the absence of those variables. Uh, no, it doesn't. We removed the dependence on -DLINUX_OOM_SCORE_ADJ. If a packager is expecting that to still work in 9.4, he's going to be unpleasantly surprised, because the system will silently fail to do what he's expecting: it will run all the backend processes at no-OOM-kill priority, which is likely to be bad. It is possible to make packages that will work either way, along the lines of the advice I sent to the Red Hat guys: https://bugzilla.redhat.com/show_bug.cgi?id=1110969 but I think it's a bit late in the cycle to be telling packagers to do that for 9.4. BTW, does the project publish the feature-freeze deadlines and other dates somewhere (apart from on this list). It's usually in the dev meeting minutes https://wiki.postgresql.org/wiki/PgCon_2014_Developer_Meeting#9.5_Schedule regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] checking for interrupts during heap insertion
Hi, While talking to Amit Kapila this morning, he mentioned to me that there seem to be no CHECK_FOR_INTERRUPTS() calls anywhere in heap_multi_insert() or the functions it calls. Should there be? By way of contrast, heapgetpage() has this: /* * Be sure to check for interrupts at least once per page. Checks at * higher code levels won't be able to stop a seqscan that encounters many * pages' worth of consecutive dead tuples. */ CHECK_FOR_INTERRUPTS(); In heap_multi_insert(), we first do heap_prepare_insert() on each tuple, which may involve dirtying many pages, since it handles TOAST. Then, we loop over the tuples themselves and dirty a bunch more pages. All of that will normally happen pretty quickly, but if the I/O subsystem is very slow for some reason, such as due to heavy system load, then it might take quite a long time. I'm thinking we might want a CHECK_FOR_INTERRUPTS() in the following two places: 1. Inside toast_save_datum, at the top of the loop that starts with while (data_todo 0). 2. Inside heap_multi_insert, at the top of the loop that starts with while (ndone ntuples). Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use a signal to trigger a memory context dump?
Andres Freund and...@2ndquadrant.com writes: I wonder if it'd make sense to allow a signal to trigger a memory context dump? I and others more than once had the need to examine memory usage on production systems and using gdb isn't always realistic. I wonder if we could install a signal handler for some unused signal (e.g. SIGPWR) to dump memory. I'd also considered adding a SQL function that uses the SIGUSR1 signal multiplexing for the purpose but that's not necessarily nice if you have to investigate while SQL access isn't yet possible. There's also the problem that not all possibly interesting processes use the sigusr1 signal multiplexing. Well, you can't just have the signal handler call MemoryContextStats directly. (Even if the memory manager's state were 100% interrupt-safe, which it ain't, fprintf itself might not be safe either.) The closest approximation that I think would be reasonable is to set a flag that would be noticed by the next CHECK_FOR_INTERRUPTS macro. So you're already buying into the assumption that the process executes CHECK_FOR_INTERRUPTS fairly often. Which probably means that assuming it's using the standard sigusr1 handler isn't a big extra limitation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minmax indexes
Heikki Linnakangas wrote: Some comments, aside from the design wrt. bounding boxes etc. : Thanks. I haven't commented on that sub-thread because I think it's possible to come up with a reasonable design that solves the issue by adding a couple of amprocs. I need to do some more thinking to ensure it is really workable, and then I'll post my ideas. On 06/15/2014 05:34 AM, Alvaro Herrera wrote: Robert Haas wrote: If I understand the code correctly, the revmap is a three-level deep structure. The bottom level consists of regular revmap pages, and each regular revmap page is filled with ItemPointerDatas, which point to the index tuples. The middle level consists of array revmap pages, and each array revmap page contains an array of BlockNumbers of the regular revmap pages. The top level is an array of BlockNumbers of the array revmap pages, and it is stored in the metapage. Yep, that's correct. Essentially, we still have the revmap as a linear space (containing TIDs); the other two levels on top of that are only there to enable locating the physical page numbers for each revmap logical page. I make one exception that the first logical revmap page is always stored in page 1, to optimize the case of a smallish table (~1360 page ranges; approximately 1.3 gigabytes of data at 128 pages per range, or 170 megabytes at 16 pages per range.) Each page has a page header (24 bytes) and special space (4 bytes), so it has 8192-28=8164 bytes available for data, so 1360 item pointers. With 8k block size, that's just enough to cover the full range of 2^32-1 blocks that you'll need if you set mm_pages_per_range=1. Each regular revmap page can store about 8192/6 = 1365 item pointers, each array revmap page can store about 8192/4 = 2048 block references, and the size of the top array is 8192/4. That's just enough; to store the required number of array pages in the top array, the array needs to be (2^32/1365)/2048)=1536 elements large. But with 4k or smaller blocks, it's not enough. Yeah. As I said elsewhere, actual useful values are likely to be close to the read-ahead setting of the underlying disk; by default that'd be 16 pages (128kB), but I think it's common advice to increase the kernel setting to improve performance. I don't think we don't need to prevent minmax indexes with pages_per_range=1, but I don't think we need to ensure that that setting works with the largest tables, either, because it doesn't make any sense to set it up like that. Also, while there are some recommendations to set up a system with larger page sizes (32kB), I have never seen any recommendation to set it lower. It wouldn't make sense to build a system that has very large tables and use a smaller page size. So in other words, yes, you're correct that the mechanism doesn't work in some cases (small page size and index configured for highest level of detail), but the conditions are such that I don't think it matters. ISTM the thing to do here is to do the math at index creation time, and if we find that we don't have enough space in the metapage for all array revmap pointers we need, bail out and require the index to be created with a larger pages_per_range setting. I wonder if it would be simpler to just always store the revmap pages in the beginning of the index, before any other pages. Finding the revmap page would then be just as easy as with a separate fork. When the table/index is extended so that a new revmap page is needed, move the existing page at that block out of the way. Locking needs some consideration, but I think it would be feasible and simpler than you have now. Moving index items around is not easy, because you'd have to adjust the revmap to rewrite the item pointers. I have followed the suggestion by Amit to overwrite the index tuple when a new heap tuple is inserted, instead of creating a separate index tuple. This saves a lot of index bloat. This required a new entry point in bufpage.c, PageOverwriteItemData(). bufpage.c also has a new function PageIndexDeleteNoCompact which is similar in spirit to PageIndexMultiDelete except that item pointers do not change. This is necessary because the revmap stores item pointers, and such reference would break if we were to renumber items in index pages. ISTM that when the old tuple cannot be updated in-place, the new index tuple is inserted with mm_doinsert(), but the old tuple is never deleted. It's deleted by the next vacuum. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Atomics hardware support table supported architectures
On 2014-06-23 12:46:11 -0400, Robert Haas wrote: On Mon, Jun 23, 2014 at 12:29 PM, Andres Freund and...@2ndquadrant.com wrote: That we have support for platforms that we haven't even documented as working (e.g. SuperH) or platforms that don't work in the realword (m32r) means that that one has to think about and research so many more edge cases that it's really hard to make progress. I don't know how often I've now sequentially gone through s_lock.h, s_lock.c, src/backend/port/tas to understand all the supported combinations. I agree that s_lock.h is the most painful part of this whole thing, mostly because I'd really like to get to the point where SpinLockAcquire() and SpinLockRelease() function as compiler barriers. s_lock.h is basically gone in my patch btw. Only old comments + defines for TAS/TAS_SPIN/S_UNLOCK etc mapping the old calls onto atomic flags remain. The new code should, hopefully, all be barrier safe. Urp. I'm not sure that's at all a good idea. I'm not sure either. But it got harder and harder to do it without that because of currently interweaved dependencies. Barriers depend on spinlocks, atomics depend on barriers, atomics depend on spinlocks. And for a useful generic spinlock implementation spinlocks depend on atomics. And then there's the problem that the spinlock based atomics emulation needs to influence the semaphore spinlock implementation because you otherwise can get deadlocks (imagine a atomic var being manipulated while a spinlock is held. If they map to the same semaphore...). I guess it's possible to remove all traces of s_lock.h from barrier.h; add a atomics using fallback to s_lock.h, forceable using a define; make the the public part of the spinlock based fallback not require the spinlock implementation somehow. But even if, I think we should get rid of s_lock.h in a later commit for 9.5. I don't love s_lock.h, but if we make a wholesale change, we risk breaking things that work today in obscure ways that are hard to find and debug. I thought we were talking about adding new facilities with their own fallbacks, not massively reengineering the facilities we've already got. I think we can separate it if we want, but I doubt it'll reall make stuff simpler. I'll omit !gcc PA-RISC unless somebody complains. I don't think I'll be able to modify hpux_hppa.s + build infrastructure correctly and there's no buildfarm. That means it'll require --disable-spinlocks. I dunno whether we should care in that particular case, but in general I don't think it's OK for lack of support for the *new* atomics to prevent us from using the *existing* TAS support. I think I'll just need help from Tom for that case. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tab completion for setting search_path
On Mon, Jun 23, 2014 at 9:10 AM, Kevin Grittner kgri...@ymail.com wrote: Andres Freund and...@2ndquadrant.com wrote: On 2014-06-22 20:02:57 -0700, Tom Lane wrote: Ian Barwick i...@2ndquadrant.com writes: On 23/06/14 00:58, Andres Freund wrote: I thought about committing this but couldn't get over this bit. If you type SELECT * FROM pg_cattab it'll get autocompleted to pg_catalog.pg_ and pg_temptab will list all the temp schemas including the numeric and toast ones. So we have precedent for *not* bothering about excluding any schemas. I don't think we should start doing so in a piecemal fashion in an individual command's completion. There is an exception of sorts already for system schemas, in that although SELECT * FROM ptab will list the system schemas, it will not list any tables from them, and won't until SELECT * FROM pg_tab is entered (see note in tab-completion.c around line 3722). Personally I'd be mildly annoyed if every SET search_path TO ptab resulted in all the system schemas being displayed when all I want is public; how about having these listed only once pg_ is entered, i.e. SET search_path TO pg_tab? I think there is a pretty strong practical argument for excluding the pg_temp and pg_toast schemas from completion for search_path, namely that when does anyone ever need to include those in their search_path explicitly? Infrequently, yes. I've only done it when trying to break stuff ;) The use-case for including pg_catalog in your path is perhaps a bit greater, but not by much. I don't know. It feelds like inappropriate nannyism to me. More confusing than actually helpful. The schemas are there, so they should get autocompleted. But anyway, the common opinion seems to be swinging against my position, so lets do it that way. I would be for excluding the pg_toast, pg_toast_temp_n, and pg_temp_n schemas, and including public and pg_catalog. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use a signal to trigger a memory context dump?
On 2014-06-23 10:07:36 -0700, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: I wonder if it'd make sense to allow a signal to trigger a memory context dump? I and others more than once had the need to examine memory usage on production systems and using gdb isn't always realistic. I wonder if we could install a signal handler for some unused signal (e.g. SIGPWR) to dump memory. I'd also considered adding a SQL function that uses the SIGUSR1 signal multiplexing for the purpose but that's not necessarily nice if you have to investigate while SQL access isn't yet possible. There's also the problem that not all possibly interesting processes use the sigusr1 signal multiplexing. Well, you can't just have the signal handler call MemoryContextStats directly. (Even if the memory manager's state were 100% interrupt-safe, which it ain't, fprintf itself might not be safe either.) Yea. And fprintf() definitely isn't. The closest approximation that I think would be reasonable is to set a flag that would be noticed by the next CHECK_FOR_INTERRUPTS macro. So you're already buying into the assumption that the process executes CHECK_FOR_INTERRUPTS fairly often. Which probably means that assuming it's using the standard sigusr1 handler isn't a big extra limitation. There seem to be far more subsystems doing CHECK_FOR_INTERRUPTS than using SIGUSR1 multiplexing. Several processes have their own SIGUSR1 handlers: * bgworkers (Which certainly is a major candidate for this. And: Isn't this a bug? Think recovery conflicts.) * startup process (certainly interesting as well) * checkpointer * walreceiver * walsender * wal writer * bgwriter * archiver * syslogger At least bgworkers, startup process, walsenders are definitely interesting from this POV. It very well might be best to provide a common sigusr1 implementation supporting a subset of multiplexing for some of those since they essentially all do the same... Although that'd require a fair bit of surgery in procsignal.c Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [HACKERS] please review source(SQLServer compatible)
Vik Fearing vik.fear...@dalibo.com wrote: On 06/23/2014 04:51 PM, rohtodeveloper wrote: 1.SQL statement support INSERT statement without INTO keyword DELETE statement without FROM keywork Why would we want this? I'm pretty sure that the only argument for it is to ease migration of software from other DBMS products which allow that non-standard syntax for people who have chosen to use the non-standard form of the statement instead of the standard syntax (which is also available in all cases that I know of). If the SQL standard were static, I would actually lean toward allowing it, to make it easier for people to switch to PostgreSQL. The biggest down side I see is the possibility that some future version of the standard might implement some new syntax which is more difficult to implement if we need to also support this non-standard variation. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frost sfr...@snowman.net writes: I'd expect a catalog table or perhaps changes to pg_class (maybe other things also..) to define what gets logged. How exactly will that work for log messages generated in contexts where we do not have working catalog access? (postmaster, crash recovery, or pretty much anywhere where we're not in a valid transaction.) Certainly a great question and we may have to address it by not supporting changes immediately (in other words, cache things at backend start, or only at specific times), or by reviewing what logging we do at those times and work out which of those can be controllerd through the catalog and which can't. The logging which can't be managed through the catalog would have to be managed through GUCs or in another way. There's a difference between being able to have finer grained control over certain log messages and having every single ereport() query the catalog. I'm not advocating the latter. This strikes me as much like the periodic suggestions we hear to get rid of the GUC infrastructure in favor of keeping all those settings in a table. It doesn't work because too much of that info is needed below the level of working table access. I'm not suggesting this. Thanks, Stephen signature.asc Description: Digital signature
Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]
Hello I am sending little bit modified patch by Fujii' comments - but I am not able to fix it more - it is task for someone with better English skill then I have Regards Pavel 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com: On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote: Hello third version with Erik's update Here are some my comments: The document of psql needs to be updated. At least the description of new option this patch adds needs to be added into the document. +printf(_( --help-variables list of available configuration variables (options), then exit\n)); We should get rid of of from the message, or add show in front of list of? +printf(_( ECHO write all input lines to standard output\n)); This message seems not to be correct when ECHO=queries is set. +printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); +printf(_( DBNAME name of currently connected database\n)); +printf(_( ECHO write all input lines to standard output\n)); I found that some help message line uses a normal form of a verb, but other does not. We should standardize them? +printf(_( PROMPT1, PROMPT2, PROMPT3 specify the psql prompt\n)); When the option name field is long, we should add a new line just after the name field and align the starting position of the option explanation field. That is, for example, the above should be printf(_( PROMPT1, PROMPT2, PROMPT3\n specify the psql prompt\n)); +printf(_( ON_ERROR_ROLLBACK when on, ROLLBACK on error\n)); This message seems incorrect to me. When this option is on and an error occurs in transaction, transaction continues rather than ROLLBACK occurs, IIUC. I did not check whole help messages yet, but ISTM some messages are not correct. It's better to check them again. +printf(_( PSQL_RCalternative location of the user's .psqlrc file\n)); Typo: PSQL_RC should be PSQLRC +printf(_( PGDATABASE same as the dbname connection parameter\n)); +printf(_( PGHOST same as the host connection parameter\n)); +printf(_( PGPORT same as the port connection parameter\n)); +printf(_( PGUSER same as the user connection parameter\n)); +printf(_( PGPASSWORD possibility to set password (not recommended)\n)); +printf(_( PGPASSFILE password file (default ~/.pgpass)\n)); I don't think that psql needs to display the help messages of even environment variables supported by libpq. Regards, -- Fujii Masao commit 4d8a267870f15a22818da226f72223db86944636 Author: Pavel Stehule pavel.steh...@gooddata.com Date: Mon Jun 23 19:38:41 2014 +0200 without comments diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index 3aa3c16..e960f34 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -78,12 +78,13 @@ usage(void) printf(_( -f, --file=FILENAME execute commands from file, then exit\n)); printf(_( -l, --list list available databases, then exit\n)); printf(_( -v, --set=, --variable=NAME=VALUE\n - set psql variable NAME to VALUE\n)); + set psql variable NAME to VALUE e.g.: -v ON_ERROR_STOP=1\n)); printf(_( -V, --versionoutput version information, then exit\n)); printf(_( -X, --no-psqlrc do not read startup file (~/.psqlrc)\n)); printf(_( -1 (\one\), --single-transaction\n execute as a single transaction (if non-interactive)\n)); printf(_( -?, --help show this help, then exit\n)); + printf(_( --help-variables list of available configuration variables (options), then exit\n)); printf(_(\nInput and output options:\n)); printf(_( -a, --echo-all echo all input from script\n)); @@ -279,6 +280,84 @@ slashUsage(unsigned short int pager) } +/* + * show list of available variables (options) from command line + */ +void +help_variables(void) +{ + printf(_(List of variables (options) for use from command line.\n)); + + printf(_(psql variables:\n)); + printf(_(Usage:\n)); + printf(_( psql --set=NAME=VALUE\n or \\set NAME VALUE in interactive mode\n\n)); + + printf(_( AUTOCOMMIT successful SQL commands are automatically committed\n)); + printf(_( COMP_KEYWORD_CASE determines which letter case to use when completing an SQL key word\n)); + printf(_( DBNAME name of currently connected database\n)); + printf(_( ECHO controls what input can be writtent to standard output [all, queries]\n)); + printf(_( ECHO_HIDDENdisplay internal queries (same as -E option)\n)); + printf(_( ENCODING current client character set encoding\n)); + printf(_( FETCH_COUNTfetch many rows
[HACKERS] Re: [HACKERS] Re: [HACKERS] please review source(SQLServer compatible)
2014-06-23 19:22 GMT+02:00 Kevin Grittner kgri...@ymail.com: Vik Fearing vik.fear...@dalibo.com wrote: On 06/23/2014 04:51 PM, rohtodeveloper wrote: 1.SQL statement support INSERT statement without INTO keyword DELETE statement without FROM keywork Why would we want this? I'm pretty sure that the only argument for it is to ease migration of software from other DBMS products which allow that non-standard syntax for people who have chosen to use the non-standard form of the statement instead of the standard syntax (which is also available in all cases that I know of). There is a fork of PostgreSQL http://www.tpostgres.org/se/ what can do it better this task. We doesn't support a special syntax for Oracle more, for DB2 and I don't see any reason, why we should to do for T-SQL. More - usually this is most simple part in migration from Sybase family to PostgreSQL - there is totally different concept of stored procedures, temp tables, and other so there is not possible simple migration without relative hard changes in PostgreSQL parser. If the SQL standard were static, I would actually lean toward allowing it, to make it easier for people to switch to PostgreSQL. The biggest down side I see is the possibility that some future version of the standard might implement some new syntax which is more difficult to implement if we need to also support this non-standard variation. yes. Regards Pavel -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] idle_in_transaction_timeout
On 06/22/2014 09:02 PM, Abhijit Menon-Sen wrote: I (somewhat reluctantly) agree with Kevin that idle_in_transaction_session_timeout (for FATAL) and idle_transaction_timeout (for ERROR) would work. Given that an IIT timeout has been a TODO for at least 6 years before being addressed, I'm not convinced that we need to worry about what an eventual error vs. fatal timeout should be named or how it should be scoped. Let's attack that when someone actually shows an inclination to work on it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels
On Mon, Jun 23, 2014 at 12:49 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gurjeet Singh gurj...@singh.im writes: While I'd love to reduce the number of future installations without this fix in place, I respect the decision to honor project policy. At the same time, this change does not break anything. It introduces new environment variables which change the behaviour, but behaves the old way in the absence of those variables. Uh, no, it doesn't. We removed the dependence on -DLINUX_OOM_SCORE_ADJ. If a packager is expecting that to still work in 9.4, he's going to be unpleasantly surprised, because the system will silently fail to do what he's expecting: it will run all the backend processes at no-OOM-kill priority, which is likely to be bad. True. I didn't think from a packager's perspective. It is possible to make packages that will work either way, along the lines of the advice I sent to the Red Hat guys: https://bugzilla.redhat.com/show_bug.cgi?id=1110969 but I think it's a bit late in the cycle to be telling packagers to do that for 9.4. Understandable. BTW, does the project publish the feature-freeze deadlines and other dates somewhere (apart from on this list). It's usually in the dev meeting minutes https://wiki.postgresql.org/wiki/PgCon_2014_Developer_Meeting#9.5_Schedule Thanks. But it doesn't spell out the specific dates, or even the month of feature-freeze. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EDB www.EnterpriseDB.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?
* Robert Haas (robertmh...@gmail.com) wrote: On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost sfr...@snowman.net wrote: I'm also of the opinion that this isn't strictly necessary for the initial RLS offering in PG- there's a clear way we could migrate existing users to a multi-policy system from a single-policy system. Sure, to get the performance and optimization benefits that we'd presumably have in the multi-policy case they'd need to re-work their RLS configuration, but for users who care, they'll likely be very happy to do so to gain those benefits. I think a lot depends on the syntax we choose. If we choose a syntax that only makes sense in a single-policy framework, then I think allowing upgrades to a multi-policy syntax is going to be really difficult. On the other hand, if we choose a syntax that allows multiple policies, I suspect we can support multiple policies from the beginning without much extra effort. What are these policies going to depend on? Will they be allowed to overlap? I don't see multi-policy support as being very easily added. If there are specific ways to design the syntax which would make it easier to support multiple policies in the future, I'm all for it. Have any specific thoughts regarding that? - Require the user to specify in some way which of the available policies they want applied, and then apply only that one. I'd want to at least see a way to apply an ordering to the policies being applied, or have PG work out which one is cheapest and try that one first. Cost-based comparison of policies that return different results doesn't seem sensible to me. I keep coming back to the thought that, really, having multiple overlapping policies just adds unnecessary complication to the system for not much gain in real functionality. Being able to specify a policy per-role might be useful, but that's only one dimension and I can imagine a lot of other dimensions that one might want to use to control which policy is used. I think exactly the opposite, for the query planning reasons previously stated. I think the policies will quickly get so complicated that they're no longer optimizable. Here's a simple example: - Policy 1 allows the user to access rows for which complexfunc() returns true. - Policy 2 allows the user to access rows for which a = 1. Most users have access only through policy 2, but some have access through policy 1. Users who have access through policy 1 will always get a sequential scan, This is the thing which I most object to- if the quals being provided at any level are leakproof and would be able to reduce the returned set sufficiently that an index scan is the best bet, we should be doing that. I don't anticipate the RLS quals to be as selective as the quals which the user is adding. I think it would be a VERY bad idea to design the system around the assumption that the RLS quals will be much more or less selective than the user-supplied quals. That's going to be different in different environments. Fine- but do you really see the query planner having a problem pushing down whichever is the more selective qual, if the user-provided qual is marked as leakproof? I realize that you want multiple policies because you'd like a way for the RLS qual to be made simpler for certain cases while also having more complex quals for other cases. What I keep waiting to hear is exactly how you want to specify which policy is used because that's where it gets ugly and complicated. I still really don't like the idea of trying to apply multiple policies inside of a single query execution. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
At 2014-06-23 08:50:33 -0400, sfr...@snowman.net wrote: I'm not a huge fan of adding this as a contrib module I added it to the CF because we're interested in auditing functionality for 9.5, and as far as I can tell, there's nothing better available. At the moment, the contrib module has the advantage that it exists :-) and works with 9.[34] (and could be made to work with 9.2, though I didn't bother for the initial submission). unless we can be quite sure that there's a path forward from here to a rework of the logging in core which would actually support the features pg_audit is adding, without a lot of pain and upgrade issues. What sort of pain and upgrade issues did you have in mind? I'd expect a catalog table or perhaps changes to pg_class (maybe other things also..) to define what gets logged.. Please explain? (I wish extensions were able to add reloptions. That would have made it relatively easy for us to implement per-object audit logging.) I'd also like to see the ability to log based on the connecting user, and we need to log under what privileges a command is executing I imagine it's not useful to point out that you can do the former with pgaudit (using ALTER ROLE … SET), and that we log the effective userid for the latter (though maybe you had something more in mind)… and, really, a whole host of other things.. …but there's not a whole lot I can do with that, either. Is the minimal set of auditing features that we would need in-core very different from what pgaudit already has? -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minmax indexes
On 06/23/2014 08:07 PM, Alvaro Herrera wrote: Heikki Linnakangas wrote: With 8k block size, that's just enough to cover the full range of 2^32-1 blocks that you'll need if you set mm_pages_per_range=1. Each regular revmap page can store about 8192/6 = 1365 item pointers, each array revmap page can store about 8192/4 = 2048 block references, and the size of the top array is 8192/4. That's just enough; to store the required number of array pages in the top array, the array needs to be (2^32/1365)/2048)=1536 elements large. But with 4k or smaller blocks, it's not enough. Yeah. As I said elsewhere, actual useful values are likely to be close to the read-ahead setting of the underlying disk; by default that'd be 16 pages (128kB), but I think it's common advice to increase the kernel setting to improve performance. My gut feeling is that it might well be best to set pages_per_page=1. Even if you do the same amount of I/O, thanks to kernel read-ahead, you might still avoid processing a lot of tuples. But would need to see some benchmarks to know.. I don't think we don't need to prevent minmax indexes with pages_per_range=1, but I don't think we need to ensure that that setting works with the largest tables, either, because it doesn't make any sense to set it up like that. Also, while there are some recommendations to set up a system with larger page sizes (32kB), I have never seen any recommendation to set it lower. It wouldn't make sense to build a system that has very large tables and use a smaller page size. So in other words, yes, you're correct that the mechanism doesn't work in some cases (small page size and index configured for highest level of detail), but the conditions are such that I don't think it matters. ISTM the thing to do here is to do the math at index creation time, and if we find that we don't have enough space in the metapage for all array revmap pointers we need, bail out and require the index to be created with a larger pages_per_range setting. Yeah, I agree that would be acceptable. I feel that the below would nevertheless be simpler: I wonder if it would be simpler to just always store the revmap pages in the beginning of the index, before any other pages. Finding the revmap page would then be just as easy as with a separate fork. When the table/index is extended so that a new revmap page is needed, move the existing page at that block out of the way. Locking needs some consideration, but I think it would be feasible and simpler than you have now. Moving index items around is not easy, because you'd have to adjust the revmap to rewrite the item pointers. Hmm. Two alternative schemes come to mind: 1. Move each index tuple off the page individually, updating the revmap while you do it, until the page is empty. Updating the revmap for a single index tuple isn't difficult; you have to do it anyway when an index tuple is replaced. (MMTuples don't contain a heap block number ATM, but IMHO they should, see below) 2. Store the new block number of the page that you moved out of the way in the revmap page, and leave the revmap pointers unchanged. The revmap pointers can be updated later, lazily. Both of those seem pretty straightforward. I have followed the suggestion by Amit to overwrite the index tuple when a new heap tuple is inserted, instead of creating a separate index tuple. This saves a lot of index bloat. This required a new entry point in bufpage.c, PageOverwriteItemData(). bufpage.c also has a new function PageIndexDeleteNoCompact which is similar in spirit to PageIndexMultiDelete except that item pointers do not change. This is necessary because the revmap stores item pointers, and such reference would break if we were to renumber items in index pages. ISTM that when the old tuple cannot be updated in-place, the new index tuple is inserted with mm_doinsert(), but the old tuple is never deleted. It's deleted by the next vacuum. Ah I see. Vacuum reads the whole index, and builds an in-memory hash table that contains an ItemPointerData for every tuple in the index. Doesn't that require a lot of memory, for a large index? That might be acceptable - you ought to have plenty of RAM if you're pushing around multi-terabyte tables - but it would nevertheless be nice to not have a hard requirement for something as essential as vacuum. In addition to the hash table, remove_deletable_tuples() pallocs an array to hold an ItemPointer for every index tuple about to be removed. A single palloc is limited to 1GB, so that will fail outright if there are more than ~179 million dead index tuples. You're unlikely to hit that in practice, but if you do, you'll never be able to vacuum the index. So that's not very nice. Wouldn't it be simpler to remove the old tuple atomically with inserting the new tuple and updating the revmap? Or at least mark the old tuple as deletable, so that vacuum can just delete it, without building the large hash
Re: [HACKERS] checking for interrupts during heap insertion
On 06/23/2014 08:07 PM, Robert Haas wrote: While talking to Amit Kapila this morning, he mentioned to me that there seem to be no CHECK_FOR_INTERRUPTS() calls anywhere in heap_multi_insert() or the functions it calls. Should there be? Haven't heard any complaints, but I guess.. By way of contrast, heapgetpage() has this: /* * Be sure to check for interrupts at least once per page. Checks at * higher code levels won't be able to stop a seqscan that encounters many * pages' worth of consecutive dead tuples. */ CHECK_FOR_INTERRUPTS(); In heap_multi_insert(), we first do heap_prepare_insert() on each tuple, which may involve dirtying many pages, since it handles TOAST. Then, we loop over the tuples themselves and dirty a bunch more pages. All of that will normally happen pretty quickly, but if the I/O subsystem is very slow for some reason, such as due to heavy system load, then it might take quite a long time. I'm thinking we might want a CHECK_FOR_INTERRUPTS() in the following two places: 1. Inside toast_save_datum, at the top of the loop that starts with while (data_todo 0). 2. Inside heap_multi_insert, at the top of the loop that starts with while (ndone ntuples). Seems reasonable. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minmax indexes
On Thu, Jun 19, 2014 at 12:32 PM, Greg Stark st...@mit.edu wrote: On Wed, Jun 18, 2014 at 4:51 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Implementing something is a good way to demonstrate how it would look like. But no, I don't insist on implementing every possible design whenever a new feature is proposed. I liked Greg's sketch of what the opclass support functions would be. It doesn't seem significantly more complicated than what's in the patch now. As a counter-point to my own point there will be nothing stopping us in the future from generalizing things. Dealing with catalogs is mostly book-keeping headaches and careful work. it's something that might be well-suited for a GSOC or first patch from someone looking to familiarize themselves with the system architecture. It's hard to invent a whole new underlying infrastructure at the same time as dealing with all that book-keeping and it's hard for someone familiarizing themselves with the system to also have a great new idea. Having tasks like this that are easy to explain and that mentor understands well can be easier to manage than tasks where the newcomer has some radical new idea. Generalizing this in the future would be highly likely to change the on-disk format for existing indexes, which would be a problem for pg_upgrade. I think we will likely be stuck with whatever the initial on-disk format looks like for a very long time, which is why I think we need to try rather hard to get this right the first time. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers